The Muppix Team provides our clients with innovative and value driven Professional Services and Training to quickly make sense of Large Scale Data without needing any Computer skills.
Based on extensive data-mining experience at Blue Chip clients such Morgan Stanley, JP Morgan, Santander, Credit Suisse, the Muppix Team have developed a Free Unix/Linux Data Science Toolkit for Consultants and Entrepreneurs to extract and analyse unstructured information from diverse data sources.
Friday 13th June talk by Roger Willink entitled
"Big Data for Entrepreneurs using Muppix"
Pakhuis de Zwijger Piet Heinkade 179, Amsterdam, Holland
MUPPIX Toolkit 2.3 now available for download, includes the Keyword Spreadsheet, try out the keywords on your own text, and see how the commands will extract on a sample of your data.
Check out sample usages on http://muppix.blogspot.nl/
The Toolkit is a comprehensive Unix Cheatsheet described in a simple language so that commands can be easily found and used
Designed specifically for professionals with no technical experience
Over 1300 industry strength Unix commands,
As a data scientist, I spend quite a bit of time on the command-line, especially when there's data to be obtained, scrubbed, or explored - Jeroen Janssens
Part of the skillset of a data scientist is knowing how to obtain a sufficient corpus of usable data, possibly from multiple sources, and possibly from sites which require specific query syntax. A data scientist should know how to do this from the command line, e.g. in a Un*X Environment - Hilary Mason
Few tools are more indispensable to my work than Unix. Manipulating data into different formats, performing transformations, and conducting exploratory data analysis (EDA) is the lingua franca of data science. The coffers of Unix hold many simple tools, which by themselves are powerful, but when chained together facilitate complex data manipulations. Although languages like R and Python are invaluable for data analysis, I find Unix to be superior in many scenarios for quick and simple data cleaning, idea prototyping, and understanding data. - Seth Brown
While it's sometimes difficult to remember all of the parameters for the Unix commands, getting familiar with them has been beneficial to my productivity and allowed me to avoid many headaches when working with large text files... Writing a script in Python/Ruby/Perl would probably take a few minutes and then even more time for the script to actually complete. Thankfully, the Unix Utilities exist and they're awesome. - Greg Reda
Whenever you need to work with data, don’t overlook the Unix “hand tools.” Once you get used to working on the Unix command line, you’ll find that it’s often faster than the alternatives. And the more you use these tools, the more fluent you’ll become. - Mike Loukides
80% of time on data cleaning & exploring
Huge Demand for technical decision makers
Exponential increase in volume, complexity and sources of data