The Muppix Team provides our clients with innovative and value driven Professional Services and Training to quickly make sense of Large Scale Data without needing any Computer skills.
Based on extensive data-mining experience in Front Office Trading systems at Blue Chip clients such Morgan Stanley, JP Morgan, Santander, Credit Suisse, the Muppix Team have developed a Free Unix/Linux & SQL Data Science Toolkit for Consultants and Entrepreneurs to extract and analyse unstructured information from diverse data sources.
So much data is created everyday. Amazingly a lot of it is actually accessible, but its all very unstructured and difficult to be of any real practical use.
No single software programme on your PC can handle really large data, and not one of them s going to give you the flexibility to select-out any useful information. Control F just doesnt cut it.
Muppix enables you to easily perform sophisticated selections on very large data, on any PC or Apple. It uses industry-strength Linux to do the heavy lifting, but each Muppix command is described in a simple keyword language so they're easy to find. Our mission is that any granny can use Muppix within 5 minutes!
In the future we will all be setting up our own web-sites, making apps , showing new connections between different areas. This involves extracting information from large data from different sources. Muppix puts you fully in control of the information exploration.
We're planning another free introductory course entitled
"A Hands-on Exploration of Big Data"
Saturday morning, 21th May at our facilities in Bosch en Duin, nr Bilthoven, just north of Utrecht. ( Click on Training on top of this page, tab to book )
MUPPIX Toolkit 2.6 now available for download, includes the SQL commands & Keyword Spreadsheet, try out the keywords on your own text, and see how the commands will extract on a sample of your data.
Check out Muppix examples on http://muppix.blogspot.nl/
Muppix have just added a New Toolkit :
Muppix are now developing a Cross-Over Toolkit to Excell and to SQL. Basic Muppix commands will be converted to the equivalend EXCEL or standard SQL commands
The Toolkit is a comprehensive Unix Cheatsheet described in a simple language so that commands can be easily found and used
Designed specifically for professionals with no technical experience
Over 1300 industry strength Unix commands,
As a data scientist, I spend quite a bit of time on the command-line, especially when there's data to be obtained, scrubbed, or explored - Jeroen Janssens
Part of the skillset of a data scientist is knowing how to obtain a sufficient corpus of usable data, possibly from multiple sources, and possibly from sites which require specific query syntax. A data scientist should know how to do this from the command line, e.g. in a Un*X Environment - Hilary Mason
Few tools are more indispensable to my work than Unix. Manipulating data into different formats, performing transformations, and conducting exploratory data analysis (EDA) is the lingua franca of data science. The coffers of Unix hold many simple tools, which by themselves are powerful, but when chained together facilitate complex data manipulations. Although languages like R and Python are invaluable for data analysis, I find Unix to be superior in many scenarios for quick and simple data cleaning, idea prototyping, and understanding data. - Seth Brown
While it's sometimes difficult to remember all of the parameters for the Unix commands, getting familiar with them has been beneficial to my productivity and allowed me to avoid many headaches when working with large text files... Writing a script in Python/Ruby/Perl would probably take a few minutes and then even more time for the script to actually complete. Thankfully, the Unix Utilities exist and they're awesome. - Greg Reda
Whenever you need to work with data, don’t overlook the Unix “hand tools.” Once you get used to working on the Unix command line, you’ll find that it’s often faster than the alternatives. And the more you use these tools, the more fluent you’ll become. - Mike Loukides
80% of time on data cleaning & exploring
Huge Demand for technical decision makers
Exponential increase in volume, complexity and sources of data