From time to time I help clients making the most of their data. In the following I present a related non-complete list of tools that I’m experienced in.
- Amazon Web Services (AWS)
- SQL-Databases including dialects MySQL, PostgreSQL and HIVE (on Qubole)
- self-written Python scripts using Numpy, Scipy and pandas
- common queueing systems like PBS and SGE
- methods from network theory
- C++-based packages for network-related problems
- GNU/Linux or Mac OSX
My consultation entails solving problems in multiple layers of abstraction.
- what kind of questions do you want to have answered
- is the right data to answer those questions already available?
- what kind of available data is actually usable
- which other data sources can we acquire
- planning: let’s find the most efficient way to answer the initially posed questions given we answered all of the above
- collect relevant internal data
- collect relevant external data (e.g. from social network sites)
- data analysis using statistical methods and state-of-the-art tools of distributed computing
- extraction of the relevant information from the analysis
- interactive summary of the results in a visually appealing manner
I’d be happy to help on your projects, too! Please contact me if you feel like I could be an asset to the solution of your problem!