According to the Harvard Business Review, Data Scientist is “The Sexiest Job of the 21st Century”. Data scientists are often considered to be wizards that deliver value from big data. These wizards need to have knowledge in three very distinct subject areas, namely, scalable data management, data analysis and domain area expertise. However, it is a challenge to find these jacks-of-all-trades that cover all three areas.
Or, as the Wall Street Journal puts it “Big Data’s Problem is Little Talent”. Naturally, finding talented data scientists is also a requirement, if we are to put big data to good use. If data analysis were specified using a declarative language, data scientists would not have to worry about low-level programming any longer. Instead, they would be free to concentrate on their data analysis problem.
We aim to enable deep analytics of huge heterogeneous data sets with low latency by developing advanced, scalable data analysis and machine learning methods. Our goal is to specify in these methods a declarative way and optimize and parallelize them automatically, in order to empower data scientists to focus on the analysis problem at hand. That is, relieving them from the need to be system programmers.
Part of my research deal with computational methods and big data.
Herein, I briefly list my current knowledge on subject.
Numerical techniques:
Density functional theory (DFT) implemented in Siesta and PWscf
Numerical methods for Open quantum system
DFTB
Kinetic Monte-Carlo
Optimization and systems of non-linear equations
Dynamical mean field theory (DMFT)
Programming languages:
Matlab
Mathematica
Fortran 90
Python