Summer school

Learning in Data Science: Models, Algorithms and Tools

July 17-22, 2017

Organised by : School of Engineering and Applied Science, Ahmedabad University

PROGRAM

The SEAS is delighted to announce an intensive school on Pattern Recognition. This school blends theoretical and practical aspects of Data science with focus on models, tools and techniques in data science illustrated through problems in pattern matching and pattern recognition with applications in Computer Science, Bioinformatics/Computational Biology and Image Processing.

This school is led by experts from both industry and academia and will be an immersive experience in theoretical and practical aspects of Data science for advanced UG/PG students and faculty members in Computer Science / Mathematics. This school will be an opportunity for prospective researchers to get a rigorous introduction into the fast emerging field of Data Science.


Workshop Content:

Theory: Probability and Statistics Basics: Basics of continuous and discrete distribution, Normal distribution, Conditional Probability, Bayesian inference, Expectation, Variance, Co-variance and Linearity of Expectations, Markov chains, Hidden Markov Chains (HMM); Parametric estimation: Maximum likelihood and expectation maximization; Pairwise Alignment – different scoring models, dynamic programming algorithms, heuristic algorithms (BLAST, FASTA), statistics of similarity scores and scoring parameters and modelling pairwise alignments using Markov chains and HMMs; Multiple Sequence Alignment and Phylogeny Tree Construction: Exact and Approximation Algorithms, and Progressive alignment methods (Clustal X and T-Coffee) and Multiple Alignments using profile HMM’s and probabilistic approaches to Phylogeny construction; Introduction to convex optimization, Application: Face recognition using Principal component analysis (PCA), Incremental PCA, probabilistic PCA, Linear discriminant analysis (LDA), Incremental LDA, Probabilistic LDA

Lab/Tutorial:

Design and Development of Algorithms for exact string matching using SMART, a framework for development and testing of pattern matching algorithms, Models, Algorithms and Heuristics for pair-wise sequence alignment with applications and the statistical analysis of results, How to use BLAST and FASTA (key local alignment search and analysis tool for pairwise local sequence alignment) and understand the algorithms and statistics behind its workings, How to use CLUSTALW and T-coffee (tools for multiple sequence alignment) and understand the underlying algorithms and interpretation of its results, Modelling, design and analysis of algorithms for face recognition using variants of principal component analysis and linear discriminant analysis. Tools for Labs will be Matlab, Python and open CV combination. The emphasis of the labs will be on understanding and analysis of the probabilistic and online variants of PCA and LDA