Software

I write industrial-strength software (which I attribute to my years of working in industry and skills honed in academia). As of January 2018, I was ranked 153rd globally on Matlab Central (an online repository for Matlab code contributed by users all over the world) and top 5% code contributors worldwide. Here is a screenshot (link1) (link2) (link3) and here is a link to my profile on Matlab Central. My code was downloaded 436 times in the month of March 2016.

I am comfortable programming in a large number of computer languages (R, Python and MATLAB).

Here is some of the code that I wrote. If you like it or have any suggestions/feedback, please drop me a line. If you use my software, kindly cite my work (if you are not sure how to cite my work, just send me an email and I can help you).

Here is a link to my code repository on bitbucket , a link to my open source data science projects (github link) (bitbucket link) and a link to my github repository.


I have also written R packages for survival analysis of data in a privacy-preserving manner (link, link, link)


Other minor packages I have modified/contributed to from other resources/open source repositories (credit goes to the original authors) (link) (link) (link) (link) (link)



1) Some code in MATLAB for doing linear regression and plotting the results. It will output all the statistics

and plot the regression lines (available on Matlab Central) (github)

Dynamical Systems and Nonlinear Differential Equations

2) A GUI for solving ordinary differential equations (available on Matlab Central) (github)

3) A GUI for solving ordinary differential equations (model with immune response).

4) R code to perform inference using Stan, and RStan for an Ordinary Differential Equation model (Lotka-Volterra model) (available on bitbucket)

5) A GUI for solving delay differential equations and doing local search for best-fit parameters (available on Matlab Central)

6) A graphical user interface for solving ordinary differential equations for in-vitro infection: shows both viral and target cell dynamics (on Matlab Central)

Machine Learning and Statistics

7) Code in Matlab to do linear regression for multiple categories and output relevant statistics (available on Matlab Central)

8) Code in Matlab to do linear regression for multiple categories with different slopes for each category (on Matlab Central)

9) Code in Matlab to do boxplots of a biological quantity measured in healthy vs. disease subjects in two different cohorts (on Matlab Central)

10) Code in Matlab to do a t-test and create box plots for genes/compounds/bugs etc in healthy vs. disease individuals (on Matlab Central)

11) Code in Matlab to do a Wilcoxon rank-sum test and create box plots for genes/compounds/bugs etc in healthy vs. disease individuals (on Matlab Central)

12) MATLAB code to perform Bayesian linear regression (on Matlab Central)

13) Simple MATLAB example code and generic function to perform LASSO (on Matlab Central)

14) Function to perform Bayesian LASSO (on Matlab Central)

15) Simple MATLAB example code and generic function to perform LASSO on GLM (on Matlab Central)

16) Simple MATLAB example code and generic function to perform LASSO on logistic regression and predict (on Matlab Central)

17) MATLAB generic function for performing LASSO on GLM and using it for prediction (on MATLAB Central)

18) Generic function to perform elastic net regularization on GLM or logistic regression (on MATLAB Central)

19) Simple MATLAB example code and function to do binary classification using SVM (support vector machine) (on Matlab Central)

20) Simple MATLAB example code and function to do binary classification using SVM (support vector machine) for a general matrix (on Matlab Central)

21) Simple MATLAB example code and generic function to perform kmeans clustering (on Matlab Central)

22) Simple MATLAB generic function to perform PCA (on Matlab Central)

23) MATLAB generic function for random forests (on MATLAB Central and more powerful general function to try different leaf sizes on MATLAB Central)

24) MATLAB generic function for neural networks (on MATLAB Central)

25) Example MATLAB script to plot ROC curve and compute AUC for 4 different classification algorithms (on MATLAB Central)

26) R code to test for allometric power law relationship (on bitbucket)

27) MATLAB code to plot co-ordinate data from a file on a US map (on MATLAB Central)

28) R code to create a stacked plot (on bitbucket)

29) R code to perform forecasting and SQL like queries for a road accident forecasting and data exploration project (on bitbucket)(github)(deployed on shinyapps)

     Deployed web application to perform data exploration using SQL-like queries and perform machine learning analysis.

     This is an example of doing a very simple time-series model and visualization

     (on shinyapps)

30) R code for forecasting and time series (on bitbucket)

31) Basic example in R to do matrix completion using softImpute (on bitbucket)

32) Python code for a generic random forest for regression and classification (on bitbucket)

33) R code for LASSO and elastic net in R (on github)

34) R code for random forests (on github)


Process Automation

33) Scripts to run MATLAB on cluster using bsub (on Matlab Central)

34) Shell script to iteratively go into each directory and compile MATLAB code (on Matlab Central)

35) Here are some other tidbits of code that I either wrote or compile (MATLAB nuggets)

36) Shell script to go into each directory (in the current directory) and compile matlab code in each of these directories (code)

37) Shell script to launch multiple commands (generate "swarm" of jobs) on cluster (code) (also on bitbucket)

38) Shell script to automatically compile latex file (call latex, bibtex, dvips and ps2pdf : generate final pdf from tex file) (code) (also on bitbucket)

39) Shell script to iteratively go into each directory and compile MATLAB code (on bitbucket)

40) Shell script to install R, R Studio and other utilities in UNIX/Ubuntu (on github)

41) Shell scripts for essential work (on github)


Bioinformatics

42) Pipeline and vignette to use Seurat for single-cell RNA sequencing data analysis (link)

43) Vignette to Seurat CCA to align multiple datasets for single-cell RNA sequencing data (link)

44) Vignette to use ZINBWAVE to align datasets and plug into Seurat for single-cell RNA sequencing data (link)

45) Vignette for differential expression analysis of single-cell RNA sequencing data using DESeq2 (on bitbucket)

46) Analysis of single-cell using SC3 and scater (on bitbucket)

47) Mapping single-cell data to other datasets using scmap (on bitbucket)

48) R code for converting from Entrez ID to Ensembl ID using annotation packages (on bitbucket)

49) Analysis of microarray data (differential expression) using limma (on bitbucket)

50) Shell script and python script to parse output from SignalP bioinformatics software (on bitbucket)

51) Shell script and python script to parse output from PSortB bioinformatics package (on bitbucket)

52) Python program to get data using curl from OpenTargets biological repository and parse resulting json file (on bitbucket)


Data visualization

53) R code to visualize gene expression data as heatmaps using pheatmap (on bitbucket)

54) R code to visualize gene expression data as heatmaps using ComplexHeatmap (on bitbucket)

55) R code to visualize gene expression data as heatmaps using Morpheus R plugin (on bitbucket)

56) Python generic function for tSNE visualization (on bitbucket)


Statistics and machine learning

57) Vignette for meta-analysis in R using the metafor package (on bitbucket)

58) Fisher's method of combining p-values to do meta-analysis in R (on bitbucket)

59) Fisher's method of combing p-values to perform meta-analysis in MATLAB (on MATLAB Central)

60) Example code to perform supervised PCA (on bitbucket)

61) A generic function to use a Generalized Linear Model (GLM) with factors (on MATLAB Central)

62) Non-negative matrix factorization for gene expression data to construct metagene (on bitbucket)

63) Plotting survival curves by estimating a Kaplan-Meier and Cox proportional hazards model (on bitbucket)

64) Chi-squared test and Fisher's exact test example R script (on bitbucket)

65) Example code for Gaussian mixture model in R (on bitbucket)

66) Example code in R to perform logistic regression and plot ROC curve and precision recall curve in ggplot, perform test and train split and also perform optional cross-validation (on bitbucket)

67) Code to perform mixed effects logistic regression in R using glmer (on bitbucket)

68) Simple example code in R to perform principal component regression (on bitbucket)

69) Example tutorials for ANOVA and linear mixed effects models (fixed effects and random effects) (on bitbucket)

70) Automated machine learning using TPOT (on bitbucket)

71) Draw samples from gamma and Wishart distribution (on MATLAB Central)

72) Example to perform linear mixed effects regression in a Bayesian setting using the Edward framework (on bitbucket)

73) Example to perform linear mixed effects regression in a Bayesian setting using the PyMc3 framework (on bitbucket)

74) Example of linear mixed effects regression in a Bayesian setting (probabilistic programming) using the rstanarm framework (on bitbucket)

75) Simple example of regression and decision tree in R (on bitbucket)

76) Example of using bridge sampling to perform model selection on a Bayesian GLM (on bitbucket

77) Simple example of Poisson regression GLM (on bitbucket)

78) Simple examples of GLM and GLM mixed effects models in frequentist and Bayesian (using rstanarm) (on bitbucket)

79) Simple example of GAM (Generalized Additive Model) with spline fits; also has logistic regression with GAM (on bitbucket)

80) Simple examples of NLP (on bitbucket)

81) Generic function to perform PCA and show biplots in R (on bitbucket)

82) Error message in glm() function in R (on bitbucket)

83) Simple example of bootstrapping in python and R (on bitbucket)

84) Packages in R for survival analysis of secure federated data (link, link, link)

85) Code in R for generating survival curves (link)

86) Integrated R code repository for statistical analysis (link) (builds on work by Rudolf Cardinal)

87) Example script for isotonic regression (link, link)

88) Gaussian process (link)

89) Examples of probabilistic programming using PyMC3 (link)

90) Software to generate complex explanations from machine learning models using a class-contrastive approach (software created by student Yujia Yang) (link) (more complex code on github)

91) Software to perform patient stratification on genomic data and explain using class-contrastive reasoning (Sharday Olowu) (github)


Software Engineering

92) Example script to show how to do testing in R using the testthat framework (on bitbucket)


Time Series

93) Example code to perform clustering and community detection in time series data (on bitbucket)

94) Example code to perform changepoint detection in time series data (on bitbucket)

R code to perform forecasting and SQL like queries on a road accident forecasting project (on bitbucket) (on shinyapps)

R code for forecasting and time series (on bitbucket)


Deep learning

95) Example code and prototypes for deep learning (feedforward networks, auto-encoders, LSTMs) (on bitbucket)

96) Python and command-line tools for easily working with the Abstraction & Reasoning Corpus (ARC) dataset (created by Mikel Bober-Irizar) (link on github)


Epidemiology

97) Code for calculating incidence ratio of a disease in a population (on bitbucket) (private: email me if you need access)


Data science tools and data munging tools

98) Example code to perform diff on data (on bitbucket)

Repository of tools and scripts for data munging (link) (private; email me if you need access)

99) Advanced data.table operations (like detecting next row for each patient) (on bitbucket)

100) Tutorial and template for reproducible science in R using rmarkdown (link)

Templates

101) Template for R package (link)

102) Template for R bookdown (link)

103) Latex new manuscript project template (link)

104) New project template (link)


Miscellaneous

105) Repository of tools and scripts for data munging (link) (private; email me if you need access)


Open source data science

106) Repository of my public open source data science projects (bitbucket) (github)

107) Open data repository (link)