Group Meeting 2020

Fall, 2020

  • Xiaomeng WAN, Nov. 27. Inexplicit bias of gradient descent. Optimization algorithm (e.g. gradient decent) may lead to different implicit regularization. When the optimization problem is underdetermined, i.e. there are many global minima corresponding to zero training error. Different optimization algorithms may prefer different types of minima. We discussed several examples of gradient descent.

  • Mingxuan CAI, Nov. 20. Statistical methods for cell type deconvolution. The RNA-seq samples generated from bulk tissues only represent the averaged gene expression profiles across cell types. By utilizing the cell type-specific gene expression reference derived from independent single-cell RNA-seq data sets, many computational deconvolution methods have been developed to infer the cell type proportions in bulk samples. We discussed several widely used deconvolution methods, including CIBERTSORT, MuSiC, and Bisque, summarized their practical considerations, and analyzed their performances with real datasets. We also noticed Scaden.

  • Jia ZHAO, Nov. 6. Generalized matrix factorization. We discussed a fast algorithm for fitting generalized linear latent variable models (GLLVM), which is scalable to high-volume and high-dimensional datasets. This is achieved via a good approximation based on quasi-likelihood and parallelization. [ref]

  • Gefei WANG, Oct. 30. Learning disentangled representation in VAE. We discussed how to learn interpretable representations in deep generative models, in the unsupervised manner. For the Variational Auto-Encoders (VAEs), one way to achieve this goal is to enforce the statistical independence among latent factors. Some calculation shows that, the KL divergence term of the VAE objective implicitly encourages an independence because of the Total Correlation (TC) term. Several models, including beta-VAE, beta-TCVAE and FactorVAE make use of this property and further improve the disentanglement in VAEs.

  • Jiashun XIAO, Oct. 23. Matching human face with DNA with applications. Suspect identification is a common problem in forensics, such problems always involve match the human face to DNA samples collected at the crime scene. To address this issue, we explored two different strategies to connect the human face to DNA. The first strategy predicted the face from DNA information and then matched the predicted face to the real face image. The second strategy predicted DNA-encoded aspects (sex, ancestry, SNP) from face images, and then matched the predicted DNA aspects against the real DNA. We found that the first method seemed straightforward but with high computation cost and lower identification power compared with the second method. We thus concluded that face-to-DNA regime was more practicable in real applications.

Spring, 2020

  • Xiaomeng WAN, June 12. A glance of overparameterized region [pdf].

Fast Integration of scRNA-seq data

Jingsi MING, April 17, 2020

Mendelian Randomization

Xianghong HU, April 24, 2020

Cross Population Analysis

Mingxuan CAI, April 03, 2020

Bless of multiple causes

Jia ZHAO, April, 10, 2020

Ancestry inference [link][code]

Jiashun XIAO, March 20, 2020.

Parallel Programming [link]

Shunkang ZHANG, March 27, 2020