I am a PhD student in MLD within school of computer science at CMU working under Prof. Eric Xing.

My research interests include statistical machine learning, non-parametric Bayesian methods, information retrieval and clustering. I have recently worked on transferring supervision in finite and infinite mixture models (Dirichlet Process) and information retrieval.

I interned at Microsoft Research Redmond in summer of 2014 and got the opportunity to work with great mentors: Rich Caruana,  Evelyne Viegas and Mathew Richardson. I also received the prestigious IBM Fellowship for 2013-2015. I have worked as a researcher for 2 years at IBM Research India from 2009 - 2011. I received my Master of Technology degree in Computer Science from IIT Bombay in 2009.
I did my Master's thesis in Information Retrieval under the guidance of Prof. Soumen Chakrabarti

Publications (dblp, scholar)

Latest Manuscript
  • Embarrassingly Parallel MCMC in Quasi-ergodic Settings - W. Neiswanger, A. Dubey, C. Wang, E. Xing (In submission)
  • Bayesian Nonparametric Kernel-Learning - J. Oliva*, A. Dubey*, B. Poczos, J. Schneider, E. P. Xing (In submission)
  • Estimating Accuracy from Unlabeled Data: A Bayesian Approach - E. Platanios, A. Dubey, T. Mitchell (In submission)
  • Large-scale randomized-coordinate descent methods with non-separable linear constraints - S. Reddi, A. Hefny, A. Dubey, C. Downey, S. Sra. International Conference on Conference on Uncertainty in Artificial Intelligence (UAI) 2015. (arXiv:1409.2617)
  • Large-scale Distributed Dependent Nonparametric Trees - Z. Hu, Q. Ho, A. Dubey, E. P. Xing, The 32th International Conference on Machine Learning (ICML) 2015. (pdf)
  • Learning Answer-Entailing Structures for Machine Comprehension - M. Sachan, A. Dubey, M. Richardson, E. P. Xing, Association for Computational Linguistics (ACL) 2015. (pdf) (Honorable Mention)
  • Dependent nonparametric trees for dynamic hierarchical clustering- A. Dubey*, Q. Ho*, S.Williamson and E. P. Xing, Advances in Neural Information Processing Systems (NIPS ) 2014 (pdf )
  • Integrating Transition-based and Graph-based Parsing Using Integer Linear Programming - A. Hefny, A. Dubey, S. J. Reddy, Advances in Neural Information Processing Systems 28 (NIPS ) Workshop Modern ML + NLP (pdf coming)
  • Parallel Markov Chain Monte Carlo for Pitman-Yor Mixture Models - A. Dubey, S.Williamson and E. P. Xing, International Conference on Conference on Uncertainty in Artificial Intelligence (UAI) 2014. (pdf)
  • Spatial Compactness meets Topical Consistency: Jointly modeling Links and Content for Community Detection - M. Sachan, A. Dubey, S. Srivastava, E. P. Xing and E. Hovy, International Conference on Web Search and Data Mining (WSDM) 2014. (pdf) (Considered for best paper
  • Parallel Markov Chain Monte Carlo for Nonparametric Mixture Models - S. Williamson, A. Dubey and E. P. Xing. The 30th International Conference on Machine Learning (ICML) 2013 [preprint
  • A Non-parametric Mixture Model for Topic Modeling Over Time - A. Dubey, A. Hefny, S. Williamson, E. P. Xing, Proceedings of The Thirteenth SIAM International Conference on Data Mining (SDM) 2013. (previous version pdf)
  • AUSUM: approach for unsupervised bug report summarization, S. Mani, R. Catherine, V. S. Sinha, A. Dubey, ACM 20th International Symposium on the Foundations of Software Engineering  (SIGSOFT) 2012. (pdf)
  •  Learning Dirichlet Processes from Partially Observed Groups, A. Dubey, I. Bhattacharya, M. Das, T. Faruquie, and C. Bhattacharyya,  IEEE International Conference on Data Mining (ICDM), Vancouver, Canada, 2011. (pdf)
  • Diversity in Ranking via Resistive Graph Centers, A. Dubey, S. Chakrabarti and C. Bhattacharyya, 17th ACM Conference on Knowledge Discovery and Data Mining (SIGKDD), San Diego, CA, USA, 2011. (pdf)
  • A Cluster-Level Semi-Supervision Model for Interactive Clustering, A. Dubey, I. Bhattacharya, S. Godbole, The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), Barcelona, Spain, September 2010. (pdf)
  • Conditional Models for Non-smooth Ranking Loss Functions, A. Dubey, J. Machchhar, C. Bhattacharyya and S. Chakrabarti, IEEE International Conference on Data Mining (ICDM), Miami, Florida, USA,  December 2009. (pdf)
  • Efficient and Accurate Local Learning for Ranking,  S. Banerjee, A. Dubey, J. Machchhar, S. Chakrabarti, 32nd Annual ACM SIGIR  Conference  workshop on Learning to Rank for Information Retrieval , Boston, USA, July 2009. (pdf)


  • Systems and Methods for Interactive Clustering, Avinava Dubey, Indrajit Bhattacharya, Shantanu Godbole.

Contact details