The Artificial Intelligent for Social Good Lab

DBLP1, DBLP2|Google Scholar|Research Gate

New Jersey Institute of Technology

GITC #5115, NJIT Media Center, Newark, NJ 07104, USA

Email: phan@njit.edu, Phone: 973-596-6367 Open Positions

Dr. Phan is an assistant professor at the NJIT. Dr. Phan's topic of interest mainly concerns on privacy and security, social network analysis, machine learning, and spatio-temporal data mining. In particular, Dr. Phan am interested in understanding the nature of human behavior and human expression on health social networks and social media. Dr. Phan received his PhD in Computer Science from the University of Montpellier 2 in October 2013. Dr. Phan was a part of the Knowledge Discovery in Databases Team (TaToo) in the LIRMM laboratory (supervised by Prof. Pascal Poncelet, Mme. Maguelonne Teisseire, and Dr. Dino Ienco). Dr. Phan joined the University of Oregon as a Research Associate in fall 2013 and work with Prof. Dejing Dou.

We are looking for a PhD student with a strong background in math and machine learning! The position is expected to start on Spring (January) 2021. Please send your CV to phan at njit.edu.


Han Hu (PhD) 2016 - present

Phung Lai (PhD) 2018 - present (NJIT), 01/2020 - present (Adobe)

Pelin Ayranci (PhD) 2020 - present

Khang Tran (PhD) 2020 - present

Andrew Dennis (PhD) 2020 - present (Bloomberg)

Khang Dang (PhD) 2021 - present

Anuja Badeti (Honor Undergraduate) 2018 - present

Pradnya Desai (Honor Undergraduate) 2019 - present

Hang Nguyen (Honor Undergraduate) 2019 - present

"Everything should be made as simple as possible, but not simpler." (Albert Einstein)

Media Cover: NJIT Weekly, Vox.com, NJSpotlight, NJ101.5, newstrotteur.fr, modrogen.com, actualite-sante.fr, scienmag.com, Medicalxpress, Healthitanalytics.com, Princeton Magazine, Daily Emerald


May 2021: Serve on PC of ICLR 2022.

May 2021: Our project about Federated Learning in the Wild has been funded by NJIT Seed Grant! This is a joint project with Cristian Borcea.

Apr 2021: Invited to serve on two NSF Panels. Joyful to serve!

Apr 2021: Phung's provisional pattern in preserving privacy in natural language modeling has been approved. Congrats Phung! This is joint work with Adobe Inc.

Dec 2020: Serve on PC of ICML 2021, KDD 2021, NeurIPS'21, ICDM'21, MedInfo'21.

Dec 2020: Exciting to serve on an NSF Panel.

Nov 2020: Grateful to continue receiving funding support from Adobe System Inc.

Aug 2020: Exciting to serve on an NSF Panel.

Aug 2020: Our Secure and Scalable Federated Learning in the Wild during the COVID-19 Pandemic project has been funded by NSF: IIS: EAGER grant. This is a collaborative project with Prof. Ruoming Jin from KSU.

Aug 2020: Serve on PC of AAAI 2021, and WSDM 2021.

June 2020: Grateful to receive an award from the NSF: SaTC: Core for our project, "When Adversarial Learning Meets Differential Privacy: Theoretical Foundation and Applications," collaborating with Prof. My T. Thai from the University of Florida. Great thanks to the community for continuing to support our research!

May 2020: Our paper "Scalable Differential Privacy with Certified Robustness in Adversarial Learning" has been accepted to ICML'20. The paper established 1) the first connection between differential privacy for adversarial training with certified robustness in both input and latent spaces, and 2) scalable training to bypass the iterative training process in the leading DP mechanisms. [Code has been released on Github]

May 2020: Greatful to continue receiving great support from Adobe System Inc. to push the boundary of "Differential Privacy in Natural Language Processing." Stay tuned for significant results to come out soon!

May 2020: Invited to serve on the PC of NeurIPS'20, IEEE BigData'20, and IEEE ICMLA'20.

May 2020: Our research on Human Behavior during and after the COVID-19 pandemic using federated machine learning is covered on NJIT Weekly.

May 2020: Gave an invited talk at Adobe System Inc. on "Deep Learning in the Wild with Certified Defenses."

March 2020: Our paper about "Ontology-based Interpretable Machine Learning for Textual Data" has been accepted for an Oral Presentation at the IEEE International Joint Conference on Neural Networks (IJCNN'2020). [Github] See you in Glasgow, Scotland, UK!

Feb 2020: Exciting to announce a long-term partnership with Qualcomm Technologies Inc. to build a Federated Machine Learning Framework in the Wild.

Feb 2020: Exciting to serve on an NSF Panel.

Jan 2020: Our paper "Limiting the Neighborhood: De-Small-World Network for Outbreak Prevention" has been selected as "Best Papers" at CSoNet 2019.

Dec 2019: Invited to serve in the PC of IJCAI-PRICAI'20, ICDM'20, PAKDD'20.

Nov 2019: Honor to receive the "Best Service Award" at CSoNet 2019.

Nov 2019: Our class "IS 698: Emerging Topics in Deep Learning - Artificial Intelligence" will be available from Spring 2020. Please consider registering for the course, which will cover significant research and practice directions in DL - AI.

Sept 2019: Our paper about "Differentially Private Lifelong Learning" has been accepted to present at the Privacy in Machine Learning (PriML) Workshop at NeurIPS'19. See you at NeurIPS'19!

Sept 2019: Our paper about "Ontology-based Interpretable Machine Learning" has been accepted for an Oral Presentation at the Knowledge Representation & Reasoning Meets Machine Learning (KR2ML) Workshop at NeurIPS'19. See you at NeurIPS'19!

Sept 2019: Gave an Invited Talk about "Deep Learning in the Wild with Certified Defenses" at the Computer Science Department, Kent State University.

Sept 2019: Two papers accepted at Springer CSoNet 2019!

Aug 2019: Grateful to receive gift fundings from Adobe Systems!

Aug 2019: Attend MedInfo 2019, present our work on DrugTracker, and grateful for exciting questions! Went back to Montpellier and regroup with Prof. Pascal Poncelet.

Aug 2019: Our paper "DrugTracker: A Community-focused Drug Abuse Monitoring and Supporting System using Social Media and Geospatial Data" has been accepted at the International Conference on Advances in Geographic Information Systems 2019 (ACM SIGSPATIAL 2019), Nov 5-8, 2019, Chicago.

May 2019: Invited to serve in the PC of AAAI 2020.

June 2019: We are organizing the "AI with Geospatial Information System for Social Good" Special Track at CSoNet 2019. Please submit your good works.

June 2019: Grateful to receive the IJCAI-AIJ Travel Grant 2019, invited to serve as a Session Chair IJCAI'19.

May 2019: Invited to serve in the PC of CIKM 2019, ICDM 2019, ICTAI 2019, FLAIRS 2020.

May 2019: Our paper "Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in Deep Learning with Provable Robustness" has been accepted at the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), August 10-16, 2019, Macao, China. (acceptance rate = 850/4752, 17.9%) [GitHub]

March 2019: Our paper "An Ensemble Deep Learning Model for Drug Abuse Detection in Sparse Twitter-Sphere," has been accepted at The 17th MedInfo 2019, Lyon, France (A leading venue in Medical and Health Informatics). We addressed the extreme sparse classification problem in this work to detect drug abuse risk behaviors in Twitter-Sphere.

March 2019: We are looking for a Ph.D. student, fully funded, to work on Privacy and Security in Deep Learning...

Feb 2019: NJIT has been named a "Very High Research Activity (R1)" institution by the Carnegie Classification.

Feb 2019: Our paper "Deep Self-Taught Learning for Detecting Drug Abuse Risk Behavior in Tweets," has been selected as the best papers at CSoNet 2018, and invited to a Special Issue of COSN (Computational Social Networks, Springer).

Dec 2018: Exciting to serve on the PC of IJCAI 2019 and MedInfo 2019.

Oct 2018: Exciting to serve on the PC of ICDCS 2019, PAKDD 2019, and ICDS 2019.

August 2018: Our paper, titled "Deep Self-Taught Learning for Detecting Drug Abuse Risk Behavior in Tweets," has been invited to The 7th International Conference on Computational Data & Social Networks (CSoNet 2018).

August 2018: Our paper, titled "Recursive Structure Similarity: A Novel Algorithm for Graph Clustering," has been accepted at The 30th IEEE ICTAI 2018.

May 2018: Our team visited Google Research at New York City to present our work on Morphological Neural Networks and Differentially Private Deep Learning.

May 2018: Drug abuse risk behavior labeled Tweets (i.e., 5,000 tweets) has been released at: https://github.com/hu7han73/DrugAbuseLabeledTweets.

May 2018: A paper, titled "Deep Learning Model for Classifying Drug Abuse Risk Behavior in Tweets" has been accepted at IEEE International Conference on Healthcare Informatics (ICHI 2018).

May 2018: Invited to serve in the PC of CIKM 2018, ICDM 2018.

May 2018: A paper, titled "DPNE: Differentially Private Network Embedding" has been accepted at PAKDD 2018.

April 2018: Differentially Private Deep Learning Package Release.

Nov 2017: Selected as Best Papers IEEE ICDM 2017.

Nov 2017: I gave an invited talk (jointly with Prof. Dejing Dou) about: "Deep Learning and Privacy Preserving in a Health Social Network" at The 10th International Workshop on Privacy and Anonymity in the Information Society (PAIS), IEEE ICDM 2017.

Nov 2017: Invited to serve in the PC of IJCAI 2018, and KDD 2018.

Sept 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at IEEE NJACS-2017.

Sept 2017: A paper, titled "Time-Sensitive Behavior Prediction in a Health Social Network" has been accepted at IEEE ICMLA 2017.

Aug 2017: Two papers have been accepted as regular papers at IEEE ICDM 2017 (acceptance rate: 9.25% = 72 / 778).

June 2017: Our article, titled "Preserving Differential Privacy in Convolutional Deep Belief Networks," has been accepted by Machine Learning 2017, ECML-PKDD Journal Track, IF 1.848. Arxiv version (acceptance rate: 13.5%)

June 2017: Invited to serve in the PC of ICDM 2017, KDIR 2017, and ADMA 2017.

June 2017: NSF funded our planning meeting to establish an I/UCRC Center for Big Learning (CBL) in 2017.

June 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at United Technologies Research Center (UTRC) on August.

June 2017: I will give an invited talk about "Enabling Real-Time Drug Abuse Detection on Online Social Media" at Tumblr.com on July.

May 2017: Invited to serve as a Reviewer for IEEE Intelligent Systems.

April 2017: Invited to serve in the PC of ACM CIKM 2017 and IEEE ICTAI 2017.

April 2017: My tutorial on Deep Learning has reached 20,000 views on Slideshare.

Mar 2017: I gave an invited talk about "Differential Privacy Preservation in Deep Learning" at University of Arkansas.

Recent Projects

A Large-Scale Deep Learning-based Drug Abuse Detection on Online Social Media


IJCNN'20 | SIGSPATIAL'19 | MedInfo'19 | KR2ML'19 (NeurIPS) | CSoNet'19 (selected as best papers) | CSoNet'18 (selected as best papers) | ICHI'18 | ICTAI'18 | ICDE-17 | Media Cover: NJSpotlight, NJ101.5, Medicalxpress

Every day, about 7,000 people are treated in emergency rooms for misuse of prescription drugs. Furthermore, every day about 44 people in the US die from overdoses of prescription painkillers. Opioid overdoses claimed more than 33,000 lives in 2015. Heroin and fentanyl deaths have risen sharply, by 23%, in one year, to 12,989. Deaths from synthetic opioids, including fentanyl, rose by 73% to 9,580. Untreated drug abuse cases can cause tremendous economic and societal burdens, posing serious public health challenges.

This project will develop a computational system to monitor, detect, predict and discover activities, events, or topical relationships, recurring patterns, and spatial/temporal/social trends in drug abuse, by analyzing large-scale social media and other online data.

Deep Private Auto-Encoder (dPA)

DeepPrivate - Differential Privacy Preservation in Deep Learning under Model Attacks

Xintao Wu (University of Arkansas), Dejing Dou (University of Oregon)

Grants: NSF CRII: SaTC, NSF I/UCRC for Big Learning, NJIT Seed Grant

ICML'20 | IJCAI-19 | PriML-19 (NeurIPS) | CSN-19 | MedInfo-19 | PAKDD-18 | ICDM-17 | Machine Learning 2017, ECML-PKDD'17 | AAAI-16 | Media Cover: Princeton Magazine

Today, the remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built based on patients’ personal and highly sensitive data, e.g., clinical records, etc. However, no deep learning techniques have yet been developed that incorporate privacy protection against model attacks. Such lack of protection and efficacy may put patient data at high risk and expose health care providers to legal action based on HIPAA/HITECH law. This project will develop a mechanism, called "DeepPrivate," for privacy preservation in deep learning under model attacks.

A key thrust of the project is to better understand and defend against model inference attacks, including both well-known fundamental model attacks and novel attacks developed through prism of the classical confidentiality and integrity models. Through an extensive analysis of these attacks, the team will develop an understanding of the relative risks of key aspects of learning approaches. In particular, vulnerable features, parameters, and correlations, which are essential to conduct model attacks, will be automatically identified and protected in a novel threat-aware privacy preserving approach based on ideas from differential privacy.

Semantic Mining of Activity, Social, and Health data Project (SMASH)

Dejing Dou, Xiao Xiao, Hao Wang, Javid Ebrahimi (University of Oregon), Xintao Wu (University of Arkansas)

Brigitte Piniewski (PeaceHealth Labs), David Kil (HealthMantic)

Information Sciences - Elsevier'16 | AAAI-16 | ICMLA'16 | Digital Health'16 | CIKM'14 | SDM'15 | ASONAM'15 | ACM BCB'15 | IEEE Intelligent Systems'15 | KAIS'15 | ACM TIST'16 | SNAM'16 ...

Two thirds of the US population are now overweight or obese. This incurs significant health risks and financial costs to society. Traditionally, support groups and other social reinforcement approaches have been popular and effective in dealing with unhealthy behaviors including overweight. Of the factors associated with sustained weight loss one of the most important is continued intervention with frequent social contacts. Research in the design and implementation of the SMASH (Semantic Mining of Activity, Social, and Health data) system will address a critical need for data mining tools to help understanding the influence of healthcare social networks, such as YesiWell, on sustained weight loss where the data are multi-dimensional, temporal, semantically heterogeneous, and very sensitive.

Mining Object Movement Patterns from Trajectory Data (GeT_Move)

Dino Ienco (IRSTEA Montpellier)

Pascal Poncelet (University Montpellier 2)

Maguelonne Teisseire (IRSTEA Montpellier)

http://www.lirmm.fr/~phan/multimove.jsp | IJITDM 2016| ISI 2013 | PAKDD 2013 | ADMA 2012 | ACM GIS 2012 | IDA 2012 | BDA 2012 (Best paper) | ECML-PKDD 2012 | ICSDM 2011

Dataset Released and Code Released

As a part of the CRNS-Lirmm project, we build up a unifying framework to extract multiple-movement patterns from trajectory data. We have also applied different techniques such as Fuzzy logic, MDL principal, etc., to avoid redundant patterns. In addition, novel movement patterns have been introduced in this project. A demonstration system has been developed allowing end users playing with the system in real time.

User Similarity Ranking and User Clustering Algorithms based on Social Network Analysis

Van Duc Thong Hoang (University of Gent, Belgium), Hyoseop Shin (Konkuk University, South Korea)

ACM MM 2010 | APWeb 2010 | EDB 2010

Dataset Released

Most clustering algorithms are not effective on dense and concentrated graphs which do not have any meaningful cut points. To address these problems, we first propose a graph transformation to separate large scale online communities into two different types of meaningful subgraphs. The first subgraph is the intimacy graph and the second is the reputation graph. Then, we present the effective algorithms for discovering good sub-communities and for excluding incompatible users in these subgraphs.

In this research, we also propose an adaptive combination scheme of tag-based similarity and link-based similarity in which the weight factors are dynamically determined for each user by evaluating each user’s characteristics such as tag commonness and link strength. The experimental results with a Flickr data set show that the proposed scheme consistently outperforms the previous work by about 20%.