The Artificial Intelligent for Social Good Lab
NJIT Media Center, Newark, NJ 07104, USA 
(GITC #5115)
Phone: 973-596-6367                         Open Positions
Dr. Phan is an assistant professor at the NJIT. Dr. Phan's topic of interest mainly concerns on privacy and security, social network analysis, machine learning, and spatio-temporal data mining. In particular, Dr. Phan am interested in understanding the nature of human behavior and human expression on health social networks and social media. Dr. Phan received his PhD in Computer Science from the University of Montpellier 2 in October 2013. Dr. Phan was a part of the Knowledge Discovery in Databases Team (TaToo) in the LIRMM laboratory (supervised by Prof. Pascal Poncelet, Mme. Maguelonne Teisseire, and Dr. Dino Ienco). Dr. Phan joined the University of Oregon as a Research Associate in fall 2013 and work with Prof. Dejing Dou

Han Hu (PhD) 2016 - present
Phung Lai (PhD) 2018 - present
Anuja Badeti (Honor Undergraduate) 2018 - present

"Everything should be made as simple as possible, but not simpler." (Albert Einstein)

Oct 2018: Exciting to serve on the PC of ICDCS 2019PAKDD 2019 and ICDS 2019
May 2018:
 Our team visited Google Research at New York City to present our work on Morphological Neural Networks and Differentially Private Deep Learning.  
May 2018: Drug abuse risk behavior labeled Tweets (i.e., 5,000 tweets) has been released at: https://github.com/hu7han73/DrugAbuseLabeledTweets
May 2018: Invited to serve in the PC of CIKM 2018ICDM 2018.
May 2018: A paper, titled "DPNE: Differentially Private Network Embedding" has been accepted at PAKDD 2018
Nov 2017: Selected as Best Papers IEEE ICDM 2017.
Nov 2017: I gave an invited talk (jointly with Prof. Dejing Dou) about: "Deep Learning and Privacy Preserving in a Health Social Network" at The 10th International Workshop on Privacy and Anonymity in the Information Society (PAIS), IEEE ICDM 2017.
Nov 2017: Invited to serve in the PC of IJCAI 2018, and KDD 2018.
Sept 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at IEEE NJACS-2017.
Sept 2017: A paper, titled "Time-Sensitive Behavior Prediction in a Health Social Network" has been accepted at IEEE ICMLA 2017
Aug 2017: Two papers have been accepted as regular papers at IEEE ICDM 2017 (acceptance rate: 9.25% = 72 / 778). 
June 2017: Our article, titled "Preserving Differential Privacy in Convolutional Deep Belief Networks," has been accepted by Machine Learning 2017, ECML-PKDD Journal Track, IF 1.848. Arxiv version (acceptance rate: 13.5%)
June 2017: Invited to serve in the PC of ICDM 2017KDIR 2017, and ADMA 2017.
June 2017: NSF funded our planning meeting to establish an I/UCRC Center for Big Learning (CBL) in 2017.
June 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at United Technologies Research Center (UTRC) on August.
June 2017: I will give an invited talk about "Enabling Real-Time Drug Abuse Detection on Online Social Media" at Tumblr.com on July.
May 2017: Invited to serve as a Reviewer for IEEE Intelligent Systems.
April 2017: Invited to serve in the PC of ACM CIKM 2017 and IEEE ICTAI 2017.
April 2017: My tutorial on Deep Learning has reached 20,000 views on Slideshare.
Mar 2017: I gave an invited talk about "Differential Privacy Preservation in Deep Learning" at University of Arkansas.

Recent Projects

A Large-Scale Deep Learning-based Drug Abuse Detection on Online Social Media 

CSoNet 2018 | ICHI 2018 | ICTAI 2018 | ICDE-17 | ...

Every day, about 7,000 people are treated in emergency rooms for misuse of prescription drugs. Furthermore, every day about 44 people in the US die from overdoses of prescription painkillers. Opioid overdoses claimed more than 33,000 lives in 2015. Heroin and fentanyl deaths have risen sharply, by 23%, in one year, to 12,989. Deaths from synthetic opioids, including fentanyl, rose by 73% to 9,580.  Untreated drug abuse cases can cause tremendous economic and societal burdens, posing serious public health challenges.
This project will develop a computational system to monitor, detect, predict and discover activities, events, or topical relationships, recurring patterns, and spatial/temporal/social trends in drug abuse, by analyzing large-scale social media and other online data.  

Deep Private Auto-Encoder (dPA)
DeepPrivate - Differential Privacy Preservation in Deep Learning under Model Attacks
Xintao Wu (University of Arkansas)
Dejing Dou (University of Oregon)

Grant: NSF I/UCRC for Big Learning, NJIT Seed Grant
PAKDD-18 | ICDM-17 | 
Machine Learning 2017, ECML-PKDD'17 | AAAI-16 | ...

Today, the remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built based on patients’ personal and highly sensitive data, e.g., clinical records, etc. However, no deep learning techniques have yet been developed that incorporate privacy protection against model attacks. Such lack of protection and efficacy may put patient data at high risk and expose health care providers to legal action based on HIPAA/HITECH law. This project will develop a mechanism, called "DeepPrivate," for privacy preservation in deep learning under model attacks.

Semantic Mining of Activity, Social, and Health data Project (SMASH)
Dejing Dou, Xiao Xiao, Hao Wang, Javid Ebrahimi (University of Oregon), Xintao Wu (University of Arkansas)
Brigitte Piniewski (PeaceHealth Labs), David Kil (

Information Sciences - Elsevier'16 | AAAI-16 | ICMLA'16 | Digital Health'16 | CIKM'14 | SDM'15 | ASONAM'15 | ACM BCB'15 | IEEE Intelligent Systems'15 | KAIS'15 | ACM TIST'16 | SNAM'16 ...

Two thirds of the US population are now overweight or obese. This incurs significant health risks and financial costs to society. Traditionally, support groups and other social reinforcement approaches have been popular and effective in dealing with unhealthy behaviors including overweight. Of the factors associated with sustained weight loss one of the most important is continued intervention with frequent social contacts. Research in the design and implementation of the SMASH (Semantic Mining of Activity, Social, and Health data) system will address a critical need for data mining tools to help understanding the influence of healthcare social networks, such as YesiWell, on sustained weight loss where the data are multi-dimensional, temporal, semantically heterogeneous, and very sensitive.

Mining Object Movement Patterns from Trajectory Data (GeT_Move)
Dino Ienco (IRSTEA Montpellier)
Pascal Poncelet (University Montpellier 2)
Maguelonne Teisseire
(IRSTEA Montpellier)

http://www.lirmm.fr/~phan/multimove.jsp | IJITDM 2016| ISI 2013 | PAKDD 2013 | ADMA 2012 | ACM GIS 2012 | IDA 2012 | BDA 2012 (Best paper) | ECML-PKDD 2012 | ICSDM 2011
Dataset Released

As a part of the CRNS-Lirmm project, we build up a unifying framework to extract multiple-movement patterns from trajectory data. We have also applied different techniques such as Fuzzy logic, MDL principal, etc., to avoid redundant patterns. In addition, novel movement patterns have been introduced in this project. A demonstration system has been developed allowing end users playing with the system in real time.

User Similarity Ranking and User Clustering Algorithms based on Social Network Analysis
VanDucThong Hoang (University of Gent, Belgium)
Hyoseop Shin (Konkuk University, Seoul, South Korea)

ACM MM 2010 | APWeb 2010 | EDB 2010
Dataset Released

Most clustering algorithms are not effective on dense and concentrated graphs which do not have any meaningful cut points. To address these problems, we first propose a graph transformation to separate large scale online communities into two different types of meaningful subgraphs. The first subgraph is the intimacy graph and the second is the reputation graph. Then, we present the effective algorithms for discovering good sub-communities and for excluding incompatible users in these subgraphs.
n this research, we also propose an adaptive combination scheme of tag-based similarity and link-based similarity in which the weight factors are dynamically determined for each user by evaluating each user’s characteristics such as tag commonness and link strength. The experimental results with a Flickr data set show that the proposed scheme consistently outperforms the previous work by about 20%.