Dr. Phan is an assistant professor at the NJIT. Dr. Phan's topic of interest mainly concerns on privacy and security, social network analysis, machine learning, and spatio-temporal data mining. In particular, Dr. Phan am interested in understanding the nature of human behavior and human expression on health social networks and social media. Dr. Phan received his PhD in Computer Science from the University of Montpellier 2 in October 2013. Dr. Phan was a part of the Knowledge Discovery in Databases Team (TaToo) in the LIRMM laboratory (supervised by Prof. Pascal Poncelet, Mme. Maguelonne Teisseire, and Dr. Dino Ienco). Dr. Phan joined the University of Oregon as a Research Associate in fall 2013 and work with Prof. Dejing Dou.
We are looking for a PhD student with a strong background in math and machine learning! Please send your CV to phan at njit.edu.
Han Hu (PhD) 2016 - present
Phung Lai (PhD) 2018 - present
Anuja Badeti (Honor Undergraduate) 2018 - present
"Everything should be made as simple as possible, but not simpler." (Albert Einstein)
May 2019: Our paper "Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in Deep Learning with Provable Robustness" has been accepted at the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), August 10-16, 2019, Macao, China. (acceptance rate = 850/4752, 17.9%)
March 2019: Our paper "An Ensemble Deep Learning Model for Drug Abuse Detection in Sparse Twitter-Sphere," has been accepted at The 17th MedInfo 2019, Lyon, France (A leading venue in Medical and Health Informatics). We addressed the extreme sparse classification problem in this work to detect drug abuse risk behaviors in Twitter-Sphere.
March 2019: We are looking for a PhD student, fully funded, to work on Privacy and Security in Deep Learning...
Feb 2019: NJIT has been named a "Very High Research Activity (R1)" institution by the Carnegie Classification.
Feb 2019: Our paper "Deep Self-Taught Learning for Detecting Drug Abuse Risk Behavior in Tweets," has been selected as best papers at CSoNet 2018, and invited to a Special Issue of COSN (Computational Social Networks, Springer).
August 2018: Our paper, titled "Deep Self-Taught Learning for Detecting Drug Abuse Risk Behavior in Tweets," has been invited to The 7th International Conference on Computational Data & Social Networks (CSoNet 2018).
August 2018: Our paper, titled "Recursive Structure Similarity: A Novel Algorithm for Graph Clustering," has been accepted at The 30th IEEE ICTAI 2018.
May 2018: Our team visited Google Research at New York City to present our work on Morphological Neural Networks and Differentially Private Deep Learning. May 2018: Drug abuse risk behavior labeled Tweets (i.e., 5,000 tweets) has been released at: https://github.com/hu7han73/DrugAbuseLabeledTweets.
May 2018: A paper, titled "Deep Learning Model for Classifying Drug Abuse Risk Behavior in Tweets" has been accepted at IEEE International Conference on Healthcare Informatics (ICHI 2018).
Nov 2017: Selected as Best Papers IEEE ICDM 2017.
Nov 2017: I gave an invited talk (jointly with Prof. Dejing Dou) about: "Deep Learning and Privacy Preserving in a Health Social Network" at The 10th International Workshop on Privacy and Anonymity in the Information Society (PAIS), IEEE ICDM 2017.
Sept 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at IEEE NJACS-2017.
Sept 2017: A paper, titled "Time-Sensitive Behavior Prediction in a Health Social Network" has been accepted at IEEE ICMLA 2017.
Aug 2017: Two papers have been accepted as regular papers at IEEE ICDM 2017 (acceptance rate: 9.25% = 72 / 778).
June 2017: Our article, titled "Preserving Differential Privacy in Convolutional Deep Belief Networks," has been accepted by Machine Learning 2017, ECML-PKDD Journal Track, IF 1.848. Arxiv version (acceptance rate: 13.5%)
June 2017: NSF funded our planning meeting to establish an I/UCRC Center for Big Learning (CBL) in 2017.
June 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at United Technologies Research Center (UTRC) on August.
June 2017: I will give an invited talk about "Enabling Real-Time Drug Abuse Detection on Online Social Media" at Tumblr.com on July.
May 2017: Invited to serve as a Reviewer for IEEE Intelligent Systems.
April 2017: My tutorial on Deep Learning has reached 20,000 views on Slideshare.
Mar 2017: I gave an invited talk about "Differential Privacy Preservation in Deep Learning" at University of Arkansas.
A Large-Scale Deep Learning-based Drug Abuse Detection on Online Social Media
NJIT, CUNY, NYU, and VCU
MedInfo'19 | CSoNet'18 (selected as best papers) | ICHI'18 | ICTAI'18 | ICDE-17 | ...
Every day, about 7,000 people are treated in emergency rooms for misuse of prescription drugs. Furthermore, every day about 44 people in the US die from overdoses of prescription painkillers. Opioid overdoses claimed more than 33,000 lives in 2015. Heroin and fentanyl deaths have risen sharply, by 23%, in one year, to 12,989. Deaths from synthetic opioids, including fentanyl, rose by 73% to 9,580. Untreated drug abuse cases can cause tremendous economic and societal burdens, posing serious public health challenges.
This project will develop a computational system to monitor, detect, predict and discover activities, events, or topical relationships, recurring patterns, and spatial/temporal/social trends in drug abuse, by analyzing large-scale social media and other online data.
DeepPrivate - Differential Privacy Preservation in Deep Learning under Model Attacks
Xintao Wu (University of Arkansas), Dejing Dou (University of Oregon)
IJCAI-19 | CSN-19 | MedInfo-19 | PAKDD-18 | ICDM-17 | Machine Learning 2017, ECML-PKDD'17 | AAAI-16 | ...
Today, the remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built based on patients’ personal and highly sensitive data, e.g., clinical records, etc. However, no deep learning techniques have yet been developed that incorporate privacy protection against model attacks. Such lack of protection and efficacy may put patient data at high risk and expose health care providers to legal action based on HIPAA/HITECH law. This project will develop a mechanism, called "DeepPrivate," for privacy preservation in deep learning under model attacks.
A key thrust of the project is to better understand and defend against model inference attacks, including both well-known fundamental model attacks and novel attacks developed through prism of the classical confidentiality and integrity models. Through an extensive analysis of these attacks, the team will develop an understanding of the relative risks of key aspects of learning approaches. In particular, vulnerable features, parameters, and correlations, which are essential to conduct model attacks, will be automatically identified and protected in a novel threat-aware privacy preserving approach based on ideas from differential privacy.
Semantic Mining of Activity, Social, and Health data Project (SMASH)
Dejing Dou, Xiao Xiao, Hao Wang, Javid Ebrahimi (University of Oregon), Xintao Wu (University of Arkansas)
Brigitte Piniewski (PeaceHealth Labs), David Kil (HealthMantic)
Information Sciences - Elsevier'16 | AAAI-16 | ICMLA'16 | Digital Health'16 | CIKM'14 | SDM'15 | ASONAM'15 | ACM BCB'15 | IEEE Intelligent Systems'15 | KAIS'15 | ACM TIST'16 | SNAM'16 ...
Two thirds of the US population are now overweight or obese. This incurs significant health risks and financial costs to society. Traditionally, support groups and other social reinforcement approaches have been popular and effective in dealing with unhealthy behaviors including overweight. Of the factors associated with sustained weight loss one of the most important is continued intervention with frequent social contacts. Research in the design and implementation of the SMASH (Semantic Mining of Activity, Social, and Health data) system will address a critical need for data mining tools to help understanding the influence of healthcare social networks, such as YesiWell, on sustained weight loss where the data are multi-dimensional, temporal, semantically heterogeneous, and very sensitive.
Mining Object Movement Patterns from Trajectory Data (GeT_Move)
Dino Ienco (IRSTEA Montpellier)
Pascal Poncelet (University Montpellier 2)
Maguelonne Teisseire (IRSTEA Montpellier)
http://www.lirmm.fr/~phan/multimove.jsp | IJITDM 2016| ISI 2013 | PAKDD 2013 | ADMA 2012 | ACM GIS 2012 | IDA 2012 | BDA 2012 (Best paper) | ECML-PKDD 2012 | ICSDM 2011
As a part of the CRNS-Lirmm project, we build up a unifying framework to extract multiple-movement patterns from trajectory data. We have also applied different techniques such as Fuzzy logic, MDL principal, etc., to avoid redundant patterns. In addition, novel movement patterns have been introduced in this project. A demonstration system has been developed allowing end users playing with the system in real time.
User Similarity Ranking and User Clustering Algorithms based on Social Network Analysis
Van Duc Thong Hoang (University of Gent, Belgium), Hyoseop Shin (Konkuk University, South Korea)
ACM MM 2010 | APWeb 2010 | EDB 2010
Most clustering algorithms are not effective on dense and concentrated graphs which do not have any meaningful cut points. To address these problems, we first propose a graph transformation to separate large scale online communities into two different types of meaningful subgraphs. The first subgraph is the intimacy graph and the second is the reputation graph. Then, we present the effective algorithms for discovering good sub-communities and for excluding incompatible users in these subgraphs.
In this research, we also propose an adaptive combination scheme of tag-based similarity and link-based similarity in which the weight factors are dynamically determined for each user by evaluating each user’s characteristics such as tag commonness and link strength. The experimental results with a Flickr data set show that the proposed scheme consistently outperforms the previous work by about 20%.