Home

Associate Professor, New Jersey Institute of Technology

Director of AI4Good Lab, Associate Chair, Ph.D. Program Director and Founding Member of the Data Science Department

GITC #2105, NJIT Media Center, Newark, NJ 07104, USA 

Email: phan@njit.edu, Phone: 973-596-6367                         Open Positions

Google Scholar|Research Gate

Dr. Phan is an Associate Professor at NJIT. Dr. Phan's topic of interest mainly focuses on privacy and security, social network analysis, machine learning, and spatio-temporal data mining. In particular, Dr. Phan is interested in understanding the nature of human behavior and human expression on health social networks and social media. At NJIT, Phan has established the required expertise in the field, i.e., privacy and security, ML, and health informatics, with over 65 publications. Many of them were published at leading venues, including ACM CCS, IEEE S&P, ICML, ECML, AAAI, IJCAI, IEEE ICDM, IEEE PerCom, AISTATS, IEEE BigData, ACM SigSpatial GIS, ACM Multimedia, etc., with an AAAI 2023 Distinguished Paper Award and several selected as best papers, i.e., IEEE SDS'22, IEEE ICDM’17, Springer CSoNet’19, Springer CSoNet’18, ACM BCB’15, IEEE/ACM ASONAM’15. Phan’s research has been generously supported by NSF and (long-term) industry partners, including Qualcomm Technology Inc., Adobe System Inc., and Wells Fargo.

We are hiring Post-doc and PhD students in Spring/Fall 2024 with a strong math and machine learning background! Please send your CV to phan at njit.edu.

Students:

Apurv Verma (PhD) 2024 - present

Khiem Ton (PhD) 2024 - present

Khoa Nguyen (PhD) 2023 - present

Khang Tran (PhD) 2020 - present

Andrew Dennis (PhD) 2020 - present (Bloomberg)

Hang Nguyen (Honor Undergraduate) 2019 - present

Simran Kaur (Honor Undergraduate) 2021 - present

Huong Ly Ngo (Master in Data Science) 2021 - present

Alumni:

Phung Lai (PhD) 2018 - 2023, Tenure-track Assistant Professor, University at Albany, SUNY

Pelin Ayranci (PhD) 2020 - 2022, Technical Product Manager, Paypal

Han Hu (PhD) 2016 - 2020

Pradnya Desai (Honor Undergraduate) 2019 - 2022, Microsoft

Anuja Badeti (Honor Undergraduate) 2018 - 2022, Bloomberg

Media Cover: YWCC News, YWCC News (2), NJIT Weekly, Vox.com, NJSpotlight, NJ101.5, newstrotteur.fr, modrogen.com, actualite-sante.fr, scienmag.comMedicalxpress, Healthitanalytics.com, Princeton Magazine, Daily Emerald

Dec 2024: Invited to review for NSF.

Oct 2024: Our paper, "An Analysis of the Prevalence and Trends in Drug-Related Lyrics on Twitter (X): A Quantitative Approach," has been accepted for publication in the Journal of Medical Internet Research (JMIR)

Oct 2024: Invited to serve on an NSF panel.

Sept 2024: Invited Panel Speaker on "ML (for) Supply Chain Security: Promises, Pitfalls and Opportunities" the ACM Scored'24 at ACM CCS'24, the flagship venue in Cybersecurity and Privacy.

August 2024: Our demonstration system, "SGCode: A Flexible Prompt-Optimizing System for Secure  Generation of Code," has been accepted to ACM CCS'24, the flagship venue in Cybersecurity and Privacy. You can access our web-based link to SGCode here: https://sgcode.codes

July 2024: Our paper "PromSec: Prompt Optimization for Secure Generation of Functional Source Code with Large Language Models (LLMs)" has been accepted to ACM CCS'24, the flagship venue in Cybersecurity and Privacy.  We show that we can generate secure software code at a significantly low cost, outperforming existing works.

June 2024: Invited to serve on NSF Panels.

April 2024: Our project "XCopilot: Private Code Generating with Large Language Models" in collaboration with Microsoft has been partially funded by an NSF Accelerating Research Translation (ART) grant and NJIT Collaborative Research and Innovation Strategic Partnership (CRISP) investment plan through the NJIT Center for Translational Research (CTR).

March 2024: Our Handbook of Trustworthy Federated Learning is out. The first-of-its-kind book focuses on providing insights into trustworthy federated learning. Part of the book series: Springer Optimization and Its Applications (SOIA, volume 213).

Feb 2024: Our patent application Secure Prompt, in which we can fix vulnerabilities in software codes generated by LLMs in a cost-effective and optimized manner, has been accepted as a US provisional patent, and we are working on a US patent filing.

Nov 2023: Exciting news! We will work with Qualcomm Inc. to bring the next generation of Generative AI to mobile devices in a secure, private, and scalable manner

Nov 2023: Invited talk "Effects of Generative AI and Opportunities" at the Newark Regional Business Partnership - Regional Economic Outlook, Nov 8, 2023. 

Nov 2023: Our team won a QRDI Award of ~$740k no overhead ~ $1m+ with overhead to develop Trustworthy Federated Learning for Deployable Distributed Generative AI (award rate: 18 / 500+, 3.6%).

Nov 2023: Invited to serve on an NSF panel.

August 2023: Our paper "Differential Privacy in HyperNetworks for Personalized Federated Learning" has been accepted to the 32nd ACM International Conference on Information and Knowledge Management (ACM CIKM 2023)

July 2023: Our paper "Multi-Instance Adversarial Attack on GNN-Based Malicious Domain Detection" has been accepted to the 45th IEEE Symposium on Security and Privacy (IEEE S&P 2024).

Jun 2023: Invited to serve on an NSF panel.

May 2023: Our paper "Exploring COVID-19’s Impact on Mental Health: A Longitudinal and Thematic Analysis of Reddit Users’ Discourse" has been accepted in the Journal of Medical Internet Research (JMIR). IF: 7.08.

Feb 2023: Our paper "XRand: Differentially Private Defense against Explanation-Guided Attacks" was selected as one of the twelve AAAI 2023 Distinguished Paper Awards. AAAI Conference Paper Awards and Recognition honors papers that exemplify the highest standards in technical contribution and exposition.

Jan 2023: Our paper "Active Membership Inference Attack under Local Differential Privacy in Federated Learning" has been accepted to the 26th AISTATS 2023. This paper shows that passively applying Local Differential Privacy (LDP) is vulnerable to Active Membership Inference Attacks.

Jan 2023: Our paper "Un-Fair Trojan: Targeted Backdoor Attacks Against Model Fairness" has won the Best Paper Award at the 9th IEEE International Conference on Software Defined Systems (IEEE SDS-2022). In this paper, we develop Un-Fair Trojan to severely damage model fairness while remaining stealthy in Federated Learning.

Dec 2022: Our paper "Zone-based Federated Learning for Mobile Sensing Data" has been accepted at the IEEE PerCom 2023, a leading venue in mobile computing. We propose Zone-based Federated Learning (ZoneFL) to simultaneously achieve good model accuracy while adapting to user mobility behavior, scale well as the number of users increases, and protect user data privacy.

Dec 2022: Serve as a Senior PC member - IJCAI 2023, PC member of KDD 2023, Reviewer of ICML 2023.

Dec 2022: Gave an invited lecture on  "Privacy in Machine Learning: Policies, Societal Concerns, and Challenging Gaps" at the University of Alberta.

Nov 2022: Our paper "XRand: Differentially Private Defense against Explanation-Guided Attacks" has been accepted in the AAAI 2023. In this paper, we introduce a new concept of achieving local differential privacy (LDP) in the explanations, and from that, we establish a defense, called XRand, against explanation-guided attacks.

Nov 2022: Our paper "FLSys: Toward an Open Ecosystem for Federated Learning Mobile Apps" has been accepted in IEEE Transactions on Mobile Computing. We developed a scalable system to open an ecosystem of FL on mobile apps. This is a joint work among NJIT, Kent State University, Qualcomm, and other industrial partners.

Oct 2022: Invited to serve on an NSF Panel.

Oct 2022: Two papers accepted in the IEEE BigData'22.

August 2022: I gave an invited talk on "Privacy in Machine Learning: Policies, Societal Concerns, and Challenging Gaps" at Qualcomm, San Diego.

July 2022: Our paper "Lifelong DP: Consistently Bounded Differential Privacy in Lifelong Machine Learning" has been accepted to the Conference on Lifelong Learning Agents - CoLLAs 2022 and will be published in the Proceedings of Machine Learning Research (PMLR).

March 2022: Our paper "Social and Motivational Factors for the Spread of Physical Activities in a Health Social Network" has been invited to a fast-track of the IEEE Transactions on Network Science and Engineering (TNSE), IF: 5.213.

March 2022:  We have launched our field trial for our FL system on mobile devices. If you are interested in this opportunity, please sign up here: https://docs.google.com/forms/d/1oAkQT8apIdHC2Cbo2804UZu2PdKu-knlUpxytpxn6gE/edit

Jan 2022: Our Pradyna Desai (an undergrad student from CS Department in my research team) has been selected to the 80 Finalist for the 2022 National Center for Women & Information Technology (NCWIT) Collegiate Award. NCWIT is a non-profit community of over 1,500 universities, companies, non-profits, and government organizations national wide working to increase the influential and meaningful participation of girls and women in the field of computing. This is the inspiration of our research work on "Continual Learning with Differential Privacy,” in which Pradyna Desai is the lead author. 

Dec 2021: Our article "Ontology-based Interpretable Machine Learning: A Comprehensive Study" has been accepted with minor revision by the Journal of Combinatorial Optimization - Springer.

Nov 2021: Exciting to serve on an NSF Panel.

Nov 2021: We are happy to announce our Ecosystem for Federated Learning on Mobile Applications. Check out our paper "FLSys: Toward an Open Ecosystem for FederatedLearning Mobile Apps." This is a joint work among NJIT, Kent State University, Qualcomm, and associated industrial partners. If there are any interests in collaborating with us on this journey, please feel free to drop me an email.

Oct 2021: Two regular papers about Trustworthy and Secure AI, including a synergic poisoning attack on deep neural networks and a novel metric c-eval to evaluate Explanable AI models, were accepted at IEEE International Conference on Big Data (IEEE BigData'21).  Acceptance Rate: 97 / 486.

Oct 2021: Our great Han Hu has successfully defended his Ph.D. dissertation, titled "Private and Federated Deep Learning: System, Theory, and Applications for Social Good." Big Congratulation to Dr. Hu!!!

Sept 2021: Our Honor Undergraduate Student, Pradnya Desai, has a paper, titled "Continual Learning with Differential Privacy," accepted at the 28th International Conference on Neural Information Processing (ICONIP2021) (Rank A, CORE2020) with an oral presentation. This paper establishes the first formal connection between Differential Privacy and Continual Learning.  

Sept 2021: We are preparing to launch the first Data Science Challenge for all NJIT students! An official announcement is coming soon.

Sept 2021: Invited paper to Springer CSoNet'21. NhatHai Phan, David Kil, Brigitte Piniewski, and Dejing Dou. "Social and Motivational Factors for the Spread of Physical Activities in a Health Social Network."

May 2021: Serve on PC of ICLR'22, AAAI'22, WSDM'22.

May 2021: Our project about Federated Learning in the Wild has been funded by NJIT Seed Grant! This is a joint project with Cristian Borcea.

Apr 2021: Invited to serve on two NSF Panels. Joyful to serve!

Apr 2021: Phung's provisional pattern in preserving privacy in natural language modeling has been approved. Congrats Phung! This is joint work with Adobe Inc.

Dec 2020: Serve on PC of ICML 2021, KDD 2021, NeurIPS'21, ICDM'21, MedInfo'21.

Dec 2020: Exciting to serve on an NSF Panel.

Nov 2020: Grateful to continue receiving funding support from Adobe System Inc.

Aug 2020: Exciting to serve on an NSF Panel.

Aug 2020: Our Secure and Scalable Federated Learning in the Wild during the COVID-19 Pandemic project has been funded by NSF: IIS: EAGER grant. This is a collaborative project with Prof. Ruoming Jin from KSU.

Aug 2020: Serve on PC of AAAI 2021, and WSDM 2021.

June 2020: Grateful to receive an award from the NSF: SaTC: Core for our project, "When Adversarial Learning Meets Differential Privacy: Theoretical Foundation and Applications," collaborating with Prof. My T. Thai from the University of Florida. Great thanks to the community for continuing to support our research!

May 2020: Our paper "Scalable Differential Privacy with Certified Robustness in Adversarial Learning" has been accepted to ICML'20. The paper established 1) the first connection between differential privacy for adversarial training with certified robustness in both input and latent spaces, and 2) scalable training to bypass the iterative training process in the leading DP mechanisms. [Code has been released on Github]

May 2020: Greatful to continue receiving great support from Adobe System Inc. to push the boundary of "Differential Privacy in Natural Language Processing." Stay tuned for significant results to come out soon!

May 2020: Invited to serve on the PC of NeurIPS'20, IEEE BigData'20, and IEEE ICMLA'20.

May 2020: Our research on Human Behavior during and after the COVID-19 pandemic using federated machine learning is covered on NJIT Weekly.

May 2020: Gave an invited talk at Adobe System Inc. on "Deep Learning in the Wild with Certified Defenses."

March 2020: Our paper about "Ontology-based Interpretable Machine Learning for Textual Data" has been accepted for an Oral Presentation at the IEEE International Joint Conference on Neural Networks (IJCNN'2020). [Github] See you in Glasgow, Scotland, UK! 

Feb 2020: Exciting to announce a long-term partnership with Qualcomm Technologies Inc. to build a Federated Machine Learning Framework in the Wild.

Feb 2020: Exciting to serve on an NSF Panel.

Jan 2020: Our paper "Limiting the Neighborhood: De-Small-World Network for Outbreak Prevention" has been selected as "Best Papers" at CSoNet 2019

Dec 2019: Invited to serve in the PC of IJCAI-PRICAI'20, ICDM'20, PAKDD'20.

Nov 2019: Honor to receive the "Best Service Award" at CSoNet 2019

Nov 2019: Our class "IS 698: Emerging Topics in Deep Learning - Artificial Intelligence" will be available from Spring 2020. Please consider registering for the course, which will cover significant research and practice directions in DL - AI.

Sept 2019: Our paper about "Differentially Private Lifelong Learning" has been accepted to present at the Privacy in Machine Learning (PriML) Workshop at NeurIPS'19. See you at NeurIPS'19!

Sept 2019: Our paper about "Ontology-based Interpretable Machine Learning" has been accepted for an Oral Presentation at the Knowledge Representation & Reasoning Meets Machine Learning (KR2ML) Workshop at NeurIPS'19. See you at NeurIPS'19!

Sept 2019: Gave an Invited Talk about "Deep Learning in the Wild with Certified Defenses" at the Computer Science Department, Kent State University.

Sept 2019: Two papers accepted at Springer CSoNet 2019!

Aug 2019: Grateful to receive gift fundings from Adobe Systems!

Aug 2019: Attend MedInfo 2019, present our work on DrugTracker, and grateful for exciting questions! Went back to Montpellier and regroup with Prof. Pascal Poncelet

Aug 2019: Our paper "DrugTracker: A Community-focused Drug Abuse Monitoring and Supporting System using Social Media and Geospatial Data" has been accepted at the International Conference on Advances in Geographic Information Systems 2019 (ACM SIGSPATIAL 2019), Nov 5-8, 2019, Chicago.

May 2019: Invited to serve in the PC of AAAI 2020.

June 2019: We are organizing the "AI with Geospatial Information System for Social Good" Special Track at CSoNet 2019. Please submit your good works. 

June 2019: Grateful to receive the IJCAI-AIJ Travel Grant 2019, invited to serve as a Session Chair IJCAI'19.

May 2019: Invited to serve in the PC of CIKM 2019, ICDM 2019, ICTAI 2019, FLAIRS 2020.

May 2019:  Our paper "Heterogeneous Gaussian Mechanism: Preserving Differential Privacy in Deep Learning with Provable Robustness" has been accepted at the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019), August 10-16, 2019, Macao, China. (acceptance rate = 850/4752, 17.9%) [GitHub]

March 2019: Our paper "An Ensemble Deep Learning Model for Drug Abuse Detection in Sparse Twitter-Sphere," has been accepted at The 17th MedInfo 2019, Lyon, France (A leading venue in Medical and Health Informatics). We addressed the extreme sparse classification problem in this work to detect drug abuse risk behaviors in Twitter-Sphere. 

March 2019: We are looking for a Ph.D. student, fully funded, to work on Privacy and Security in Deep Learning...

Feb 2019: NJIT has been named a "Very High Research Activity (R1)" institution by the Carnegie Classification.

Feb 2019: Our paper "Deep Self-Taught Learning for Detecting Drug Abuse Risk Behavior in Tweets," has been selected as the best papers at CSoNet 2018, and invited to a Special Issue of COSN (Computational Social Networks, Springer).

Dec 2018: Exciting to serve on the PC of IJCAI 2019 and MedInfo 2019.

Oct 2018: Exciting to serve on the PC of ICDCS 2019, PAKDD 2019, and ICDS 2019

August 2018: Our paper, titled "Deep Self-Taught Learning for Detecting Drug Abuse Risk Behavior in Tweets," has been invited to The 7th International Conference on Computational Data & Social Networks (CSoNet 2018).

August 2018: Our paper, titled "Recursive Structure Similarity: A Novel Algorithm for Graph Clustering," has been accepted at The 30th IEEE ICTAI 2018.

May 2018: Our team visited Google Research at New York City to present our work on Morphological Neural Networks and Differentially Private Deep Learning.  

May 2018: Drug abuse risk behavior labeled Tweets (i.e., 5,000 tweets) has been released at: https://github.com/hu7han73/DrugAbuseLabeledTweets

May 2018: A paper, titled "Deep Learning Model for Classifying Drug Abuse Risk Behavior in Tweets" has been accepted at IEEE International Conference on Healthcare Informatics (ICHI 2018)

May 2018: Invited to serve in the PC of CIKM 2018, ICDM 2018.

May 2018: A paper, titled "DPNE: Differentially Private Network Embedding" has been accepted at PAKDD 2018

April 2018: Differentially Private Deep Learning Package Release.

Nov 2017: Selected as Best Papers IEEE ICDM 2017.

Nov 2017: I gave an invited talk (jointly with Prof. Dejing Dou) about: "Deep Learning and Privacy Preserving in a Health Social Network" at The 10th International Workshop on Privacy and Anonymity in the Information Society (PAIS), IEEE ICDM 2017.

Nov 2017: Invited to serve in the PC of IJCAI 2018, and KDD 2018.

Sept 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at IEEE NJACS-2017.

Sept 2017: A paper, titled "Time-Sensitive Behavior Prediction in a Health Social Network" has been accepted at IEEE ICMLA 2017

Aug 2017: Two papers have been accepted as regular papers at IEEE ICDM 2017 (acceptance rate: 9.25% = 72 / 778). 

June 2017: Our article, titled "Preserving Differential Privacy in Convolutional Deep Belief Networks," has been accepted by Machine Learning 2017, ECML-PKDD Journal Track, IF 1.848. Arxiv version (acceptance rate: 13.5%)

June 2017: Invited to serve in the PC of ICDM 2017, KDIR 2017, and ADMA 2017.

June 2017: NSF funded our planning meeting to establish an I/UCRC Center for Big Learning (CBL) in 2017.

June 2017: I will give an invited talk about "Differential Privacy Preservation in Deep Learning" at United Technologies Research Center (UTRC) on August.

June 2017: I will give an invited talk about "Enabling Real-Time Drug Abuse Detection on Online Social Media" at Tumblr.com on July.

May 2017: Invited to serve as a Reviewer for IEEE Intelligent Systems.

April 2017: Invited to serve in the PC of ACM CIKM 2017 and IEEE ICTAI 2017.

April 2017: My tutorial on Deep Learning has reached 20,000 views on Slideshare.

Mar 2017: I gave an invited talk about "Differential Privacy Preservation in Deep Learning" at University of Arkansas.


Recent Projects

A Large-Scale Deep Learning-based Drug Abuse Detection on Online Social Media 

NJIT, CUNY, NYU, and VCU

IJCNN'20 | SIGSPATIAL'19 | MedInfo'19 | KR2ML'19 (NeurIPS) | CSoNet'19 (selected as best papers) | CSoNet'18 (selected as best papers) | ICHI'18 | ICTAI'18 | ICDE-17 | Media Cover: NJSpotlight, NJ101.5, Medicalxpress

Every day, about 7,000 people are treated in emergency rooms for misuse of prescription drugs. Furthermore, every day about 44 people in the US die from overdoses of prescription painkillers. Opioid overdoses claimed more than 33,000 lives in 2015. Heroin and fentanyl deaths have risen sharply, by 23%, in one year, to 12,989. Deaths from synthetic opioids, including fentanyl, rose by 73% to 9,580.  Untreated drug abuse cases can cause tremendous economic and societal burdens, posing serious public health challenges. 

This project will develop a computational system to monitor, detect, predict and discover activities, events, or topical relationships, recurring patterns, and spatial/temporal/social trends in drug abuse, by analyzing large-scale social media and other online data.  

Deep Private Auto-Encoder (dPA)

DeepPrivate - Differential Privacy Preservation in Deep Learning under Model Attacks

Xintao Wu (University of Arkansas), Dejing Dou (University of Oregon)

Grants: NSF CRII: SaTC, NSF I/UCRC for Big Learning, NJIT Seed Grant

ICML'20 | IJCAI-19 | PriML-19 (NeurIPS) | CSN-19 | MedInfo-19 | PAKDD-18 | ICDM-17 | Machine Learning 2017, ECML-PKDD'17 | AAAI-16 | Media Cover: Princeton Magazine

Today, the remarkable development of deep learning in medicine and healthcare domain presents obvious privacy issues, when deep neural networks are built based on patients’ personal and highly sensitive data, e.g., clinical records, etc. However, no deep learning techniques have yet been developed that incorporate privacy protection against model attacks. Such lack of protection and efficacy may put patient data at high risk and expose health care providers to legal action based on HIPAA/HITECH law. This project will develop a mechanism, called "DeepPrivate," for privacy preservation in deep learning under model attacks.

A key thrust of the project is to better understand and defend against model inference attacks, including both well-known fundamental model attacks and novel attacks developed through prism of the classical confidentiality and integrity models. Through an extensive analysis of these attacks, the team will develop an understanding of the relative risks of key aspects of learning approaches. In particular, vulnerable features, parameters, and correlations, which are essential to conduct model attacks, will be automatically identified and protected in a novel threat-aware privacy preserving approach based on ideas from differential privacy. 

Semantic Mining of Activity, Social, and Health data Project (SMASH)

Dejing Dou, Xiao Xiao, Hao Wang, Javid Ebrahimi (University of Oregon), Xintao Wu (University of Arkansas)

Brigitte Piniewski (PeaceHealth Labs), David Kil (HealthMantic)

Information Sciences - Elsevier'16 | AAAI-16 | ICMLA'16 | Digital Health'16 | CIKM'14 | SDM'15 | ASONAM'15 | ACM BCB'15 | IEEE Intelligent Systems'15 | KAIS'15 | ACM TIST'16 | SNAM'16 ...

Two thirds of the US population are now overweight or obese. This incurs significant health risks and financial costs to society. Traditionally, support groups and other social reinforcement approaches have been popular and effective in dealing with unhealthy behaviors including overweight. Of the factors associated with sustained weight loss one of the most important is continued intervention with frequent social contacts. Research in the design and implementation of the SMASH (Semantic Mining of Activity, Social, and Health data) system will address a critical need for data mining tools to help understanding the influence of healthcare social networks, such as YesiWell, on sustained weight loss where the data are multi-dimensional, temporal, semantically heterogeneous, and very sensitive.

Mining Object Movement Patterns from Trajectory Data (GeT_Move)

Dino Ienco (IRSTEA Montpellier)

Pascal Poncelet (University Montpellier 2)

Maguelonne Teisseire (IRSTEA Montpellier)

http://www.lirmm.fr/~phan/multimove.jsp | IJITDM 2016| ISI 2013 | PAKDD 2013 | ADMA 2012 | ACM GIS 2012 | IDA 2012 | BDA 2012 (Best paper) | ECML-PKDD 2012 | ICSDM 2011

Dataset Released and Code Released

As a part of the CRNS-Lirmm project, we build up a unifying framework to extract multiple-movement patterns from trajectory data. We have also applied different techniques such as Fuzzy logic, MDL principal, etc., to avoid redundant patterns. In addition, novel movement patterns have been introduced in this project. A demonstration system has been developed allowing end users playing with the system in real time.

User Similarity Ranking and User Clustering Algorithms based on Social Network Analysis

Van Duc Thong Hoang (University of Gent, Belgium), Hyoseop Shin (Konkuk University, South Korea)

ACM MM 2010 | APWeb 2010 | EDB 2010

Dataset Released

Most clustering algorithms are not effective on dense and concentrated graphs which do not have any meaningful cut points. To address these problems, we first propose a graph transformation to separate large scale online communities into two different types of meaningful subgraphs. The first subgraph is the intimacy graph and the second is the reputation graph. Then, we present the effective algorithms for discovering good sub-communities and for excluding incompatible users in these subgraphs.

In this research, we also propose an adaptive combination scheme of tag-based similarity and link-based similarity in which the weight factors are dynamically determined for each user by evaluating each user’s characteristics such as tag commonness and link strength. The experimental results with a Flickr data set show that the proposed scheme consistently outperforms the previous work by about 20%.