Bio
I currently work as AI Research Scientist, Tech Lead at Dataminr, inc., focusing R&D on data mining, machine learning and natural language processing for real-time events discovery and summarization, from public available data sources (tweets, articles, sensor data, etc.). Previously, I work as a Senior Member of Technical Staff at AT&T Labs Research, with R&D focusing on spatial-temporal data mining, applied machine learning and deep learning on large-scale spatially distributed networking data, to automate dynamic resource management and heterogeneous 4G/5G network planning. Along with 1-year R&D internship experiences at NEC Labs America, TuSimple, and MSKCC in predictive modeling, text mining and anomaly detection, I received my PhD in School of Computing and Information at University of Pittsburgh, with background in data mining, machine learning, network science and statistics. My PhD thesis focuses on large-scale geo-social network data mining to model spatial, social and temporal aspects of user mobility behaviors in physical world and its applications to local economy. I am passionate on learning insights from various structured/unstructured dataset, communicating findings to scientists, engineers and non-experts via publications/presentations/demos, as well as delivering data-driven software/products to drive intelligent and automated businesses/services.
Education
University of Pittsburg, Pittsburgh, PA, USA
Ph.D. in Information Sciences, Jan 2012 ~ Oct 2016
Thesis: "Urban Mobility and Location-based Social Networks: Social, Economic and Environmental Incentives"
Advisor: Prof. Konstantinos Pelechrinis
Huazhong University of Science and Technology, Wuhan, Hubei, China
M.S. in Communication and Information System, Sep 2009 ~ Dec 2011
Thesis: "Wireless Indoor Target Tracking using Particle Filter Estimation"
B.S. in Telecommunication Engineering, Sep 2005 ~ Jun 2009
Theis: "Design of ZigBee-based Indoor Localization System"
Work Experience
Senior Research Scientist, AI & Data Science, Dataminr, inc., NYC, Jan 2022 - Present
Research Scientist II, AI & Data Science, Dataminr, inc., NYC, June 2019 - Dec 2021
Lead research and development on multiple projects and initiatives:
Natural language generation with Deep Learning for event summarization from online social media posts
Text classification with deep learning for real-time event detection from online social media data
Spatio-temporal data mining and machine learning/deep learning for real-time event and anomaly detection from sensor data
Senior Member of Technical Stuff, AT&T Labs Research, Bedminster, NJ. Nov 2016 - June 2019
Provide data-driven solutions with ML/AI for spatially distributed network planning/optimization
Led to design time-series models (LSTM, Attention) capturing graph interactions to predict network load
Led to establish building ranking systems by mining heterogeneous spatial data to automate indoor site planning
Built urban object image recognition micro-services using transfer learning to automate site deployment
Software Engineering Intern, TuSimple LLC, San Diego, CA. May 2016 ~ Aug 2016
Mined spatial trajectories and ego-motion sensor time-series data in autonomous driving system
Delivered data-driven algorithms and services to detect driving behaviors and anomalous events
Research Intern, NEC Labs America, Princeton, NJ. Jun 2015 ~ Nov 2015
Analyzed machine logs and design a deep learning based algorithm for system failure prediction
Learnt regular text patterns via hierarchical clustering from large-scale heterogeneous machine logs
Formalized system failure prediction as a sequential learning problem with devised TF-IDF features
Implemented a deep RNN (LSTM) from scratch to predict rare failure events with high PR-AUC
Data Scientist Intern, MSKCC, New York City, NY. May 2014 ~ Aug 2014
Analyzed and extracted features from clinic data to predict patient re-admission after discharge
Explored and experimented various Hadoop distributions/solutions for healthcare applications
Journal Paper
Geng, Li, and Ke Zhang. 2023. "Correlation of Road Network Structure and Urban Mobility Intensity: An Exploratory Study Using Geo-Tagged Tweets" ISPRS International Journal of Geo-Information 12, no. 1: 7. https://doi.org/10.3390/ijgi12010007
Michele Polese, Rittwik Jana, Velin Kounev, Ke Zhang, Supratim Deb, Michele Zorzi, "Machine Learning at the Edge: A Data-Driven Architecture with Applications to 5G Cellular Networks," in IEEE Transactions on Mobile Computing, doi: https://doi.org/10.1109/TMC.2020.2999852.
Lei Li, Daqing He, Chengzhi Zhang, Li Geng, Ke Zhang, "Characterizing peer-judged answer quality on academic Q&A sites: A cross-disciplinary case study on ResearchGate", Aslib Journal of Information Management, Vol. 70 Issue: 3, pp.269-287. https://doi.org/10.1108/AJIM-11-2017-0246
Ke Zhang, Konstantinos Pelechrinis, Theodoros Lappas, "Effects of Promotions on Location-based Social Media: Evidence from Foursquare", in International Journal of Electronic Commerce, Vol. 22, Issue. 1, 2018. https://doi.org/10.1080/10864415.2018.1396118
L. Jin, Ke Zhang, J. Lu, Y.R. Lin. "Towards understanding the gamification upon users’ scores in a location-based social network". Multimedia Tools and Applications, Vol. 75, Issue. 15, pp. 8895–8919 (2016). https://doi.org/10.1007/s11042-014-2317-3
L. Jin, X. Long, Ke Zhang, Y.R. Lin and J. Joshi. "Characterizing users’ check-in activities using their scores in a location-based social network". Multimedia Systems, Vol. 22, Issue. 3, pp. 87–98 (2016). https://doi.org/10.1007/s00530-014-0395-8
Ke Zhang, Konstantinos Pelechrinis, and Prashant Krishnamurthy. "ACM HotMobile 2013 poster: detecting fake check-ins in location-based social networks through honeypot venues". ACM SIGMOBILE Mobile Computing and Communications Review. 17, 3 (July 2013), 29–30. DOI:https://doi.org/10.1145/2542095.2542111. [Link] [PDF]
Conference Paper
Ma, Liang, Shuyang Cao, I. V. Logan, L. Robert, Di Lu, Shihao Ran, Ke Zhang, Joel Tetreault, Aoife Cahill, and Alejandro Jaimes. "BUMP: A Benchmark of Unfaithful Minimal Pairs for Meta-Evaluation of Faithfulness Metrics." arXiv preprint arXiv:2212.09955 (2022).
Hossein Rajaby Faghihi, Bashar Alhafni, Ke Zhang, Shihao Ran, Joel Tetreault, Alejandro Jaimes. CrisisLTLSum: A Benchmark for Local Crisis Event Timeline Extraction and Summarization, in the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022 Findings) [Arxiv]
Vivian Lai, Alison Smith-Renner, Ke Zhang, Ruijia Cheng, Wenjuan Zhang, Joel Tetreault, Alejandro Jaimes. An Exploration of Post-Editing Effectiveness in Text Summarization, in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2022) [Link][PDF]
Ruijia Cheng, Alison Smith-Renner, Ke Zhang, Joel R. Tetreault, Alejandro Jaimes; Mapping the Design Space of Human-AI Interaction in Text Summarization, in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2022) [Link][PDF]
Xin Dong, Tanay Kumar Saha, Ke Zhang, Joel Tetreault, Alejandro Jaimes, Gerard de Melo. Temporal Event Reasoning Using Multi-source Auxiliary Learning Objectives. In: , et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13186. Springer, Cham. https://doi.org/10.1007/978-3-030-99739-7_12
Eleftheria Briakou, Sweta Agrawal, Ke Zhang, Joel Tetreault, Marine Carpuat, "A Review of Human Evaluation for Style Transfer", in Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021). http://dx.doi.org/10.18653/v1/2021.gem-1.6 [PDF]
Eleftheria Briakou, Di Lu, Ke Zhang, Joel Tetreault, "Olá, Bonjour, Salve! XFORMAL: A Benchmark for Multilingual Formality Style Transfer", in Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2021). http://dx.doi.org/10.18653/v1/2021.naacl-main.256 [PDF]
Chidubem Arachie, Manas Gaur, Sam Anzaroot, William Groves, Ke Zhang, Alejandro Jaimes, "Unsupervised Detection of Sub-events in Large Scale Disasters", in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2020), NYC, NY USA, 34(01), 354-361. https://doi.org/10.1609/aaai.v34i01.5370, [PDF]
Abhijeet Bhorkar*, Ke Zhang*, Jin Wang, "DeepAuto: A Hierarchical Deep Learning Framework for Real-Time prediction in Cellular Networks", in Globecom 2019, Waikoloa, HI, USA. (* equal contribution) [PDF]
L. Geng, Ke Zhang, X. Wei and X. Feng, "Soft Biometrics in Online Social Networks: A Case Study on Twitter User Gender Recognition," 2017 IEEE Winter Applications of Computer Vision Workshops (WACVW), Santa Rosa, CA, 2017, pp. 1-8, doi: https://doi.org/10.1109/WACVW.2017.8. [Link][PDF]
Ke Zhang, J. Xu, M. R. Min, G. Jiang, K. Pelechrinis and H. Zhang, "Automated IT system failure prediction: A deep learning approach", 2016 IEEE International Conference on Big Data (IEEE Big Data 2016), Washington D.C. USA. pp. 1291-1300, doi: https://doi.org/10.1109/BigData.2016.7840733. (Acceptance Rate 18.68%) [Link][PDF][Slides]
Ke Zhang, Konstantinos Pelechrinis, "Do Street Fairs Boost Local Businesses? A Quasi-Experimental Analysis Using Social Network Data". In The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2016 (ECML-PKDD 2016), Riva del Garda, Italy. Lecture Notes in Computer Science, vol 9853. Springer, Cham. https://doi.org/10.1007/978-3-319-46131-1_22 (Acceptance rate 20%) [Link][PDF][Slides]
Ke Zhang, Yu-ru Lin, Konstantinos Pelechrinis, "EigenTransitions with Hypothesis Testing: The Anatomy of Urban Mobility", in International AAAI Conference on Web and Social Media 2016 (ICWSM 2016), Cologne, Germany. Available at: https://www.aaai.org/ocs/index.php/ICWSM/ICWSM16/paper/view/13042/12768. (Acceptance rate 17%) [PDF][Slides]
Ke Zhang, Konstantinos Pelechrinis, Theodoros Lappas. "Analyzing and Modeling Special Offer Campaigns in Location-Based Social Networks", in International AAAI Conference on Web and Social Media 2015 (ICWSM 2015), Oxford, UK. Available at: https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10481 (Acceptance rate 19%) [PDF][Slides]
Ke Zhang and Konstantinos Pelechrinis. "Understanding spatial homophily: the case of peer influence and social selection". In Proceedings of the 23rd international conference on World wide web (WWW '14). Association for Computing Machinery, New York, NY, USA, 271–282. DOI:https://doi.org/10.1145/2566486.2567990. (Acceptance rate 12.9%) [PDF][Slides]
Ke Zhang, Qiuye Jin, Konstantinos Pelechrinis, and Theodoros Lappas. "On the importance of temporal dynamics in modeling urban activity". In Proceedings of the 2nd ACM SIGKDD International Workshop on Urban Computing (UrbComp '13). Association for Computing Machinery, New York, NY, USA, Article 7, 1–8. DOI:https://doi.org/10.1145/2505821.2505825. [PDF][Slides]
Ke Zhang, Wei Jeng, Francis Fofie, Konstantinos Pelechrinis, and Prashant Krishnamurthy, "Towards reliable spatial information in LBSNs", In Proceedings of the 2012 ACM Conference on Ubiquitous Computing (UbiComp '12), Association for Computing Machinery, New York, NY, USA, 950–955. DOI:https://doi.org/10.1145/2370216.2370426. [Link][PDF][Slides]
Granted Patents
Chen J, Zhao W, Zhang K, Huijing YA, Wang H, inventors; AT&T Intellectual Property I LP, assignee. Real-time user traffic classification in wireless networks. United States patent US 10,772,016. 2020 Sep 8. [link]
Chen J, Zhang K, Zhao W, Yang Y, Wen X, inventors; AT&T Intellectual Property I LP, assignee. System and method for classifying a physical object. United States patent US 10,362,491. 2019 Jul 23. [link]
Jianwu XU, Zhang K, Zhang H, Min R, Jiang G, inventors; NEC Corp, assignee. Mobile phone with system failure prediction using long short-term memory neural networks. United States patent US 10,296,430. 2019 May 21. [link]
Jianwu XU, Zhang K, Zhang H, Min R, Jiang G, inventors; NEC Corp, assignee. System failure prediction using long short-term memory neural networks. United States patent US 10,289,509. 2019 May 14. [link]
Media Press
Our work on analyzing and predicting the effectiveness of location-based advertising through social media has been featured in Pitt News, Pittsburgh Post Gazette, Communiations of the ACM, Pittsburgh's NPR News Station, Radio PA, Bloomberg Business, Pitt News, GeoMarketing, World News.
Services
Conference Area Chair/SPC/PC/Reviewer
ACL ARR Aug 2024 [Area Chair], COLING 2025 [PC for Demo Track, Area Chair for NLP/LLM Track], EMNLP 2024 [Area Chair], ACL 2024 [Area Chair], CIKM 2024 [Area Chair], NAACL 2024, ACM ICME 2024, ACL 2023, ACM ICME 2023, AAAI 2023, CIKM 2022 (Best SPC Award), CtrlGen Workshop at NeurIPS 2021, AAAI 2022, ACM ICME 2021, AAAI 2021, KDD-KiML 2020, ACM ICME 2020, ACM WWW 2020, ACM ICME 2019, ACM WWW 2019, ACM CIKM 2018, ACM ICME 2018, ACM CIKM 2017/2018, AAAI ICWSM 2016/2017, IEEE MDM 2017, ACM WSDM 2017/2018, ACM WWW 2017, IJCAI 2016, IEEE IRI 2016
Journal Reviewer
IEEE Internet of Things Journal (2022), IEEE MultiMedia, IEEE Open Journal of Intelligent Transportation Systems, IEEE Access, IEEE Transactions on Knowledge and Data Engineering (TKDE), IEEE Journal on Selected Areas on Communications (JSAC), Big Data, International Journal of Distributed Sensor Networks, IEEE Transaction on Intelligent Transportation Systems, IEEE Transactions on Information Forensics and Security, Travel Behavior and Society
Guest Lecture/Talks
Awards
Hackathon ML competition 1st prize (Video Categorization), AT&T, April 2018
Hackathon ML competition 3rd prize (Malicious Domain Name Detection), AT&T, April 2017
Student grant awards: IEEE BigData'16, AAAI ICWSM'16, AAAI ICWSM'15
Research/Teaching Assistantship, Jan 2012 ~ Oct 2016
Contact
Email: kzhang AT dataminr DOT com (work), zhangke290 AT gmail DOT com (personal), kez11 AT pitt DOT edu (univ)