Sangkeun Matthew Lee
Senior Research Staff, Critical Infrastructure Resilience Group,
Geospatial Science and Human Security Division
National Security Sciences Directorate
Oak Ridge National Laboratory, Oak Ridge, TN
lees4 at ornl.gov
KAIST, South Korea B.S. 2004 Computer Science
Seoul National University, South Korea Ph.D. 2012 Computer Science and Engineering
2015-Present R&D Associate, Computer Science & Mathematics Division, ORNL
2013-2015 Post-Doc, Computational Sciences & Engineering Division, ORNL
Sangkeun Lee earned his Ph.D. in Computer Science and Engineering from Seoul National University in 2012 and joined Oak Ridge National Laboratory (ORNL) in 2013 as a post-doctoral research associate. He has since advanced to a leadership role as a computer scientist within the Discrete Algorithm Group of the Computer Science and Mathematics Division. Throughout his career at ORNL, Lee has demonstrated exceptional leadership in advancing data analysis technologies across diverse fields, including power systems, building science, materials science, and medical sciences. As the lead of the URBAN-NET critical infrastructure interdependency analytics team, Lee has spearheaded impactful research in energy resilience, contributing significantly to key projects such as the Environment for Analysis of Geo-Located Energy Information (EAGLE-I), the Technical Assistance for States and Tribes Initiative: Assisting Grid Resilience Investment Decision-Making (TASTI-GRID), and the North American Energy Resilience Model (NAERM). Lee has been instrumental in advancing ORNL’s mission by leading interdisciplinary research initiatives aimed at enhancing the resilience of critical infrastructure systems through innovative data analytics and collaboration. His work includes significant contributions to academia, numerous high-impact publications, and the development of essential U.S. copyrighted and open-source data analytics software for both scientific research and government sectors.
URBAN-NET: Predicting Propagation Consequences Using Synergistically Interacting Infrastructure Networks, (DOE Office of Cybersecurity, Energy Security, and Emergency Response Infrastructure Security and Energy Restoration) - Role: EAGLE-I URBAN-NET Analytics Team Lead
Technical Assistance for States and Tribes Initiative: Assisting Grid Resilience Investment Decision-Making (TASTI-GRID), State and Tribal Assistance Program (DOE Grid Deployment Office) - Role: Weather and power system resilience correlation analysis
North American Energy Resilience Model (NAERM) – Role: Critical infrastructure interdependency & system resilience curve analysis
EAGLE-I: Environment for Analysis of Geo-Located Energy Information Project (DOE Cybersecurity, Energy Security, and Emergency Response) – Role: Data collection & standardization
Machine Learning and Supercomputing to Predict Corrosion/Oxidation of High-Performance Valve Alloys, (DOE Energy Efficient Renewable Energy (EERE) Vehicle Technologies Office) – Role: Machine learning model development
Clinical Knowledge: Advanced Analytics Platform to develop an environment for advanced analytics and decision support in the Learning Health Care System (Department of Veterans Affairs) – Role: System architecture design for data analytics
Benchmark Datasets Development and Applications (DOE Building Technology Office) – Role: Data standardization Tool Development
Workshop Chair (Organizer) - the 1st-4th Workshop on Big Data Tools and Use Cases for Innovative Scientific Discovery, in conjunction with IEEE Big Data 2019-2022
Program Committee on ACM RecSys 2015-2017, CDMCS 2017, APWEB-WAIM 2017
External Reviewer – Expert Systems with Applications – Elsevier Journal (ISSN:0957-4174)
Member of the IEEE
Geoinformatics: Information visualization and analysis of complex systems, such as power grid systems and their interdependencies.
Large-Scale Graph/Data Analysis: High-performance computing for graph analysis, including homogeneous and heterogeneous graph analysis (e.g., node ranking, link analysis, clustering, path mining, etc.).
Machine Learning and Artificial Intelligence: Leveraging data to extract valuable insights for real-world problems
Big Data & NoSQL Databases: Scalable ETL (Extract, Transform, Load) processes, graph construction, and the integration of heterogeneous data sources.
2023 ORNL CSMD Distinguished Software Award
2023 IEEE IRI 2023 Best Poster Award
2016 R&D 100 Award
2013 Honorable Mention Poster Award at the 1st ORNL Annual Postdoctoral Research Symposium
2011 Best Paper Award at PIKM 2011 in conjunction with CIKM
Clean Energy Innovation Ecosystems Discovery - Measuring Innovation through Data Analytics (MIDAS) - Copyright No. 80000023
URBAN-NET-TOOLKIT - Copyright No. 80000012
ASCENDS (Advanced data SCiENce toolkit for Non-Data Scientists) - ASCENDS is a toolkit that is developed to assist scientists (or any persons) who want to use their data for machine learning tasks, more specifically, correlation analysis, regression, and classification. ASCENDS does not require programming skills. Instead, it provides a set of simple but powerful CLI (Command Line Interface) and GUI (Graphic User Interface) tools for non-data scientists to be able to intuitively perform advanced data analysis and machine learning techniques. Please see our GitHub project page for more details: https://github.com/ornlpmcp/ASCENDS
VizBrick - VizBrick (https://github.com/liza183/vizbrick) is a web-based tool with a graphical user interface that helps users to create Brick models (https://brickschema.org/) for their building datasets more easily in an interactive way, without having to know the detailed syntax of RDF TTL(Terse RDF Triple Language) that is used to describe Brick models. In this tutorial, we explain how to create a Brick model using VizBrick with the Ecobee building dataset (https://bbd.labworks.org/ds/bbd/ecobee). https://github.com/liza183/vizbrick
ORiGAMI (Oak Ridge Graph Analytics for Medical Innovation) - An Open Science Reasoning Framework on National Library of Medicine’s Semantic Medline - Won 2016 DOE R&D 100 Award: ORiGAMI is a tool for discovering and evaluating potentially interesting associations and creating novel hypothesis in medicine. ORiGAMI connects the dots across 70 million knowledge nuggets published in 23 million papers in the medical literature. The tool works on a ‘Knowledge Graph’ derived from Semantic Medline published by the National Library of Medicine integrated with scalable software that enables term-based, path-based, meta-pattern and analogy-based reasoning principles.
“CCSD’s Lee wins best poster for storm risk study”- https://www.ornl.gov/news/ccsds-lee-wins-best-poster-storm-risk-study
A collaboration work with Dr. Raiman has been featured on ORNL news - "Materials—Quelling corrosion” https://www.ornl.gov/news/materials-quelling-corrosion
An ORNL LDRD project “Advancing Domain Science with Explainable Deep-Learning: Application to High-Temperature Alloy Design” (LDRD, PI: Matt Lee) has been featured on ORNL news - “ORNL Researchers Achieve Explainable AI for Alloy Design” https://www.ornl.gov/news/ornl-researchers-achieve-explainable-ai-alloy-design
(Full List available at: https://scholar.google.com/citations?hl=en&user=DcyrVNoAAAAJ)
Lee, S., Chinthavali, S., Bhusal, N., Stenvig, N., Tabassum, A., & Kuruganti, T. (2023). Quantifying the power system resilience of the US power grid through weather and power outage data mapping. IEEE Access (Impact Factor 3.4).
Lee, S., Cui, B., Bhandari, M. S., & Im, P. (2023). Visual Brick model authoring tool for building metadata standardization. Automation in Construction (Impact Factor 9.6), 156, 105122.
Jung, G. S., Choi, J. Y., & Lee, S. M. (2024). Active learning of neural network potentials for rare events. Digital Discovery (Impact Factor 6.2), 3(3), 514-527.
Cui, B., Im, P., Bhandari, M., & Lee, S. (2023). Performance analysis and comparison of data-driven models for predicting indoor temperature in multi-zone commercial buildings. Energy and Buildings (Impact Factor 6.6), 298, 113499.
Brelsford, C., Tennille, S., Myers, A., Chinthavali, S., Tansakul, V., Denman, M., ... & Bhaduri, B. (2024). A dataset of recorded electricity outages by United States county 2014–2022. Scientific Data, 11(1), 271. (5-year Impact Factor: 8.9)
Lee, S., Choi, J., Jung, G., Tabassum, A., Stenvig, N., & Chinthavali, S. (2023, August). Predicting Power Outage During Extreme Weather Events with EAGLE-I and NWS Datasets. In 2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI) (pp. 211-212). IEEE.
Lee, S., Chintavali, S., Tennille, S., Chae, J., Tabassum, A., Tansakul, V., ... & Myers, A. (2022, December). Graph-based Cascading Impact Estimation for Identifying Crucial Infrastructure Components. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 6749-6751). IEEE.
Allen-Dumas, M. R., Lee, S., & Chinthavali, S. (2022, December). Analysis of Correlation between Cold Weather Meteorological Variables and Electricity Outages. In 2022 IEEE International Conference on Big Data (Big Data) (pp. 3398-3401). IEEE.
Chinthavali, S., Tansakul, V., Lee, S., Whitehead, M., Tabassum, A., Bhandari, M., ... & Cortner, C. (2022). COVID-19 pandemic ramifications on residential Smart homes energy use load profiles. Energy and Buildings, 259, 111847.
Pillai, R., Romedenne, M., & Lee, S. M. (2022). Development of an Open-source Alloy selection and Lifetime assessment tool for structural components in CSP (No. ORNL/TM-2021/2365). Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States).
Chinthavali, S., Lee, S., Starke, M., Chae, J., Tansakul, V., Munk, J., ... & Leverette, J. (2021, February). Data Analysis Approach for Large Data Volumes in a Connected Community. In 2021 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT) (pp. 1-5). IEEE.
Lee, S., Peng, J., Williams, A., & Shin, D. (2020). ASCENDS: Advanced data SCiENce toolkit for Non-Data Scientists. Journal of Open Source Software, 5(46), 1656.
Chinthavali, S., Tansakul, V., Lee, S., Tabassum, A., Munk, J., Jakowski, J., ... & Leverette, J. (2019, November). Quantification of Energy Cost Savings through Optimization and Control of Appliances within Smart Neighborhood Homes. In Proceedings of the 1st ACM International Workshop on Urban Building Energy Sensing, Controls, Big Data Analysis, and Visualization (pp. 59-68).
Shin, D., Yamamoto, Y., Brady, M. P., Lee, S., & Haynes, J. A. (2019). Modern data analytics approach to predict creep of high-temperature alloys. Acta Materialia, 168, 321-330.
Raiman, S. S., & Lee, S. (2018). Aggregation and data analysis of corrosion studies in molten chloride and fluoride salts. Journal of Nuclear Materials, 511, 523-535.
Lee, S., Sim, H., Kim, Y., & Vazhkudai, S. S. (2019). A programmable shared-memory system for an array of processing-in-memory devices. Cluster Computing, 22(2), 385-398.
Shin, D., Lee, S., Shyam, A., & Haynes, J. A. (2017). Petascale supercomputing to accelerate the design of high-temperature alloys. Science and Technology of advanced MaTerialS, 18(1), 828-838.
Lee, S., Vazhkudai, S. S., & Gunasekaran, R. (2017, December). Applying Graph Analytics to Understand Compute Core Usage and Publication Trends in a Petascale Supercomputing Facility. In 2017 IEEE 24th International Conference on High Performance Computing (HiPC) (pp. 294-305). IEEE.
Lee, S., Sim, H., Kim, Y., & Vazhkudai, S. S. (2017, May). Analyzethat: a programmable shared-memory system for an array of processing-in-memory devices. In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) (pp. 619-624). IEEE.
Lee, S., Chen, L., Duan, S., Chinthavali, S., Shankar, M., & Prakash, B. A. (2016, December). URBAN-NET: A network-based infrastructure monitoring and analysis system for emergency management and public safety. In 2016 IEEE International Conference on Big Data (Big Data) (pp. 2600-2609). IEEE.
Lee, S., Chinthavali, S., Shankar, M., Zeng, C., & Hendrickson, S. (2016). Energy Finance Data Warehouse Manual (No. ORNL/TM-2016/762). Oak Ridge National Lab.(ORNL), Oak Ridge, TN (United States).
Lee, S., Chinthavali, S., Duan, S., & Shankar, M. (2016, June). Utilizing semantic big data for realizing a national-scale infrastructure vulnerability analysis system. In Proceedings of the International Workshop on Semantic Big Data (pp. 1-6).
Lee, S., Sukumar, S. R., Hong, S., & Lim, S. H. (2016). Enabling graph mining in RDF triplestores using SPARQL for holistic in-situ graph analysis. Expert Systems with Applications, 48, 9-25.
Kim, Y., Atchley, S., Vallee, G. R., Lee, S., & Shipman, G. M. (2016). Optimizing end-to-end big data transfers over terabits network infrastructure. IEEE Transactions on Parallel and Distributed Systems, 28(1), 188-201.
Hong, S., Lee, S., Lim, S. H., Sukumar, S. R., & Vatsavai, R. R. (2016, May). Evaluation of pattern matching workloads in graph analysis systems. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing (pp. 263-266).
Singh, R., Graves, J. A., Lee, S., Sukumar, S. R., & Shankar, M. (2015, October). Enabling graph appliance for genome assembly. In 2015 IEEE International Conference on Big Data (Big Data) (pp. 2583-2590). IEEE.
Lee, S., Kahng, M., & Lee, S. G. (2015). Constructing compact and effective graphs for recommender systems via node and edge aggregations. Expert Systems with Applications, 42(7), 3396-3409.
Lee, S., Lee, S., & Park, B. H. (2015, April). PathMining: A path-based user profiling algorithm for heterogeneous graph-based recommender systems. In The Twenty-Eighth International Flairs Conference.
Hong, S., Lim, S. H., Lee, S., Sukumar, S. R., & Vatsavai, R. R. (2015). Benchmarking high performance graph analysis systems with graph mining and pattern matching workloads. In the Proc. of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (IEEE Supercomputing).