BAY-YUAN HSU 許倍源

Assistant Professor, Dept. of Industrial Engineering and Engineering Management

National Tsing Hua University

Hsu received the B.S. degree and M.S. degree from the Department of Life Science and Department of Computer Science in National Tsing Hua University, Hsinchu, Taiwan, in 2010 and 2012, respectively. Hsu received the Ph.D. degree from the Department of Computer Science, University of California at Santa Barbara, USA, in 2019. Hsu was an assistant professor in the Department of Computer Science at National Taipei University from 2020 to 2021. Now Hsu was an assistant professor in the Department of Industrial Engineering and Engineering Management at National Tsing-Hua University, Hsinchu, Taiwan. His research interests include graph mining, big data, machine learning, social networks analytics and bioinformatics.


Education


09/2013 – 09/2019 Ph. D. in Computer Science, University of California of Santa Barbara, CA, USA


09/2010 – 06/2012 M.S. in Computer Science, National Tsing Hua University, Taiwan


09/2006 – 06/2010 B.E. in Life Science, National Tsing Hua University, Taiwan


Honor Awards

  • Scholarship of International University Visiting Program, National Tsing Hua University, 12/2011

  • Honorary Member of the Phi Tau Phi Scholastic Honor Society, 06/2012

  • Scholarship of Pan Wen Yuan Foundation, 06/2012


Project

  • Multi-layer graph on social network analysis, intervention and extraction base on machine learning and graph theory



Research Interest

  • Data Mining and Artificial Intelligence

  • Big Data and Social Network Analytics

  • Combinatorial Optimization and Bioinformatics

RESEARCH DESCRIPTION



DATA ANALYSIS AND DATA MINING


Big data analysis and large-scale machine learning have become a central part of many disciplines. My research revolves around elegant modeling of problems of practical interest, understanding the fundamental principles of statistical machine learning, designing scalable algorithms with guarantees, and analyzing their computational and information-theoretic properties rigorously. For data analysis, data mining, and artificial intelligence. I summarize the topics as follows. The extraction of meaningful features is essential for good performance in multi-class classification. A novel feature learning method involves higher-order score functions computed for certain models from data. Feature extraction aims to reduce the number of features in a dataset by creating new features from the existing ones (and then discarding the original features). These new reduced set of features should then be able to summarize most of the information contained in the original set of features. In this way, a summarized version of the original features can be created from a combination of the original set. Deep learning is presently an effective research area in machine learning technique and pattern classification association. Some research result shows the impact of feature extraction used in a deep learning technique such as Convolutional Neural Network (CNN). Our goal is to train a Convolutional Neural Network to recognize different patterns and features from the social networks. Moreover, while neural networks hold the state-of-the-art empirical performance credits and have shown tremendous promise in many domains, much is yet to be understood about them. To this end, I’m interested in understanding their fundamental properties by trying to answer the question, “why deep learning models work the way they work?” In addition, I’m also interested in applying neural networks to push the state-of-the-art results in various problems of practical importance in artificial intelligence. The rapidly emerging fields of Machine Learning (ML) and Artificial Intelligence (AI) are disrupting many traditional business and industries, and promise to reorganize many aspects of daily life ultimately. The existence and persistence of financial statement fraud (FSF) are detrimental to the financial health of global capital markets. Several detective and predictive methods have been used to prevent, detect, and correct FSF. I am planning on building a credit card fraud detection system. Some of my lab mates now is working on the banks. They told me that the banks need a method to find the credit purchase automatically through the cardholder or detect a fraudulent usage according to the user’s previous behavior and their purchase time snapshot. We need to use a suitable feature selection method from the large dataset, which can classify group and segment data to search through up to millions of transactions to find patterns and detect fraud. We can use machine learning methods to learn suspicious-looking patterns and use those patterns to identify characteristics in fraud automatically.


AI FOR SOCIAL NETWORKS AND HUMAN LEARNING

The goals of my future research work are to develop fundamental concepts and new principles of data mining, design intelligent algorithms, and build scalable systems. For example, I am trying to extract some critical features from the social network that can affect human behavior. The timeliness and quality of healthcare are often compromised due to a large number of decisions to be taken and a shortage of human experts. For example, in certain medical specialties, patients in most countries need to wait for months to be diagnosed by a specialist. I have published a paper in that finds the correlation between the social network structure such as small dense subgroups and the mental health status. However, there are still a lot of features that we can try to find by using feature selection and data mining methods such as the time snapshot on a dynamic network interaction or the correlation between multi-social network. Moreover, when people feel anxious, we can use AI detection through the data of mobile devices or sensors and provide corresponding coping methods. We also can use AI to help anger management, and graph mining and intervention for mental well-being by finding several graph patterns. Moreover, Human learning has become an online activity, made popular by the emerging online touring systems and learning platforms. We can focus on the following problems: (i) Can we find a way of reviewing that help people memorize information more efficiently? (ii) Can we spot knowledge items (e.g., questions and answers) in social media that systematically help people increase their expertise? For the first problem, we can try to develop a computational framework to derive optimal spaced repetition algorithms, specially designed to adapt to the learners’ performance. To answer the second question, we can first focus on spotting specific knowledge items (e.g., questions and answers) that systematically helped people increase their expertise and develop a probabilistic modeling framework that leverages the crowd to spot information with high (knowledge) value.


HIGH-QUALITY DATA COLLECTION FROM ONLINE SOCIAL NETWORKS

I also work on collecting high-quality information on communities from online social networks (OSNs) . For social network analysis, we usually need the user data from the network for user study, however, collecting (crawling) information from OSNs is not a simple task due to GDPR (General Data Protection Regulation) and node and edge information should be crawled entirely. These manual approaches usually fail to obtain good ego networks because the users who are willing to contribute their data (recruited from online forums and crowdsourcing platforms) usually do not know each other and thus, no ego network structure exists. In contrast, in a dense community (class or office), only a small portion of users may be willing to contribute their data. Currently, when taking users' willingness into consideration, researchers usually recruit those users (to contribute their data) manually, e.g., recruiting them through online forums and crowdsourcing platforms, or sending messages to potential users in dense communities, such as classes or offices.


GRAPH MINING FOR BIG SOCIAL NETWORKS

My recent research is on big data analytics for mental disorder detection and status improvement . The research topic is about graph mining and intervention to improve mental well-being. I also used a lot of different machine learning techniques such as graph mining, neural network, and deep learning to find some of the graph patterns that related to mental disorders. I found some of the patterns such as small dense subgroup and the patient ego-network is related to the mental disorder. Our work shows that we can improve the patients’ mental health status by reducing these small dense subgroups in the social network through network intervention to find a suitable therapy group to help mental disorder patients. Here, small dense subgroups refer to the small groups in the social network in which members are socially dense but have no or few links to other individuals outside the group. I have designed a linear time approximation algorithm with a ½(1-1/e) approximation ratio that can reduce the subgroups in the social network through network intervention effectively and efficiently. In order to improve the patients’ mental health status, I also designed an algorithm to help psychologists and doctors to find a suitable therapy group to help mental disorder patients. Several psychologists and doctors evaluate the result. It demonstrated that the therapy groups selected by our algorithm are of much better quality than the groups manually selected by professionals . I also work on extract a socially tight group with optimized group diversity from the social network, which can help the leaders to find a high-performance team group for their work.


PUBLICATIONS



Conference Papers


B.-Y. Hsu, C.-L. Tu, M.-Y. Chang, and C.-Y. Shen. “CrawlSN: Community-aware Data Acquisition with Maximum Willingness in Online Social Networks,” The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD), Sept. 2020. (Journal Track Paper)


B.-Y. Hsu, C.-Y. Shen, and M.-Y. Chang. “WMEgo: Willingness Maximization for Ego Network Data Extraction in Online Social Networks,” ACM International Conference on Information and Knowledge Management (CIKM), Oct. 2020. (Research Track Full Paper, Acceptance Rate = 21%, 920 Submissions)

B.-Y. Hsu, C.-L. Tu, M.-Y. Chang, C.-Y. Shen, “On Crawling Community-aware Online Social Network Data,” ACM Conference on Hypertext and Social Media (HT’19), Hof, Germany, 2019. (Research Track Poster)


B.-Y. Hsu, C.-Y. Shen, G.-S. Lee, Y.-J. Hsu, C.-H. Yang, C.-W. Lu, M.-Y. Chang, K.-P. Lin, “Optimizing k-Collector Routing for Big Data Collection in Road Networks,” IEEE Global Communication Conference (Globecom), Hawaii, USA, Dec. 2019.


B.-Y. Hsu and C.-Y. Shen, “On Extracting Social-Aware Diversity-Optimized Groups in Social Networks,” IEEE Global Communication Conference (Globecom), Abu Dhabi, UAE, Dec. 2018.

X.-H. Dang, A. K. Singh, P. Bogdanov, H. You, and B.-Y. Hsu, “Discriminative Subnetworks with Regularized Spectral Learning for Global-State Network Data,” European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD Acceptance Rate = 25%), France, pp 290-306, Sep. 2014.


B.-Y. Hsu, T. K. F. Wong, W.-K. Hon, X. Liu, T.-W. Lam, and S.-M. Yiu, “A Local Structural Prediction Algorithm for RNA Triple Helix Structure,” International Conference on Pattern Recognition in Bioinformatics (PRIB), pp. 102-113, June 2013.


T. K. F. Wong, H.-T. Yu, B.-Y. Hsu, T. W. Lam, W.-K. Hon, and S.-M. Yiu, “Algorithms for Pseudoknot Classification,” Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine, Chicago, IL, USA, pp. 484-486, Aug. 2011. (Poster)



Journal Articles


BY Hsu, YL Chen, YC Ho, PY Chang, CC Chang, BC Shia, and CY Shen, "Diversity-Optimized Group Extraction in Social Networks", IEEE Transactions on Computational Social Systems (TCSS) 2022

BY Hsu, LY Yeh, MY Chang, and CY Shen, "Willingness Maximization for Ego Network Data Extraction in Multiple Online Social Networks", IEEE Transactions on Knowledge and Data Engineering (TKDE) 2022

B.-Y. Hsu, C.-Y. Shen, and X. Yan, “Network Intervention for Mental Disorders with Minimum Small Dense Subgroups”, IEEE Transactions on Knowledge and Data Engineering (TKDE) Oct, 2019 (Online DOI: 10.1109/TKDE.2019.2949294).


B.-Y. Hsu, Y.-F. Lan, and C.-Y. Shen, “On Automatic Formation of Effective Therapy Groups in Social Networks”, IEEE Transactions on Computational Social Systems (TCSS), Vol. 5, No. 3, pp. 713-726, Sep. 2018.

T. K. F. Wong, K.-L. Wan, B.-Y. Hsu, B. W.Y. Cheung, W.-K. Hon, T.-W. Lam, and S.-M. Yiu, “RNASAlign: RNA Structural Alignment System”, Bioinformatics, Vol. 27, No. 15, pp. 2151-2152, Aug. 2011.