Selected Publications
Please find a complete list of publications at DBLP and Google Scholar.
2024
Privacy-Enhanced Database Synthesis for Benchmark Publishing.
Yongrui Zhong, Yunqing Ge, Jianbin Qin, Shuyuan Zheng, Bo Tang, Yu-Xuan Qiu, Rui Mao, Ye Yuan, Makoto Onizuka, and Chuan Xiao.
arXiv preprint.
[paper] [source code]
Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation.
Jiawei Wang, Renhe Jiang, Chuang Yang, Zengqing Wu, Makoto Onizuka, Ryosuke Shibasaki, Noboru Koshizuka, and Chuan Xiao.
arXiv preprint.
[paper] [source code]
Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents.
Zengqing Wu, Shuyuan Zheng, Qianying Liu, Xu Han, Brian Inhyuk Kwon, Makoto Onizuka, Shaojie Tang, Run Peng, and Chuan Xiao.
arXiv preprint.
Agentic Markets Workshop (AMW) 2024 (non-archival).
[paper] [source code]
Probabilistic Routing for Graph-Based Approximate Nearest Neighbor Search.
Kejing Lu, Chuan Xiao, and Yoshiharu Ishikawa.
International Conference on Machine Learning (ICML) 2024.
[paper] [source code]
Utilization of Information Entropy in Training and Evaluation of Students' Abstraction Performance and Algorithm Efficiency in Programming.
Zengqing Wu, Huizhong Liu, and Chuan Xiao.
IEEE Transactions on Education (ToE).
[paper]
2023
Jellyfish: A Large Language Model for Data Preprocessing.
Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations.
Zengqing Wu, Run Peng, Xu Han, Shuyuan Zheng, Yixin Zhang, and Chuan Xiao.
arXiv preprint.
[paper] [slides] [source code]
Large Language Models as Data Preprocessors.
Haochen Zhang, Yuyang Dong, Chuan Xiao, and Masafumi Oyamada.
International Workshop on Tabular Data Analysis (TaDA) 2024.
[paper]
BClean: A Bayesian Data Cleaning System.
Jianbin Qin, Sifan Huang, Yaoshu Wang, Jing Zhu, Yifan Zhang, Yukai Miao, Rui Mao, Makoto Onizuka, and Chuan Xiao.
IEEE International Conference on Data Engineering (ICDE) 2024.
[paper] [source code]
High-Ratio Compression for Machine-Generated Data.
Jiujing Zhang, Zhitao Shen, Shiyu Yang, Meng Lingkai, Chuan Xiao, Wei Jia, Yue Li, Qinhui Sun, Wenjie Zhang, and Xuemin Lin.
Proceedings of the ACM on Management of Data (PACMMOD).
[paper] [source code]
"Guinea Pig Trials" Utilizing GPT: A Novel Smart Agent-Based Modeling Approach for Studying Firm Competition and Collusion.
Xu Han, Zengqing Wu, and Chuan Xiao.
Conference on Information Systems and Technology (CIST) 2023 (non-archival).
[paper] [slides] [source code]
DeepJoin: Joinable Table Discovery with Pre-trained Language Models.
Yuyang Dong, Chuan Xiao, Takuma Nozawa, Masafumi Enomoto, and Masafumi Oyamada.
Proceedings of the VLDB Endowment (PVLDB).
[paper] [source code]
2022
MQH: Locality Sensitive Hashing on Multi-level Quantization Errors for Point-to-Hyperplane Distances.
Kejing Lu, Yoshiharu Ishikawa, and Chuan Xiao.
Proceedings of the VLDB Endowment (PVLDB).
[paper] [slides] [source code]
FedMe: Federated Learning via Model Exchange.
Koji Matsuda, Yuya Sasaki, Chuan Xiao, and Makoto Onizuka.
SIAM International Conference on Data Mining (SDM).
[paper] [source code]
2021
HVS: Hierarchical Graph Structure Based on Voronoi Diagrams for Solving Approximate Nearest Neighbor Search.
Kejing Lu, Mineichi Kudo, Chuan Xiao, and Yoshiharu Ishikawa.
Proceedings of the VLDB Endowment (PVLDB).
[paper] [slides] [source code]
High-Dimensional Similarity Query Processing for Data Science.
Jianbin Qin, Wei Wang, Chuan Xiao, Ying Zhang, and Yaoshu Wang.
ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).
Tutorial.
[slides]
Consistent and Flexible Selectivity Estimation for High-Dimensional Data.
Yaoshu Wang, Chuan Xiao, Jianbin Qin, Rui Mao, Makoto Onizuka, Wei Wang, Rui Zhang, and Yoshiharu Ishikawa.
ACM SIGMOD International Conference on Management of Data (SIGMOD).
[paper] [extended version] [slides] [source code]
Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach.
Yuyang Dong, Kunihiro Takeoka, Chuan Xiao, and Masafumi Oyamada.
IEEE International Conference on Data Engineering (ICDE).
[paper] [extended version] [source code]
HSGAN: Reducing Mode Collapse in GANs by the Latent Code Distance of Homogeneous Samples.
Simin Yu, Kuntian Zhang, Chuan Xiao, Joshua Zhexue Huang, Mark Junjie Li, and Makoto Onizuka.
Computer Vision and Image Understanding (CVIU).
[paper] [source code]
2020
Similarity Query Processing for High-Dimensional Data.
Jianbin Qin, Wei Wang, Chuan Xiao, and Ying Zhang.
Proceedings of the VLDB Endowment (PVLDB).
Tutorial.
[slides]
Fast Subtrajectory Similarity Search in Road Networks under Weighted Edit Distance Constraints.
Satoshi Koide, Chuan Xiao, and Yoshiharu Ishikawa.
Proceedings of the VLDB Endowment (PVLDB).
[paper] [extended version] [video]
Monotonic Cardinality Estimation of Similarity Selection: A Deep Learning Approach.
Yaoshu Wang, Chuan Xiao, Jianbin Qin, Xin Cao, Yifang Sun, Wei Wang, and Makoto Onizuka.
ACM SIGMOD International Conference on Management of Data (SIGMOD).
[paper] [extended version] [slides] [source code]
Continuous Top-k Spatial-Keyword Search on Dynamic Objects.
Yuyang Dong, Chuan Xiao, Hanxiong Chen, Jeffrey Xu Yu, Kunihiro Takeoka, Masafumi Oyamada, and Hiroyuki Kitagawa.
VLDB Journal (VLDBJ).
[paper]
2019
Autocompletion for Prefix-Abbreviated Input.
Dynamic Set kNN Self-Join.
Daichi Amagata, Takahiro Hara, and Chuan Xiao.
IEEE International Conference on Data Engineering (ICDE).
Indexing Trajectories for Travel-Time Histogram Retrieval.
Robert Waury, Christian S. Jensen, Satoshi Koide, Yoshiharu Ishikawa, and Chuan Xiao.
International Conference on Extending Database Technology (EDBT).
[paper]
Efficient Query Autocompletion with Edit Distance-Based Error Tolerance.
Jianbin Qin, Chuan Xiao, Sheng Hu, Jie Zhang, Wei Wang, Yoshiharu Ishikawa, Koji Tsuda, and Kunihiko Sadakane.
VLDB Journal (VLDBJ).
Extension of the VLDB 2013 paper "Efficient Error-Tolerant Query Autocompletion".
[paper]
Generalizing the Pigeonhole Principle for Similarity Search in Hamming Space.
Jianbin Qin, Chuan Xiao, Yaoshu Wang, Wei Wang, Xuemin Lin, Yoshiharu Ishikawa, and Guoren Wang.
IEEE Transactions on Knowledge and Data Engineering (TKDE).
Extension of the ICDE 2018 paper "GPH: Similarity Search in Hamming Space", invited as best of ICDE 2018 papers.
[paper]
Scope-aware Code Completion with Discriminative Modeling.
Sheng Hu, Chuan Xiao, and Yoshiharu Ishikawa.
IPSJ Journal of Information Processing (JIP).
[paper]
2018
Pigeonring: A Principle for Faster Thresholded Similarity Search.
Jianbin Qin and Chuan Xiao.
Proceedings of the VLDB Endowment (PVLDB).
[paper] [extended version] [slides]
GPH: Similarity Search in Hamming Space.
Jianbin Qin, Yaoshu Wang, Chuan Xiao, Wei Wang, Xuemin Lin, and Yoshiharu Ishikawa.
IEEE International Conference on Data Engineering (ICDE).
[paper] [extended version] [slides] [source code]
CiNCT: Compression and Retrieval for Massive Vehicular Trajectories via Relative Movement Labeling.
Satoshi Koide, Yukihiro Tadokoro, Chuan Xiao, and Yoshiharu Ishikawa.
IEEE International Conference on Data Engineering (ICDE).
[paper]
Enhanced Indexing and Querying of Trajectories in Road Networks via String Algorithms.
Satoshi Koide, Yukihiro Tadokoro, Takayoshi Yoshimura, Chuan Xiao, and Yoshiharu Ishikawa.
ACM Transactions on Spatial Algorithms and Systems (TSAS).
[paper]
Building Hierarchical Spatial Histograms for Exploratory Analysis in Array DBMS.
Jing Zhao, Yoshiharu Ishikawa, Lei Chen, Chuan Xiao, and Kento Sugiura.
IEICE Transactions on Information and Systems.
[paper]
2017
Set Similarity Query Processing.
Jianbin Qin and Chuan Xiao.
International Conference on Web Information Systems Engineering (WISE).
Tutorial.
[slides]
Efficient Structure Similarity Searches: A Partition-Based Approach.
Xiang Zhao, Chuan Xiao, Xuemin Lin, Wenjie Zhang, and Yang Wang.
VLDB Journal (VLDBJ).
Extension of the VLDB 2014 paper "A Partition-Based Approach to Structure Similarity Search".
[paper]
An Efficient Algorithm for Location-Aware Query Autocompletion.
Sheng Hu, Chuan Xiao, and Yoshiharu Ishikawa.
IEICE Transactions on Information and Systems.
Outstanding paper award winner.
[paper]
2016
Local Similarity Search for Unstructured Text.
2015
BEVA: An Efficient Query Processing Algorithm for Error Tolerant Autocompletion.
Xiaoling Zhou, Jianbin Qin, Chuan Xiao, Wei Wang, Xuemin Lin, and Yoshiharu Ishikawa.
ACM Transactions on Database Systems (TODS).
Invited as a poster to SIGMOD 2016.
[paper]
Frequent Subgraph Mining Based on Pregel.
Xiang Zhao, Yifan Chen, Chuan Xiao, and Yoshiharu Ishikawa.
The Computer Journal.
[paper]
2014
Improving Performance of Graph Similarity Joins Using Selected Substructures.
Xiang Zhao, Chuan Xiao, Wenjie Zhang, Xuemin Lin, and Jiuyang Tang.
International Conference on Database Systems for Advanced Applications (DASFAA).
[paper]
2013
A Partition-Based Approach to Structure Similarity Search.
Xiang Zhao, Chuan Xiao, Xuemin Lin, Qing Liu, and Wenjie Zhang.
Proceedings of the VLDB Endowment (PVLDB).
[paper] [extended version] [slides]
Efficient Error-Tolerant Query Autocopletion.
Chuan Xiao, Jianbin Qin, Wei Wang, Yoshiharu Ishikawa, Koji Tsuda, and Kunihiko Sadakane.
Proceedings of the VLDB Endowment (PVLDB).
[paper] [slides] [source code]
Asymmetric Signature Schemes for Efficient Exact Edit Similarity Query Processing.
Jianbin Qin, Wei Wang, Chuan Xiao, Yifei Lu, Xuemin Lin, and Haixun Wang.
ACM Transactions on Database Systems (TODS).
Extension of the SIGMOD 2011 paper "Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme", invited as best of SIGMOD 2011 papers.
[paper] [source code]
Efficient Processing of Graph Similarity Queries with Edit Distance Constraints.
Xiang Zhao, Chuan Xiao, Xuemin Lin, Wei Wang, and Yoshiharu Ishikawa.
VLDB Journal (VLDBJ).
Extension of the ICDE 2012 paper "Efficient Graph Similarity Joins with Edit Distance Constraints".
[paper]
2012
Efficient Graph Similarity Joins with Edit Distance Constraints.
Efficient Subgraph Similarity All-Matching.
VChunkJoin: An Efficient Algorithm for Edit Similarity Joins.
Wei Wang, Jianbin Qin, Chuan Xiao, Xuemin Lin, and Heng Tao Shen.
IEEE Transactions on Knowledge and Data Engineering (TKDE).
[paper] [source code]
2011
Efficient Exact Edit Similarity Query Processing with the Asymmetric Signature Scheme.
Jianbin Qin, Wei Wang, Yifei Lu, Chuan Xiao, and Xuemin Lin.
ACM SIGMOD International Conference on Management of Data (SIGMOD).
Best paper award nominee.
[paper] [slides] [source code]
2010
Efficient Similarity Joins for Near Duplicate Detection.
Chuan Xiao, Wei Wang, Xuemin Lin, Jeffrey Xu Yu, and Guoren Wang.
ACM Transactions on Database Systems (TODS).
Extension of the WWW 2008 paper "Efficient Similarity Joins for Near Duplicate Detection".
2009
Approximate Entity Extraction with Edit Constraints.
Top-k Set Similarity Joins.
2008
Ed-Join: An Efficient Algorithm for Similarity Join with Edit Distance Constraints.
Chuan Xiao, Wei Wang, and Xuemin Lin.
Proceedings of the VLDB Endowment (PVLDB).
Efficient Similarity Joins for Near Duplicate Detection.
Chuan Xiao, Wei Wang, Xuemin Lin, and Jeffrey Xu Yu.
International World Wide Web Conference (WWW).