Resources
Software
Smart Agent-Based Modeling
number-guessing game
emergency evacuation
plea bargaining
firm pricing competition
Data Preprocessing
Jellyfish-7B / 13B: large language models for data preprocessing.
BClean: a Bayesian data cleaning system.
Similarity Query Processing
Please refer to my GitHub and the Efficient Similarity Query Processing Project (a WayBack Machine version) for the source/binary codes of our work.
MQH: Locality Sensitive Hashing on Multi-level Quantization Errors for Point-to-Hyperplane Distances.
HVS: Hierarchical Voronoi Structure for Approximate Nearest Neighbor Search.
HSP: Hamming Distance Search in PostgreSQL (ongoing work).
GPH: Exact Hamming Distance Search.
CardNet: Cardinality Estimation for Similarity Selection.
SelNet: Cardinality Estimation for High-Dimensional Range Query.
NNS Benchmark: Benchmark for Approximate Nearest Neighbor Search.
Data Sharing
Dejima: An Architecture for Transactional Peer Data Integration by Bidirectional Transformation (ongoing work).
Tutorials and Invited Talks
Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations (KJDM 2023).
Data Science for the Study of History: From Statistics to Machine Learning (KJDB 2022).
Querying High-Dimensional Vectors: Challenges, Techniques, and Software (NDBC 2022).
High-Dimensional Similarity Query Processing for Data Science (KDD 2021, a concise version of the VLDB 2020 tutorial).
Similarity Query Processing for High-Dimensional Data (VLDB 2020).
Set Similarity Query Processing (WISE 2017).