Random Sampling for Group-By Queries [PDF, arXiv version]
Trong Duc Nguyen*, Ming-Hung Shih*, Sai Sree Parvathaneni, Bojian Xu*, Divesh Srivastava, and Srikanta Tirthapura*
Proceedings of International Conference on Data Engineering (ICDE), pp.541-552, 2020.
Stratified Random Sampling from Streaming and Stored Data (♠) [PDF, arXiv version]
Trong Duc Nguyen, Ming-Hung Shih, Divesh Srivastava, Srikanta Tirthapura, and Bojian Xu
Proceedings of the 22nd International Conference on Extending Database Technology (EDBT), pp.25--36, 2019.
(Selected as one of the 3 best papers out of 157 full research conference paper submissions and invited for publication at the Distributed and Parallel Databases journal. Journal version submission to DAPD is under review.)
A Practical and Efficient Algorithm for the k-mismatch Shortest Unique Substring Finding Problem (♠) [PDF, code]
Daniel R. Allen, Sharma V. Thankachan, and Bojian Xu
Proceedings of the 9th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB), pp.428--437, 2018.
(Selected as one of the 18 best papers out of 148 full research conference paper submissions and are invited for publication at IEEE/ACM Transactions on Computational Biology and Bioinformatics. Check out its journal version at IEEE/ACM TCBB.)
On k-mismatch Shortest Unique Substring Queries Using GPU (♠) [PDF]
Daniel W. Schultz and Bojian Xu
Proceedings of the 14th International Symposium on Bioinformatics Research and Applications (ISBRA), pp.193-204, 2018.
(Check out the journal version at IEEE/ACM Transactions on Computational Biology and Bioinformatics.)
An In-place Framework for Exact and Approximate Shortest Unique Substring Queries (♠) [PDF, slides]
Wing-Kai Hon, Sharma V. Thankachan, and Bojian Xu
Proceedings of the 26th International Symposium on Algorithms and Computation (ISAAC), pp.755-767, 2015.
(Check out the journal version at Theoretical Computer Science, 2017)
On Stabbing Queries for Generalized Longest Repeat [PDF, slides]
Bojian Xu
Proceedings of IEEE International Conference on Bioinformatics & Biomedicine (BIBM), pp.523-530, 2015.
(Selected as one of the best papers to be published in a special issue of International Journal of Data Mining and Bioinformatics.
Check out the journal version at International Journal of Data Mining and Bioinformatics, 2016.)
Note: The conference verion of this paper has typos in Figures 2 and 4, where references [8] and [19] should be replaced by [12] and [14], respectively.
CloudTree: A Library to Extend Cloud Services for Trees [PDF, full version@arXiv]
Yun Tian*, Bojian Xu, Yanqing Ji, and Jesse Scholer
Proceedings of IEEE International Congress on Big Data (BigData Congress), pp.689-693, 2015.
On Longest Repeat Queries Using GPU (♠) [PDF, slides]
Yun Tian and Bojian Xu
Proceedings of International Conference on Database Systems for Advanced Applications (DASFAA), pp.316-333, 2015.
Boosting the Basic Counting on Distributed Streams [PDF]
Bojian Xu
Proceedings of International Conference on Scientific and Statistical Database Management (SSDBM), 12 pages, 2014.
(Check out the journal version at Theoretical Computer Science, 2015)
Shortest Unique Substring Query Revisited (♠) [PDF]
Atalay Mert Ileri*, M. Oguzhan Kulekci, and Bojian Xu*
Proceedings of Annual Symposium on Combinatorial Pattern Matching (CPM), pp.172-181, 2014.
(Check out the journal version at Theoretical Computer Science, 2015)
Wavelet Trees: from Theory to Practice (♠) [PDF]
Roberto Grossi, Jeffrey Scott Vitter, and Bojian Xu*
Proceedings of International Conference on Data Compression,Communication and Processing (CCP), pp.210-221, 2011.
PSI-RA: A Parallel Sparse Index for Read Alignment on Genomes
M. Oguzhan Kulekci*, Wing-Kai Hon, Rahul Shah, Jeffrey Scott Vitter, and Bojian Xu
Proceedings of IEEE International Conference on Bioinformatics & Biomedicine (BIBM), pp.663-668, 2010.
(Selected as one of the best papers to be published in a special issue of BMC Genomics. Check out the journal version at BMC Genomics, 2011.)
Time- and Space-efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Trees (♠)
M. Oguzhan Kulekci, Jeffrey Scott Vitter, and Bojian Xu*
Proceedings of IEEE International Conference on Bioinformatics & Biomedicine (BIBM), pp.622-625, 2010.
(Check out the journal version at IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2012)
Boosting Pattern Matching Performance via k-bit Filtering (♠)
M. Oguzhan Kulekci*, Jeffrey Scott Vitter, and Bojian Xu
Proceedings of the 25th International Symposium on Computer and Information Sciences (ISCIS), pp.27-32, 2010.
(Check out the journal version at Computer Journal, 2012)
Time-decayed Correlated Aggregates over Data Streams (♠) [slides]
Graham Cormode, Srikanta Tirthapura, and Bojian Xu
Proceedings of SIAM International Conference on Data Mining (SDM), pp.269-280, 2009.
(Selected as one of the 7 best papers out of 351 conference submissions and invited to a special issue of Statistical Analysis and Data Mining journal. Check out the journal version at Statistical Analysis and Data Mining, 2009)
Forward Decay: A Practical Time Decay Model for Streaming Systems (♠) [PDF, slides]
Graham Cormode, Vladislav Shkapenyuk, Divesh Srivastava, and Bojian Xu
Proceedings of International Conference on Data Engineering (ICDE), pp.138-149, 2009.
Time-Decaying Sketches for Sensor Data Aggregation (♠) [slides]
Graham Cormode, Srikanta Tirthapura, and Bojian Xu
Proceedings of ACM Symposium on Principles of Distributed Computing (PODC), pp.215-224, 2007.
(Check out the journal version at SIAM Journal on Computing, 2009)
Sketching Asynchronous Streams Over Sliding Windows [slides]
Srikanta Tirthapura*, Bojian Xu*, and Costas Busch
Proceedings of ACM Symposium on Principles of Distributed Computing (PODC), pp.82-91, 2006.
(Check out the journal version at Distributed Computing, 2008)
Stratified Random Sampling from Streaming and Stored Data (♠)
Trong Duc Nguyen, Ming-Hung Shih, Divesh Srivastava, Srikanta Tirthapura, and Bojian Xu
Distributed and Parallel Databases, Distributed and Parallel Databases, 39, pp. 665–710, 2021. (special issue for EDBT2019)
An Ultra-Fast and Parallelizable Algorithm for Finding k-Mismatch Shortest Unique Substrings (♠) [PDF]
Daniel R. Allen, Sharma V. Thankachan, and Bojian Xu
IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 18, no. 1, pp. 138-148, 2021. (special issue for ACM-BCB2018)
Parallel Methods for Finding k-Mismatch Shortest Unique Substrings Using GPU (♠) [PDF]
Daniel W. Schultz and Bojian Xu
IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 18, no. 1, pp. 386-395, 2021. (special issue for ISBRA2018)
In-place Algorithms for Exact and Approximate Shortest Unique Substring Problems (♠) [PDF]
Wing-Kai Hon, Sharma V. Thankachan, and Bojian Xu
Theoretical Computer Science, vol. 690, pp.12–25, 2017.
On Stabbing Queries for Generalized Longest Repeat [PDF]
Bojian Xu
International Journal of Data Mining and Bioinformatics, vol. 15, no. 4, pp. 350-371, 2016. (Special issue for the best BIBM2015 papers.)
Boosting Distinct Random Sampling for Basic Counting on the Union of Distributed Streams [PDF]
Bojian Xu
Theoretical Computer Science, vol. 602, pp. 60-79, 2015.
A Simple Yet Time-Optimal and Linear-Space Algorithm for Shortest Unique Substring Queries (♠) [PDF]
Atalay Mert Ileri*, M. Oguzhan Kulekci, and Bojian Xu*
Theoretical Computer Science, vol. 562, pp. 621-633, 2015.
Efficient Maximal Repeat Finding Using the Burrows-Wheeler Transform and Wavelet Tree (♠) [PDF]
M. Oguzhan Kulekci, Jeffrey Scott Vitter, and Bojian Xu*
IEEE/ACM Transactions on Computational Biology and Bioinformatics, 9(2), pp. 421-429, 2012.
Fast Pattern Matching via k-bit Filtering Based Text Decomposition (♠) [PDF]
M. Oguzhan Kulekci*, Jeffrey Scott Vitter, and Bojian Xu
The Computer Journal 55(1):62-68, 2012. (Special issue for the best ISCIS2010 papers)
PSI-RA: A Parallel Sparse Index for Genomic Read Alignment [PDF]
M. Oguzhan Kulekci*, Wing-Kai Hon, Rahul Shah, Jeffrey Scott Vitter, and Bojian Xu
BMC Genomics, 12(Suppl 2):S7, 2011. (Special issue for the best BIBM2010 papers)
Time-decayed Correlated Aggregates over Data Streams (♠) [PDF]
Graham Cormode, Srikanta Tirthapura, and Bojian Xu
Statistical Analysis and Data Mining, 2(5-6), pp. 294-310, 2009 (Special issue for the best SDM2009 papers)
Time-Decaying Sketches for Robust Aggregation of Sensor Data (♠) [PDF]
Graham Cormode, Srikanta Tirthapura, and Bojian Xu
SIAM Journal on Computing, 39(4), pp. 1309-1339, 2009.
Sketching Asynchronous Streams Over Sliding Windows [PDF]
Bojian Xu*, Srikanta Tirthapura*, and Costas Busch
Distributed Computing, 20(5), pp. 359-374, 2008