*spaced k-mers, gapped q-grams, gapped k-mers, gapped n-mers, ...*

Note that i also try to keep a list of optimized spaced seed patterns where the percent identity is varying continuously : please contact me if you need specific values or specific models ...

[1]

M. C. Frith, L. Noé, and G. Kucherov, “Minimally-overlapping words for sequence similarity search,” *Bioinformatics*, December 2020. [ DOI | http ]

[2]

A. Mallik and L. Ilie, “ALeS: Adaptive-length spaced-seed design,” *Bioinformatics*, November 2020. [ DOI | http ]

[3]

J. Chu, H. Mohamadi, E. Erhan, J. Tse, R. Chiu, S. Yeo, and I. Birol, “Mismatch-tolerant, alignment-free sequence classification using multiple spaced seeds and multiindex bloom filters,” *Proceedings of the National Academy of Sciences*, vol. 117, pp. 16961--16968, July 2020. [ DOI | http ]

[4]

G. J. Filion, R. Cortini, and E. Zorita, “Calibrating seed-based heuristics to map short reads with sesame,” *Frontiers in Genetics*, vol. 11, p. 572, June 2020. [ DOI | http ]

[5]

T. Dencker, C.-A. Leimeister, M. Gerth, C. Bleidorn, S. Snir, and B. Morgenstern, “Multi-SpaM:a maximum-likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees,” *NAR Genomics and Bioinformatics*, vol. 2, March 2020. [ DOI | http ]

[6]

S. Röhling, A. Linne, J. Schellhorn, M. Hosseini, T. Dencker, and B. Morgenstern, “The number of *k*-mer matches between two dna sequences as a function of *k* and applications to estimate phylogenetic distances,” *PLOS ONE*, February 2020. [ DOI | http ]

[7]

L. Salmela, K. Mukherjee, S. J. Puglisi, M. D. Muggli, and C. Boucher, “Fast and accurate correction of optical mapping data via spaced seeds,” *Bioinformatics*, vol. 36, pp. 682--689, February 2020. [ DOI | http ]

[8]

E. Petrucci, L. Noé, C. Pizzi, and M. Comin, “Iterative spaced seed hashing: Closing the gap between spaced seed hashing,” *Journal of Computational Biology*, vol. 27, pp. 223--233, February 2020. [ DOI | http ]

[9]

C. R. De Pierri, R. Voyceik, L. G. C. Santos de Mattos, M. G. Kulik, J. O. Camargo, A. M. Repula de Oliveira, B. T. de Lima Nichio, J. N. Marchaukoski, A. C. da Silva Filho, D. Guizelini, J. M. Ortega, F. O. Pedrosa, and R. T. Raittz, “SWeeP: representing large biological sequences datasets in compact vectors,” *Nature Scientific Reports*, vol. 10, January 2020. [ DOI | http ]

[10]

A.-K. Lau, S. Dörrer, C.-A. Leimeister, C. Bleidorn, and B. Morgenstern, “Read-spam: assembly-free and alignment-free comparison of bacterial genomes with low sequencing coverage,” *BMC Bioinformatics*, vol. 20, December 2019. [ DOI | http ]

[11]

D. E. Wood, J. Lu, and B. Langmead, “Improved metagenomic analysis with kraken 2,” *Genome Biology*, vol. 20, November 2019. [ DOI | http ]

[12]

C.-A. Leimeister, J. Schellhorn, S. Dörrer, M. Gerth, C. Bleidorn, and B. Morgenstern, “Prot-SpaM: Fast alignment-free phylogeny reconstruction based on whole-proteome sequences,” *GigaScience*, vol. 8, March 2019. [ DOI | http ]

[13]

S. Girotto, M. Comin, and C. Pizzi, “Efficient computation of spaced seed hashing with block indexing,” in *Proceedings from the 12th International BBCC conference*, vol. 19 suppl 15 of *BMC Bioinformatics*, November 2018. [ DOI | http ]

[14]

T. Dencker, C.-A. Leimeister, M. Gerth, C. Bleidorn, S. Snir, and B. Morgenstern, “Multi-SpaM: a Maximum-Likelihood approach to phylogeny reconstruction using multiple spaced-word matches and quartet trees,” in *Proceedings of the 16th RECOMB international conference on Comparative Genomics, Magog-Orford (Canada)*, vol. 11183 of *Lecture Notes in Computer Science*, pp. 227--241, Springer, October 2018. [ DOI | http ]

[15]

S. Girotto, M. Comin, and C. Pizzi, “FSH: fast spaced seed hashing exploiting adjacent hashes,” *Algorithms for Molecular Biology*, vol. 13, March 2018. (earlier version in WABI 2017). [ DOI | http ]

[16]

D. E. K. Martin, “Minimal auxiliary markov chains through sequential elimination of states,” *Communications in Statistics - Simulation and Computation*, vol. 48, pp. 1040--1054, February 2018. [ DOI | http ]

[17]

L. Mallet, T. Bitard-Feildel, F. Cerutti, and H. Chiapello, “PhylOligo: a package to identify contaminant or untargeted organism sequences in genome assemblies,” *Bioinformatics*, vol. 33, pp. 3283--3285, October 2017. [ DOI | http ]

[18]

S. Girotto, M. Comin, and C. Pizzi, “Metagenomic reads binning with spaced seeds,” *Theoretical Computer Science*, vol. 698, pp. 88--99, October 2017. [ DOI | http ]

[19]

S. Girotto, M. Comin, and C. Pizzi, “Fast spaced seed hashing,” in *Proceedings of the 17th International Workshop on Algorithms in Bioinformatics (WABI), Boston (USA)* (R. Schwartz and K. Reinert, eds.), vol. 88 of *Leibniz International Proceedings in Informatics (LIPIcs)*, pp. 7:1--7:14, Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik, August 2017. [ DOI | http ]

[20]

S. Girotto, M. Comin, and C. Pizzi, “Binning metagenomic reads with probabilistic sequence signatures based on spaced seeds,” in *Proceedings of the 12th IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Manchester (UK)*, August 2017. [ DOI | http ]

[21]

C.-A. Leimeister, S. Sohrabi-Jahromi, and B. Morgenstern, “Fast and accurate phylogeny reconstruction using filtered spaced-word matches,” *Bioinformatics*, vol. 33, pp. 971--979, April 2017. [ DOI | http ]

[22]

L. Noé, “Best hits of 11110110111: model-free selection and parameter-free sensitivity calculation of spaced seeds,” *Algorithms for Molecular Biology*, vol. 12, February 2017. [ DOI | http | .pdf ]

[23]

D. E. K. Martin and L. Noé, “Faster exact distributions of pattern statistics through sequential elimination of states,” *Annals of the Institute of Statistical Mathematics*, vol. 69, pp. 231--248, February 2017. [ DOI | http | .pdf ]

[24]

L. Hahn, C.-A. Leimeister, R. Ounit, S. Lonardi, and B. Morgenstern, “rasbhari: optimizing spaced seeds for database searching, read mapping and alignment-free sequence comparison,” *PLoS Computational Biology*, vol. 12, p. e1005107, October 2016. [ DOI | http ]

[25]

R. Ounit and S. Lonardi, “Higher classification sensitivity of short metagenomic reads with CLARK-S,” *Bioinformatics*, vol. 32, pp. 3823--3825, August 2016. [ DOI | http ]

[26]

H. Chen, A. D. Smith, and T. Chen, “WALT: fast and accurate read mapping for bisulfite sequencing,” *Bioinformatics*, vol. 32, pp. 3507--3509, July 2016. [ DOI | http ]

[27]

J. Healy, “FLAK: Ultra-fast Fuzzy Whole Genome Alignment,” in *Proceedings of the 10th International Conference on Practical Applications of Computational Biology & Bioinformatics (PACBB)*, vol. 477 of *Advances in Intelligent Systems and Computing*, pp. 123--131, Springer, June 2016. [ DOI | http ]

[28]

I. Sović, M. Šikić, A. Wilm, S. N. Fenlon, S. Chen, and N. Nagarajan, “Fast and sensitive mapping of nanopore sequencing reads with GraphMap,” *Nature Communications*, vol. 7, April 2016. [ DOI | http ]

[29]

R. Wang, Y. Xu, and B. Liu, “Recombination spot identification based on gapped k-mers,” *Nature Scientific Reports*, vol. 6, March 2016. RETRACTED: 20 March 2018. [ DOI | http ]

[30]

Y. Gheraibia, A. Moussaoui, Y. Djenouri, S. Kabir, P.-Y. Yin, and S. Mazouzi, “Penguin search optimisation algorithm for finding optimal spaced seeds,” *International Journal of Software Science and Computational Intelligence (IJSSCI)*, vol. 7, pp. 85--99, November 2015. [ DOI | http ]

[31]

P.-T. Do and C.-G. Tran-Thi, “An improvement of the overlap complexity in the spaced seed searching problem between genomic DNAs,” in *Proceedings of the 2nd National Foundation for Science and Technology Development Conference on Information and Computer Science (NICS), Ho Chi Minh City (Vietnam)*, pp. 271--276, IEEE Computer Society Press, September 2015. [ DOI | http ]

[32]

R. Ounit and S. Lonardi, “Higher classification accuracy of short metagenomic reads by discriminative spaced k-mers,” in *Proceedings of the 15th International Workshop on Algorithms in Bioinformatics (WABI), Atlanta (USA)*, vol. 9289 of *Lecture Notes in Bioinformatics*, pp. 286--295, Springer, August 2015. [ DOI | http ]

[33]

K. Břinda, M. Sykulski, and G. Kucherov, “Spaced seeds improve k-mer based metagenomic classification,” *Bioinformatics*, vol. 31, pp. 3584--3592, July 2015. [ DOI | http ]

[34]

I. Petrov, S. Brillet, E. Drezen, S. Quiniou, L. Antin, P. Durand, and D. Lavenier, “KLAST: fast and sensitive software to compare large genomic databanks on cloud,” in *Proceedings of the World Congress in Computer Science, Computer Engineering, and Applied Computing (WORLDCOMP), Las Vegas (USA)*, pp. 85--90, July 2015. [ .pdf ]

[35]

T. T. Tran, M. Giraud, and J.-S. Varré, “Perfect hashing structures for parallel similarity searches,” in *Proceedings of the 14th IEEE International Workshop on High Performance Computational Biology (HICOMB), Hyderabad, India*, pp. 332--341, May 2015. [ DOI | .pdf ]

[36]

L. Egidi and G. Manzini, “Multiple seeds sensitivity using a single seed with threshold,” *Journal of Bioinformatics and Computational Biology*, vol. 13, p. 1550011, March 2015. [ DOI | http ]

[37]

I. Birol, J. Chu, H. Mohamadi, S. D. Jackman, K. Raghavan, B. P. Vandervalk, A. Raymond, and R. L. Warren, “Spaced seed data structures for de novo assembly,” *International Journal of Genomics*, vol. 2015, p. ID 196591, March 2015. [ DOI | .pdf ]

[38]

B. Morgenstern, B. Zhu, S. Horwege, and C.-A. Leimeister, “Estimating evolutionary distances between genomic sequences from spaced-word matches,” *Algorithms for Molecular Biology*, vol. 10, February 2015. [ DOI | http | .pdf ]

[39]

L. Noé and D. E. K. Martin, “A coverage criterion for spaced seeds and its applications to support vector machine string kernels and k-mer distances,” *Journal of Computational Biology*, vol. 21, pp. 947--963, December 2014. [ DOI | http | http | http ]

[40]

B. Buchfink, C. Xie, and D. H. Huson, “Fast and sensitive protein alignment using DIAMOND,” *Nature Methods*, vol. 12, pp. 59--60, November 2014. [ DOI | .html ]

[41]

E. Giaquinta, K. Fredriksson, S. Grabowski, A. I. Tomescu, and E. Ukkonen, “Motif matching using gapped patterns,” *Theoretical Computer Science*, vol. 548, pp. 1--13, September 2014. [ DOI | http ]

[42]

M. Ghandi, M. Mohammad-Noori, and M. A. Beer, “Robust k-mer frequency estimation using gapped k-mers,” *Journal of Mathematical Biology*, vol. 69, pp. 469--500, August 2014. [ DOI | http | .pdf ]

[43]

M. Ghandi, D. Lee, M. Mohammad-Noori, and M. A. Beer, “Enhanced regulatory sequence prediction using gapped k-mer features,” *PLoS Computational Biology*, vol. 10, p. e1003711, July 2014. [ DOI | http ]

[44]

S. Horwege, S. Lindner, M. Boden, K. Hatje, M. Kollmar, C.-A. Leimeister, and B. Morgenstern, “Spaced words and kmacs: Fast alignment-free sequence comparison based on inexact word matches,” *Nucleic Acids Research*, vol. 42, pp. W7--W11, May 2014. [ DOI | http | .pdf ]

[45]

C.-A. Leimeister, M. Boden, S. Horwege, S. Lindner, and B. Morgenstern, “Fast alignment-free sequence comparison using spaced-word frequencies,” *Bioinformatics*, vol. 30, pp. 1991--1999, March-April 2014. [ DOI | http | .pdf ]

[46]

K. Břinda, “Languages of lossless seeds,” in *Proceedings of the 14th International Conference on Automata and Formal Languages (AFL), Szeged, Hungary* (Z. Ésik and Z. Fülöp, eds.), vol. 151 of *Electronic Proceedings in Theoretical Computer Science*, pp. 139--150, 2014. [ DOI | http | .pdf ]

[47]

J. Healy and D. Chambers, “Approximate k-mer matching using fuzzy hash maps,” *IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)*, vol. 11, pp. 258--264, March 2014. [ DOI | http ]

[48]

T. Gagie, G. Manzini, and D. Valenzuela, “Compressed spaced suffix arrays,” in *Proceedings of the 2nd International Conference on Algorithms for Big Data (ICABD), Palermo (Italy)*, vol. 1146 of *CEUR-WS*, pp. 37--45, 2014. [ .pdf ]

[49]

W. Li, B. Ma, and K. Zhang, “Optimizing spaced k-mer neighbors for efficient filtration in protein similarity search,” *IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)*, vol. 11, pp. 398--406, February 2014. [ DOI | http ]

[50]

L. Egidi and G. Manzini, “Spaced seeds design using perfect rulers,” *Fundamenta Informaticae*, vol. 131, pp. 187--203, March 2014. (earlier version in SPIRE 2011). [ DOI | http ]

[51]

M. C. Frith and L. Noé, “Improved search heuristics find 20 000 new alignments between human and mouse genomes,” *Nucleic Acids Research*, vol. 42, p. e59, February 2014. [ DOI | http | .pdf ]

[52]

L. Egidi and G. Manzini, “Design and analysis of periodic multiple seeds,” *Theoretical Computer Science*, vol. 522, pp. 62--76, February 2014. [ DOI | http ]

[53]

A. M. S. Shrestha, M. C. Frith, and P. Horton, “A bioinformatician's guide to the forefront of suffix array construction algorithms,” *Briefings in bioinformatics*, vol. 15, pp. 138--154, January 2014. [ DOI | http | .pdf ]

[54]

M. Boden, M. Schöneich, S. Horwege, S. Lindner, C. Leimeister, and B. Morgenstern, “Alignment-free sequence comparison with spaced *k*-mers,” in *Proceedings of the German Conference on Bioinformatics (GCB)*, vol. 34 of *OpenAccess Series in Informatics (OASIcs)*, pp. 24--34, September 2013. [ DOI | .pdf ]

[55]

T. Onodera and T. Shibuya, “The gapped spectrum kernel for support vector machines,” in *Proceedings of the International Conference on Machine Learning and Data Mining in Pattern Recognition (MLDM)*, vol. 7988 of *Lecture Notes in Computer Science*, pp. 1--15, Springer, April 2013. [ DOI | http | .pdf ]

[56]

L. Egidi and G. Manzini, “Better spaced seeds using quadratic residues,” *Journal of Computer and System Sciences*, vol. 79, pp. 1144--1155, November 2013. [ DOI | http ]

[57]

L. Ilie, H. Mohamadi, G. Brian Golding, and W. F. Smyth, “BOND: Basic OligoNucleotide Design,” *BMC Bioinformatics*, vol. 14, February 2013. [ DOI | http | .pdf ]

[58]

M. Hou, L. Zhang, and R. S. Harris, “Alignment seeding strategies using contiguous pyrimidine purine matches,” in *Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine (BCB), Orlando (USA)*, pp. 384--391, October 2012. [ DOI | http ]

[59]

W. Li, B. Ma, and K. Zhang, “Efficient filtration for similarity search with spaced k-mer neighbors,” in *Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Philadelphia (USA)*, pp. 11--16, IEEE Computer Society Press, October 2012. [ DOI | http ]

[60]

T. Marschall, I. Herms, H.-M. Kaltenbach, and S. Rahmann, “Probabilistic arithmetic automata and their applications,” *IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)*, vol. 9, pp. 1737--1750, December 2012. [ DOI | http ]

[61]

D. Do Duc, H. Q. Dinh, T. H. Dang, K. Laukens, and X. H. Hoang, “AcoSeeD: An ant colony optimization for finding optimal spaced seeds in biological sequence search,” in *Proceedings of the 8th International Conference on Swarm Intelligence (ANTS), Brussels (Belgium)*, vol. 7461 of *Lecture Notes in Computer Science*, pp. 204--211, Springer, September 2012. [ DOI | http | .pdf ]

[62]

S. Ilie, “Efficient computation of spaced seeds,” *BMC Research Notes*, vol. 5, February 2012. [ DOI | http | .pdf ]

[63]

M. Startek, S. Lasota, M. Sykulski, A. Bulak, L. Noé, G. Kucherov, and A. Gambin, “Efficient alternatives to PSI-BLAST,” *Bulletin of the Polish Academy of Sciences: Technical Sciences*, vol. 60, pp. 495--505, December 2012. [ DOI | http | .pdf ]

[64]

M. Pellegrini, M. E. Renda, and A. Vecchio, “Ab initio detection of fuzzy amino acid tandem repeats in protein sequences,” *BMC Bioinformatics*, vol. 13, p. S8, March 2012. [ DOI | http ]

[65]

M. David, M. Dzamba, D. Lister, L. Ilie, and M. Brudno, “SHRiMP2: Sensitive yet practical short read mapping,” *Bioinformatics*, vol. 27, pp. 1011--1012, April 2011. [ DOI | http ]

[66]

E. Bao, T. Jiang, I. Kaloshian, and T. Girke, “SEED: efficient clustering of next-generation sequences,” *Bioinformatics*, vol. 27, pp. 2502--2509, August 2011. [ DOI | http | .pdf ]

[67]

L. Egidi and G. Manzini, “Spaced seeds design using perfect rulers,” in *Proceedings of the 18th International Symposium on String Processing and Information Retrieval (SPIRE), Pisa (Italy)*, vol. 7024 of *Lecture Notes in Computer Science*, pp. 32--43, Springer, October 2011. [ DOI | http | .pdf ]

[68]

K. Chen, K. She, and Q. Zhu, “Overlap digraph: An effective model for finding good spaced seeds for biological sequence local alignment,” *Chinese Science Bulletin*, vol. 56, pp. 1100--1107, April 2011. [ DOI | http | .pdf ]

[69]

L. Ilie, S. Ilie, and A. Mansouri Bigvand, “SpEED: fast computation of sensitive spaced seeds,” *Bioinformatics*, vol. 27, pp. 2433--2434, September 2011. [ DOI | http | .pdf ]

[70]

L. Ilie, S. Ilie, S. Khoshraftar, and A. Mansouri Bigvand, “Seeds for effective oligonucleotide design,” *BMC Genomics*, vol. 12, p. 280, June 2011. [ DOI | http | .pdf ]

[71]

A. Gambin, S. Lasota, M. Startek, M. Sykulski, L. Noé, and G. Kucherov, “Subset seed extension to Protein BLAST,” in *Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms (BIOINFORMATICS 2011), January 26-29 2011, Rome (Italy)*, pp. 149--158, SciTePress Digital Library, January 2011. [ DOI | http ]

[72]

S. M. Kielbasa, R. Wan, K. Sato, P. Horton, and M. C. Frith, “Adaptive seeds tame genomic sequence comparison,” *Genome Research*, vol. 21, pp. 487--493, March 2011. [ DOI | http | .pdf ]

[73]

M. Crochemore and G. Tischler, “The gapped suffix array: A new index structure for fast approximate matching,” in *Proceedings of the 17th International Symposium on String Processing and Information Retrieval (SPIRE), Los Cabos (Mexico)* (E. Chavez and S. Lonardi, eds.), vol. 6393 of *Lecture Notes in Computer Science*, pp. 359--364, Springer, October 2010. [ DOI | http | .pdf ]

[74]

E. Giladi, J. Healy, G. Myers, C. Hart, P. Kapranov, D. Lipson, S. Roels, E. Thayer, and S. Letovsky, “Error tolerant indexing and alignment of short reads with covering template families,” *Journal of Computational Biology*, vol. 17, pp. 1397--1411, October 2010. [ DOI | http | http ]

[75]

L. Noé, M. Gîrdea, and G. Kucherov, “Designing efficient spaced seeds for SOLiD read mapping,” *Advances in Bioinformatics*, vol. 2010, p. ID 708501, July 2010. [ DOI | http | .pdf ]

[76]

L. Zhou, I. Mihai, and L. Florea, “Spaced seeds for cross-species cDNA-to-genome sequence alignment,” *Communications in Information and Systems*, vol. 10, no. 2, pp. 115--136, 2010. [ http | .pdf ]

[77]

L. Noé, M. Gîrdea, and G. Kucherov, “Seed design framework for mapping SOLiD reads,” in *Proceedings of the 14th Annual International Conference on Research in Computational Molecular Biology (RECOMB), April 25-28, 2010, Lisbon (Portugal)* (B. Berger, ed.), vol. 6044 of *Lecture Notes in Computer Science*, pp. 384--396, Springer, April 2010. [ DOI | http | http | http ]

[78]

W.-H. Chung and S.-B. Park, “Hit integration for identifying optimal spaced seeds,” *BMC Bioinformatics - Selected articles from the 8th Asia-Pacific Bioinformatics Conference (APBC), 18-21 january, Bangalore, India*, vol. 11, p. S37, January 2010. [ DOI | http | .pdf ]

[79]

G. Battaglia, D. Cangelosi, R. Grossi, and N. Pisanti, “Masking patterns in sequences: A new class of motif discovery with don't cares,” *Theoretical Computer Science*, vol. 410, pp. 4327--4340, October 2009. [ DOI | http ]

[80]

W.-H. Chung and S.-B. Park, “An empirical study of choosing efficient discriminative seeds for oligonucleotide design,” *BMC Genomics*, vol. 10, p. S3, December 2009. [ DOI | http | .pdf ]

[81]

V.-H. Nguyen and D. Lavenier, “PLAST: parallel local alignment search tool for database comparison,” *BMC Bioinformatics*, vol. 10, p. 329, October 2009. [ DOI | http | .pdf ]

[82]

Y. Chen, T. Souaiaia, and T. Chen, “PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds,” *Bioinformatics*, vol. 25, pp. 2514--2521, October 2009. [ DOI | http | .pdf ]

[83]

K. Chen, Q. Zhu, F. Yang, and D. Tang, “An efficient way of finding good indel seeds for local homology search,” *Chinese Science Bulletin*, vol. 54, pp. 3837--3842, November 2009. [ DOI | http | .pdf ]

[84]

B. Ma and H. Yao, “Seed optimization for i.i.d. similarities is no easier than optimal Golomb ruler design,” *Information Processing Letters*, vol. 109, pp. 1120--1124, September 2009. (earlier version in APBC 2008). [ DOI | http ]

[85]

S. M. Rumble, P. Lacroute, A. V. Dalca, M. Fiume, A. Sidow, and M. Brudno, “SHRiMP: Accurate mapping of short color-space reads,” *PLoS Comp. Biol*, vol. 5, p. e1000386, May 2009. [ DOI | http ]

[86]

L. Zhou, M. Pertea, A. L. Delcher, and L. Florea, “Sim4cc: A cross-species spliced alignment program,” *Nucleic Acids Research*, vol. 37, p. e80, May 2009. [ DOI ]

[87]

W. Li, B. Ma, and K. Zhang, “Amino acid classification and hash seeds for homology search,” in *Proceedings of the 1st International Conference in Bioinformatics and Computational Biology, BICoB 2009, New Orleans LA (USA)*, vol. 5462 of *Lecture Notes in Computer Science*, pp. 44--51, Springer, April 2009. [ DOI | http | .pdf ]

[88]

L. Ilie and S. Ilie, “Fast computation of neighbor seeds,” *Bioinformatics*, vol. 25, pp. 822--823, March 2009. [ DOI | http | .pdf ]

[89]

D. Y. Mak and G. Benson, “All hits all the time: parameter free calculation of seed sensitivity,” *Bioinformatics*, vol. 25, pp. 302--308, February 2009. (earlier version in APBC 2007). [ DOI | http | .pdf ]

[90]

M. A. Roytberg, A. Gambin, L. Noé, S. Lasota, E. Furletova, E. Szczurek, and G. Kucherov, “On subset seeds for protein alignment,” *IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)*, vol. 6, pp. 483--494, July 2009. [ DOI | http | http | http ]

[91]

K.-M. Chao and L. Zhang, *Sequence Comparison: Theory and Methods*, vol. 7 of *Computational Biology*. Springer, 2008. [ DOI | http ]

[92]

D. Lavenier, “Ordered index seed algorithm for intensive DNA sequence comparison,” in *IEEE International Symposium on Parallel and Distributed Processing (IPDPS)*, pp. 1--8, April 2008. [ DOI | http | .pdf ]

[93]

G. Benson and D. Y. Mak, “Exact distribution of a spaced seed statistic for DNA homology detection,” in *Proceedings of the 15th International Symposium on String Processing and Information Retrieval (SPIRE), Melbourne (Australia)* (A. Amir, A. Turpin, and A. Moffat, eds.), vol. 5280 of *Lecture Notes in Computer Science*, pp. 282--293, Springer, November 2008. [ DOI | http | .pdf ]

[94]

V.-H. Nguyen and D. Lavenier, “Speeding up subset seed algorithm for intensive protein sequence comparison,” in *Proceedings of the 6th IEEE International Conference on research, innovation & vision for the future*, pp. 57--63, July 2008. [ DOI | http ]

[95]

J. Yang and L. Zhang, “Run probabilities of seed-like patterns and identifying good transition seeds,” *Journal of Computational Biology*, vol. 15, pp. 1295--1313, December 2008. (earlier version in APBC 2008). [ DOI | http | http ]

[96]

F. Nicolas and É. Rivals, “Hardness of optimal spaced seed design,” *Journal of Computer and System Sciences*, vol. 74, pp. 831--849, August 2008. (earlier version in CPM 2005). [ DOI | http | .pdf ]

[97]

I. Herms and S. Rahmann, “Computing alignment seed sensitivity with probabilistic arithmetic automata,” in *Proceedings of the 8th International Workshop on Algorithms in Bioinformatics (WABI), Karlsruhe (Germany)*, vol. 5251 of *Lecture Notes in Bioinformatics*, pp. 318--329, Springer, September 2008. [ DOI | http | .pdf ]

[98]

H. Lin, Z. Zhang, M. Q. Zhang, B. Ma, and M. Li, “ZOOM! Zillions Of Oligos Mapped,” *Bioinformatics*, vol. 24, pp. 2431--2437, November 2008. [ DOI | http | .pdf ]

[99]

M. A. Roytberg, A. Gambin, L. Noé, S. Lasota, E. Furletova, E. Szczurek, and G. Kucherov, “Efficient seeding techniques for protein similarity search,” in *Bioinformatics Research and Development, Proceedings of the 2nd International Conference BIRD 2008, Vienna (Austria), July 7-9, 2008* (M. Elloumi, J. Küng, M. Linial, R. Murphy, K. Schneider, and C. Toma, eds.), vol. 13 of *Communications in Computer and Information Science*, pp. 466--478, Springer, July 2008. [ DOI | http | http | http ]

[100]

D. G. Brown, *Bioinformatics Algorithms: Techniques and Applications*, ch. A survey of seeding for sequence alignment, pp. 126--152. Wiley-Interscience (I. Mandoiu, A. Zelikovsky), February 2008. [ DOI ]

[101]

J. Yang and L. Zhang, “Run probability of high-order seed patterns and its applications to finding good transition seeds,” in *Proceedings of the 6th Asia Pacific Bioinformatics Conference (APBC), 14-17 January 2008, Kyoto, Japan* (A. Brazma, S. Miyano, and T. Akutsu, eds.), vol. 6 of *Advances in Bioinformatics and Computational Biology*, pp. 123--132, Imperial College Press, January 2008. [ DOI | http | .pdf ]

[102]

B. Ma and H. Yao, “Seed optimization is no easier than optimal Golomb ruler design,” in *Proceedings of the 6th Asia Pacific Bioinformatics Conference (APBC), 14-17 January 2008, Kyoto, Japan* (A. Brazma, S. Miyano, and T. Akutsu, eds.), vol. 6 of *Advances in Bioinformatics and Computational Biology*, pp. 133--144, Imperial College Press, January 2008. [ DOI | http | .pdf ]

[103]

L. Zhou, J. Stanton, and L. Florea, “Universal seeds for cDNA-to-genome comparison,” *BMC Bioinformatics*, vol. 9, p. 36, January 2008. [ DOI | http | .pdf ]

[104]

Z. Zhang, H. Lin, and M. Li, “Mango: multiple alignment with N gapped oligos,” *Journal of Bioinformatics and Computational Biology*, vol. 6, pp. 521--541, June 2008. [ DOI | .html | .pdf ]

[105]

R. S. Harris, *Improved pairwise alignment of genomic DNA*. Ph.d. thesis, The Pennsylvania State University, December 2007. [ bib ]

[106]

Z. Zhang, H. Lin, and M. Li, “Mango: A new approach to multiple sequence alignment,” in *Proceedings of the 6th International Conference on Computational Systems Bioinformatics (CSB), San Diego (USA)*, vol. 6, pp. 237--247, August 2007. [ .html | .pdf ]

[107]

G. Kucherov, L. Noé, and M. A. Roytberg, “Subset seed automaton,” in *Proceedings of the 12th International Conference on Implementation and Application of Automata (CIAA), July 16-18, 2007, Prague (Czech Republic)* (J. Holub and J. Zdarek, eds.), vol. 4783 of *Lecture Notes in Computer Science*, pp. 180--191, Springer, July 2007. [ DOI | http | http | http ]

[108]

P. Peterlongo, L. Noé, D. Lavenier, G. Georges, J. Jacques, G. Kucherov, and M. Giraud, “Protein similarity search with subset seeds on a dedicated reconfigurable hardware,” in *Proceedings of the 2nd Workshop on Parallel Bio-Computing (PBC), September 9-12, 2007 Gdansk (Poland)* (R. Wyrzykowski, J. Dongarra, K. Karczewski, and J. Wasniewski, eds.), vol. 4967 of *Lecture Notes in Computer Science*, pp. 1240--1248, Springer, September 2008. [ DOI | http | .pdf ]

[109]

J.-E. Duchesne, M. Giraud, and N. El-Mabrouk, “Seed-based exclusion method for non-coding RNA gene search,” in *Proceedings of the 13rd International Computing and Combinatorics Conference (COCOON)*, vol. 4598 of *Lecture Notes in Computer Science*, pp. 27--39, Springer, July 2007. [ DOI | http | .pdf ]

[110]

L. Ilie and S. Ilie, “Long spaced seeds for finding similarities between biological sequences,” in *Proceedings of the 2nd International Conference on Bioinformatics & Computational Biology (BIOCOMP)*, pp. 3--8, 2007. [ .pdf ]

[111]

L. Ilie and S. Ilie, “Multiple spaced seeds for homology search,” *Bioinformatics*, vol. 23, pp. 2969--2977, September 2007. [ DOI | http | .pdf ]

[112]

L. Ilie and S. Ilie, “Fast computation of good multiple spaced seeds,” in *Proceedings of the 7th International Workshop on Algorithms in Bioinformatics (WABI), Philadelphia (USA)*, vol. 4645 of *Lecture Notes in Bioinformatics*, pp. 346--358, Springer, September 2007. [ DOI | http | .pdf ]

[113]

L. Zhang, “Superiority of spaced seeds for homology search,” *IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)*, vol. 4, pp. 496--505, July 2007. [ DOI | http ]

[114]

X. Gao, S. C. Li, and Y. Lu, “New algorithms for the spaced seeds,” in *Frontiers of Algorithmic Workshop 2007 (FAW2007)*, vol. 4613 of *Lecture Notes in Computer Science*, pp. 51--61, Springer, August 2007. [ DOI | http | .pdf ]

[115]

S. Feng and E. R. Tillier, “A fast and flexible approach to oligonucleotide probe design for genomes and gene families,” *Bioinformatics*, vol. 23, pp. 1195--1202, May 2007. [ DOI | http | .pdf ]

[116]

Y. Kong, “Generalized correlation functions and their applications in selection of optimal multiple spaced seeds for homology search,” *Journal of Computational Biology*, vol. 14, pp. 238--254, March 2007. [ DOI | http | http ]

[117]

B. Ma and M. Li, “On the complexity of spaced seeds,” *Journal of Computer and System Sciences*, vol. 73, pp. 1024--1034, March 2007. [ DOI | http ]

[118]

M. Farach-Colton, G. M. Landau, S. Cenk Sahinalp, and D. Tsur, “Optimal spaced seeds for faster approximate string matching,” *Journal of Computer and System Sciences*, vol. 73, pp. 1035--1044, November 2007. [ DOI | http ]

[119]

L. Zhou and L. Florea, “Designing sensitive and specific spaced seeds for cross-species mRNA-to-genome alignment,” *Journal of Computational Biology*, vol. 14, pp. 113--130, March 2007. [ DOI | http | http ]

[120]

D. Y. Mak and G. Benson, “All hits all the time: parameter free calculation of seed sensitivity,” in *Proceedings of the 5th Asia Pacific Bioinformatics Conference (APBC)* (D. Sankoff, L. Wang, and F. Chin, eds.), vol. 5 of *Advances in Bioinformatics and Computational Biology*, pp. 327--340, Imperial College Press, January 2007. [ DOI | http | .pdf ]

[121]

M. Csűrös and B. Ma, “Rapid homology search with neighbor seeds,” *Algorithmica*, vol. 48, pp. 187--202, June 2007. (earlier version in COCOON 2005). [ DOI | http | .pdf ]

[122]

J. Xu, D. G. Brown, M. Li, and B. Ma, “Optimizing multiple spaced seeds for homology search,” *Journal of Computational Biology*, vol. 13, pp. 1355--1368, September 2006. (earlier version in CPM 2004). [ DOI | http | http ]

[123]

D. Y. Mak, Y. Gelfand, and G. Benson, “Indel seeds for homology search,” *Bioinformatics*, vol. 22, no. 14, pp. e341--e349, 2006. [ DOI | http | .pdf ]

[124]

A. E. Darling, T. J. Treangen, L. Zhang, C. Kuiken, X. Messeguer, and N. T. Perna, “Procrastination leads to efficient filtration for local multiple alignment,” in *Proceedings of the 6th International Workshop on Algorithms in Bioinformatics (WABI), Zürich (Switzerland)*, vol. 4175 of *Lecture Notes in Bioinformatics*, pp. 126--137, Springer, September 2006. [ DOI | http | .pdf ]

[125]

Y. Sun and J. Buhler, “Choosing the best heuristic for seeded alignment of DNA sequences,” *BMC Bioinformatics*, vol. 7, p. 133, March 2006. [ DOI | http | .pdf ]

[126]

M. Li, B. Ma, and L. Zhang, “Superiority and complexity of the spaced seeds,” in *Proceedings of the 17th Symposium on Discrete Algorithms (SODA)*, pp. 444--453, ACM Press, January 2006. [ DOI | http | .pdf ]

[127]

G. Kucherov, L. Noé, and M. A. Roytberg, “A unifying framework for seed sensitivity and its application to subset seeds,” *Journal of Bioinformatics and Computational Biology*, vol. 4, pp. 553--569, November 2006. [ DOI | .html | http | http ]

[128]

A. Pol and T. Kahveci, “Highly scalable and accurate seeds for subsequence alignments,” in *Proceedings of the IEEE 5th Symposium on Bioinformatics and Bioengineering (BIBE), Minneapolis (USA)*, pp. 27--31, IEEE Computer Society Press, October 2005. [ DOI | http ]

[129]

K. P. Choi and L. Zhang, “Analysis of spaced seed technique in sequence alignment,” *COSMOS*, vol. 1, pp. 57--73, May 2005. [ DOI | .html | .pdf ]

[130]

M. Fontaine, S. Burkhardt, and J. Kärkkäinen, “BDD-based analysis of gapped *q*-gram filters,” *International Journal of Foundations of Computer Science*, vol. 16, pp. 1121--1134, December 2005. (earlier version in PSC 2004). [ DOI | .ps.gz ]

[131]

M. Csűrös and B. Ma, “Rapid homology search with two-stage extension and daughter seeds,” in *Proceedings of the 11th International Computing and Combinatorics Conference (COCOON)*, vol. 3595 of *Lecture Notes in Computer Science*, pp. 104--114, Springer, August 2005. [ DOI | http | .pdf ]

[132]

F. P. Preparata, L. Zhang, and K. P. Choi, “Quick, practical selection of effective seeds for homology search,” *Journal of Computational Biology*, vol. 12, pp. 1137--1152, November 2005. [ DOI | http | http ]

[133]

J. Buhler, U. Keich, and Y. Sun, “Designing seeds for similarity search in genomic DNA,” *Journal of Computer and System Sciences*, vol. 70, no. 3, pp. 342--363, 2005. (earlier version in RECOMB 2003). [ DOI | http | .pdf ]

[134]

B. Brejová, D. G. Brown, and T. Vinař, “Vector seeds: An extension to spaced seeds,” *Journal of Computer and System Sciences*, vol. 70, no. 3, pp. 364--380, 2005. (earlier version in WABI 2003). [ DOI | http ]

[135]

Y. Sun and J. Buhler, “Designing multiple simultaneous seeds for DNA similarity search,” *Journal of Computational Biology*, vol. 12, no. 6, pp. 847--861, 2005. (earlier version in RECOMB 2004). [ DOI | http | http ]

[136]

M. Farach-Colton, G. M. Landau, S. Cenk Sahinalp, and D. Tsur, “Optimal spaced seeds for faster approximate string matching,” in *Proceedings of the 32nd International Colloquium on Automata, Languages and Programming (ICALP'05), Lisboa (Portugal)*, vol. 3580 of *Lecture Notes in Computer Science*, pp. 1251--1262, Springer, 2005. [ DOI | http | .pdf ]

[137]

F. Nicolas and É. Rivals, “Hardness of optimal spaced seed design,” in *Proceedings of the 16th Annual Symposium on Combinatorial Pattern Matching (CPM), Jeju Island (Korea)* (A. Apostolico, M. Crochemore, and K. Park, eds.), vol. 3537 of *Lecture Notes in Computer Science*, pp. 144--155, Springer, 2005. [ DOI | http | .pdf ]

[138]

D. G. Brown, “Optimizing multiple seeds for protein homology search,” *IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)*, vol. 2, pp. 29--38, january 2005. (earlier version in WABI 2004). [ DOI | http ]

[139]

D. Kisman, M. Li, B. Ma, and W. Li, “tPatternhunter: gapped, fast and sensitive translated homology search,” *Bioinformatics*, vol. 21, pp. 542--544, February 2005. [ DOI | http | .pdf ]

[140]

B. Brejová, *Evidence Combination in Hidden Markov Models for Gene Prediction*. PhD thesis, University of Waterloo, 2005. [ http | .pdf ]

[141]

L. Noé and G. Kucherov, “YASS: enhancing the sensitivity of DNA similarity search,” *Nucleic Acids Research*, vol. 33 (web-server issue), pp. W540--W543, April 2005. [ DOI | http | .pdf ]

[142]

G. Kucherov, L. Noé, and M. A. Roytberg, “A unifying framework for seed sensitivity and its application to subset seeds (extended abstract),” in *Proceedings of the 5th International Workshop on Algorithms in Bioinformatics (WABI), October 3-6, 2005, Mallorca (Spain)* (R. Casadio and G. Myers, eds.), vol. 3692 of *Lecture Notes in Computer Science*, pp. 251--263, Springer, October 2005. [ DOI | http | http | http ]

[143]

G. Kucherov, L. Noé, and M. A. Roytberg, “Multiseed lossless filtration,” *IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)*, vol. 2, pp. 51--61, January 2005. [ DOI | http | http | http ]

[144]

L. Noé, *Recherche de similarités dans les séquences d'ADN: modèles et algorithmes pour la conception de graines efficaces*. PhD thesis, Université Henri Poincaré - Nancy, September 2005. [ http | http ]

[145]

J. Flannick and S. Batzoglou, “Using multiple alignments to improve seeded local alignment algorithms,” *Nucleic Acids Research*, vol. 33, pp. 4563--4577, August 2005. [ DOI | http ]

[146]

P. Peterlongo, N. Pisanti, F. Boyer, and M.-F. Sagot, “Lossless filter for finding long multiple approximate repetitions using a new data structure, the bi-factor array,” in *Proceedings of the 12th International Conference, on String Processing and Information Retrieval (SPIRE), Buenos Aires (Argentina)* (M. Consens and G. Navarro, eds.), vol. 3772 of *Lecture Notes in Computer Science*, pp. 179--190, November 2005. [ DOI | http | .pdf ]

[147]

D. G. Brown, M. Li, and B. Ma, “A tutorial of recent developments in the seeding of local alignment,” *Journal of Bioinformatics and Computational Biology*, vol. 2, no. 4, pp. 819--842, 2004. [ DOI | .html | .pdf ]

[148]

M. Fontaine, S. Burkhardt, and J. Kärkkäinen, “BDD-based analysis of gapped *q*-gram filters,” in *Proceedings of the 9th Prague Stringology Conference (PSC)*, pp. 56--68, 2004. [ .html | .pdf ]

[149]

D. G. Brown, “Multiple vector seeds for protein alignment,” in *Proceedings of the 4th International Workshop on Algorithms in Bioinformatics (WABI), Bergen (Norway)* (I. Jonassen and J. Kim, eds.), vol. 3240 of *Lecture Notes in Bioinformatics*, pp. 170--181, Springer, September 2004. [ DOI | http | .pdf ]

[150]

D. G. Brown and A. K. Hudek, “New algorithms for multiple DNA sequence alignment,” in *Proceedings of the 4th International Workshop on Algorithms in Bioinformatics (WABI), Bergen (Norway)* (I. Jonassen and J. Kim, eds.), vol. 3240 of *Lecture Notes in Bioinformatics*, pp. 314--325, Springer, September 2004. [ DOI | http | .pdf ]

[151]

X. Huang, L. Ye, H.-H. Chou, I.-H. Yang, and K.-M. Chao, “Efficient combination of multiple word models for improved sequence comparison,” *Bioinformatics*, vol. 20, no. 16, pp. 2529--2533, 2004. [ DOI | http | .pdf ]

[152]

U. Keich, M. Li, B. Ma, and J. Tromp, “On spaced seeds for similarity search,” *Discrete Applied Mathematics*, vol. 138, no. 3, pp. 253--263, 2004. (earlier version in 2002). [ DOI | http ]

[153]

M. Csűrös, “Performing local similarity searches with variable length seeds,” in *Proceedings of the 15th Annual Combinatorial Pattern Matching Symposium (CPM), Istanbul (Turkey)* (S. Sahinalp, S. Muthukrishnan, and U. Dogrusoz, eds.), vol. 3109 of *Lecture Notes in Computer Science*, pp. 373--387, Springer, 2004. [ DOI | http | .pdf ]

[154]

J. Xu, D. G. Brown, M. Li, and B. Ma, “Optimizing multiple spaced seeds for homology search,” in *Proceedings of the 15th Symposium on Combinatorial Pattern Matching (CPM), Istambul (Turkey)* (S. Sahinalp, S. Muthukrishnan, and U. Dogrusoz, eds.), vol. 3109 of *Lecture Notes in Computer Science*, pp. 47--58, Springer, 2004. [ DOI | http | http ]

[155]

I.-H. Yang, S.-H. Wang, Y.-H. Chen, P.-H. Huang, L. Ye, X. Huang, and K.-M. Chao, “Efficient methods for generating optimal single and multiple spaced seeds,” in *Proceedings of the IEEE 4th Symposium on Bioinformatics and Bioengineering (BIBE), Taichung (Taiwan)*, pp. 411--416, IEEE Computer Society Press, 2004. [ DOI | http ]

[156]

B. Brejová, D. G. Brown, and T. Vinař, “Optimal spaced seeds for homologous coding regions,” *Journal of Bioinformatics and Computational Biology*, vol. 1, pp. 595--610, January 2004. [ DOI | .html | .pdf ]

[157]

M. Li, B. Ma, D. Kisman, and J. Tromp, “PatternHunter II: Highly sensitive and fast homology search,” *Journal of Bioinformatics and Computational Biology*, vol. 2, no. 3, pp. 417--439, 2004. (earlier version in GIW 2003). [ DOI | .html ]

[158]

Y. Sun and J. Buhler, “Designing multiple simultaneous seeds for DNA similarity search,” in *Proceedings of the 8th Annual International Conference on Research in Computational Molecular Biology (RECOMB), San Diego (California)*, pp. 76--84, March 2004. [ DOI | http ]

[159]

K. P. Choi, F. Zeng, and L. Zhang, “Good spaced seeds for homology search,” *Bioinformatics*, vol. 20, no. 7, pp. 1053--1059, 2004. [ DOI | http | .pdf ]

[160]

L. Noé and G. Kucherov, “Improved hit criteria for DNA local alignment,” *BMC Bioinformatics*, vol. 5, p. 149, October 2004. [ DOI | http | .pdf ]

[161]

G. Kucherov, L. Noé, and M. A. Roytberg, “Multi-seed lossless filtration (extended abstract),” in *Proceedings of the 15th Annual Combinatorial Pattern Matching Symposium (CPM), July 5-7, 2004, Istanbul (Turkey)* (S. Sahinalp, S. Muthukrishnan, and U. Dogrusoz, eds.), vol. 3109 of *Lecture Notes in Computer Science*, pp. 297--310, Springer, July 2004. [ DOI | http | http | http ]

[162]

G. Kucherov, L. Noé, and Y. Ponty, “Estimating seed sensitivity on homogeneous alignments,” in *Proceedings of the IEEE 4th Symposium on Bioinformatics and Bioengineering (BIBE), May 19-21, 2004, Taichung (Taiwan)*, pp. 387--394, IEEE Computer Society Press, April 2004. [ DOI | http | http | http ]

[163]

L. Noé and G. Kucherov, “Improved hit criteria for DNA local alignment,” in *Proceedings of the 5th Open Days in Biology, Computer Science and Mathematics (JOBIM), June 28-30, 2004, Montréal (Canada)*, June 2004. [ http | http ]

[164]

W. Chen and W.-K. Sung, “On half gapped seed,” *Genome Informatics*, vol. 14, pp. 176--185, 2003. (earlier version in GIW 2003). [ DOI | http | .pdf ]

[165]

B. Brejová, D. G. Brown, and T. Vinař, “Vector seeds: an extension to spaced seeds allows substantial improvements in sensitivity and specificity,” in *WABI*, vol. 2812 of *Lecture Notes in Computer Science*, pp. 39--54, Springer, September 2003. [ DOI | http | .pdf ]

[166]

K. P. Choi and L. Zhang, “Sensitivity analysis and efficient method for identifying optimal spaced seeds,” *Journal of Computer and System Sciences*, vol. 68, no. 1, pp. 22--40, 2004. [ DOI | http ]

[167]

J. Buhler, U. Keich, and Y. Sun, “Designing seeds for similarity search in genomic DNA,” in *Proceedings of the 7th Annual International Conference on Research in Computational Molecular Biology (RECOMB), Berlin (Germany)*, pp. 67--75, ACM Press, April 2003. [ DOI | .pdf ]

[168]

S. Schwartz, W. J. Kent, A. Smit, Z. Zhang, R. Baertsch, R. C. Hardison, D. Haussler, and W. Miller, “Human--mouse alignments with BLASTZ,” *Genome Research*, vol. 13, pp. 103--107, 2003. [ DOI | http ]

[169]

B. Ma, J. Tromp, and M. Li, “PatternHunter: Faster and more sensitive homology search,” *Bioinformatics*, vol. 18, no. 3, pp. 440--445, 2002. [ DOI | http | .pdf ]

[170]

B. Brejová, D. G. Brown, and T. Vinař, “Optimal spaced seeds for Hidden Markov Models, with application to homologous coding regions,” in *Proceedings of the 14th Symposium on Combinatorial Pattern Matching (CPM), Morelia (Mexico)* (M. C. R. Baeza-Yates, E. Chavez, ed.), vol. 2676 of *Lecture Notes in Computer Science*, pp. 42--54, Springer, June 2003. [ DOI | http | .pdf ]

[171]

S. Burkhardt and J. Kärkkäinen, “Better filtering with gapped *q*-grams,” *Fundamenta Informaticae*, vol. 56, no. 1-2, pp. 51--70, 2002. (earlier version in CPM 2001). [ http | .ps.gz ]

[172]

S. Burkhardt and J. Kärkkäinen, “One-gapped *q*-gram filters for Levenshtein Distance,” in *Proceedings of the 13th Symposium on Combinatorial Pattern Matching (CPM)*, vol. 2373 of *Lecture Notes in Computer Science*, pp. 225--234, Springer, 2002. [ DOI | http | .pdf ]

[173]

J. Buhler, “Provably sensitive indexing strategies for biosequence similarity search,” in *RECOMB, Washington DC (USA)*, pp. 90--99, ACM Press, April 2002. [ DOI | http ]

[174]

P. Nicodème, B. Salvy, and P. Flajolet, “Motif statistics,” *Theoretical Computer Science*, vol. 287, no. 2, pp. 593--617, 2002. [ DOI | http ]

[175]

S. Burkhardt and J. Kärkkäinen, “Better filtering with gapped *q*-grams,” in *Proceedings of the 12th Symposium on Combinatorial Pattern Matching (CPM)*, vol. 2089 of *Lecture Notes in Computer Science*, pp. 73--85, Springer, July 2001. [ DOI | http | .pdf ]

[176]

J. Buhler and M. Tompa, “Finding motifs using random projections,” in *Proceedings of the 5th Annual International Conference on Research in Computational Molecular Biology (RECOMB)*, pp. 69--76, ACM Press, 2001. [ DOI | http ]

[177]

J. Buhler, “Efficient large-scale sequence comparison by locality-sensitive hashing,” *Bioinformatics*, vol. 17, no. 5, pp. 419--428, 2001. [ DOI | http | .pdf ]

[178]

W. J. Kent and A. M. Zahler, “Conservation, regulation, synteny, and introns in a large-scale c. briggsae–c. elegans genomic alignment,” *Genome Research*, vol. 10, pp. 1115--1125, August 2000. [ DOI | http | .pdf ]

[179]

A. Califano and I. Rigoutsos, “Flash: A fast look-up algorithm for string homology,” in *Proceedings of the 1st International Conference on Intelligent Systems for Molecular Biology (ISMB)*, pp. 56--64, July 1993. [ DOI | http | .pdf ]