ICTIR'2020 Tutorial: Modern Query Performance Prediction - Theory and Practice

Part 1: Introduction to Query Performance Prediction in IR
The Robustness problem in IR
Task definition and Motivation
Core challenges and potential applications
QPP evaluation
Pre-retrieval QPP
Post-retrieval QPP
Towards QPP theory: basic QPP frameworks
- The Query Difficulty Model
- Utility Estimation Framework
- Unified Post-Retrieval QPP

[slides]

Part 2: Modern QPP Frameworks
The Probabilistic QPP framework
QPP and Cluster Ranking
Query-based vs. Ranking-based QPP
Score Distribution QPP frameworks
Generalized Post-retrieval QPP framework: The Weighted Product Model
Reference Lists and the role of Asymmetric Co-Relevance
QPP with minimal relevance feedback
Neural QPP frameworks

[slides]

Part 3: Task-specific QPP Methods
Query processing and evaluation
Passage-retrieval and QA
Fusion
Semantic Search
Conversational-search

[slides]

References
Giambattista Amati, Claudio Carpineto, and Giovanni Romano. Query difficulty, robustness and selective application of query expansion. In Proceedings of the European Conference on Information Retrieval (ECIR 2004), pages 127–137. Springer, 2004. 28, 60, 67
Negar Arabzadeh, Fattane Zarrinkalam, Jelena Jovanovic, and Ebrahim Bagheri. 2020. Neural Embedding-Based Metrics for Pre-retrieval Query Performance Prediction. In Advances in Information Retrieval, Joemon M. Jose, Emine Yilmaz, Jo ao Magalh aes, Pablo Castells, Nicola Ferro, Mário J. Silva, and Flávio Martins (Eds.). Springer International Publishing, Cham, 78--85.
Javed A. Aslam and Virgiliu Pavlu. Query hardness estimation using Jensen-Shannon divergence among multiple scoring functions. In Proceedings of the European Conference on Information Retrieval (ECIR 2007), pages 198–209, 2007. DOI: 10.1007/978-3-540-71496-5_20 31
Olga Butman, Anna Shtok, Oren Kurland, and David Carmel. 2013. Query-Performance Prediction Using Minimal Relevance Feedback. In Proceedings of the 2013 Conference on the Theory of Information Retrieval (ICTIR '13). Association for Computing Machinery, New York, NY, USA,14--21. https://doi.org/10.1145/2499178.2499201
David Carmel and Elad Yom-Tov. 2010. Estimating the query difficulty for information retrieval. Morgan and Claypool Publishers.
David Carmel, Elad Yom-Tov, Adam Darlow, and Dan Pelleg. 2006. What Makes a Query Difficult?. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '06). Association for Computing Machinery, New York, NY, USA, 390--397. https://doi.org/10.1145/1148170.1148238
Steve Cronen-Townsend, Yun Zhou, and W. Bruce Croft. 2002. Predicting Query Performance. In Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '02). Association for Computing Machinery, New York, NY, USA, 299--306. https://doi.org/10.1145/564376.564429
Ronan Cummins. 2014. Document Score Distribution Models for Query Performance Inference and Prediction. ACM Trans. Inf. Syst., Vol. 32, 1, Article 2 (Jan. 2014), 28 pages. https://doi.org/10.1145/2559170
Fernando Diaz. Performance prediction using spatial autocorrelation. In Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2007), pages 583–590, Amsterdam, The Netherlands, 2007. ACM. DOI: 10.1145/1277741.1277841 23, 32, 35, 41
Helia Hashemi, Hamed Zamani, and W. Bruce Croft. 2019. Performance Prediction for Non-Factoid Question Answering. In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR '19). Association for Computing Machinery, New York, NY, USA,55--58. https://doi.org/10.1145/3341981.3344249
Claudia Hauff, Leif Azzopardi, Djoerd Hiemstra, and Franciska de Jong. 2010. Query Performance Prediction: Evaluation Contrasted with Effectiveness. In Advances in Information Retrieval, Cathal Gurrin, Yulan He, Gabriella Kazai, Udo Kruschwitz, Suzanne Little, Thomas Roelleke, Stefan Rüger, and Keith van Rijsbergen (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 204--216.
Claudia Hauff, Djoerd Hiemstra, and Franciska de Jong. 2008. A Survey of Pre-Retrieval Query Performance Predictors. In Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM '08). Association for Computing Machinery, New York, NY, USA,1419--1420. https://doi.org/10.1145/1458082.1458311
Donna Harman and Chris Buckley. 2004. The NRRC reliable information access (RIA) workshop. In Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '04). Association for Computing Machinery, New York, NY, USA, 528–529. DOI:https://doi.org/10.1145/1008992.1009104
Helia Hashemi, Hamed Zamani, and W. Bruce Croft. 2019. Performance Prediction for Non-Factoid Question Answering. In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR '19). Association for Computing Machinery, New York, NY, USA, 55–58. DOI:https://doi.org/10.1145/3341981.3344249
Ben He and Iahd Ounis, Inferring Query Performance Using Pre-retrieval Predictors, In Proceedings of String Processing and Information Retrieval, 2004
Shay Hummel, Anna Shtok, Fiana Raiber, Oren Kurland, David Carmel, Clarity re-visited, In proceedings of SIGIR pp 1039--1040, 2012
Gilad Katz, Anna Shtock, Oren Kurland, Bracha Shapira, and Lior Rokach. 2014. Wikipedia-Based Query Performance Prediction. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '14). Association for Computing Machinery, New York, NY, USA, 1235--1238. https://doi.org/10.1145/2600428.2609553
Eyal Krikon, David Carmel, and Oren Kurland. 2012. Predicting the Performance of Passage Retrieval for Question Answering. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM '12). Association for Computing Machinery, New York, NY, USA, 2451--2454. https://doi.org/10.1145/2396761.2398664
Oren Kurland, Fiana Raiber, and Anna Shtok. 2012a. Query-Performance Prediction and Cluster Ranking: Two Sides of the Same Coin. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM '12). Association for Computing Machinery, New York, NY, USA, 2459--2462. https://doi.org/10.1145/2396761.2398666
Oren Kurland, Anna Shtok, David Carmel, and Shay Hummel. 2011. A unified framework for post-retrieval query-performance prediction. In Conference on the Theory of Information Retrieval. Springer, 15--26.
Oren Kurland, Anna Shtok, Shay Hummel, Fiana Raiber, David Carmel, and Ofri Rom. 2012b. Back to the Roots: A Probabilistic Framework for Query-Performance Prediction. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM '12). Association for Computing Machinery, New York, NY, USA, 823--832. https://doi.org/10.1145/2396761.2396866
Gad Markovits, Anna Shtok, Oren Kurland, and David Carmel. 2012. Predicting Query Performance for Fusion-Based Retrieval. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management (CIKM '12). Association for Computing Machinery, New York, NY, USA, 813--822. https://doi.org/10.1145/2396761.2396865
Craig Macdonald, Rodrygo L.T. Santos, and Iadh Ounis. 2012. On the usefulness of query features for learning to rank. In Proceedings of the 21st ACM international conference on Information and knowledge management (CIKM '12). Association for Computing Machinery, New York, NY, USA, 2559–2562. DOI:https://doi.org/10.1145/2396761.2398691
Hafeezul Rahman Mohammad, Keyang Xu, Jamie Callan, and J. Shane Culpepper. 2018. Dynamic Shard Cutoff Prediction for Selective Search. In The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '18). Association for Computing Machinery, New York, NY, USA, 85--94. https://doi.org/10.1145/3209978.3210005
Mothe, Josiane & Tanguy, Ludovic. (2005). Linguistic features to predict query difficulty.
Alberto Oliveira, Eric Oakley, Ricardo [da Silva Torres], and Anderson Rocha. 2019. Relevance prediction in similarity-search system using extreme value theory. Journal of Visual Communication and Image Representation, Vol. 60 (2019), 236 -- 249. https://doi.org/10.1016/j.jvcir.2019.02.019
Ahmet Ozdemiray and Ismail Altingövde. 2014. Query Performance Prediction for Aspect Weighting in Search Result Diversification. CIKM 2014 - Proceedings of the 2014 ACM International Conference on Information and Knowledge Management, 1871--1874. https://doi.org/10.1145/2661829.2661975
Vassilis Plachouras, Ben He, and Iadh Ounis. University of glasgow at TREC 2004: Experiments in web, robust, and terabyte tracks with terrier. In Proceedings of 10th Text REtrieval Conference (TREC-10), 2004. 20, 22
Joaquín Pérez-Iglesias and Lourdes Araujo, Evaluation of Query Performance Prediction Methods by Range, International Symposium on String Processing and Information Retrieval, SPIRE 2010: String Processing and Information Retrieval pp 225-236|
Fiana Raiber and Oren Kurland. 2014. Query-Performance Prediction: Setting the Expectations Straight. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '14). Association for Computing Machinery, New York, NY, USA, 13--22. https://doi.org/10.1145/2600428.2609581
Hadas Raviv, Oren Kurland, and David Carmel. 2014. Query Performance Prediction for Entity Retrieval. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '14). Association for Computing Machinery, New York, NY, USA, 1099--1102. https://doi.org/10.1145/2600428.2609519
Haggai Roitman. 2017. An Enhanced Approach to Query Performance Prediction Using Reference Lists. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '17). Association for Computing Machinery, New York, NY, USA, 869--872. https://doi.org/10.1145/3077136.3080665
Haggai Roitman. 2018a. Enhanced Performance Prediction of Fusion-Based Retrieval. In Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR '18). Association for Computing Machinery, New York, NY, USA, 195--198. https://doi.org/10.1145/3234944.3234950
Haggai Roitman. 2018b. An Extended Query Performance Prediction Framework Utilizing Passage-Level Information. In Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR '18). Association for Computing Machinery, New York, NY, USA, 35--42. https://doi.org/10.1145/3234944.3234946
Haggai Roitman. 2019. Normalized Query Commitment Revisited. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'19). Association for Computing Machinery, New York, NY, USA, 1085--1088. https://doi.org/10.1145/3331184.3331334
Haggai Roitman, Shai Erera, and Guy Feigenblat. 2019. A Study of Query Performance Prediction for Answer Quality Determination. In Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval. 43--46.
Haggai Roitman, Shai Erera, Oren Sar-Shalom, and Bar Weiner. 2017b. Enhanced Mean Retrieval Score Estimation for Query Performance Prediction. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR '17). Association for Computing Machinery, New York, NY, USA, 35--42. https://doi.org/10.1145/3121050.3121051
Haggai Roitman, Shai Erera, and Bar Weiner. 2017a. Robust Standard Deviation Estimation for Query Performance Prediction. In Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval (ICTIR '17). Association for Computing Machinery, New York, NY, USA, 245--248. https://doi.org/10.1145/3121050.3121087
Haggai Roitman, Shay Hummel, and Oren Kurland. 2014. Using the Cross-Entropy Method to Re-Rank Search Results. In Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '14). Association for Computing Machinery, New York, NY, USA, 839--842. https://doi.org/10.1145/2600428.2609454
Haggai Roitman and Oren Kurland. 2019. Query Performance Prediction for Pseudo-Feedback-Based Retrieval. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'19). Association for Computing Machinery, New York, NY, USA, 1261--1264. https://doi.org/10.1145/3331184.3331369
Haggai Roitman, Ella Rabinovich, and Oren Sar Shalom. 2018. As Stable As You Are: Re-Ranking Search Results Using Query-Drift Analysis. In Proceedings of the 29th on Hypertext and Social Media (HT '18). Association for Computing Machinery, New York, NY, USA, 33--37. https://doi.org/10.1145/3209542.3209567
Dwaipayan Roy, Debasis Ganguly, Mandar Mitra, and Gareth JF Jones. 2019. Estimating gaussian mixture models in the local neighbourhood of embedded word vectors for query performance prediction. Information Processing & Management, Vol. 56, 3 (2019), 1026--1045.
Anna Shtok, Oren Kurland, and David Carmel. 2010. Using Statistical Decision Theory and Relevance Models for Query-Performance Prediction. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '10). Association for Computing Machinery, New York, NY, USA, 259--266. https://doi.org/10.1145/1835449.1835494
Anna Shtok, Oren Kurland, and David Carmel. 2016. Query Performance Prediction Using Reference Lists. ACM Trans. Inf. Syst., Vol. 34, 4, Article 19 (June 2016), 34 pages. https://doi.org/10.1145/2926790
Anna Shtok, Oren Kurland, David Carmel, Fiana Raiber, and Gad Markovits. 2012. Predicting Query Performance by Query-Drift Estimation. ACM Trans. Inf. Syst., Vol. 30, 2, Article 11 (May 2012), 35 pages. https://doi.org/10.1145/2180868.2180873
Mor Sondak, Anna Shtok, and Oren Kurland. 2013. Estimating Query Representativeness for Query-Performance Prediction. In Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '13). Association for Computing Machinery, New York, NY, USA, 853--856. https://doi.org/10.1145/2484028.2484107
Yongquan Tao and Shengli Wu. 2014. Query Performance Prediction By Considering Score Magnitude and Variance Together. In Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management (CIKM '14). Association for Computing Machinery, New York, NY, USA, 1891–1894. DOI:https://doi.org/10.1145/2661829.2661906
Stephen Tomlinson. Robust, Web and Terabyte Retrieval with Hummingbird Search Server at TREC 2004. In Proceedings of TREC-13, 2004. 32
Vishwa Vinay, Ingemar J. Cox, Natasa Milic-Frayling, and Kenneth R.Wood. On ranking the effectiveness of searches. In Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2006), pages 398–404, Seattle, Washington, USA, 2006. ACM. DOI: 10.1145/1148170.1148239 23, 30, 31, 32, 49
Ellen M. Voorhees. 1998. Variations in relevance judgments and the measurement of retrieval effectiveness. In Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR '98). Association for Computing Machinery, New York, NY, USA, 315–323. DOI:https://doi.org/10.1145/290941.291017
Elad Yom-Tov, Shai Fine, David Carmel, and Adam Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. In Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (SIGIR 2005), pages 512–519, Salvador, Brazil, 2005. ACM. DOI: 10.1145/1076034.1076121 23, 41, 58, 60, 67, 77
Ying Zhao, Falk Scholer, and Yohannes Tsegay. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Proceedings of the European Conference on Information Retrieval (ECIR 2008), pages 52–64, 2008. DOI: 10.1007/978-3-540-78646-7_8 19, 21, 22, 23, 66
Hamed Zamani, W Bruce Croft, and J Shane Culpepper. 2018. Neural query performance prediction using weak supervision from multiple signals. In The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval. 105--114.
Oleg Zendel, Anna Shtok, Fiana Raiber, Oren Kurland, and J. Shane Culpepper. 2019. Information Needs, Queries, and Query Performance Prediction. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR'19). Association for Computing Machinery, New York, NY, USA, 395--404. https://doi.org/10.1145/3331184.3331253
Yun Zhou and W. Bruce Croft. 2007. Query Performance Prediction in Web Search Environments. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '07). Association for Computing Machinery, New York, NY, USA, 543--550. https://doi.org/10.1145/1277741.1277835

ICTIR'2020 Tutorial: Modern Query Performance Prediction - Theory and Practice

Part 1: Introduction to Query Performance Prediction in IR

[slides]

Part 2: Modern QPP Frameworks

[slides]

Part 3: Task-specific QPP Methods

[slides]

References