Prediction Biases Sampling

Prediction, Biases, and Sampling Algorithms in Sentence Comprehension

LPP Graduate Seminar

Matt Husband

TT 2024

 

The past twenty years has seen a sea change in psycholinguistic theorizing about the mechanisms of sentence comprehension. Predictive mechanisms that anticipate the upcoming linguistic signal at a variety of different levels of representation are now well established for language comprehension. Much of the theorizing about these predictive mechanisms has taken place at Marr’s (1982) computational level of analysis, proposing that these predictions reflect incremental probabilistic updating of linguistic representations during the comprehension process (Kuperburg & Jaeger, 2016). How comprehenders execute this updating process algorithmically, however, is currently unclear. The goal of this seminar is start building bridges between our current computational accounts of the comprehension process and the algorithms that realize these processes.

We are not completely in the dark about how to proceed. Questions concerning the algorithmic nature of probabilistic computational processes have been developed and discussed in areas of cognitive psychology outside of the psycholinguistics literature. Perhaps the most well developed are in the domain of probability judgment and decision making. As has been argued for language comprehension, this literature has proposed that human probability judgments are probabilistic computational processes that are argued to be rational, coming close to the Bayesian ideal, especially when the hypothesis space is given and relatively small (Frank & Goodman, 2012; Griffiths & Tenenbaum, 2006, 2011; Oaksford & Chater, 2007; a.o.). However, as the hypothesis space grows or becomes more generally unknown, computation of the Bayesian ideal becomes intractable. Under these more typical everyday conditions, humans tend to depart from rational expectations in predictable ways, often generating only a subset of hypotheses which leading to systematic biases.

             Explaining these biases has guided research to models that approximate Bayesian processes at the algorithmic-level of analysis. One class of processes that has proved productive and promising is sampling. Sampling algorithms provide methods for estimating complex distributions. In the limit, different sampling algorithms are indistinguishable as they all converge to the ideal response. However, within resource limitations and resulting small sample sizes, different algorithms display distinct behaviors and biases. This suggests that by discovering the biases that humans have, we can narrow down to different classes of algorithms that are known to share these biases.

While this approach has been fruitful in cognitive domains like probability judgment, surprisingly little research has investigated sampling algorithms in the context of sentence processing (cf. Levy, et al., 2008; Hoover, et al., 2023). We will wrap up the seminar with some initial ideas and directions for psycholinguistics, considering how an empirical exploration of biases in sentence comprehension might relate to different classes of sampling algorithms.

 

 

Week 1: Prediction in Sentence Comprehension

 

Week 1 sets the stage for our current understanding of predictive mechanisms in sentence comprehension as it is currently understood theoretically as incremental belief (Bayesian) updating. We will examine some evidence linking metrical like Bayesian surprise and information-theoretical surprisal to measures of processing time and neural activity, focusing on word predictability as a target domain.

 

Suggested readings

Kuperberg, G. R., & Jaeger, T. F. (2016). What do we mean by prediction in language comprehension?. Language, Cognition and Neuroscience, 31(1), 32-59.

Shain, C., Meister, C., Pimentel, T., Cotterell, R., & Levy, R. (2024). Large-scale evidence for logarithmic effects of word predictability on reading time. Proceedings of the National Academy of Sciences, 121(10), e2307876121.

 

Prior research on word predictability

Brothers, T., & Kuperberg, G. R. (2021). Word predictability effects are linear, not logarithmic: Implications for probabilistic models of sentence comprehension. Journal of Memory and Language, 116, 104174.

Smith, N. J., & Levy, R. (2013). The effect of word predictability on reading time is logarithmic. Cognition, 128(3), 302-319.

 

Other readings

Chater, N., & Manning, C. D. (2006). Probabilistic models of language processing and acquisition. Trends in Cognitive Sciences, 10(7), 335-344.

Levy, R. (2008). Expectation-based syntactic comprehension. Cognition, 106(3), 1126-1177.

Norris, D. (2006). The Bayesian reader: explaining word recognition as an optimal Bayesian decision process. Psychological Review, 113(2), 327.

Ryskin, R., & Nieuwland, M. S. (2023). Prediction during language comprehension: What is next?. Trends in Cognitive Sciences, 27(11) 1032-1052.

 

 

Week 2: Probability Judgments and Cognitive Biases

 

UPDATE: We will spend the first 30 minutes on Shain et al (2024) before turning to the suggesting reading for Week 2.


Probabilistic models of sentence comprehension discussed in Week 1 rest on assumptions about incremental belief updating which relate Bayesian surprise and information-theoretical surprisal to measures of processing time and neural activity. These metrics are fundamentally about probability distributions over hypotheses.

Outside of language, there is a now very large literature that exploring how humans display probabilistic behaviors, including our ability to make probabilistic judgments. Central to this enterprise has been discoveries that humans are systematically biased in particular ways that push their judgments up and down. Week 2 focuses on subset of these biases, unpacking effects, as these seem especially interesting when we return to language comprehension.

 

Suggested readings

Sloman, S., Rottenstreich, Y., Wisniewski, E., Hadjichristidis, C., & Fox, C. R. (2004). Typical versus atypical unpacking and superadditive probability judgment. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30(3), 573-582.

Tversky, A., & Koehler, D. J. (1994). Support theory: A nonextensional representation of subjective probability. Psychological Review, 101(4), 547-567.

 

Background reading

Gershman, S. (2021). What Makes Us Smart: The Computational Logic of Human Cognition. Princeton University Press.

 

Other readings

Griffiths, T. L., Kemp, C. & Tenenbaum, J. B. (2008) Bayesian models of cognition. In R. Sun (ed.) The Cambridge Handbook of Computational Cognitive Modeling. Cambridge University Press.

Hadjichristidis, C., Summers, B., & Thomas, K. (2014). Unpacking estimates of task duration: The role of typicality and temporality. Journal of Experimental Social Psychology, 51, 45-50.

Hadjichristidis, C., Geipel, J., & Gopalakrishna Pillai, K. (2022). Diversity effects in subjective probability judgment. Thinking & Reasoning, 28(2), 290-319.

 

Week 3: Approximate Bayesian Inferencing via Sampling

 

Week 2 reviews some of the literature demonstrating how humans display certain systematic biases when they make probabilistic judgments. While this literature has sometimes argued that these biases show that humans depart from the rational response, others have proposed that humans are behaving rationally given resource limitations on their cognition. One prominent approach to account for this is through approximate Bayesian inference using sampling algorithms. While sampling algorithms all converge to the Bayesian ideal in the limit, different classes of sampling algorithms show different systematic biases when resource limited.

Week 3 examines the idea that human probabilistic judgement, which has a certain profile of systematic biases, is underpinned by a particular class of sampling algorithm. We will also consider how this approach can be extended to sentence comprehension.

 

Suggested readings

Dasgupta, I., Schulz, E., & Gershman, S. J. (2017). Where do hypotheses come from?. Cognitive psychology, 96, 1-25. [explores MCMC sampling]

Shi, L., Griffiths, T. L., Feldman, N. H., & Sanborn, A. N. (2010). Exemplar models as a mechanism for performing Bayesian inference. Psychonomic Bulletin & Review, 17(4), 443-464. [explores Importance sampling]

 

Background reading

Zhu, J. Q., Chater, N., León-Villagrá, P., Spicer, J., Sundh, J., & Sanborn, A. (2023). An introduction to psychologically plausible sampling schemes for approximating Bayesian inference. In K. Fiedler, P. Juslin, J. Denrell (eds.) Sampling in Judgment and Decision Making, 467-489.

 

Research on sampling algorithms in language comprehension

Hoover, J. L., Sonderegger, M., Piantadosi, S. T., & O’Donnell, T. J. (2023). The plausibility of sampling as an algorithmic theory of sentence processing. Open Mind, 7, 350-391.

Levy, R., Reali, F., & Griffiths, T. (2008). Modeling the effects of memory on human online sentence processing with particle filters. Advances in Neural Information Processing Systems, 21.

 

Other readings

Chater, N., Zhu, J. Q., Spicer, J., Sundh, J., León-Villagrá, P., & Sanborn, A. (2020). Probabilistic biases meet the Bayesian brain. Current Directions in Psychological Science, 29(5), 506-512.

Griffiths, T. L., Vul, E., & Sanborn, A. N. (2012). Bridging levels of analysis for probabilistic models of cognition. Current Directions in Psychological Science, 21(4), 263-268.

Lewis, R. L., Howes, A., & Singh, S. (2014). Computational rationality: Linking mechanism and behavior through bounded utility maximization. Topics in Cognitive Science, 6(2), 279-311.

Sanborn, A. N., & Chater, N. (2016). Bayesian brains without probabilities. Trends in Cognitive Sciences, 20(12), 883-893.