Predicting human decisions with behavioural theories and machine learning

Abstract:

Predicting human decisions under risk and uncertainty remains a fundamental challenge across disciplines. Existing models often struggle even in highly stylized tasks like choice between lotteries. Here we introduce BEAST gradient boosting (BEAST-GB), a hybrid model integrating behavioural theory (BEAST) with machine learning. We first present CPC18, a competition for predicting risky choice, in which BEAST-GB won. Then, using two large datasets, we demonstrate that BEAST-GB predicts more accurately than neural networks trained on extensive data and dozens of existing behavioural models. BEAST-GB also generalizes robustly across unseen experimental contexts, surpassing direct empirical generalization, and helps to refine and improve the behavioural theory itself. Our analyses highlight the potential of anchoring predictions on behavioural theory even in data-rich settings and even when the theory alone falters. Our results underscore how integrating machine learning with theoretical frameworks, especially those—like BEAST—designed for prediction, can improve our ability to predict and understand human behaviour.

Bio:

Ori Plonsky is an Assistant Professor in the Faculty of Data and Decision Sciences at the Technion-Israel Institute of Technology. His research spans behavioral decision-making, human learning, and the integration of data science with behavioral science, focusing on human choice prediction and computational modeling of behavior. Ori holds a PhD in Behavioral Sciences and has an engineering background, enhancing his interdisciplinary approach. His work has been published in journals such as Psychological Review, Nature Human Behavior, and PNAS. In 2022, he received the Hillel Einhorn New Investigator Award from the Society for Judgment and Decision Making.

Summary:

Goal: predict people’s response to incentive structures
- Want to incentivize some desired behavior in a population (e.g. reduce car use, ensure people take their medicine)
- Given some budget for incentivizing a goal, how to do this best?
- Many approaches:
  - Loss aversion: give people money at start of month, take away when they don’t do desired thing
  - Lotteries
  - Accounting for diminishing sensitivity: people more sensitive to $1 difference between 10 and 11 than to 99 and 100
- Can experiment with many different types of treatments: effective but expensive
- Can ask experts but they disagree and are often wrong
- Want to create an ML model that predicts human responses to stimuli
Approach: model competitions using human training data (e.g. via Kaggle)
- Given new choice problem: decisions under risk and uncertainty
- Goal :predict human response to choice problem
- Baseline models provided
- Researcher submit solutions/model
Problem: choice under risk and uncertainty
- Choices: A or B
  - A: 3 with certainty
  - B: 4 with prob .8. 0 with prob .2
- Can expand to lotteries, more outcomes, unknown probabilities, etc.
- Experiments:
  - 720 choice tasks (210 train, 60 test)
  - 930 participants, ~700k choices
- Accuracy metrics:
  - mean squared error
  - Completeness: proportion of predictable error the model error
    - (errorbase - errormodel) / (errorbase - errorperfect_model)
- 46 teams, 20 submissions
- Best: BEAST-GB: Beast Gradient Boosting
  - Beast theory: people use strategies that are useful in many situations, then apply them to similar situations
  - Unknown:
    - What similarity function do people use?
    - What strategies do they have?
  - Model:
    - Input:
      - Risky choice task is given to the BEAST theory model to predict what the theory implies for the decision
        BEAST is used as a foundation model
      - Task payoffs and probabilities
    - Train Extreme Gradient boosting model
    - Output: prediction of human choice
- All submissions combined theory and ML
Prior work has shown that an unconstrained neural network works better than pure BEAST
BEAST-GB beats both BEAST and neural nets: 96.2% complete. High accuracy even when training on a few records
Ablation studies show that psychological, BEAST and payoff features are key when used by ML model
BEAST helps the ML generalize across experiments
- BEAST-GB is more accurate than other approaches
- Also, slightly more accurate than using other experiments with same payoffs but different UIs
  - That baseline captures the baseline variability of the experimental process
- BEAST-GB is more accurate because it captures the mean of the experimental noise process, while real experiments are sampled from a biased distribution of study designs/populations/UIs
Next: larger experimental studies with more verbal cues to people
LLMs are showing some capability for predicting behavior but the results are still inconsistent

Page updated

Report abuse