Calling all students, researchers, academics, and industry practitioners to participate in our shared task on social-media based persona modeling, hosted on Kaggle! If you would like to participate, please sign up here: https://www.kaggle.com/competitions/social-sim-challenge-social-media-based-personas
Competition Start: June 2nd, 2025 (10 PM EST)
Competition End: Aug 31st, 2025, (8AM EST) July 2nd, 2025, AoE
This task focuses on building realistic social-media personas from user activity and evaluating how well a model can predict their future actions. Inspired by generative agent-style simulations papers, the goal is to explore the predictive limits of LLM-based agents when grounded in real-world behavioral data. Given a cluster of anonymized individuals and their historical activity on social platforms (e.g., BlueSky), your model must predict the most plausible next social media action the persona would take. The challenge is to model subtle persona traits, habits, and social behavior grounded in real-world clusters while maintaining privacy and generalizability.
Participants will be provided with an anonymized dataset of processed clustered user activity logs (train, validation, and test splits). Each cluster contains sequences of actions (e.g., posts, replies, follows) made by individuals over time.
Train set: Ground truth provided
Validation set: Ground truth hidden; evaluation shown on leaderboard
Test set: Ground truth hidden; final evaluation only run once at the end of the competition
Participants will submit up to 3 result sets at the end of the competition, each consisting of predictions for both the validation and test sets. Only validation scores will be visible during the competition; the final ranking will be based on the average test performance across the three submitted sets.
Evaluation Metrics:
F1-score (action space)
Cosine-similarity
We encourage participants to submit a short (up to 4 pages) or long (up to 9 pages) paper describing their approach, insights, or analysis. Selected submissions will be invited to present their work at the workshop.
Please don't hesitate to reach out if you have any questions, including uncertainties about the relevance of a particular topic. You can contact us at social-simulation@googlegroups.com.
The shared task competition on Kaggle has started: https://www.kaggle.com/competitions/social-sim-challenge-social-media-based-personas
The shared task will be hosted on Kaggle, but we're still finalizing the setup — thank you for your patience!
In the meantime, to help you get started, the task will be based on a dataset similar to this: 🔗 BluePrint Dataset on Hugging Face
This version will be using the 25 user clusters and is derived from Bluesky data. The shared task will use a similar structure, but with an extended time window and updated content.
Thanks again for bearing with us!