🧱 CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions

Tayfun Ates¹ M. Samil Atesoglu¹ Cagatay Yigit¹ Ilker Kesen²’³ Mert Kobas

Erkut Erdem¹’² Aykut Erdem²’³ Tilbe Goksun Deniz Yuret²’³

¹ Hacettepe University, Computer Engineering Department ² Koç University, KUIS AI Center

³ Koç University, Computer Engineering Department ⁴ Koç University, Psychology Department

Humans are able to perceive, understand and reason about causal events. Developing models with similar physical and causal understanding capabilities is a long-standing goal of artificial intelligence. As a step towards this direction, we introduce CRAFT, a new video question answering dataset that requires causal reasoning about physical forces and object interactions. It contains 58K video and question pairs that are generated from 10K videos from 20 different virtual environments, containing various objects in motion that interact with each other and the scene. Two question categories in CRAFT include previously studied descriptive and counterfactual questions. Additionally, inspired by the Force Dynamics Theory in cognitive linguistics, we introduce a new causal question category that involves understanding the causal interactions between objects through notions like cause, enable, and prevent. Our results show that even though the questions in CRAFT are easy for humans, the tested baseline models, including existing state-of-the-art methods, do not yet deal with the challenges posed in our benchmark.

Q: "What is the color of the last object that collided with the tiny red circle?" A: "Green"
Q: "What is the shape of the first object that collided with the tiny green circle? A: "Triangle"Q: "After hitting the floor, does the small green triangle collide with other objects?" A: "False"Q: "Before hitting the ground, does the small green circle collide with other objects?" A: "True"
CounterfactualQ: "If the tiny green circle is removed, will the small green triangle fall to the ground?" A: "False"Q: "If any of the other objects are removed, will the tiny green circle end up in the basket?" A: "True"Q: "If any one of the other objects are removed, will the small green circle hit the floor?" A: "False"
Q: "There is a tiny green circle, does it stimulate the tiny green triangle to fall to the floor?" A: "True"
Q: "Does the tiny green triangle lead to the small red circle ending up in the bucket?" A: "True"
EnableQ: "There is a small red circle, does it enable the tiny green circle to hit the ground?" A: "True"Q: "What is the number of objects that the small red circle allows to hit the floor?" A: "1"
PreventQ: "There is a tiny green triangle, does it hinder the tiny green circle from going into the container?" A: "True"Q: "How many objects are prevented by the tiny green triangle from falling into the basket?" A: "1"


CRAFT: A Benchmark for Causal Reasoning About Forces and inTeractions
Tayfun Ates, Muhammed Samil Atesoglu, Cagatay Yigit, Ilker Kesen, Mert Kobas, Erkut Erdem, Aykut Erdem, Tilbe Goksun, Deniz Yuret
ACL 2022 Findings (arxiv).