Learning Differentiable Logic Programs for
Abstract Visual Reasoning

Hikaru Shindo* 1, Viktor Pfanschilling 1,4 , Devendra Singh Dhami 1,2, Kristian Kersting 1,2,3,4
1 TU Darmstadt, Germany
2 Hessian Center for AI (hessian.AI), Darmstadt, Germany
3 Centre for Cognitive Science, TU Darmstadt, Germany
4 German Center for Artificial Intelligence (DFKI), Germany
* Corresponding author: A@B, where (A, B) = (hikaru.shindo, tu-darmstadt.de)

Abstract
Visual reasoning is essential for building intelligent agents that understand the world and perform problem-solving beyond perception. Differentiable forward reasoning has been developed to integrate reasoning with gradient-based machine learning paradigms. However, due to the memory intensity, most existing approaches do not bring the best of the expressivity of first-order logic, excluding a crucial ability to solve abstract visual reasoning, where agents need to perform reasoning by using analogies on abstract concepts in different scenarios. To overcome this problem, we propose NEUro-symbolic Message-pAssiNg reasoNer (NEUMANN), which is a graph-based differentiable forward reasoner, passing messages in a memory-efficient manner and handling structured programs with functors. Moreover, we propose a computationally-efficient structure learning algorithm to perform explanatory program induction on complex visual scenes. To evaluate, in addition to conventional visual reasoning tasks, we propose a new task, visual reasoning behind-the-scenes, where agents need to learn abstract programs and then answer queries by imagining scenes that are not observed. We empirically demonstrate that NEUMANN solves visual reasoning tasks efficiently, outperforming neural, symbolic, and neuro-symbolic baselines.

Scalable Visual Logic Reasoning and Learning

Reasoning and learning using logic programming on complex visual scenes.

Complex Reasoning Behind-the-Scenes

Perform visual reasoning beyond observation:
"What is the color of the 2nd left-most object after deleting a cyan object?"

Explainable Visual Logic Reasoning

Visualize the important factors for the reasoning using attention maps. Using the end-to-end reasoning architecture, NEUMANN highlights only relevant objects by computing input gradients.

Related Projects  Check out our list of papers for visual logic reasoning and learning. 

αILP: Thinking Visual Scenes as Differentiable Logic Programs
We propose a novel differentiable inductive logic programming framework that learns to represent scenes as logic programs—intuitively, logical atoms correspond to objects, attributes, and relations, and clauses encode high-level scene information.
[paper] [code] [web] 

V-LOL😂: A Diagnostic Dataset for Visual Logic Learning
We propose a visual logic learning benchmark by realizing a long-standing task of AI, the Michalski train problem, in 3D visual scenes. By incorporating intricate visual scenes and flexible logical reasoning tasks within a versatile framework, V-LoL-Train provides a platform for investigating a wide range of visual logical learning challenges.
[paper] [code] [web]

Acknowledgements

This work was supported by the AI lighthouse project “SPAICER” (01MK20015E), the EU ICT-48 Network of AI Research Excellence Center “TAILOR” (EU Horizon 2020, GA No 952215), and the Collaboration Lab “AI in Construction” (AICO). The work has also benefited from the Hessian Ministry of Higher Education, Research, Science and the Arts (HMWK) cluster projects “The Third Wave of AI” and “The Adaptive Mind”, the Hessian Centre for Artificial Intelligence overall, the Hessian research priority program LOEWE within the project WhiteBox, and from the German Center for Artificial Intelligence (DFKI) project ‘SAINT’.