RayNet was proposed by Paschalidou et al. [1] as a framework combining strengths of probabilist models [2,3,4] and neural networks [5,6] to reconstruct 3D models from a collection of images and their camera poses. RayNet integrates (see figure below) a convolutional neural network that learns view-invariant feature representations with an Markov network that explicitly encodes the physics of perspective projection and occlusion. The authors trained RayNet end-to-end using empirical risk minimization, and evaluate it on real-world datasets. The code is available in GitHub.
Presented at IMPA in Luiz Velho course: Fundamentals and Trends in Vision and Image Processing