Colloquia and Symposia

Symposia: Directable editing tools for image synthesis and color palettes

David Vanderhaegh (IRIT, Université de Toulouse, CNRS, INPT, UPS, UT1, UT2J, France)

David Vanderhaeghe is an Associate Professor at IRIT-Université de Toulouse where he started in 2010. He completed his Ph. D. in 2008 at Université de Grenoble-INRIA Rhône-Alpes under the supervision of Joëlle Thollot and François X. Sillion. Then he spent two years as a Postdoctoral Researcher at LaBRI-INRIA Bordeaux Sud-Ouest/Université Bordeaux 1. His research focus on image synthesis, rendering, stylization and user control for image and animation creation. He has participated to several program committee (Eurographics short paper, Expressive, WSCG...) and is a regular reviewer for major journals and international conferences (ACM Siggraph TOG, Eurographics, IEEE-TVCG, CGF). He is co-head of the computer graphics and image analysis master program of Université de Toulouse, France. He was also invovled in national animation of the research as a comittee member of the french association for computer graphics (AFIG) and the national research agency lab on computer graphics and virtual reality CNRS GRD IG RV, as well as the lead of the working group on rendering. He is currently involved in IRIT laboratory council.* (edited)

Abstract:

After a quick overview of my professional carrer, I will present some of recent research works I was involved in. Physically based rendering offers a tremedous visual quality, but misses artistic control needed for story telling. Skilled graphists tend to break the production workflow to have artistic freedom. We explore how to give artistic directed tools to efficiently control computer generated images in a physically based context. I will present the Ray Portal[1] and Global illumination shadow layers [2]. Next I will focus on the color content editing and present our work on constrained palette space exploration [3] as well as trending research direction I am focused on.

[1] Thomas Subileau, Nicolas Mellado, David Vanderhaeghe, Mathias Paulin. RayPortals: a light transport editing framework. Visual Computer, Springer Verlag, 2017.

[2] François Desrichards, David Vanderhaeghe, Mathias Paulin. Global Illumination Shadow Layers. Conditionally accepted to Computer Graphics Forum (Proc. of Eurographics Symposium on rendering 2019).

[3] Nicolas Mellado, David Vanderhaeghe, Charlotte Hoarau, Sidonie Christophe, Mathieu Brédif, et al.. Constrained Palette-Space Exploration. ACM Transactions on Graphics, 2017.

Much of his presentation materials were more advanced than my present knowledge, but I was definitely able to understand about 75% of the information presented today. It was funny, the parts I most wished to hear about were the ones least spoken about. His initial research when he was first interning at IRIT included modeling mushrooms, which I thought was quirky and cool. However, since it was not his most advanced or most recent research he did not dwell on the topic for very long. He also did some research with map stylizations that I found fascinating and would like to look at more deeply. The other topic he did not cover as much was the constrained palette space exploration (unfortunately, he ran out of time!), but he was able to do a little "Norm's Notes" summary of the information he prepared. However, his presentation was well made, and I enjoyed what I understood!

It was also interesting to hear what challenges the researchers encountered as they tried to market their product to animation artists and rendering companies. For instance, they completed a lot of research and developed a rendering tool that could allow artists to directly manipulate the light sources surrounding an object (in ways that would not naturally occur) using "ray portals". The shadows around the object would automatically update giving the scene a "natural" look while allowing the artist to have direct control over details in the image. Although their renderer was well made, artists didn't want to have to switch interfaces and rendering companies would only incorporate their technology if there was proof of concept by an artist using the technology. Thus, the researchers found themselves in a dilemma with regards to introducing their technology into the industry market.

Colloquium: Dr. Kosecka on Semantic Understanding for Robot Perception

It was really cool to hear from a professor who is working on research that operates in an area where several other areas of research converge. The research we have been doing also has this quality of being the meeting place of several other major computer science areas converge, so it was interesting to see how they resolved and navigated all the different challenges and opportunities presented by their research. I'm including here my very, very general notes from the presentation, but her website is the best place to look for papers relating to what she spoke about.

Scene understanding for robotics:

- detecting target object and 3D orientation

- obstacles in room, free floor (safe directions)

- plan a safe path

Single image doesn’t have all information needed

Rely a lot on learning techniques, particularly neural networks

Shallow vs Deep Architecture:

Shallow:

image/video -> had crafted feature rep -> trainable classifier -> object class

- brittle b/c features informed by intuition

Deep:

image/video -> layer 1…layer n -> object class

(Not done by hand)

Linear classifiers - basic idea underlying neural networks

Neural networks - can apply in multiple class setting; determine weights given features

Single Layer NN

  • Take image and turn into one large vector
  • Library of things which we know works well and optimize parameters effectively
  • Adding more layers helps with accuracy

Problem - pixels in image are not related when we put them in vector and feed to neural network , but in reality pixels are related

Convolutional neural networks:

  • Instead of traditional fully connected layer, looks at neighborhoods of pixels and lets neighborhoods share weights across the image
  • Pixels not independent anymore
  • One of first successfully application was the post office
  • Every set of weights interpreted as filter and applied to image

ImageNet dataset - accuracy on this dataset increased from 72 to 84 to 92 b/c of convolutional neural network

Rule of thumb: deeper better for classification; problem: back gradients get smaller so makes it harder to learn; optimization techs developed

Visual QA Challenge: find associations between brief word descriptions and images

  • how we learn and associate meaning with things; ground what we see into semantic concepts

Robotics enviro challenging:

  • not a lot of training examples
  • Lots of clutter
  • Need to recognize object and location in lots of different contexts

Exploration topics:

  • What if not a lot of training examples? Can synthetically generate different scenarios and feed to algorithm? —> create 3d images of objects and then place on images in table/surface images; appear in right context scale size; can now have lots of training examples
  • Target driven instance detection? Learn how to do some kind of template matching? —> calls for diff architecture and learning;
  • How to teach robot how to look for things?

3d bounding box: given detection bounding box estimate 3d bounding box for object defined by orientation, translation, and physical dimensions (poes estimation problem)

Semantic segmentation: assigning a label to each pixel

  • Labels can be object categories such as closet, fridge, etc.
  • one option: reducing resolution until prediction can be made; problem: loss of feature maps needs to be recovered —> deconvolutional neural network

Can represent prior knowledge about places?

Doing things in simulation and then transferring to reality/open world