tl;dr Learning general purpose robot manipulation. Robots that can put groceries in floppy grocery bags, and prepare food, and help you grandmother around the house.
I don't know yet! But to start I am trying to define the specific problems within the above grand vision. In this post, I will attempt to describe the problem I am working on, or at least the broad category of problems. I think one way to put it would be:
"How do we represent knowledge such that it is both useful for selecting actions, amenable to gradient based learning, and can be combined with prior human knowledge?"
As I was writing that sentence I wanted to keep adding more clauses, but three felt like that magic number. First, why am I interested in the representation of knowledge, and what does that even mean? Second, why gradient descent? Third, what kinds of prior knowledge should we incorporate and how?
1. In robotics, we might have fact-like knowledge or equations about physics, and we want the robot to use this knowledge in order to select actions (i.e. planning). Knowledge could be detecting an object in the scene, or it could be a skill like unscrewing a lid. So for example, one representation would be an English language description of facts or equations. But that representation requires English language understanding which is very hard, so that's a bad idea.
So what might a good representation be? Well, I don't know yet! But some of the kinds that other researchers have used are the weights in a neural network, which along with the structure of a neural network represent a function (which we may liberally call knowledge here). But even still there a million ways to apply the weight-matrices-as-knowledge idea. Another way to do it could be with probabilistic graphical models, or strings of symbols, and I'm sure the list goes on. So, again my first question is how do we represent knowledge?
2. Robots are not just software crunching numbers, they make actual physical decisions about the world and then act according to them. The actions they take effect the future actions they will take, and so whatever knowledge our robot has needs to be useful for selecting the right actions (however you may define "right actions").
3. Gradient descent is used in every recent result in machine learning, so it seems like a good idea to make it so that the representation of knowledge can be optimized by gradient descent. Basically, this means that if we represent our knowledge as a set of numbers, and we can write an equation that tells how changing those numbers will change our success, then we can adjust those numbers to do increase success.
4. Prior human knowledge is a tricky thing. First, we must acknowledge all the cool things that have been done learning from scratch, without much prior knowledge. It's very impressive, but I still believe there is a use for methods which can incorporate prior human knowledge. The main reason is laziness. I would guess that even if we had the optimal learning algorithm, and assumed infinite free computing power, it would take months or years to train agents from scratch that are useful for general purpose manipulation. Think how long it takes a human child to become good at general purpose manipulation! My position is, why train from scratch when we have so many useful models and equations that we can give to a learning algorithm and essentially say, "Here's some information that might be useful, do what you will with it". We must be careful not to limit our agents to the information we give them, and they should not require any prior information, but I think they ought to make good use of it.
Thanks for reading, check back on the 1st next month (Feb 1st, 2019) for the next post.