Earlier this week, I attended the SNN Symposium – Intelligent Machines. SNN is the Dutch foundation for Neural Networks, which coordinates the Netherlands national platform on machine learning, which connects most of the ML groups in the Netherlands.
It’s not typical for a 1 day Dutch specific academic symposium to sell out – but this one did. This is a combination of the topic (machine learning is hot!) but also the speakers. The organizers put together a great line-up:
- Zoubin Ghahramani – Professor in the Machine Learning group at Cambridge
- Daan Wierstra – Google Deep Mind
- Sethu Vijayakumar – Professor Robotics at University of Edinburgh
- Ralf Herbrich – Director of Machine Learning Amazon
It’s not typical to get essentially 4 keynotes in one day. Instead of going through each talk in turn, I’ll try to draw some of the major items that I took away from across the talks.
The Case for Probability Theory
Both Prof. Ghahramani and Dr. Herbrich made strong arguments for probability as the core way to think about machine learning/intelligence and in particular a bayesian view of the world . Herberich summarized the argument to use probability as:
- Probability is a calculus of uncertainty (argued using the “naturalness” of Cox Axioms)
- It maps well to computational systems – (factor graphs allow for computational distribution )
- It decouples inference, prediction and decision
For me, it was a nice reminder to think of optimization as an approximation for computing probabilities. More generally, coming back to a simplified high-level framework makes understanding the complexities of the algorithms easier. Ghahramani did a great job of connecting this framework with the underlying mathematics. Slides from his ML course are here – unfortunately without the lecturer himself.
The Rise of Reinforcement Learning
The presentations by Daan Wierstra and Sethu Vijayakumar both featured pretty amazing demos. Dr. Wierstra work at was on the team that developed algorithms that can learn to play Atari games purely from pixels and a knowledge of the game score. This uses reinforcement learning to train a convolutional neural network. The key invention here was to keep around the past experience when providing input back into the neural network.
Likewise, Prof. Vijayakumar showed how robots can also learn via reinforcement. Here’s an example of a robot arm learning to balance a pole.
Reinforcement learning can help attack the problem of data efficiency that’s faced by machine learning. Essentially, it’s hard to get enough training data, let alone labelled training data. We’ve seen the rise of unsupervised methods to take advantage of the data we do have. (Side note: unsupervised approaches just keep getting better) But by situating the agent in an environment, it it’s easier to provide the sort of training necessary. Instead of examples, one needs to provide the appropriate feedback environment. From Wienstra’s talk, again the apparent difficulty for reinforcement learning is temporal abstraction – using knowledge from past to learn. Both the Atari and Robot example receive fairly immediate reinforcement on their tasks.
This takes us back to the classic ideas of situated cognition and of course the work of Luc Steels.
Good Task Formulation
Sometimes half the battle in research is coming up with a good task formulation. This is obvious but it’s actually quite difficult. What struck me was each of the speakers was good at formulating their problem and the metrics by which they can test it. For example, Prof. Ghahramani was able to articulate his goals and measure of success for the development of the Automatic Statistician – a system for finding a good model of a given data and providing a nifty human readable and transparent report. Here’s one for affairs 🙂
(Side note: the combination of parameter search and search through components reminds of work on the Wings Workflow environment.)
Likewise, Dr. Herbrich was good at translating the various problems faced within Amazon into specific ML tasks. For example, here’s his definition for Content Linkage:
He then broke this down into the specific well defined tasks through the rest of talk. The important thing here is to keep coming back to these core tasks and having well defined evaluation criteria. (See also Watson’s approach)
Attacking General AI?
One thing that stood out to me was the audacious of the Google Deep Mind goal – to solve General AI. Essentially, designing “AI that can operate over a wide range of tasks”. Why now? Wierstra emphasized the available compute power and advances in different algorithms. I thought the interesting comment was that they have something like a 30 year time horizon within a company. Of course, funding may not last long, but articulating that goal and demonstrable attacking it is something that I would expect more from academia. Indeed, I wonder if we are not thinking enough They already have very impressive results. The atari example but also their DRAW algorithm for learning to generate images :
I also like their approach of Neural Turing Machines – using recurrent neural network to create a computer itself. By adding memory to neural networks there trying to tackle the “memory” problem discussed above.
Overall, it was an invigorating day.
Random thoughts:
- Robots demos are cool!
- Text Kernel and Postdam’s use of word2vec for entity extraction in CVs was interesting.
- (click to see the full size poster)