Geoffrey Hinton has a powerful and relatively rare skill to compliment his research. That is, to be able to articulate complex concepts into language that anyone can understand. He recently spoke to CBC Toronto’s Anna Maria Tremonti about deep learning, starting with a perfectly simple question from Tremonti to hold listeners hands through the fundamentals of deep learning and then further into Hinton’s work and what the future holds for Google.
AMT: Can you describe how deep learning mirrors how a toddler would learn about the world?
Geoffrey Hinton: One thing about how children learn about the world is that they don’t get given the right answer for everything. They look at images, they hear sounds, they figure out for themselves what is in those images and what those sounds mean without anybody telling them the right answer.
So one aspect of deep learning that is important is called unsupervised learning where you just take input from the world, maybe a TV camera or maybe a microphone and you figure out what is going on without anybody telling you.
AMT: And you talk about neural networks. What are neural networks?
Geoffrey Hinton: So our brain has lots of neurons connected together and the strengths of the connections between neurons are where all your long-term knowledge is, so everything you know is encoded by how much one neuron affects another neuron.
So every so often a neuron will go ping and that sends a signal to other neurons and they have to decide whether to go ping and they do that based on the pings they are getting from other neurons and how strong the connections are so it’s like each neuron is voting for whether other neurons should go ping.
AMT: So in other words we have all of this information, all these bits of information in the neurons and they connect themselves as we look at the world or as we try to decide something or as we are confronted with something.
Geoffrey Hinton: So the active neurons that are going ping are representing what your currently thinking or your currently seeing and the way they are activated is determined by the connection strengths between neurons, so when I look at some pixels in an image the connection strengths determine whether a neuron that gets input from pixels will go ping.
It might, for example have a strong positive connection to one pixel and a strong negative connection to another pixel and so if a pixel with the positive connection bright and the pixel with the negative connection is dim, then that neuron will go ping and it will have detected an edge between those two pixels.
AMT: Ok, and if we take that a little further than if you are looking at, it depends on how the pixels come together so I might see it as if I am looking at something from a distance and the more I look at it or the closer it comes then I can discern exactly what it is.
Geoffrey Hinton: The neurons close to the input detect little things you might see through a very small peephole like a little piece of edge and the neurons connecting to those detect little combinations of edges like corners that are a bit bigger and as you go through layers of neurons you detect bigger and more complicated features and the thing about deep learning is instead of hand engineering all those features like we used to do, we have a learning algorithm that will learn all the connection strengths so that the neural net just decides what features it should use for itself.
AMT: Ok, so you have been working on neural networks for decades but it has only exploded in its application potential in the last few years, why is that?
Geoffrey Hinton: I think it’s mainly because of the amount of computation and the amount of data now around but it’s also partly because there have been some technical improvements in the algorithms. Particularly in the algorithms for doing unsupervised learning where you’re not told what the right answer is but the main thing is the computation and the amount of data.
The algorithms we had in the old days would have worked perfectly well if computers had been a million times faster and datasets had been a million times bigger but if we’d said that thirty years ago people would have just laughed.
AMT: So it is the speed then and just the volume. So, when Google came knocking, other than the undisclosed sum, what is it that they offered that made you want to work with them?
Geoffrey Hinton: Well Google is at the cutting edge of technology and if you can make something that makes a Google product work better, it will be used by hundreds of millions of people, so that’s very exciting. They’ve also got tremendous resources, so you can get your hands on thousands of computers there very easily and someone called Jeff Dean at Google put together a big infrastructure that allows you to run these algorithms of very large numbers of computers and that’s very appealing.
AMT: And they have the database, right? They have that very thing you are talking about they’ve got the speed and the ability to have all that information that the artificial intelligence will then be able to start working with a neural network.
Geoffrey Hinton: Yes, they have a lot of data, so if you are trying to recognise things in images, for example, it’s easy at Google to get 100 million images and have the computer power to process them all.
AMT: Ok, so I am going to play a clip that you will probably recognise, this is a clip of Watson, IBM’s artificial intelligence system. Watson was pitted against Jeopardy champions a few years ago, Watson won, let’s listen.
AMT: Well, there we go. How is the artificial intelligence you are working on different from what Watson is up to?
Geoffrey Hinton: Ok, so Watson does use some machine learning but it also uses a very large amount of hand programming of putting in heuristics of more conventional computer programming and in deep learning what we try to do is minimize the amount of hand engineering and get the neural nets to learn more or less everything so in perception for example, you’d put in the raw pixels of an image and you get the neural net to learn how to extract features from that so it can recognise complicated objects.
So it’s much more to do with learning everything and that’s why it scales much better with data and computation because there is not much human labour involved and so as you get more data and more computation, the systems get better.
AMT: Oh I see because you can keep feeding that in but you’ve already taught it how to go through with the neural networks, how to actually use that information.
Geoffrey Hinton: Yes, instead of programming the computer to solve a particular task we program the computer to know how to learn and then you can give it any old task and the more data and the more computation you provide the better it will get because what is being programmed is the learning algorithm not the particular heuristics for that task.
AMT: Ok, so Watson is essentially loaded up with a bunch of information that exists and then Watson picks it from the list that is available whereas what you are doing would have the jumble and be able to pull out of all of the different pieces of information and put them together.
Geoffrey Hinton: Yes, that’s broadly correct
AMT: You can tell who the lay person is here but we knew that. So, let’s talk a little bit more about some of the things you have been able to accomplish with your neural networks, Google’s photo search function got much better within six months of the purchase of your company. Can you walk me through in baby steps how that works.
Geoffrey Hinton: So when you upload your own photos to Google plus, you can give it the name of something you’d like to find in a photo and it will show you all the photos that it thinks contain that thing and it can be a fairly abstract thing like you can say jewellery or you can say food and if you think about jewellery, there are lots of different ways of identifying jewellery.
If you see a woman’s neck with something shiny on it, that’s probably jewellery. If you just see something shiny by itself, that’s not necessarily jewellery and jewellery comes in many sizes and shapes and it would be very very hard to hand program something that could recognise jewellery in images but if you show, say tens of thousands of images of jewellery, the neural networks simulated by the computer can learn for itself how to identify jewellery, it can learn how to use contextual information, it can learn all the sorts of varieties of jewellery, all sorts of stuff that would be very hard to program in.
AMT: You’re almost asking it to think, right? I mean, in lay terms?
Geoffrey Hinton: Yes, as time goes by these neural nets are going to get better and better at thinking. So let me give you a little example of a neural network that is beginning to think.
If you want to translate English into French the standard way to do to it is to have a huge table of little phrases in English and the corresponding phrases in French, that’s called phrase based translation and you need to store this huge table. But there is a different way to do it. You can take a neural net in which the neurons have connections to themselves, and you can feed in the English one word at a time and after you’ve fed in the English sentence, there will be a pattern of activity inside the neural net, that is a bunch of the neurons will be active.
That set of active neurons, you can call a thought and I’ll tell you why I think it is reasonable to call it a thought. It’s what you get from hearing the English sentence and if you now take that pattern of active neurons and feed it to another neural network, that knows how to produce French, it will cause the second neural network to produce a translation of the English sentence.
But you could feed the same thought to a neural network that knows how to produce German and it will produce a translation in German, so this pattern of active neurons really is the thought behind this English sentence. It’s extracted the thought and it can learn to do that just by giving a large set of pairs of an English sentence and it’s translation into French.
AMT: So what would revolutionize the way translation is able to be done by computer.
Geoffrey Hinton: So right now it’s about comparable with phrase based translation on a medium-sized database and not as good on a really big database but it’s improving very fast.
My belief is in the next five years, people will switch to doing machine translation that way.
AMT: Really? So you’re talking about the nuance in language, the way that it would be found in the translation of an amazing novel, for example. That use of the word.
Geoffrey Hinton: It can’t yet capture subtle nuances, it captures fairly simple things at present but as time goes by it will get better and better at capturing complex nuances, yes.
AMT: Ok, so let’s go back to the toddler example, how efficiently can this learning be done in neural networks compared to the human brain?
Geoffrey Hinton: Well, nobody really knows exactly how the human brain is doing it. We are fairly sure that what the human brain is doing is learning the strengths of connections between neurons and that’s where the knowledge is stored. We’re not certain exactly what the algorithm is but now on computers, we can get something that does that fairly efficiently and with a lot of computer power it is beginning to rival people.
AMT: I talked about toddlers but there is also a connection to cats because it was actually work in the ’60s was it not, that looked at how a cat brain processes information.
Geoffrey Hinton: Yes, so two Scientists called Hubel and Wiesel in the ’60s got a Nobel Prize. What you do is you put electrodes in the brain cells of cats in the visual system and you wave things at the cats and see what makes those neurons get excited and it turned out that in one part of the visual system, what makes neurons get excited is little pieces of edge that are moving and so those neurons, they decided, were edge detectors and that was the beginning of understanding physiologically how perception works in the brain.
People had long suspected there would be edge detectors but they actually found that’s how it works and that’s the beginning of a hierarchy where early on you detect little bits of edge, later on combinations of edges and as you go up the visual system you detect more and more complex things until you can detect things like a cat.
AMT: And so it is this hierarchy with these neural networks which is what you are working on, this ability to detect and fine tuning it.
Geoffrey Hinton: Yes, so as you can imagine, if you’ve got a hierarchy that is a half a dozen levels deep, you wouldn’t want to try to hand engineer all those features, you wouldn’t want to try to decide exactly what features you should use, you’d much rather just program a learning algorithm that looks at data and decides what features are going to work best for getting the right answer.
And that is what has worked in vision. Before it worked really well in vision, it worked really well in speech recognition so the first big win that deep learning got was in speech recognition, where in 2009 two different students of mine discovered that you could use deep learning to recognise little fragments of speech and it worked better than existing systems and very quickly it took over the speech recognition business and by 2012 Google already had that in the Android.
So, speech recognition in the Android for doing voice search got a lot better in 2012 and that was because of deep neural nets that were taken to Google by a couple of my students.
AMT: Tell me about the Merck drug discovery competition, you thought your neural nets would have a chance of winning that, what was that about? What happened?
Geoffrey Hinton: So, one of the two students that was involved in the speech recognition research called George Dahl entered this competition on the web that was set up by Merck and when Merck designs new drugs, they don’t want to actually synthesize all the molecules and test them because that is very expensive. They want a computer to predict which molecules are likely to bind to some target and make a good drug so they have computer programs that try to predict from descriptions of the properties of the molecule whether it will bind to a particular target.
What George did is said, well deep learning may be good at that and he entered the competition late and he won it and Merck is now using that in their production pipeline.
AMT: Ok and how did he win it? What process did he put that through?
Geoffrey Hinton: Ok, so he had a neural network that had several layers of neurons and the bottom layer consists of descriptors provided by Merck of the properties of the molecule so it would be like 12,000 different properties they can tell you and it will be things like, it has an -OH group attached to some mother group, little properties of the chemistry of the molecule and from all these properties you want to predict will this bind to some particular target that would be good for a drug for schizophrenia or something.
So what George did was said, well let’s put in these descriptors of the molecule at the bottom level, let’s tell it for various molecules whether they bound well or didn’t bind well and let’s let it learn all the features to predict that and the deep learning algorithms just worked better than any other machine learning algorithms.
AMT: And he entered late but that system actually worked fast enough as well.
Geoffrey Hinton: Yes, it won the competition. One reason he could do it is we used the accelerator boards that are used for computer graphics, so the things called graphic processing units which were designed by the gaming industry for gamers and they actually much more useful than the supercomputers that were designed by governments and much cheaper and so all the people doing deep learning now use these graphic processing units because they are very powerful, they can do lots of things in parallel.
AMT: How much of a game changer is this technology then?
Geoffrey Hinton: I think it’s a huge game changer. More or less everything we apply deep learning to it wins and there is going to be more and more things like that. So in particular one area I think it’s going to be a huge game changer is in understanding natural language. I already talked a little bit about the translation work but I think in future Google is going to be able to take documents and understand what they mean. It’s going to be able to read them and understand what the sentences mean and understand how the sentences are connected together, why you would say this sentence after those previous two sentences.
So things that logicians would say was done by rule of inference but actually it’s complicated and fuzzy and has all sorts of subtleties that you couldn’t program by hand. People in artificial intelligence for years tried to program natural human reasoning and you can’t do it but you can learn it.
That is these neural nets can learn to reason in a natural way like humans do.
AMT: And if they are able to understand documents what jobs exist now that.. could they be replaced?
Geoffrey Hinton: I think the obvious thing for Google is in search where at present you can find a document that contains particular words and it works very well for that but suppose you wanted to say, find me a document that pretends to support climate change but is trying to undermine the idea that there is climate change.
Well, Google can’t do that at present, but if you understood what was going on in the documents and what the argument structure was in the sentences, like a person would, then Google could do that and it’s going to be some way in the future but maybe 10-20 years down the road or maybe even sooner, Google will be able to understand documents and therefore give you much more reasonable answers and in particular you can search for what you really want which is to do with the meaning of the documents and the slant the documents are putting on things.
AMT: That’s extraordinary, and this is all coming out of the work you are doing on neural networks?
Geoffrey Hinton: Yes this is coming out of the fact that we now have very general purpose ways of training neural networks to do more or less anything.
AMT: Of course we are talking about this now and how Google purchased your company and all of this stuff but when did it really start to gel for you in your research, tell us about that time.
Geoffrey Hinton: So I started as a Research Graduate in 1972 and this is what I was working on and it was extremely unpopular back then and everyone told me this is crazy and it will never work and back then it didn’t work, computers weren’t fast enough.
AMT: And when they told you that you were crazy, what did you say to them?
Geoffrey Hinton: I said, basically, this is the only way it could be. If you want to understand how the human mind works, you better understand how the brain works and how the brain is implementing the mind and the brain is clearly a big neural network.
So we better understand how systems that compute like that work and the argument back in the 1970s was, oh that’s just silly, we know that computer hardware doesn’t matter, it’s the software that matters and the thing that’s wrong with that argument is that if you write software for the wrong type of computer it will be millions of times less efficient than software for the right type of computer.
AMT: And so when was it that your colleague or the people who were overseeing your work started to understand that you were onto something.
Geoffrey Hinton: So in the 1980s, people came up with a learning algorithm called back propagation and there was a huge wave of interest in neural networks, it was a very exciting time. A bunch of different research groups all came up with the same algorithm and began to show that it could do impressive things.
But they weren’t quite impressive enough, they didn’t beat the existing technologies at things like recognizing objects and images or recognizing speech and so the interest waned again. And the reason we now know that they didn’t beat existing techniques is that computers weren’t fast enough.
AMT: But you were persistent because you understood that, you understood the potential that was there.
Geoffrey Hinton: I never thought there was any alternative. I mean the brain must work somehow and if we can figure out how the brain is doing it or get close enough to a method that does it like the brain, then we are going to be able to do things like common sense reasoning and recognizing objects and recognizing speech.
AMT: So could we get to a point that anything our brain can do a computer with the right neural network algorithms can do it as well.
Geoffrey Hinton: Yes we could get to that point.
AMT: Does that scare you?
Geoffrey Hinton: We are a long way off so I don’t think it’s something I am going to have to worry about..
AMT: But does it excite you?
Geoffrey Hinton: It’s very exciting, yes and I think the consequences of that depend entirely on politics and what people decide to use it for.
AMT: Now there are some very bright individuals that say we need to be careful of advancing this type of technology, I’ve got a couple of clips I want you to hear, Elon Musk and Stephen Hawking with their concerns.
Stephen Hawking: The primitive forms of artificial intelligence we already have, have proved very useful but I think the development of full artificial intelligence could spell the end of the human race. Once humans develop artificial intelligence it would take off on its own and redesign itself at an ever-increasing rate. Humans, who are limited by slow biological evolution couldn’t compete and would be superseded.
Elon Musk: I think we should be very careful about artificial intelligence, if I were to guess what our biggest existential threat is, it’s probably that. Increasingly inclined to think that their should be some regulatory oversight maybe at the national and international level just to make sure that we don’t do something very foolish. I mean with artificial intelligence we are summoning the demon.
AMT: Ok, summoning the demon, end of the human race, biggest existential threat, Geoffrey Hinton what do you think of what they are saying there?
Geoffrey Hinton: Well, when they were building that big accelerator at CERN, people said you know, they are going to create a big black hole that is going to swallow the world (I remember that, they did say that) and I was a little nervous about that but I’m not a Physicist and all the Physicists said don’t worry about it. I don’t know whether you can trust a Physicist but I guess I think two things, one is, it’s not going to happen for a long time. The level they are talking about, where you get autonomous intelligence systems that can somehow support themselves without requiring people to babysit them I think that’s a long way off and I think really the outcome of the advance of artificial intelligence will depend a lot on the political system and how it gets used.
I’m much more worried about things that are much more in the near term like, things like drones, if you make those smarter and they’re used by the military that’s not good.
AMT: So in other words, we as a society have a role to play in the decisions we make and who is making the decisions for us.
Geoffrey Hinton: Yes, I think it’s extremely important, it’s as important as the technology itself and that’s why I think that taking all the funding out of social sciences and putting it into hard sciences is probably a mistake, we need to fund social scientists to figure out how this stuff is going to be used and how best to use it. It ought to be that when you get these wonderful new abilities it’s good for people. We need a political system where it’s going to be good for people when we get better at doing things rather than bad for people.
AMT: As your work continues, shoot way ahead to the future for me, how will people in the future view this period of time and what you are doing. What will they credit this development for 100 years from now?
Geoffrey Hinton: It’s very hard for me to speculate on that but I’m hoping they’ll see it as a sort of wonderful birth period where a new technique suddenly took off and this technique turned out to be the right way of doing things.
AMT: It’s fascinating, thank you for coming in.
Geoffrey Hinton: Thank you.