When Google announced in January that
Ray Kurzweil would be
joining the company, a lot of people wondered why the phenomenally accomplished entrepreneur and futurist would want to work for a large company he didn’t start.
Kurzweil’s answer: No one but Google could provide the kind of computing and engineering resources he needed to fulfill his life’s work. Ever since age 14, the 65-year-old inventor of everything from music synthesizers to speech recognition systems has aimed to create a true artificial intelligence, even going so far as to predict that machines would match human intelligence by 2029.
Now, as a director of engineering at Google, he’s focusing specifically on enabling computers to truly understand and even speak in natural language. As I outlined in a recent
story on deep learning–a fast-rising branch of AI that attempts to mimic the human neocortex to recognize patterns in speech, images, and other data–Kurzweil eventually wants to help create a “cybernetic friend” that knows what you want before you do (that is, if
someone elsedoesn’t get to it first).
In a recent interview I conducted for the story, Kurzweil revealed a surprising amount of detail about his planned work at Google. No doubt the nature of that work will evolve as he settles in at the company, but so far, this interview provides possibly the deepest look so far at his plans.
At least initially, that work won’t relate directly to advertising, the main subject of this blog. But marketers will need to understand how profoundly Kurzweil’s and others’ work at Google could change not only what search will become in the age of more and more intelligent machines, but the way we interact with information and even each other. All that is sure to mean
big changes in the nature of advertising and marketing–well before 2029.
Q: In your book, How to Create a Mind, you lay out a theory of how the brain works. Can you explain it briefly?
A: The world is hierarchical. Only mammals have a neocortex, and the neocortex evolved to provide a better understanding of the structure of the world so you can do a better job of modifying it to your needs and solving problems within a hierarchical world. We think in a hierarchical manner. Our first invention was language, and language is hierarchical.
The theory behind deep learning, which I would call hierarchical learning, is that you have a model that reflects the hierarchy in the natural phenomenon you’re trying to learn. If you don’t do that, it’s going to be much weaker and fooled by apparent ambiguities.
Q: How will you apply that theory at Google?
A: What I’ll be doing here is developing hierarchical methods specifically aimed at understanding natural language, extracting semantic meaning … actually developing a way to represent and model the semantic content of documents to do a better job of search and answering questions.
An increasing percentage of queries to Google are in the form of questions. The questions right now can’t have an indefinite complexity to them. But if we can actually model language in a hierarchical fashion, we can do a better job of answering questions and doing search in general, by actually modeling what all these billions of web pages are trying to say.
Q: Google or someone else hasn’t done that yet?
A: There are some academic projects that have attempted to do this, but nobody has really developed a full solution to it. I will say that
IBM’s Watson [the famous Jeopardy!-playing computer] does an impressive job of actually understanding semantic language, and it shows the feasibility of doing this. All the knowledge that Watson had was not hand-coded in some computer language. The idea that you could write down all this common-sense knowledge … turns out to be very brittle, because it doesn’t reflect the ambiguities sufficiently in language and common-sense knowledge.
Watson didn’t work that way. It actually got its knowledge by reading Wikipedia and several other encyclopedias, and then played a game that is not a narrow task. It’s really equivalent to answering questions. The queries can be very diverse. For example, it got the query, “A long, tiresome speech delivered by a frothy pie topping,” and it answered, “What is a meringue harangue?” Watson got a higher score in Jeopardy! than the next best two human players put together.
Q: Why did you come to Google?
A: I definitely gave that a lot of thought. It’s actually my first job for a company that I didn’t found myself. And I don’t think I could have ended up anywhere else.
I’ve had the opportunity to work with Larry Page on a number of projects over the years. Larry and I have had a series of conversations about artificial intelligence. Some of the techniques here use learning algorithms that are not deep but get their tremendous power from Google-scale data. The thrust of our conversations was that Google-scale data and computing infrastructure is a necessary ingredient to creating even stronger forms of artificial intelligence.
In July, I had a meeting with Larry to talk about my book, because I had given him a pre-publication draft. I said I was interested in doing a project, or maybe starting a company to develop some of these ideas. He made the pitch that I should consider doing that at Google because as Larry and I discussed, Google-scale data and computing infrastructure is a necessary ingredient. He said, “I could try to give you some access to it, Ray, but it’s going to be very difficult to do that for an independent company.” It really made sense. So I was pretty easily convinced that I really couldn’t do the project outside of Google the way I could do it inside.
Q: Still, it struck some people as surprising that you’d essentially become an employee after so many years as an entrepreneur.
A: It’s an opportunity to have impact. That’s what motivates me as an inventor. A reading machine for the blind involved some scientific breakthroughs, but the real satisfaction is having hundreds or thousands of blind people saying it has helped them get a job or an education. Here you’ve got around a billion people who use Google. If I can contribute to that, it has tremendous leverage in terms of helping people. It really leverages human knowledge.
This is not some little project. This is the culmination of literally 50 years of my focus on artificial intelligence. I’ve always had in mind working on the ultimate challenge, which I believe is being able to really model and understand natural language and using that to do practical things.
Q: Why is understanding language the ultimate challenge?
A: Alan Turing based the
Turing Test entirely on written language. Basically it’s an instant-messaging game. To really master natural language even at the written form at a level that’s completely convincing–and that’s the key to the Turing Test–to a human requires the full scope of human intelligence. You can’t just do some simple language processing tricks. There are chatbots that do exactly that, and they may fool some people for a few minutes, but they don’t really pass a valid Turing Test.
So the point is that natural language is a very profound domain to do artificial intelligence in. I really couldn’t do the project anywhere else in the way I could do it here. And now that I’ve been here for two months, I can see the wisdom of Larry’s counsel.
Q: How so? What else at Google will help you achieve your goal?
A: Take the
Knowledge Graph [Google’s catalog of 700 million topics, locations, people and other concepts]. If you’re going to understand natural language, you have to understand the concepts and things in the worlds, abstract things as well as concrete things. The Knowledge Graph now has 700 million entries and billions of links between them and growing quickly. That’s not something I could possibly create. The Knowledge Graph is definitely something we’re going to use. Because if you’re going to model what language is saying, you have to link into the knowledge base of all the concepts and it already has many of their relationships. And there are many other technologies, like
syntactic parsing, at a level that you don’t see outside of Google.
Q: There’s a chapter in your book where you lay out the steps to actually create a mind in silicon and software. Is that your intention at Google, at least eventually?
A: I revealed a general direction. I have proprietary ideas that for obvious reasons I didn’t reveal in the book–specifically how do you build the hierarchy? I don’t actually talk about that in the book. The key thing is that we’re not born with that hierarchy. We’re born with these unconnected modules that don’t have any patterns in them, and then we start to learn, even before we’re born, because our eyes open at 26 weeks and we start hearing sounds.
The key thing is that the neocortex creates this hierarchy, how these modules connect to a higher module, from its own experience. It takes many years or even decades to get to a point where we can achieve certain levels of performance. So even if you did a perfect job of creating a neocortex, it wouldn’t do anything without that learning experience. So a lot of creating an AI is to create that learning experience.
Q: How do you plan to do that?
A: I have ideas how to actually build that hierarchy from the data that a simulated neocortex would be exposed to. That’s what I’m doing here. Larry was excited enough about the book and has given a high priority here to AI. So I have enough independence; I wasn’t given a 16-page spec on exactly how I’m going to do this and this and this over the next several years.
Even though Watson actually beat the best two human players, it did that because of scale, because it could actually do something with 200 million pages. You and I can’t read 1 million pages. So the idea is to actually get enough semantic meaning, even if we can’t extract all of it or nearly as much as a human would, but we can do that with every web page and every book page and do a better job at search. That’s the direction that search and knowledge navigation in general is going to go.
Q: Where are you starting? Are there certain kinds of challenges you need to tackle first to get this on the road?
A: I have an idea in mind how to model the semantic meaning. That’s immediately a challenge. If you talk about speech recognition, I can describe conceptually what it means to translate a speech signal into an output very easily: It’s a transcription of what somebody said. Then I can do the hard work of building up a library of a million utterances and the correct translation. It’s a big project, but it’s a doable project. And then you have some kind of learning algorithm. In our case, we used a hierarchical learning method. And then you can learn from experience.
That very first step, which is to describe the correct translation of a speech utterance, is actually difficult to even describe in natural language understanding. How do you represent what the correct meaning of language is? Even ignoring the fact that there’s lot of ambiguities in what people say, how do you even describe that? I have an idea in mind of a graphical way to represent the semantic meaning of language. It may not capture every subtlety, but right now computer programs don’t capture anything of the semantics.
And then I have in mind a way to build up the kind of database we had in speech recognition. It’s easy to get lots of examples of text, you just go to Wikipedia and you’ve got millions of pages. But then I’ve got in mind a way to build up a database of the correct translation into this method of representing semantic meaning. And then we’ll develop a deep learning algorithm that will be able to translate that. So when it gets a new sentence, maybe a sentence that the user puts in with a question, or some new web pages that come out every minute, it can do a correct job of doing that translation.
Q: Will you and your team be working with other AI-related teams at Google?
A: I already have part of my team assembled, some of them internal transfers, some people from the outside. I’ll be extensively utilizing other resources that are here. For example, we need to flesh out the Knowledge Graph to incorporate a broader set of relationships. It’s not sufficiently comprehensive to represent all the kinds of relationships you’d want to express in language. So one of the things we will do along the way to natural language understanding is work with the Knowledge Graph team to expand the Knowledge Graph and incorporate more relationships.
Q: Ultimately, is your goal to create an AI, something that can pass the Turing Test and all that that involves, or is it to improve the human brain using these techniques? Or do you even separate those two things?
A: If you’re talking about my career, I’m first and foremost an inventor. I got into futurism in the service of being an inventor. Timing was critical; the inventors whose names you recognize got the timing right. Larry and Sergey had a great idea about reverse-engineering the links on the Internet to create a better search engine. If they were a couple years earlier or later, you might not recognize their names. It’s like skeet shooting. Think back three or four years ago, most people didn’t use social networks and wikis and blogs. Go back a dozen years or so ago, most people didn’t use search engines.
My personal motivation is not to create a computer that someday passes the Turing Test. My personal goal is to do work that makes a near-term contribution. Natural-language understanding is not a goal that is finished at some point, any more than search. That’s not a project that I think I’ll ever finish.
Q: How will the use of deep learning and neural networks change the nature of computers?
A: The actual structure of the
Von Neumann computer [the basis for all conventional computers today] is quite different from the organization of the brain. They’re modestly parallel. But in the brain, every one of the 100 trillion interneural connections is computing simultaneously. So it’s truly massive parallel processing. But each of those computations is extremely slow, like 100 calculations per second. And none of those computations by themselves are critical, it’s all organized in a probabilistic manner. That’s not true of computers.
Supercomputers are already faster than the amount of computation you’d need to completely simulate the brain–not simulate at the molecular level, but simulate it functionally. That’s an important distinction. There’s the
Blue Brain Project of
Henry Markram, which has got a billion-Euro funding. And now there’s a project here in the U.S. to copy that [the
Brain Activity Map]. That’s to simulate the brain at the molecular level.
Q: So you don’t feel that’s a doable approach?
A: Well, it’s a great project but not the right way to create AI, rather to test out our ideas about how the brain actually works. It’s a great way to learn about the brain, to learn how neurons work, how ion channels work, and build a large-scale simulation of the brain and see if that functions correctly. It’s really a way of studying the brain, then we can learn some methods and then use those biologically inspired algorithms as ways to create AI.
Q: What else will determine when we see a true AI?
A: There’s the
law of accelerating returns. Based on how information technology progresses, we always use today’s technology to create the next. Same thing applies to software. Once we have a system working, we can improve it, and we tend to improve things in multiples, not linearly. The same technology is enabling us to see into the brain–the spatial resolution of brain scanning is increasing exponentially, the model of the data we’re getting from neuroscience is improving exponentially. Then we can use these insights into how the brain does things as biologically inspired algorithms to do a better job of artificial intelligence.
Source : Robert Hof, Forbes, 4/29/2013