Our quest to build artificially intelligent
robots has five challenges. The first, which we covered in another video, was seeing. This
is the second one. Let’s say you’ve solved the problem of seeing, and a robot can recognize
everything in the refrigerator (we can’t do anything like that, but just imagine that
we can). But they still wouldn’t be able to understand anything because they cannot contextualize.
If you’re driving through the town and saw a soccer ball bouncing in the street and a
child running out to get it and a woman frantically waving the child down, you would instantly
know what was going on, but to a computer, that’s just a bunch of pixels changing color.
Really it’s just a bunch of 1’s and 0’s changing. Think, for instance, how easy it is for a
human to find out what’s going on in a photo. You could say “that’s a conga line” and
that’s people hiding for a surprise party, and that’s a prom photo, and this one’s
a piano recital, and so forth. Every one of those is easy for you because you have cultural
context to decipher it. Now in theory, you can train a computer to do all that. If you
show it enough conga lines, it’s going to get really good at spotting conga lines. However,
that just brings us to our third challenge. If you’re interested in artificial intelligence
visit GigaOM.com or check out my new book “The Fourth Age: Smart Robots, Conscious Computers,
and the Future of Humanity”.