AI detectives are cracking open the black box of deep learning (video)
Jason Yosinski sits in a small glass box at Uber’s San Francisco, California, headquarters, pondering the mind of an artificial intelligence. An Uber research scientist, Yosinski is performing a kind of brain surgery on the AI running on his laptop. Like many of the AIs that will soon be powering so much of modern life, including self-driving Uber cars, Yosinski’s program is a deep neural network, with an architecture loosely inspired by the brain. And like the brain, the program is hard to understand from the outside: It’s a black box.
This particular AI has been trained, using a vast sum of labeled images, to recognize objects as random as zebras, fire trucks, and seat belts. Could it recognize Yosinski and the reporter hovering in front of the webcam? Yosinski zooms in on one of the AI’s individual computational nodes—the neurons, so to speak—to see what is prompting its response. Two ghostly white ovals pop up and float on the screen. This neuron, it seems, has learned to detect the outlines of faces. “This responds to your face and my face,” he says. “It responds to different size faces, different color faces.”
No one trained this network to identify faces. Humans weren’t labeled in its training images. Yet learn faces it did, perhaps as a way to recognize the things that tend to accompany them, such as ties and cowboy hats. The network is too complex for humans to comprehend its exact decisions. Yosinski’s probe had illuminated one small part of it, but overall, it remained opaque. “We build amazing models,” he says. “But we don’t quite understand them. And every year, this gap is going to get a bit larger.”
This video provides a high-level overview of the problem:
Each month, it seems, deep neural networks, or deep learning, as the field is also called, spread to another scientific discipline. They can predict the best way to synthesize organic molecules. They can detect genes related to autism risk. They are even changing how science itself is conducted. The AIs often succeed in what they do. But they have left scientists, whose very enterprise is founded on explanation, with a nagging question: Why, model, why?
That interpretability problem, as it’s known, is galvanizing a new generation of researchers in both industry and academia. Just as the microscope revealed the cell, these researchers are crafting tools that will allow insight into the how neural networks make decisions. Some tools probe the AI without penetrating it; some are alternative algorithms that can compete with neural nets, but with more transparency; and some use still more deep learning to get inside the black box. Taken together, they add up to a new discipline. Yosinski calls it “AI neuroscience.”
Opening up the black box
Loosely modeled after the brain, deep neural networks are spurring innovation across science. But the mechanics of the models are mysterious: They are black boxes. Scientists are now developing tools to get inside the mind of the machine.
Marco Ribeiro, a graduate student at the University of Washington in Seattle, strives to understand the black box by using a class of AI neuroscience tools called counter-factual probes. The idea is to vary the inputs to the AI—be they text, images, or anything else—in clever ways to see which changes affect the output, and how. Take a neural network that, for example, ingests the words of movie reviews and flags those that are positive. Ribeiro’s program, called Local Interpretable Model-Agnostic Explanations (LIME), would take a review flagged as positive and create subtle variations by deleting or replacing words. Those variants would then be run through the black box to see whether it still considered them to be positive. On the basis of thousands of tests, LIME can identify the words—or parts of an image or molecular structure, or any other kind of data—most important in the AI’s original judgment. The tests might reveal that the word “horrible” was vital to a panning or that “Daniel Day Lewis” led to a positive review. But although LIME can diagnose those singular examples, that result says little about the network’s overall insight.
New counterfactual methods like LIME seem to emerge each month. But Mukund Sundararajan, another computer scientist at Google, devised a probe that doesn’t require testing the network a thousand times over: a boon if you’re trying to understand many decisions, not just a few. Instead of varying the input randomly, Sundararajan and his team introduce a blank reference—a black image or a zeroed-out array in place of text—and transition it step-by-step toward the example being tested. Running each step through the network, they watch the jumps it makes in certainty, and from that trajectory they infer features important to a prediction.
Sundararajan compares the process to picking out the key features that identify the glass-walled space he is sitting in—outfitted with the standard medley of mugs, tables, chairs, and computers—as a Google conference room. “I can give a zillion reasons.” But say you slowly dim the lights. “When the lights become very dim, only the biggest reasons stand out.” Those transitions from a blank reference allow Sundararajan to capture more of the network’s decisions than Ribeiro’s variations do. But deeper, unanswered questions are always there, Sundararajan says—a state of mind familiar to him as a parent. “I have a 4-year-old who continually reminds me of the infinite regress of ‘Why?’”
The urgency comes not just from science. According to a directive from the European Union, companies deploying algorithms that substantially influence the public must by next year create “explanations” for their models’ internal logic. The Defense Advanced Research Projects Agency, the U.S. military’s blue-sky research arm, is pouring $70 million into a new program, called Explainable AI, for interpreting the deep learning that powers drones and intelligence-mining operations. The drive to open the black box of AI is also coming from Silicon Valley itself, says Maya Gupta, a machine-learning researcher at Google in Mountain View, California. When she joined Google in 2012 and asked AI engineers about their problems, accuracy wasn’t the only thing on their minds, she says. “I’m not sure what it’s doing,” they told her. “I’m not sure I can trust it.”
Rich Caruana, a computer scientist at Microsoft Research in Redmond, Washington, knows that lack of trust firsthand. As a graduate student in the 1990s at Carnegie Mellon University in Pittsburgh, Pennsylvania, he joined a team trying to see whether machine learning could guide the treatment of pneumonia patients. In general, sending the hale and hearty home is best, so they can avoid picking up other infections in the hospital. But some patients, especially those with complicating factors such as asthma, should be admitted immediately. Caruana applied a neural network to a data set of symptoms and outcomes provided by 78 hospitals. It seemed to work well. But disturbingly, he saw that a simpler, transparent model trained on the same records suggested sending asthmatic patients home, indicating some flaw in the data. And he had no easy way of knowing whether his neural net had picked up the same bad lesson. “Fear of a neural net is completely justified,” he says. “What really terrifies me is what else did the neural net learn that’s equally wrong?”
Today’s neural nets are far more powerful than those Caruana used as a graduate student, but their essence is the same. At one end sits a messy soup of data—say, millions of pictures of dogs. Those data are sucked into a network with a dozen or more computational layers, in which neuron-like connections “fire” in response to features of the input data. Each layer reacts to progressively more abstract features, allowing the final layer to distinguish, say, terrier from dachshund.
At first the system will botch the job. But each result is compared with labeled pictures of dogs. In a process called backpropagation, the outcome is sent backward through the network, enabling it to reweight the triggers for each neuron. The process repeats millions of times until the network learns—somehow—to make fine distinctions among breeds. “Using modern horsepower and chutzpah, you can get these things to really sing,” Caruana says. Yet that mysterious and flexible power is precisely what makes them black boxes.
Complete original article here.