Artificial intelligence neural network learns when not to trust

[ad_1]

MIT researchers have developed a way for deep learning neural networks to quickly estimate levels of confidence in their output. This breakthrough could improve the safety and efficiency of AI-assisted decision making. Credit: MIT

A faster way to estimate uncertainty in AI-assisted decision-making could lead to more reliable results.

Increasingly, artificial intelligence systems known as deep learning neural networks are being used to inform decisions vital to human health and safety, such as autonomous driving or medical diagnosis. These networks are effective at recognizing patterns in large, complex data sets to aid in decision making. But how do we know they are right? Alexander Amini and his colleagues from WITH and Harvard University wanted to know.

They developed a quick way for a neural network to analyze the data and generate not only a prediction but also the confidence level of the model based on the quality of the data available. This breakthrough could save lives, as deep learning is already being deployed in the real world today. The level of certainty of a network can be the difference between an autonomous vehicle determining that “everything is clear to cross the intersection” and “it is probably clear, so stop just in case”.

Current methods of estimating uncertainty for neural networks tend to be computationally expensive and relatively slow for split-second decisions. But Amini’s approach, dubbed “deep evidence regression,” speeds up the process and may lead to safer results. “We not only need to have successful models, but also understand when we cannot trust those models,” says Amini, a doctoral student in Prof. Daniela Rus’s group at MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). .

“This idea is important and applicable in a broad sense. It can be used to evaluate products that are based on learned models. By estimating the uncertainty of a learned model, we also learn how much error to expect from the model and what data is missing that might improve the model, ”says Rus.

Amini will present the research at next month’s NeurIPS conference, along with Rus, who is Professor Andrew and Erna Viterbi of Electrical Engineering and Computer Science, Director of CSAIL and Associate Dean of Research for MIT Stephen A. Schwarzman College of Computing; and graduate students Wilko Schwarting from MIT and Ava Soleimany from MIT and Harvard.

Effective uncertainty

After a history of ups and downs, deep learning has demonstrated remarkable performance on a variety of tasks, in some cases even surpassing humans. precision. And these days, deep learning seems to go everywhere computers go. It powers search engine results, social media feeds, and facial recognition. “We have had huge success using deep learning,” says Amini. “Neural networks are really good at knowing the right answer 99% of the time.” But 99% won’t cut it when lives are on the line.

“One thing that researchers have missed is the ability of these models to know and tell us when they might be wrong,” Amini says. “We really care about that 1% of the time and how we can reliably and effectively detect these situations.”

Neural networks can be massive, sometimes spanning billions of parameters. So it can be difficult to get an answer, let alone a level of trust. Uncertainty analysis in neural networks is not new. But previous approaches, derived from Bayesian deep learning, repeatedly relied on running or sampling a neural network to understand its confidence. This process takes time and memory, a luxury that may not exist in high-speed traffic.

Researchers have developed a way to estimate uncertainty from a single run of the neural network. They designed the network with an enlarged output, producing not only a decision, but also a new probabilistic distribution capturing the evidence supporting that decision. These distributions, called probative distributions, directly capture the confidence of the model in its prediction. This includes any uncertainties present in the underlying input data, as well as in the final model decision. This distinction can indicate whether the uncertainty can be reduced by adjusting the neural network itself, or whether the input data is just noisy.

Trust control

To test their approach, the researchers started with a difficult computer vision task. They trained their neural network to analyze a monocular color image and estimate a depth value (i.e., the distance from the camera lens) for each pixel. An autonomous vehicle can use similar calculations to estimate its proximity to a pedestrian or other vehicle, which is not a simple task.

The performance of their network was comparable to previous advanced models, but it also acquired the ability to estimate its own uncertainty. As the researchers had hoped, the grating projected high pixel uncertainty where it predicted poor depth. “It was very calibrated based on the errors made by the network, which we thought was one of the most important things in judging the quality of a new uncertainty estimator,” says Amini.

To test their calibration, the team also showed that the network projected higher uncertainty for “out of distribution” data – completely new types of images never encountered during training. After training the network on indoor scenes at home, they fed it with a batch of outdoor driving scenes. The network has constantly warned that its responses to the new outdoor scenes are uncertain. The test highlighted the network’s ability to flag instances where users should not fully trust its decisions. In these cases, “if it’s a healthcare app, maybe we don’t trust the diagnosis the model gives and instead seek a second opinion,” Amini says.

The network even knew when the photos had been tampered with, which could potentially guard against data manipulation attacks. In another trial, the researchers increased conflicting noise levels in a batch of images they transmitted to the network. The effect was subtle – barely noticeable to the human eye – but the network sniffed out these images, marking their release with high levels of uncertainty. This ability to sound the alarm on falsified data could help detect and deter adverse attacks, a growing concern in the age of deepfakes.

Deep Evidence Regression is “a simple and elegant approach that advances the field of uncertainty estimation, which is important for robotics and other real-world control systems,” says Raia Hadsell, researcher in the field. artificial intelligence at DeepMind which did not participate in the work. “This is done in a new way that avoids some of the messy aspects of other approaches – for example sampling or sets – making it not only elegant but also more computationally efficient – a winning combination.”

Deep evidence-based regression could improve safety in AI-assisted decision making. “We’re starting to see a lot more of these [neural network] the models come out of the research lab and into the real world, into situations that affect humans with potentially fatal consequences, ”explains Amini. “Anyone using the method, whether a physician or someone in the passenger seat of a vehicle, should be aware of any risk or uncertainty associated with this decision.” He envisions the system not only for quickly flagging uncertainty, but also using it to make more conservative decisions in risky scenarios like an autonomous vehicle approaching an intersection.

“Any field that is going to have deployable machine learning ultimately needs to have a reliable awareness of uncertainty,” he says.

This work was supported, in part, by the National Science Foundation and the Toyota Research Institute through the Toyota-CSAIL Joint Research Center.

[ad_2]

Source link