‘Typographic attacks’ bring OpenAI image recognition to its knees

[ad_1]

CLIP identifications before and after attaching a piece of paper that says “iPod” to an apple.

Deceive a terminator in do not photographing yourself can be as simple as carrying a giant sign that says ROBOT, at least until Elon Musk-backed research firm OpenAI trains their image recognition system to not misidentify things on the base of some scribbles of a Sharpie.

OpenAI researchers work published last week on the CLIP neural network, their state-of-the-art system that enables computers to recognize the world around them. Neural networks are machine learning systems that can be trained over time to improve in a certain task using a network of interconnected nodes – in the case of CLIP, by identifying objects by function of an image – in a way that is not always immediately clear to the developers of the system. . Research published last week concerns’ multimodal neurons ”, which exist in both biological systems like the brain and artificial systems like CLIP; they “respond to groups of abstract concepts centered on a common high-level theme, rather than on a specific visual characteristic”. At the highest level, CLIP organizes images around a “loose semantic collection of ideas”.

For example, the OpenAI team wrote, CLIP has a multim“Spider-Man” odal neuron that is triggered upon seeing an image of a spider, the word “spider”, or a picture or drawing of the eponymous superhero. A side effect of multimodal neurons, the researchers say, is that they can be used to trick CLIP: the research team was able to trick the system by identifying an apple (the fruit) as an iPod (the device made by Apple ) just by gluing a piece of paper with “iPod” written on it.

Moreover, the system was in fact After confident that they correctly identified the item in question when this happened.

G / O Media can get a commission

The research team called the problem a “typo attack” because it would be trivial for anyone familiar with the problem to deliberately exploit it:

We believe that attacks such as the ones described above are far from a mere academic concern. By exploiting the model’s ability to read text in a robust manner, we find that even photographs of handwritten text can often mislead the model.

[…] We also believe that these attacks can also take a more subtle and less visible form. An image, given to CLIP, is abstracted in many subtle and sophisticated ways, and these abstractions can over-abstract common patterns – oversimplifying and, by virtue of this, over-generalizing.

This is less of a failure of CLIP and more of an illustration of the complexity of the underlying associations it has formed over time. By the keeper, OpenAI research has revealed the conceptual models that CLIP constructs are in many ways similar to how a human brain works.

The researchers anticipated that the Apple / iPod problem was just one obvious example of a problem that could manifest itself in countless other ways in CLIP, as its multimodal neurons “generalize between the literal and the iconic, this which can be a double-edged sword. For example, the system identifies a piggy bank as the combination of the neurons “finance” and “dolls, toys.” The researchers found that CLIP thus identifies the image of a standard poodle as a piggy bank when they forced the neuron financial to trigger by drawing dollar signs on it.

The research team noted that the technique is similar to “contradictory imagesWhich are images created to trick neural networks into seeing something that doesn’t exist. But overall it’s cheaper to make, because all it takes is paper and some writing on it. (As the Inscription noted, visual recognition systems are generally in their infancy and vulnerable to a series of other simple attacks, such as a Tesla autopilot system that McAfee Labs researchers tricked into thinking a 35 mph road sign was actually an 80 mph sign with a few inches of electrical tape.)

The associative model of CLIP, the researchers added, also had the ability to be significantly wrong and generate sectarian or racist conclusions about various types of people:

We observed, for example, a neuron from the “Middle East” [1895] with association with terrorism; and an “immigration” neuron [395] that responds to Latin America. We even found a neuron that fires for both dark-skinned people and gorillas. [1257], reflecting previous photo tagging incidents in other designs which we consider unacceptable.

“We believe that these investigations of CLIP are only scratching the surface in understanding CLIP behavior, and we invite the research community to join in improving our understanding of CLIP and similar models,” wrote Researchers.

CLIP isn’t the only project OpenAI has worked on. Its GPT-3 text generator, which OpenAI researchers described in 2019 as too dangerous to release, a come a long way and is now able to generate a natural sound (but not necessarily convincing) fake press articles. In September 2020, Microsoft acquired a exclusive license to put GPT-3 to work.

[ad_2]

Source link