[ad_1]
Systems designed to detect deepfakes – videos that manipulate real images via artificial intelligence – can be fooled, computer scientists first showed at the WACV 2021 conference which took place online 5-9. January 2021.
Researchers have shown that detectors can be defeated by inserting entries called conflicting examples into every video frame. Conflicting examples are lightly manipulated inputs that cause artificial intelligence systems such as machine learning models to fail. Additionally, the team has shown that the attack still works after compressing videos.
“Our work shows that attacks on deepfake detectors could be a real threat,” said Shehzeen Hussain, a PhD in computer engineering at the University of San Diego. student and first co-author of the WACV article. “Even more alarmingly, we demonstrate that it is possible to create robust adversarial deepfakes even when an adversary may not be aware of the inner workings of the machine learning model used by the detector.”
In deepfakes, a subject’s face is altered to create realistic, compelling images of events that never actually happened. As a result, typical deepfake detectors focus on the face in videos: first tracking it, then transmitting the data from the cropped face to a neural network that determines if it’s real or fake. For example, the blinking of the eyes is not reproduced well in deepfakes, so the detectors focus on eye movements as a way to make this determination. State-of-the-art Deepfake detectors rely on machine learning models to identify fake videos.
XceptionNet, a deep fake detector, calls contradictory video created by researchers as real. Credit: University of California, San Diego
The massive dissemination of fake videos via social media platforms has raised significant concerns around the world, in particular hampering the credibility of digital media, the researchers point out. “If attackers have some knowledge of the detection system, they can design inputs to target the detector’s blind spots and bypass it,” said Paarth Neekhara, the journal’s other lead co-author and computer student at the ‘UC San Diego.
The researchers created a conflicting example for each face in a video image. But while standard operations such as compressing and resizing video usually remove conflicting examples from an image, these examples are designed to withstand those processes. The attack algorithm does this by estimating on a set of input transformations how the model classifies the images as real or false. From there, it uses this estimation to transform the images so that the contradictory image remains effective even after compression and decompression.
The modified version of the face is then inserted into all video frames. The process is then repeated for all frames in the video to create a deepfake video. The attack can also be applied to detectors that work on entire video frames rather than face crops.
The team refused to release their code so that it would not be used by hostile parties.
High success rate
The researchers tested their attacks in two scenarios: one where the attackers have full access to the detector model, including the face extraction pipeline and the architecture and parameters of the classification model; and one where attackers can only query the machine learning model to determine the likelihood of an image being classified as real or fake.
In the first scenario, the attack success rate is over 99% for uncompressed videos. For compressed videos, it was 84.96%. In the second scenario, the success rate was 86.43% for uncompressed videos and 78.33% for compressed videos. This is the first work that demonstrates successful attacks on advanced Deepfake detectors.
“To use these deepfake detectors in practice, we contend that it is essential to evaluate them against an adaptive adversary who is aware of these defenses and intentionally tries to outsmart those defenses,” ?? write the researchers. “We show that the current state of in-depth false detection methods can be easily bypassed if the adversary has full or even partial knowledge of the detector.”
To improve detectors, the researchers recommend an approach similar to what is known as antagonist training: during training, an adaptive adversary continues to generate new deepfakes that can bypass the current state of the advanced detector; and the detector continues to improve in order to detect new deepfakes.
Conflicting Deepfakes: Assessing the Vulnerability of Deepfake Detectors to Conflicting Examples
* Shehzeen Hussain, Malhar Jere, Farinaz Koushanfar, Department of Electrical and Computer Engineering, UC San Diego
Paarth Neekhara, Julian McAuley, Department of Computer Science and Engineering, UC San Diego
[ad_2]
Source link