[ad_1]
Last week, Mona Lisa smiled. A big, broad smile, followed by what seemed like a laugh and silent words that could only be an answer to the mystery that had seduced his viewers for centuries.
A lot of people were confused.
Mona's "living portrait", as well as the portraits of Marilyn Monroe, Salvador Dali and others, showcased the latest deepfake technologies: seemingly realistic videos or sounds generated at the same time. 39, help with machine learning. Developed by researchers at Samsung's Artificial Intelligence Laboratory in Moscow, the portraits show a new way to create credible videos from a single image. With only a few real face photographs, the results dramatically improve, producing what the authors describe as "photorealistic talking heads". The researchers call (with horror) the result "puppeteering", a reference to how invisible strings seem to manipulate the targeted face. . And yes, it could, in theory, be used to animate the photo of your Facebook profile. But do not be afraid that chains will maliciously pull your face so soon.
"Nothing tells me that you will use this turnkey to generate deepfakes at home. Not short, medium or even long term, "says Tim Hwang, director of AI's Harvard-MIT Ethics and Governance Initiative. The reasons are related to the high costs and the technical know-how necessary to create fake quality products, obstacles that will not disappear anytime soon.
Deepfakes took its first steps in the public eye late. In 2017, an anonymous publisher under the name "deepfakes" started downloading celebrity videos such as Scarlett Johansson sewn on badgra bodies. phic actors. Early examples were tools that could insert a face into an existing sequence, frame by frame – a tedious process from yesterday to today – and be quickly extended to political personalities and television personalities. Celebrities are the easiest targets, with many public images that can be used to form deepfake algorithms; For example, it is relatively easy to make a high fidelity video of Donald Trump, broadcast on television day and night and from all angles.
The underlying technology of deepfakes is a hot area for companies working on augmented reality. On Friday, Google announced a breakthrough in controlling depth perception in video footage, making it easy to detect depths. In their article, published Monday as a pre-print, Samsung researchers stress the need to quickly create avatars for games or videoconferences. Apparently, the company could use the underlying model to generate an avatar with just a few images, a photorealistic response to Apple's Memoji. The same lab also published this week an article on creating complete avatars.
Concerns about the misuse of these advances sparked a debate over whether "deepfakes" could be used to undermine democracy. The problem is that an ingenious character, perhaps imitating a grainy video on a cell phone so that his imperfections are ignored and programmed at the right time, can give rise to many opinions. This triggered an arms race to automate the ways to detect them before the 2020 elections. The Pentagon's Darpa has dedicated tens of millions of dollars to a research program in the field of media forensics, and several startups are considering become referees of truth as the campaign starts. In Congress, politicians have called for a law banning their "malicious use."
But Robert Chesney, a professor of law at the University of Texas, said political upheavals do not require advanced technology; it may come from lower-quality products, intended to sow discord, but not necessarily to deceive. Take, for example, the three-minute clip by House Speaker Nancy Pelosi, which is circulating on Facebook and appears to be showing her publicly reproaching her intoxicated. It was not even a deepfake; the scoundrels had simply slowed the shooting
By reducing the number of photos required, the Samsung method adds another gap: "It means bigger problems for ordinary people," says Chesney. "Some people may have felt a little isolated by the anonymity of not having a lot of video or photographic evidence online." Called "learning in a few steps," the approach eliminates the need for 39, advance most of the cumbersome calculations. Rather than being driven with, for example, Trump-specific sequences, the system receives a much larger amount of videos including various people. The idea is that the system learn the basic contours of human heads and facial expressions. From there, the neural network can apply what it knows to manipulate a given face based only on a few photos or, as in the case of Mona Lisa, on a single one.
This approach is similar to the methods that have revolutionized neural networks Learning other things, such as language, with huge data sets that teach them generalizable principles. This has spawned models such as GPT-2 from OpenAI, whose written language is so fluid that its creators have decided not to publish it, for fear that it will only be used to create fake information.
The handling of this new technique presents great challenges. maliciously against you and me. The system relies on fewer target face images, but requires the formation of a large model from scratch, which is expensive and time consuming, and will probably become more so. They also take expertise to handle. You do not know why you want to generate a video from scratch, rather than turning to, for example, established film editing techniques or Photoshop. "The propagandists are pragmatists. There are many more cheaper ways to do it, "says Hwang.
For the time being, though it was suitable for malicious use, this type of quibble would be easy to spot, says Siwei Lyu, a professor at the State University. from New York to Albany who studies forensic deepfake as part of the Darpa program. The demo, although impressive, lacks finer details, he notes, like the famous mole of Marilyn Monroe, who faints as she throws her head back to laugh. Researchers have not yet addressed other issues, such as the proper synchronization of audio with deepfake and elimination of glitchy backgrounds. As a comparison, Lyu sends me a cutting-edge example using a more traditional technique: a video basing Obama's face on a Pharrell Williams singing "Happy" imitator. Albany researchers have not unveiled the method, because of its possibility of being militarized.
Hwang has little doubt that improving technology will eventually make it difficult to distinguish between fake and reality. Costs will go down or a better trained model will be released, which will allow a savvy person to create a powerful online tool. At this point, he argues that the solution will not necessarily be leading digital forensics, but an ability to look at contextual clues – a robust way for the public to badess evidence outside the video that corroborate or refute its veracity. Checking the facts, basically.
But this verification of the facts already proves a challenge for digital platforms, especially when it comes to acting. As Chesney points out, it is currently quite easy to detect altered sequences, like the Pelosi video. The question is what to do next, without going down a slippery slope to determine the intention of the creators – be it a satire, perhaps, or created with mischief. "If he clearly wants to defraud the listener to think of something derogatory, it seems obvious to suppress it," he says. "But once you've taken that route, you're in a dilemma." Facebook seemed to have reached a similar conclusion this weekend: Pelosi's video was still shared on the Internet, with the company. said, additional context from independent auditors.
This story was originally published on Wired.com.
Source link