AI company, supported by Elon Musk, claims to have created a text generator too dangerous to publish


Photo: Getty

Researchers at the nonprofit OpenAI research group AI simply wanted to train their new text generation software to predict the next word in a sentence. This made them lose all their expectations and was so clever at imitating the writing written by humans that they decided to curb the research while exploring the damage that it could cause.

Elon Musk made it clear that he thought artificial intelligence was the "greatest existential threat" to humanity. Musk is one of OpenAI's major funders and, although he has played a leading role in the organization, his researchers seem to share his concerns about opening a Pandora's box . This week, OpenAI has shared an article covering its latest work on text-generation technology, but they are moving away from their usual practice of publishing the entire research to the public, for fear of being unable to do so. be abused by bad actors. Rather than publishing the fully formed model, it offers researchers a smaller model to experiment with.

The researchers used 40 GB of data extracted from 8 million web pages to form the GPT-2 software. This is ten times the amount of data used for the first iteration of GPT technology. The dataset was collected by browsing through Reddit and selecting links to articles with more than three positive votes. Once the training process was completed, they found that the software could contain a small amount of text and continue to write convincingly according to the prompt. It has problems with "very technical or esoteric content types" but when it comes to writing more in conversation, it generates "reasonable samples" in 50% of cases.

In one example, the software was powered by this paragraph:

In a shocking discovery, a scientist discovered a herd of unicorns living in a remote valley, hitherto unexplored, in the Andes. Even more surprising to the researchers was the fact that unicorns spoke perfect English.

On the basis of these two sentences, he was able to continue to write this fanciful report for nine other paragraphs in a way that could have been written in a credible way by a human being. Here are the next machine paragraphs that have been produced by the machine:

The scientist baptized the population, in reference to his horn, the Unicorn of Ovid. This white four-horned unicorn was previously unknown to science.

Now, after almost two centuries, the mystery of what caused this strange phenomenon is finally solved.

Dr. Jorge Pérez, an evolution biologist from the University of La Paz, and several of his companions, were exploring the Andes when they discovered a small valley without animals or humans. Pérez noticed that the valley had what appeared to be a natural fountain, surrounded by two peaks of rock and silver snow.

GPT-2 is remarkably good at adapting to the style and content of given prompts. The Guardian was able to run the software and tried the first line of George Orwell. 1984"It was very cold in April and the clocks sounded at one o'clock just now." The program picked up the tone of the selection and proceeded to its own dystopian science fiction:

I was in my car on my way to a new job in Seattle. I put the gasoline, put the key, and let it run. I just imagined what the day would look like. In a hundred years. In 2045, I was teaching in a school in a poor area of ​​rural China. I started with Chinese history and history of science.

OpenAI researchers found that GPT-2 worked very well when it was assigned tasks for which it was not necessarily designed, such as translation and synthesis. In their report, the researchers wrote that they should simply urge the trained model to perform these tasks correctly at a level comparable to that of other specialized models. After analyzing a short story about an Olympic race, the software was able to correctly answer basic questions such as "How long was the race?" And "Where did the race start?"

These excellent results frightened the researchers. One of their concerns is that the technology would be used to charge false news dissemination. The Guardian has published a fake news article written by the software, as well as its coverage of the research. The article is readable and contains fake quotes that are on the subject and realistic. Grammar is better than many things you would see from fake content production factories. And according to Alex Hern of the Guardian, it only took 15 seconds for the bot to write the article.

Other concerns that researchers have identified as potentially abusive include the automation of phishing emails, the borrowing of third-party personification online and self-generating harassment. But they also believe that there are many useful apps to discover. For example, this could be a powerful tool for developing better speech recognition programs or dialogue agents.

OpenAI plans to engage in a dialogue with the artificial intelligence community about its publishing strategy and hopes to explore possible ethical guidelines to guide this type of research in the future. They said that they will have more to discuss in public in six months.

[OpenAI via The Guardian]

Source link