SpaceX founder Elon Musk watches a press conference after the launch of the SpaceX Falcon 9 rocket, carrying the Crew Dragon spacecraft, took off on an unmanned test flight to the International Space Station from the Kennedy Space Center in Cape Canaveral, Fla., March 2, 2019.
Mike Blake | Reuters
Armchairs shaped like avocados and daikon radish babies wearing tutus are among the original images created by new software from OpenAI, an artificial intelligence lab supported by Elon Musk in San Francisco.
OpenAI trained the software, known as Dall-E, to generate images from short text captions. He specifically used a dataset of 12 billion images and their captions, which were found on the internet.
The lab said Dall-E – a coat rack by Spanish surrealist artist Salvador Dali and Wall-E, a small animated robot from the Pixar film of the same name – had learned to create images for a wide range of concepts.
OpenAI showed some of the results in a blog post published on Tuesday. “We found that [Dall-E] has a diverse set of abilities, including creating anthropomorphized versions of animals and objects, combining plausibly unrelated concepts, rendering text, and applying transformations to existing images, ”wrote the society.
Dall-E is built on a neural network, which is a computer system loosely inspired by the human brain that can spot patterns and recognize relationships between vast amounts of data.
While neural networks have generated images and videos in the past, Dall-E is unusual because it relies on text input while others do not.
Synthetic videos and images have become more sophisticated in recent years as it has become difficult for humans to distinguish between what is real and what is computer generated. General Antagonist Networks (GANs), which use two neural networks, have been used to create fake videos of politicians, for example.
OpenAI acknowledged that Dall-E has the “potential for large and broad societal impacts,” adding that it plans to analyze how models like Dall-E “relate to societal issues such as economic impact on certain work processes and occupations, the potential for bias in model results and the longer-term ethical challenges this technology entails. “
Successor of GPT-3
Dall-E comes just months after OpenAI announced that it had built a text generator called GPT-3 (Generative Pre-training), which is also supported by a neural network.
The language generation tool is capable of producing human-like text on demand and it became relatively famous for an AI program when people realized it could write its own poetry, news articles and his news.
“Dall-E is a Text2Image system based on GPT-3 but trained on text and images,” said Mark Riedl, associate professor at the Georgia Tech School of Interactive Computing, at CNBC.
“Text2image is nothing new, but the Dall-E demo is remarkable for producing illustrations that are much more consistent than other Text2Image systems I’ve seen in recent years.”
OpenAI has competed with companies like DeepMind and the Facebook AI Research group to create general-purpose algorithms that can perform a wide range of tasks at the human level and beyond.
Researchers have built AIs that can play complex games like chess and the Chinese board game of Go, translate one human language into another, and spot tumors during a mammogram. But getting an AI system to demonstrate true “creativity” is a major challenge in the industry.
Riedl said that Dall-E’s results show that he has learned to mix concepts in a coherent way, adding that “the ability to mix concepts in a coherent way is seen as a key form of creativity in humans.”
“From a creativity standpoint, this is a big step forward,” Riedl added. “While there is not much agreement on what it means for an AI system to ‘understand’ something, the ability to use concepts in new ways is an important part of creativity and intelligence. “
Neil Lawrence, the former director of machine learning at Amazon Cambridge, told CNBC that Dall-E looked “very impressive.”
Lawrence, who is now a professor of machine learning at the University of Cambridge, described it as “an inspiring demonstration of the ability of these models to store information about our world and to generalize in ways that humans find it very natural. “
He said: “I think there will be all kinds of applications of this type of technology, I can’t even begin to imagine. But it’s also interesting in terms of being another pretty mind-blowing technology that solves problems we didn’t. even know we actually had. “
‘Don’t advance the state of AI’
Not everyone is so impressed with Dall-E, however.
Gary Marcus, an entrepreneur who sold a machine learning startup to Uber in 2016 for an undisclosed sum, told CNBC it was interesting but “it didn’t advance the state of AI. “.
He also pointed out that it has not been opened to the source and that the company has yet to publish an academic paper on the research.
Marcus has already questioned whether some of the research published by rival laboratory DeepMind in recent years should be classified as “breakthroughs.”
OpenAI was created as a non-profit organization with a pledge of $ 1 billion from a group of founders including Tesla CEO Elon Musk. In February 2018, Musk stepped down from OpenAI’s board of directors but continues to donate and advise the organization.
OpenAI was created for-profit in 2019 and raised an additional $ 1 billion from Microsoft to fund its research. GPT-3 is slated to be OpenAI’s first commercial product, and Reddit has signed up as one of the first customers.