AI robots train for 180 years a day to beat humans at Dota 2



[ad_1]

Beating humans to board games is over in the AI ​​world. Now the best academics and technology companies want to challenge us to video games instead. Today, OpenAI, a research laboratory founded by Elon Musk and Sam Altman, has announced its latest milestone: a team of AI agents that can beat the top 1% of fans at the popular game of the battle arena Dota 2.

You may remember that OpenAI first penetrated the world dota August 2, unveiling a system capable of beating the best players in 1v1 matches. However, this type of game greatly reduces the challenge of Dota 2. OpenAI has now updated its bots to play humans in 5v5 matches, which requires more coordination and long-term planning. And even though OpenAI has not yet challenged the best players in the game, it will do so later this year at The International, a Dota 2 tournament which is the biggest annual event on the e-sports calendar.

The motivation for research like this is simple: If we can teach AI systems the skills they need to play video games, we can use them to solve complex real-world challenges that in some ways, look like video games – like, for example, managing a city's transportation infrastructure.

"It's an important step, and it's really because it's about moving to real-world applications," said Greg Brockman, Co-Founder and Chief Technology Officer. ; OpenAI. The edge. "If you have a simulation [of a problem] and you can run it fairly large scale, there is no barrier to what you can do with it. "



Dota 2 is a complex fighting arena game, where teams of five work to achieve a common goal. Matches usually last about 45 minutes.
Picture: Valve

Basically, video games offer challenges that board games like chess or go do not do. They hide information from the players, which means that an AI can not perceive the entire playing field and calculate the next closest move. There is also more information to process and a lot of possible moves. OpenAI says that at any moment its Dota 2 Robots must choose between 1,000 different actions while processing 20,000 data points that represent what is happening in the game.

To create their robots, the lab has turned to an automatic learning method called reinforcement learning. This is a falsely simple technique that can produce complex behavior. The AI ​​agents are thrown into a virtual environment where they learn themselves how to achieve their goals by trial and error. Programmers define what's called reward functions (assigning bot points for things like killing an enemy), then they leave the AI ​​agents to play themselves over and over again.

For this new batch of dota bots, the amount of self-play is staggering. Every day the bots were playing 180 years playing time at an accelerated pace. They trained at this rate for several months. "It starts totally randomly, wandering around the map, and then, after a few hours, he starts learning basic skills," says Brockman, who says that it takes between 12,000 and 20,000 hours of play to a human to learn how to become a professional, it means that OpenAI agents "play 100 lives of experience every day".

On the one hand, it demonstrates the power of modern machine learning methods and the latest computer chips to handle large amounts of data. On the other hand, it's a reminder of how AI agents are fundamentally unintelligent. If humans took thousands of years to learn to play a single video game, we would not be very far away as a species.



The OpenAI bots were still restricted. For example, they only played with five of the 115 heroes available, including Necrophos (pictured).
Picture: Valve

Although the OpenAI bots are now playing 5v5 matches, they are not yet exposed to the full complexity of Dota 2. A number of limitations are in place. They play only using five of the 115 available heroes, each with their own style of play. (Their choice: Necrophos, Sniper, Viper, Crystal Maiden and Lich.) Some elements of their decision-making processes are hard-coded, such as the items they buy from salespeople and the skills they improve by using the experience points. Other tricky parts of the game have been completely disabled, including invisibility, summons and placement of protections, which are elements that act as remote cameras and are essential in the top game. (As stated in a game guide: "If there is one topic that troubles newcomers more than anything else, it is protection.")

OpenAI agents also have all the benefits you expect from a computer. Their reaction times are faster than those of humans, they never miss a click and they have instant and accurate access to data such as stocks of objects, health of heroes and the distance between objects on the map. spells. This is all information that human players must manually check or judge by instinct.

All this may seem like an impeachment of the bot's abilities, but Brockman argues that it's a distraction. The ability to play entire games in Dota 2 According to him, it is on average 45 minutes between OpenAI agents. This type of long-term planning was considered difficult or impossible to teach by reinforcement learning, but the OpenAI work suggests the opposite. Brockman says that the main reason for their success is simply that they have brought more computing power to deal with the problem. "It's really scale," he says.

Andreas Theodorou, an AI researcher at the University of Bath who uses computer games to study collaboration, says the latest research on 5v5 games is a big step forward. their agents. (These interactive visualizations can be seen here.) "These techniques show that even reinforcement learning and machine learning systems, in general, can be transparent," said Theodorou. The edge. These additions "increase the value of the system," he says, especially for educational purposes.

The researchers' use of a separate reward function to encourage bots to work together was also remarkable, says Theodorou. This reward function has been labeled "Team Spirit" and has been increased during each game. Bots start each game by pursuing individual goals, such as accumulating wins, but over time they focus more on shared goals.

Brockman says, unlike human players, it means that there is absolutely no ego involved. "The bots are totally ready to sacrifice a way or abandon a hero for the good of all," he says. The edge. "To have fun, we had a human intervention to replace one of the robots.We had not trained them to do anything special, but he said that he felt so good All that he wanted, the bots had it. "

The OpenAI bots team currently played five multigame matches against amateur and semi-pro teams, winning four and drawing one. But their biggest challenge will come later this year to the International. Can machines with perfect timing and no ego match the fluid and intuitive playing of human professionals? At this point, it's no matter who.

[ad_2]
Source link