Home / Technology / Robots beat humans in poker

Robots beat humans in poker



On a hot summer day, two weeks ago, I was sitting in a cavernous room in the Mojave Desert with some 2,000 other people I had never met before. More than $ 8 million in cash was collected in a secure place somewhere on the scene, and in adjoining rooms, like us in adjoining rooms, we collectively paid the privilege. For 14 hours that day, we sat down. From time to time, one of us would get up quietly and leave, never to return. The last survivor of us would become an instant millionaire.

We played poker. And without my knowledge at the time, two Intel processors installed on the other side of the country had just undergone a similar test. At the crescendo of the World Series of Poker in Las Vegas, two computer scientists announced that they had created an artificial intelligence poker player, more powerful than a complete array of top professionals in the most popular form of the game – Texas Hold 'em without limit.

Noam Brown, a researcher at Facebook AI Research, and Tuomas Sandholm, computer scientist at Carnegie Mellon, describe their findings in a new article titled "A superhuman AI for multiplayer poker," published today in the journal Science.

In recent decades, artificial intelligence has surpassed the best humans in many beloved games of our kind: the ladies and its long-term planning, chess and its iconic strategy, Go and its complexity, the backgammon and its element of chance, and now poker and its imperfect information. Ask the researchers who worked on these projects why they do it and they will tell you one thing: games are a test bed. It is with the games that the techniques are tested, the measured results and the machines compared to the humans. And with each game is added an additional layer that more accurately reproduces the real world. The real world requires planning, strategy, complex, chance and – perhaps most vexely – it contains an indescribable sea of ​​hidden information.

"No other popular recreational game captures hidden information challenges as effectively and as elegantly as poker," write Brown and Sandholm.

It's been about nine months since I was writing a book about the collision between games and artificial intelligence – and I still work on it, unfortunately, I did not become an instant millionaire at the World Series of Poker. While humans have surrendered their dominance, game after game, I have come to see artificial intelligence as an omen and a lesson in objects: it gives a glimpse of the potential future of super-intelligent systems and teaches us how we could and could react.

Poker, because of its complexity and the fact that players hide crucial information, has been one of the last frontiers of these popular games, and this border is rapidly establishing itself. Computers' conquest of poker has been gradual and much of the work to date has focused on the relatively simple version of heads-up – or two-player games.

In 2007 and 2008, computers, led by a program called Polaris, showed promise in the first man / machine matches, fighting on an equal footing with or even defeating human pros, in heads-up. limit Hold 'em, in which two players are limited to certain fixed bet sizes.

In 2015, the Hold-em Hold'em limit was "solved" thanks to an artificial intelligence player named Cepheus. This means that you can not distinguish the game of Cepheus from perfection, even after watching it all your life.

In 2017, in a Pittsburgh casino, a quartet of human professionals all faced a program called Libratus in the incredibly complex heads-up. without limits Hold em. The human pros have been summarily destroyed. At about the same time, another program, DeepStack, also claimed the superiority over human pros in unlimited heads-up.

And in 2019, Wired reported that Libratus's theoretical gaming technology was being used in the service of the US military, in the form of a two-year, $ 10 million contract with an agency of the Pentagon called Defense Innovation.

The latest creation of Brown and Sandholm, named Pluribus, is superhuman to the atmosphere of limitless poker with more that two players – especially six – that is identical to one of the most popular forms of the game played online and that much resembles the game I played in this desert room.

In an important article on game theory from 1951, one of the market's fathers, John Nash, examined an ultra-simplified version of poker, calling the game "the most obvious target" for the applications of its theory. "The analysis of a poker game more realistic than our very simple model should be an interesting case," he wrote. He predicted that the analysis would be complicated and that methods of calculation would be needed. He was right.

Pluribus, like other superhuman players in artificial intelligence games, learned to play poker only by playing against himself for eight days and 12,400 basic hours of treatment. He begins by playing at random. He observes what works and what does not work. And he modifies his approach along the way using an algorithm that directs him to the eponymous balances of Nash. This process created his plan of attack for the entire game, called his "blueprint strategy", which was calculated offline before the competition for what the authors estimate at $ 144 just in computer costs. current cloud. During its competitive games, Pluribus researches in real time the improvements to be made to its detailed plan.

The finished program, which only worked on two Intel processors, was opposed to the best players – each one of them had earned at least $ 1 million playing as a professional – in two experiments on thousands of hands: one with a copy of Pluribus and five humans and another with a human and five copies of Pluribus. Humans were paid by hand and urged to play their best with the money set up by Facebook. It was determined that Pluribus was profitable both in experiments and at levels of statistical significance worthy of being published in Science.

"I think it was the last milestone in poker," Brown said. "I think poker has served its goal of reference and challenge problem for AI."

"I probably have more experience in fighting the best poker AI systems than any other poker professional in the world," said Jason Les, one of Pluribus's opponents. "I know all the places to look for weaknesses, all the tricks to try to take advantage of the flaws of a computer. In this competition, AI has played an optimal sound strategy, based on a game theory, that you really only see among the best human professionals. Despite my best efforts, I have not managed to find a way to exploit it. I would not want to play a poker game where this AI poker robot was at the table. "

Sandholm and Brown have told me that they expect Pluribus's technology to have even wider applications than previous robots. They think that Pluribus is the first multiplayer, as in more than two years, the milestone of artificial intelligence games and the fact that it could affect an exhaustive list of "games" multiplayer in the real world: auctions, trading several cars drive.

In this cavernous desert hall of the World Series of Poker in Vegas, humans did not think about political ads or autonomous cars, but many of them were think about game theory. The best gaming professionals are increasingly drawing on academic literature on AI, commercially available programs such as PokerSnowie and PioSOLVER, and even PhDs in computer science that they recruit as consultants to refine their games. As a result, the quality of the human poker game has never been so high, and Pluribus could still elevate it.

But I have spoken to both professionals and scientists who think that poker AIs could kill the very game they are trying to conquer. Indeed, we might have already killed the heads-up limit. On the one hand, say these skeptics, modern elite poker may seem sterile, with young pros playing the best games behind their sunglasses and under the headphones. . On the other hand, poker looks like a pyramid scheme: it needs a wide range of skill levels to allow pros to play for big money at the top. As humans learn robots quickly, everyone becomes good, skill levels are flattened, the pyramid collapses and the game dies.

"Unfortunately, that could have some merit," said Sandholm. "It would be very sad. I came to love this game. "


Source link