X

Meet Pluribus, He Just Beat the World’s Best Poker Players

Pluribus is the world’s most evolved Poker AI developed by Carnegie Mellon University and Facebook AI research scientists. Here is how the software succumbed to the world’s poker elite.

Enter Pluribus, First Among the Poker AI

It’s a classic case of man against the machine. Well, if it’s any consolation, there originally were a few more men than machines, but now that Pluribus is up and running, the AI is capable of taking on as many as 5 poker opponents at the same table. Developed by Carnegie Mellon University and Facebook AI scientists, as many as 15 professional poker players took on the software to find themselves all defeated in No-Limit Texas Hold’em.

Pluribus first took on Chris “Jesus” Ferguson, a player who has won the highest distinction in poker by topping the World Series of Poker six times. Darren Elias, meanwhile, lost, despite the fact that he holds the most poker titles in the world.

In the first match-up, five copies of the AI were set against the pros with over 5,000 hands played each time. When Pluribus took on all five players, 10,000 were played. While some may think that machines would logically have the upper hand, Tuomas Sandholm who co-authored the project at Carnegie Mellon University, begged to differ.

According to M. Sandholm, who took 16 years to tackle poker by teaching an AI how to play and defeat the world’s elite, poker is a game with imperfect information where there are many unknowns. Misleading information is proffered readily and an AI is really no good judge of character. It can only tell you probabilities.

Rise of the Machines: From Claudico to Libratus, to Pluribus

The evolution of AI has been somewhat quicker than your average species development. Sandholm’s team has been trying to create the ultimate poker-playing machine for 16 years, and to this end alone, much testing was needed.

After years of research, a predecessor of the AI, Claudico, was fielded at the Brains vs. Artificial Intelligence tournament held in the Rivers Casino in Pittsburgh. Claudico played through 80,000 hands in the course of two weeks, but he fell short of achieving a convincing victory, which judges said could have been thanks to luck.

Emboldened by their efforts, the team went back to the table and gave Claudico a new purpose – to improve its own game rather than try and home in on the mistakes of others. Thus, Libratus was born. This new AI was designed to be much more assertive as well. Researchers also saw merit in increasing the number of hands the computer played, so 120,000 hands were assigned.

Libratus was in a much better form than his sibling and it also proved a powerful AI that knew when to bluff and when to play. The AI took on four players and played 120,000 combined hands of poker, winning by $1.7 million and achieving the statistical significance that promoted the AI as the clear winner.

It is important to note that at this time, Libratus was taking on single opponents, rather than being challenged by several different players at the same time. That meant more work for Sandholm and his team who were glad to go back to improving the AI.

Libratus was superior to Pluribus, but it also shared similarities. For example, Libratus didn’t try to memorize strategies, but it rather used a strategy that many of today’s professionals would agree is what makes a great player, i.e. being unpredictable.

The objective was clear: devise a way for the AI to follow through with all the different moves players do. To do this, Sandholm’s team had to develop basic behavioral rules, which while intended to limit the possible outcomes, amounted to as many as “six million continuation strategies”, as the researcher noted himself.

What Libratus did was to go back four moves for every player move and try to reason why they would play the way they did.

Libratus was superior to Pluribus, but it also shared similarities. For example, Libratus didn’t try to memorize strategies, but it rather used a strategy that many of today’s professionals would agree is what makes a great player, i.e. being unpredictable.

Libratus was smart enough to devise its own strategies by training against itself and experiencing new situations from which it could learn. Uniquely, the AI taught itself to be random out of its own accord rather than being programmed to devise randomized strategies.

Soaking Up the AI Wisdom: Play Only When You Are Sure

And so we arrive at Pluribus, the most advanced version of the AI capable of taking on multiple opponents at the same time and playing against multiple players at the same table. If there is one immediate observation that researchers made while watching Pluribus play, it was that the AI didn’t engage in empty hands. It either raised or folded, rather than calling.

Conversely, a part of the strange strategies of the AI included the so-called practice of donk betting whereby the AI purposefully called one hand but raised the one right after it. In fact, Pluribus was doing this way too often – more often than a human player.

The true strength of the AI, however, lied in the fact that it achieved complete randomization of its strategies. It didn’t follow any patterns and played with bet sizes that varied all the time, introducing additional confusion among the players. The AI’s behavior demonstrated a level of play that hasn’t yet been mastered by humans. Here is what poker player Michael Gagliano had to say:

There were several plays that humans simply are not making at all, especially relating to its bet sizing. Bots/AI are an important part in the evolution of poker, and it was amazing to have first-hand experience in this large step toward the future.”

Meanwhile, the Sandholm team is already looking into real-world applications for their software. Pluribus, true to its name, could be used to research new drugs and address pressing health issues, such as antibiotic-resistant bacteria. Yet, Sandholm’s team has licensed the AI to companies that work on defense and intelligence and gaming and entertainment.

How Are Bots Going to Impact the Community?

The success of Pluribus has been welcomed but it has also made people question the future of the game. Would such AI be leveraged by teams that send a poker player and then communicate the best strategies with him? According to the team, you could develop a similar AI on a $150 cloud service. However, the security at the live tournament is too good to allow it, but the fact that Pluribus needs some 12,000 playing hours and only 28 cores to figure out how to play is disconcerting

Yet, it’s unlikely to mess with the poker community. For starters, the team behind the project isn’t releasing the code, cognizant of the havoc it could cause. There have been similar qualms about AI in other verticals, including competitive video gaming where an AI taught itself how to beat the world’s elite in StarCraft II.

Categories: Poker
Simon Deloit: Simon is a freelance writer who specializes in gambling news and has been an author in the poker/casino scene for 10+ years. He brings valuable knowledge to the team and a different perspective, especially as a casual casino player.
X

We use cookies to optimize your experience. If you continue to browse this site, you agree to this use.

We use cookies to optimize your experience. If you continue to browse this site, you agree to this use.

Privacy Settings