How a computer program took the gambling out of poker

Researchers at the University of Alberta say they have developed a program that has 'solved' a variant of poker, meaning that it knows the optimal move in essentially every scenario, and therefore can't be beat in the long run.

Scientists say they have developed a computer program that has essentially solved a variant of poker, meaning that it knows the optimal move for just about every possible scenario.

Staff / The Christian Science Monitor

January 8, 2015

If you were to spend an evening playing poker against Cepheus, a computer program developed by researchers at the University of Alberta, you wouldn't really be gambling. That's because you're guaranteed to lose.

The researchers have announced that they have essentially "solved" a popular variant of poker called Heads-up Limit Texas Hold'em. That is, for virtually every scenario that could arise in the game, Cepheus knows the optimal move. 

Cepheus is the latest in a long line of poker players bred by UAlberta's Computer Poker Research Group, beginning in 1997 with a program called Loki. While these programs could all play competitively – a program called Polaris bested six human poker professionals in 2008 – none managed to “solve” the game until now. CPRG's dubbed their newest program Cepheus, a sly reference to a constellation whose brightest star – Gamma Cephei – is set to replace Polaris as the North Star a thousand years from now. 

In Kentucky, the oldest Black independent library is still making history

“It doesn’t actually need much computing power while playing,” says UAlberta computer scientist Michael Bowling, who lead the research. “It has one huge table that holds every possible decision point that can arise in the game, so all it needs to do is look up the appropriate action in the table.”

While the program doesn't need much power to run, preparation for the game was a different story. Dr. Bowling’s team had 4000 CPUs “training” Cepheus simultaneously, each one simulating six billion hands per second – that’s 24 trillion hands per second total. This process lasted two months.

“We estimate that it has played more hands than all of humanity in the history of poker,” Bowling says.

Cepheus represents a step forward for thinking machines – previously, programs have only been able to solve “perfect information” games like tic-tac-toe, checkers, and Connect Four. In those games, past moves are known by both players. Poker, by contrast, is a family of “imperfect information” games, which means that certain aspects of the game cannot be known by both players. 

There are only two players in Heads-up Limit Hold’em. Each player begins with a number of chips, and each is dealt a two-card hand that remains secret until the end of the game. Over a maximum of four rounds, five “community cards” are dealt face-up on the table. In each round, players may “fold,” “call,” or “raise.” If a player folds, the game ends and the “pot” (the sum of chips wagered) goes to the other player. A player calls by matching the wagered number of chips, thus beginning the next round. If a player raises by increasing the bet, the round continues until a player folds or calls. If the game progresses to the end of the fourth round, the player with the best five-card hand – using cards from their own hand as well as the community cards – wins the pot.

A majority of Americans no longer trust the Supreme Court. Can it rebuild?

Add competitive betting and bluffing to the mix, and the game becomes very hard to crack. Despite improvements in computational power and speed, the solution to the game eluded researchers for over a decade. Finally, Bowling’s team found the key in a surprisingly human concept: regret.

“Many years ago, people started investigating – and this is not just poker, this is artificial intelligence – algorithms that minimize the amount of regret,” says Jonathan Schaeffer a former CPRG leader who also led the team that solved checkers in 2007. “These algorithms look at situations and come up with a decision that has, to put it simply, the best worst case. Regret minimization tries to come up with answers with the smallest amount of downside.”

Cepheus plays to minimize loss, rather than maximize gain. In other words, it doesn't win every hand – in fact, it only wins the hand about 60 percent of the time – but it reliably wins the pot over a long period of time. And if you thought Bowling’s program was casino-bound, you would be wrong. Poker is more than a game for Bowling and CPRG – to them, it is the perfect “testbed” to develop and improve artificial intelligence.

“The real world is a whole lot like a poker game,” Bowling said. “One of the things we have to cope with in making any real world decision is uncertainty. Humans can deal with uncertainty. We’re still able to make reasonable decisions. Poker embodies that uncertainty. If we’re going to build artificial intelligence systems that can answer complex real-world problems, they need to deal with uncertainty. So looking at this through the lens of gaming is actually easier.”

By expressing worldly problems as games, the same programming that allows Cepheus to dominate on the poker table could have a myriad of other uses – including cybersecurity, military strategy, and political negotiation.

“Game theory was always intended to be broadly construed,” Bowling said. “The word ‘game’ is actually misleading – it is really meant to represent any strategic problem-solving scenario with more than one person. One obvious example is security, which is already an area where game theory has had an impact. How do we protect things of strategic importance in a post-9/11 world? We have patrols and checkpoints and searches. These programs ask, ‘What’s the worst case that can happen, and how can we avoid that?’”

The game theory research community is also seeking out medical uses for programs like Cepheus. Tuomas Sandholm, a computer scientist and professor at Carnegie Mellon University, is working on applying game theory to epidemiology. 

“The conceptualization is that the disease is one player, and the treater is another player,” Sandholm said. “Traditionally in medicine, doctors try to give you a treatment that makes you myopically better. But here is the idea is that we can make sequential plans that may not make you myopically better, but they will drive the disease into a state where we can defeat it more effectively. Because diseases evolve or adapt to treatments in a myopic way instead of a strategic way, we can trap them.”

“There are real-world applications for this, and in many ways poker is a much better game for understanding the real world because the real world is all about probabilities,” said Schaeffer “Solving checkers, in some sense, was easy because the decisions were unequivocal. Here, it’s a matter of probability. Think about buying stocks – it’s not a sure thing. You’re going to read all the information you can about stocks, but it’s an educated guess. The whole world is about probability and decisions. Whether it be love or financial or war or politics, it doesn’t matter. It’s all about probabilities.” 

"Heads-up limit hold’em poker is solved" appears in the current issue of the journal Science. If you wish, you can play a few hands against Cepheus here, but don't get your hopes up.