Groups Similar Look up By Text Browse About



Similar articles
Article Id Title Prob Score Similar Compare
142924 TECHCRUNCH 2019-7-11:
AI smokes 5 poker champs at a time in no-limit Hold’em with ‘relentless consistency’
1.000 Find similar Compare side-by-side
143039 THEVERGE 2019-7-11:
Facebook and CMU’s ‘superhuman’ poker AI beats human pros
0.980 0.693 Find similar Compare side-by-side
142999 ARSTECHNICA 2019-7-11:
Facebook AI Pluribus defeats top poker professionals in 6-player Texas Hold ’em
0.972 0.692 Find similar Compare side-by-side
143004 ENGADGET 2019-7-11:
Facebook and CMU's poker AI beat five pros at once
0.972 0.653 Find similar Compare side-by-side
143037 VENTUREBEAT 2019-7-11:
Facebook’s AI beats human poker champions
0.975 0.564 Find similar Compare side-by-side
142995 ARSTECHNICA 2019-7-11:
DeepMind AI is secretly lurking on the public StarCraft II 1v1 ladder
0.035 0.536 Find similar Compare side-by-side
142974 ENGADGET 2019-7-11:
DeepMind's ‘Starcraft II’ AI will play public matches
0.014 0.492 Find similar Compare side-by-side
143258 TECHCRUNCH 2019-7-12:
Digging into the Roblox growth strategy
0.003 0.475 Find similar Compare side-by-side
142709 THEVERGE 2019-7-9:
Apex Legends’ ranked mode will be the key to keeping it alive
0.004 0.469 Find similar Compare side-by-side
143351 VENTUREBEAT 2019-7-15:
How video game engines help create smarter AI
0.455 Find similar Compare side-by-side
143125 ARSTECHNICA 2019-7-12:
Want to be more creative? Playing Minecraft can help, new study finds
0.002 0.431 Find similar Compare side-by-side
142940 TECHCRUNCH 2019-7-11:
There’s a tennis game hidden in Google right now; here’s how to find it
0.002 0.429 Find similar Compare side-by-side
142986 ARSTECHNICA 2019-7-11:
Steam uses machine learning for its new game recommendation engine
0.002 0.419 Find similar Compare side-by-side
142462 THEVERGE 2019-7-8:
Dr. Mario World feels more like Candy Crush than the classic NES game
0.404 Find similar Compare side-by-side
142957 ARSTECHNICA 2019-7-11:
Dragon Quest Builders 2 review: Building on success
0.003 0.403 Find similar Compare side-by-side
143216 VENTUREBEAT 2019-7-12:
The RetroBeat: I can’t wait for the TurboGrafx-16 Mini
0.002 0.398 Find similar Compare side-by-side
142982 VENTUREBEAT 2019-7-11:
Man of Medan hands-on — Multiplayer adds new dimension to choice-based game
0.002 0.387 Find similar Compare side-by-side
142492 VENTUREBEAT 2019-7-8:
Why Dino Patti wants to bring online connectivity and persistence to single-player games
0.385 Find similar Compare side-by-side
143346 VENTUREBEAT 2019-7-15:
NetEase invests in Dead by Daylight maker Behaviour Interactive
0.381 Find similar Compare side-by-side
143009 VENTUREBEAT 2019-7-11:
Call of Duty: Modern Warfare will have 2 vs. 2 Gunfight multiplayer mode
0.007 0.372 Find similar Compare side-by-side
143193 VENTUREBEAT 2019-7-12:
The DeanBeat: A few days without game journalism and a few days without gaming
0.002 0.371 Find similar Compare side-by-side
142990 VENTUREBEAT 2019-7-11:
Man of Medan interview — How 2 players can share 1 story in new multiplayer mode
0.368 Find similar Compare side-by-side
142852 VENTUREBEAT 2019-7-10:
Digi-Capital: Investors poured $9.6 billion into game deals in past 18 months, but M&A and IPOs slowed
0.368 Find similar Compare side-by-side
142979 ENGADGET 2019-7-11:
Steam's new experiment hub includes AI-based game recommendations
0.004 0.367 Find similar Compare side-by-side
142671 VENTUREBEAT 2019-7-9:
Universal and Unity will launch two games by indie contest winners on Steam
0.360 Find similar Compare side-by-side

1

ID: 142924

URL: https://techcrunch.com/2019/07/11/ai-smokes-5-poker-champs-at-a-time-in-no-limit-holdem-with-ruthless-consistency/

Date: 2019-07-11

AI smokes 5 poker champs at a time in no-limit Hold’em with ‘relentless consistency’

The machines have proven their superiority in one-on-one games like chess and go, and even poker — but in complex multiplayer versions of the card game humans have retained their edge… until now. An evolution of the last AI agent to flummox poker pros individually is now decisively beating them in championship-style 6-person game. As documented in a paper published in the journal Science today, the CMU/Facebook collaboration they call Pluribus reliably beats five professional poker players in the same game, or one pro pitted against five independent copies of itself. Its a major leap forward in capability for the machines, and amazingly is also far more efficient than previous agents as well. One-on-one poker is a weird game, and not a simple one, but the zero-sum nature of it (whatever you lose, the other player gets) makes it susceptible to certain strategies in which computer able to calculate out far enough can put itself at an advantage. But add four more players into the mix and things get real complex, real fast. Carnegie Mellon creates a poker-playing AI that can beat the prosWith six players, the possibilities for hands, bets, and possible outcomes are so numerous that it is effectively impossible to account for all of them, especially in a minute or less. Itd be like trying to exhaustively document every grain of sand on a beach between waves. Yet over 10,000 hands played with champions, Pluribus managed to win money at a steady rate, exposing no weaknesses or habits that its opponents could take advantage of. Whats the secret? Consistent randomness. Pluribus was trained, like many game-playing AI agents these days, not by studying how humans play but by playing against itself. At the beginning this is probably like watching kids, or for that matter me, play poker — constant mistakes, but at least the AI and the kids learn from them. The training program used something called Monte Carlo counterfactual regret minimization. Sounds like when you have whiskey for breakfast after losing your shirt at the casino, and in a way it is — machine learning style. Regret minimization just means that when the system would finish a hand (against itself, remember), it would then play that hand out again in different ways, exploring what might have happened had it checked here instead of raised, folded instead of called, and so on. (Since it didnt really happen, its counterfactual.) A Monte Carlo tree is a way of organizing and evaluating lots of possibilities, akin to climbing a tree of them branch by branch and noting the quality of each leaf you find, then picking the best one once you think youve climbed enough. If you do it ahead of time (this is done in chess, for instance) youre looking for the best move to choose from. But if you combine it with the regret function, youre looking through a catalog of possible ways the game could have gone and observing which would have had the best outcome. So Monte Carlo counterfactual regret minimization is just a way of systematically investigating what might have happened if the computer had acted differently, and adjusting its model of how to play accordingly. The game originall played out as you see on the left, with a loss. But the engine explores other avenues where it might have done better. Of course the number of games is nigh-infinite if you want to consider what would happen if you had bet $101 rather than $100, or you would have won that big hand if youd had an eight kicker instead of a seven. Therein also lies nigh-infinite regret, the kind that keeps you in bed in your hotel room until past lunch. The truth is these minor changes matter so seldom that the possibility can basically be ignored entirely. It will never really matter that you bet an extra buck — so any bet within, say, 70 and 130 can be considered exactly the same by the computer. Same with cards — whether the jack is a heart or a spade doesnt matter except in very specific (and usually obvious) situations, so 99.999 percent of the time the hands can be considered equivalent. This abstraction of gameplay sequences and bucketing of possibilities greatly reduces the possibilities Pluribus has to consider. It also helps keep the calculation load low; Pluribus was trained on a relatively ordinary 64-core server rack over about a week, while other models might take processor-years in high-power clusters. It even runs on a (admittedly beefy) rig with two CPUs and 128 gigs of RAM. The training produces what the team calls a blueprint for how to play thats fundamentally strong and would probably beat plenty of players. But a weakness of AI models is that they develop tendencies that can be detected and exploited. In Facebooks writeup of Pluribus, it provides the example of two computers playing rock-paper-scissors. One picks randomly while the other always picks rock. Theoretically theyd both win the same amount of games. But if the computer tried the all-rock strategy on a human, it would start losing with a quickness and never stop. As a simple example in poker, maybe a particular series of bets always makes the computer go all in regardless of its hand. If a player can spot that series, they can take the computer to town any time they like. Finding and preventing ruts like these is important to creating a game-playing agent that can beat resourceful and observant humans. To do this Pluribus does a couple things. First, it has modified versions of its blueprint to put into play should the game lean towards folding, calling, or raising. Different strategies for different games mean its less predictable, and it can switch in a minute should the bet patterns change and the hand go from a calling to a bluffing one. It also engages in a short but comprehensive introspective search looking at how it would play if it had every other hand, from a big nothing up to a straight flush, and how it would bet. It then picks its bet in the context of all those, careful to do so in such a way that it doesnt point to any one in particular. Given the same hand and same play again, Pluribus wouldnt choose the same bet, but rather vary it to remain unpredictable. These strategies contribute to the consistent randomness I alluded to earlier, and which were a part of the models ability to slowly but reliably put some of the best players in the world. There are too many hands to point to a particular one or ten that indicate the power Pluribus was bringing to bear on the game. Poker is a game of skill, luck, and determination, and one where winners emerge after only dozens or hundreds of hands. And here it must be said that the experimental setup is not entirely reflective of an ordinary 6-person poker game. Unlike a real game, chip counts are not maintained as an ongoing total — for every hand, each player was given 10,000 chips to use as they pleased, and win or lose they were given 10,000 in the next hand as well. The interface used to play poker with Pluribus. Fancy! Obviously this rather limits the long-term strategies possible, and indeed the bot was not looking for weaknesses in its opponents that it could exploit, said Facebook AI research scientist Noam Brown. Truly Pluribus was living in the moment the way few humans can. But simply because it was not basing its play on long-term observations of opponents individual habits or styles does not mean that its strategy was shallow. On the contrary, it is arguably more impressive, and casts the game in a different light, that a winning strategy exists that does not rely on behavioral cues or exploitation of individual weaknesses. The pros who had their lunch money taken by the implacable Pluribus were good sports, however. They praised the systems high level play, its validation of existing techniques, and inventive use of new ones. Heres a selection of laments from the fallen humans: I was one of the earliest players to test the bot so I got to see its earlier versions. The bot went from being a beatable mediocre player to competing with the best players in the world in a few weeks. Its major strength is its ability to use mixed strategies. Thats the same thing that humans try to do. Its a matter of execution for humans — to do this in a perfectly random way and to do so consistently. It was also satisfying to see that a lot of the strategies the bot employs are things that we do already in poker at the highest level. To have your strategies more or less confirmed as correct by a supercomputer is a good feeling. -Darren EliasIt was incredibly fascinating getting to play against the poker bot and seeing some of the strategies it chose. There were several plays that humans simply are not making at all, especially relating to its bet sizing. -Michael Gags GaglianoWhenever playing the bot, I feel like I pick up something new to incorporate into my game. As humans I think we tend to oversimplify the game for ourselves, making strategies easier to adopt and remember. The bot doesnt take any of these short cuts and has an immensely complicated/balanced game tree for every decision. -Jimmy ChouIn a game that will, more often than not, reward you when you exhibit mental discipline, focus, and consistency, and certainly punish you when you lack any of the three, competing for hours on end against an AI bot that obviously doesnt have to worry about these shortcomings is a grueling task. The technicalities and deep intricacies of the AI bots poker ability was remarkable, but what I underestimated was its most transparent strength – its relentless consistency. -Sean RuaneBeating humans at poker is just the start. As good a player as it is , Pluribus is more importantly a demonstration that an AI agent can achieve superhuman performance at something as complicated as 6-player poker. Many real-world interactions, such as financial markets, auctions, and traffic navigation, can similarly be modeled as multi-agent interactions with limited communication and collusion among participants, writes Facebook in its blog. Yes, and war.