Alessandro, a Quantitative Researcher at G-Research, discusses the impact Artificial Intelligence attempts to have on imperfect information games such a Poker.
AI has been dominating games like Chess for long time, and last year’s success of Alpha-Go proved that super-human performance is achievable also in a game like Go, where the extremely large number of states (game-tree complexity of 10^360, more than the estimated number of atoms in the visible universe) makes a brute force approach completely hopeless.
Yet, a new frontier for AI is yet to be broken: imperfect information games, with Poker on the frontline. What makes those problems hard to solve is the deep, intricate entanglement between current actions and future rewards. The value of a specific state will strongly depend on the full history of the game (which defines what the opponent can infer about our current hand, and vice versa). This makes the problem highly non-Markovian, and much more difficult to tract than complete information games, where selecting the best valued action in each tree bifurcation is sufficient to guarantee convergence to optimality. On the contrary, any Nash equilibrium strategy in poker is by definition stochastic, and much more complicated to infer.
In the recent months, exciting progress in the field has been achieved by two independent research groups, under the project names of “DeepStack” and “Libratus”. Both groups managed to outperform (with good statistical significance) professional poker players in Heads-Up Texas Hold’em, which is one of the simplest variants of the game. While Libratus still holds some mystery around its approach, DeepStack revealed its secrets in a paper early this year (DeepStack, Matej Moravcik et al, 2017). Not surprisingly, DeepStack strategy is not very different form the used by Alpha-Go (in its original version): they both use a function to approximate the value of potential future states (value net) and one to encode most likely actions given a specific situation (policy net). Moreover, they both rely on deep neural networks to learn and represent those functions. In addition, DeepStack combines those networks with an approach called “Counterfactual Regret minimisation”, which allows to deal with the imperfect information nature of the game.
Despite the recent success, AI powered computers are still unable to compete with professional poker players in the full Texas Hold’em nine players no-limit game. Moreover, even in simple variants of the game (like Head’s up, solved by DeepStack), AI can beat professional players, but when playing against non-expert opponents professional players still have a lead. In fact, expert humans are able to maximise profit by observing, categorising and adapting to different classes of sub-optimal players, and they are much more efficient at doing it than any existing AI.
An AI Poker revolution would be very interesting to follow from various perspectives, and the large amount of money involved in the Poker industry is probably one of them. Currently, the online-poker market is estimated to sum to about 50 billion dollars, doubling in the last 5 years, and consistently growing. Despite not being able to beat professional players yet, it is very likely that most currently available AI can statistically outperform the average poker player found on those websites. Online poker communities are fighting against the use of bots in tournaments and cash games, yet AI-controlled agents are becoming harder and harder to detect, and Poker rooms have little more authority than just banning certain players from their specific platform, nothing preventing them from registering again under a different name.
As in many other industries, the development of AI can have a disruptive effect on the world of poker. The development of statistically advantageous ways to “gamble” (and to make a consistent profit out of it) is not something new. It happened in the past for Blackjack (“Beat the Dealer”, Ed. Thorp, 1962) and in the finance world with algorithmic trading (“Beat the Market”, Ed. Thorp, 1967). Yet, counting cards in Blackjack has been easy to overcome with some game rules’ changes (frequent reshuffling of cards), while in finance algorithmic trading is sustainable (being investment a positive-sum game), and it actually proved to be beneficial for the whole system, by improving efficiency and liquidity of the financial markets. But Poker is a negative-sum game: the expected gain of the average player will always be negative as long as there is a dealer collecting fees. Whether online Poker as we know it will be able to survive the AI revolution remains unknown.
Despite the recent progress and successes, Artificial Intelligence is still very far from being solved, and it is hard to forecast what it might lead in the future, either applied to making money in online Poker or transporting people with self-driving cars. After all, AI has been able to beat the best human players at chess for 20 years now, yet a 5 years old kid is still much better than any AI-powered robot at moving the Chess pieces on the board.
G-Research is a leading quantitative research and technology company based in central London. We use scientific techniques, big data and world-class technology to predict future movements in financial markets, and develop the platform to deploy these ideas globally. Find out more at: www.gresearch.co.uk .