Game theory is widely used to model strategic interactions in economics and finance. Concepts from game theory are applied in fields ranging from evolutionary economics and behavioral economics to political economy and further to political science and to ecology and evolutionary biology. Especially in economics, it is common practice to assume that players in strategic games instantly coordinate on an equilibrium. However, when the players have to learn their strategies by playing a game repeatedly the strategic dynamics may fail to converge. The existing literature proves convergence or non-convergence in the few specific classes of games in which analytical derivations are possible, and it is not clear how representative these games are. Is convergence to equilibrium to be expected in general? In this paper we give a possible answer by studying the generic properties of an ensemble of normal form games. We span the space of games by generating payoff matrices at random, and we introduce a computational approach to quantify how often convergence fails.
This approach depends on the best reply structure of the payoff matrix: Best replies either form cycles or fixed points (corresponding to pure strategy Nash equilibria), and the relative share of cycles is a good measure of the probability of non-convergence. We run extensive numerical simulations and show that our formalism predicts the convergence frequency of six different learning algorithms. We explore how the best reply structure changes as we vary several key parameters. This includes the number of players and the number of moves per player, characteristics that define how complicated the game and the associated decision problems are. We also vary the correlation of the payoffs, that determines how competitive the game is. We show that normal form games can be represented as boolean networks and that a wide range of results on cycles in boolean networks apply to the question at hand. Finally, we expand our formalism in order to include quasi-best replies. This is accomplished by transforming the game into a Markov process.
We find that as games become more complicated and more competitive, best reply cycles become dominant and so convergence becomes very unlikely. Our results imply that in this situation equilibrium is typically not a realistic assumption, and one should consider other behavioral models. This would have far-reaching consequences for both theoretical and applied questions in economics and finance. Alternatively, if for some reason real-world games are special, and do not possess cycles, this raises the interesting question of why this should be the case.