Multi-Agent Reinforcement Learning Modeling, Validation and Policy Simulation of the Tragedy of the Commons from the Perspective of Computational Economics

tao liu | Published Sunday, April 12, 2026

The tragedy of the commons in public resource governance is essentially the result of repeated interaction and adaptive learning among heterogeneous agents under dynamic resource constraints. Existing studies have generated rich insights into common-pool resource governance, institutional constraints, and cooperation, but they still rely mainly on theoretical deduction, static games, or econometric identification, making it difficult to jointly characterize resource dynamics, agent heterogeneity, behavioral learning, and policy scenarios. From the perspective of computational economics, this paper develops a multi-agent reinforcement learning simulation model for the governance of the tragedy of the commons. Specifically, the fish-pond resource system is formulated as a Markov decision process in discrete states, and adaptive decision-making under pure economic incentives, sustainability penalties, behavioral heterogeneity, and cooperation is characterized through the resource dynamics equation, harvesting equation, and differentiated reward functions. The model is further examined through sensitivity analysis, parameter calibration, and theoretical validation, and then used for policy simulation. The results show that pure economic incentives quickly induce resource collapse, sustainability penalties significantly reduce harvesting intensity and maintain a low but sustainable steady state, heterogeneous behavior parameters generate clear strategic divergence, and cooperation internalizes group harvesting constraints into individual payoffs and yields the strongest resource recovery and behavioral convergence.

Under development.

Research Interests

Basic Member