A study explores how Multi-Agent Reinforcement Learning (MARL) can enhance dynamic pricing strategies in supply chains by considering strategic interactions among market actors.
The research evaluates three MARL algorithms (MADDPG, MADQN, and QMIX) against static rule-based baselines in a simulated environment using real e-commerce transaction data.
Results indicate that rule-based agents achieve high fairness and price stability but lack competitive dynamics, while MADQN displays aggressive pricing behavior with high volatility and low fairness.
MADDPG offers a balanced approach by supporting market competition, maintaining high fairness, and stable pricing, suggesting that MARL introduces emergent strategic behavior in dynamic pricing scenarios.