MMedAgent-RL is a reinforcement learning-based multi-agent framework designed to optimize collaboration among medical agents for multimodal medical reasoning.
The framework addresses the limitations of existing single-agent models in generalizing across diverse medical specialties by introducing dynamic and optimized collaboration inspired by clinical workflows.
MMedAgent-RL trains two GP agents through reinforcement learning: a triage doctor assigns patients to appropriate specialties, and an attending physician integrates judgments from specialists and personal knowledge for final decisions.
Experiments on five medical VQA benchmarks show that MMedAgent-RL surpasses open-source and proprietary Med-LVLMs, displaying human-like reasoning patterns and achieving an average performance gain of 18.4% over supervised fine-tuning baselines.