The O-RAN architecture supports dynamic service demands in wireless networks by utilizing modular, disaggregated components like RAN Intelligent Controller (RIC), Centralized Unit (CU), and Distributed Unit (DU).
Deep reinforcement learning (DRL) is beneficial for resource allocation and slicing in O-RAN networks but struggles with processing raw input such as RF features and QoS metrics, impacting policy generalization.
To address these limitations, ORAN-GUIDE introduces a dual-LLM framework that enhances multi-agent RL with structured prompts generated by a domain-specific language model, ORANSight.
Experimental results demonstrate that ORAN-GUIDE improves sample efficiency, policy convergence, and performance generalization compared to standard MARL and single-LLM approaches.