A new approach integrating large language models (LLMs) as priors in reinforcement learning (RL) has been introduced.
The approach presents a cache-efficient framework for posterior sampling with LLM-derived priors, reducing computational costs while maintaining high performance.
The framework uses an adaptive caching mechanism, resulting in a 3.8--4.7$ imes$ reduction in LLM queries and 4.0--12.0$ imes$ lower median latencies on a consumer GPU.
Extensive evaluations across multiple tasks show the generalizability and practical viability of LLM-guided RL in resource-constrained settings.