Researchers from Stanford University's Scaling Intelligence Lab introduced a new inference framework called Archon.
Archon uses an inference-time architecture search (ITAS) algorithm to improve the performance of large language models (LLMs) without additional training.
It is model agnostic, open-source, and designed to be plug-and-play for large and small models.
Archon has outperformed benchmark tests and other open-source LLMs in terms of task generalization and quality of responses.