menu
techminis

A naukri.com initiative

google-web-stories
Home

>

ML News

>

HERA: Hybr...
source image

Arxiv

1d

read

133

img
dot

Image Credit: Arxiv

HERA: Hybrid Edge-cloud Resource Allocation for Cost-Efficient AI Agents

  • Large language models (LLMs) like GPT-4 predominantly operate in the cloud, incurring high operational costs.
  • The necessity of cloud-exclusive processing for AI agents is being reconsidered with the improved accuracy of local-based small language models (SLMs).
  • A lightweight scheduler called Adaptive Iteration-level Model Selector (AIMS) is proposed to partition AI agent's subtasks between SLM and LLM based on subtask features to maximize SLM usage and maintain accuracy.
  • Experimental results show that AIMS improves accuracy by up to 9.1% and increases SLM usage by up to 10.8% compared to existing approaches.

Read Full Article

like

8 Likes

For uninterrupted reading, download the app