Advancements in large language models (LLMs) have made on-device deployment practical.WebLLM is an open-source JavaScript framework that enables high-performance LLM inference within web browsers.It leverages WebGPU for GPU acceleration and WebAssembly for CPU computation.WebLLM paves the way for locally powered LLM applications in web browsers.