The Mu language model has been introduced to power the agent in Windows Settings, providing high-performance natural language input processing efficiently on NPUs.
Mu, a 330M encoder-decoder language model, offers efficiency benefits by reusing latent representations, resulting in lower latency and higher throughput on specialized hardware.
Mu's design optimization for NPUs includes weight sharing, hardware-aware operations, and efficient transformer upgrades, enhancing performance on edge devices.
The Mu model underwent training using advanced techniques like warmup-stable-decay schedules and Muon optimizer, achieving strong accuracy and faster inference within edge-device constraints.
Mu showed remarkable performance on tasks like SQUAD, CodeXGlue, and Windows Settings agent, demonstrating efficiency despite its micro-size compared to Phi models.
Model quantization techniques, such as Post-Training Quantization, enabled Mu to run efficiently on Copilot+ PCs' NPUs while maintaining accuracy and reducing memory footprint.
Mu was finely tuned for the Windows Settings agent, meeting quality objectives with response times under 500 milliseconds and integrating seamlessly into the user interface for system setting changes.
The agent in Settings leverages Mu's capabilities to handle a wide range of user queries effectively, focusing on multi-word inputs to provide high precision actionable responses.
The challenge of managing extensive Windows settings was addressed by refining training data to prioritize common settings and handling ambiguous user queries effectively.
The model's performance in the Windows Settings agent scenario was optimized through fine-tuning with synthetic approaches, metadata tuning, noise injection, and smart sampling.
The Windows Insiders program welcomes feedback on the agent in Settings as refinements and enhancements continue to improve user experience.