Building a Real-Time Voice Assistant with Local LLMs on a Raspberry Pi
In this project, the goal was to capture voice input through a web interface, process the text using a local LLM running on the Raspberry Pi, generate voice responses using a TTS engine, and stream everything in real-time via WebSockets.
The Raspberry Pi was set up with the latest Raspberry Pi OS and the hardware interfaces were enabled. Ollama was installed to run local LLMs like Mistral on the Pi. Piper, an open-source TTS engine, was chosen for offline voice generation.
A simple Node.js server was created to accept text from the client, process it using Mistral, convert the LLM response to speech with Piper, and stream the audio back to the client. For the frontend, a React app was developed to record voice input, display real-time text responses, and play the generated speech audio.