menu
techminis

A naukri.com initiative

google-web-stories
Home

>

IOT News

>

Controllin...
source image

Medium

1M

read

284

img
dot

Image Credit: Medium

Controlling IOT devices using LLMs

  • The conversational AI agent is built as a distributed service composed of a transcription service, a text to speech engine, an LLM server, and a gRPC server and client architecture that tie all the services together.
  • This project eschews adding a web interface since it makes the assumption that natural conversational interactions are the future of interactivity.
  • The automatic speech recognition client makes use of the RealtimeTTS open-source library that integrates a Faster-Whisper model using English as the primary language.
  • The main application integrates a gRPC server together with the LLM client API calls to the Ollama server and the text-to-speech functionality.
  • Regarding acknowledgments, this handshaking feature was implemented to get around the issue of talking over the agent chatbot while it replies.
  • The latency in the above pipeline is from 1–10 seconds which is does not meet real-time requirements.
  • As of early 2025, a limited subset of current open-source models does include multi-modal capabilities.
  • This article showed how to build a complete conversational AI agent that can be used fully locally with no dependence on cloud services.
  • The main idea was to evaluate the feasibility of current open-source models to implement fully local conversational interfaces that can actuate IOT devices.
  • The conversational chatbot was tested on a Jetson AGX Orin resulting is almost real-time conversations.

Read Full Article

like

17 Likes

For uninterrupted reading, download the app