📞 +91-7667918914 | ✉️ ijarcce@gmail.com
International Journal of Advanced Research in Computer and Communication Engineering
International Journal of Advanced Research in Computer and Communication Engineering A monthly Peer-reviewed & Refereed journal
ISSN Online 2278-1021ISSN Print 2319-5940Since 2012
IJARCCE adheres to the suggestive parameters outlined by the University Grants Commission (UGC) for peer-reviewed journals, upholding high standards of research quality, ethical publishing, and academic excellence.
← Back to VOLUME 15, ISSUE 4, APRIL 2026

Multilingual AI-Based Voice-Controlled Robotic System Using Distributed Architecture

Mrs. V. Divya Vani, Dr. G. Anand Kumar, K. Dharan, Ch. Tharun

👁 9 views📥 1 download
Share: 𝕏 f in
Abstract: Human–robot interaction is evolving rapidly with the convergence of artificial intelligence, cloud computing, and embedded systems. This paper presents the design and implementation of Astra, a multilingual AI-

driven voice-controlled robotic system that operates through a distributed architecture. High-level natural language intelligence runs on a laptop while real-time motor control is managed by an ESP32 microcontroller. The system supports three languages—English, Hindi, and Telugu—enabling broad accessibility across India’s linguistically diverse population. Audio input is captured via a wired microphone and transcribed by Sarvam AI, a cloud-based speech recognition service optimized for Indian languages.
The transcribed text is forwarded to a GPT-4o-mini large language model via OpenRouter, which classifies the input as either a movement command or a general conversational query and generates a structured JSON response. Movement commands are transmitted from the laptop to the ESP32 over Wi-Fi using the HTTP protocol, while conversational answers are spoken aloud via gTTS. A soft wake-word mechanism (“Astra”) enhances usability without strict keyword dependency.
Experimental evaluation demonstrates an average speech recognition accuracy of 85 % across all three languages, end- to-end command latency under 2 s, and robust motor control with no packet loss over a local Wi-Fi

Keyword: Human–Robot Interaction, Multilingual Speech Recognition, Sarvam AI, GPT-4o-mini, ESP32, Distributed AI Architecture, Wi-Fi HTTP Control, Natural Language Understanding, Voice-Controlled Robot, IoT.

How to Cite:

[1] Mrs. V. Divya Vani, Dr. G. Anand Kumar, K. Dharan, Ch. Tharun, “Multilingual AI-Based Voice-Controlled Robotic System Using Distributed Architecture,” International Journal of Advanced Research in Computer and Communication Engineering (IJARCCE), DOI: 10.17148/IJARCCE.2026.154126

Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 International License.