Sensory Announces the World’s Smallest and Most Powerful On-Device Speech-to-Text Engine

15th Apr, 2026

4 min read

About The Author

Todd Mozer

Founder & CEO

A serial entrepreneur with an IPO, an acquisition, 50+ patents, and a lifetime in audio-tech innovation. Todd has deep experience licensing and working with the largest tech companies in the world, including Amazon, Apple, Google, Microsoft, Samsung, and many others.

Now Supporting TensorFlow Lite Micro and NPU Architectures
- Global Reach: 37 Languages on the Edge

Experience AI That Works On-Device

See how Sensory technology transforms user experiences — instantly, privately, and securely.

Request a Demo Try It Now

Now Supporting TensorFlow Lite Micro and NPU Architectures

SANTA CLARA, CA – April 15, 2026 – Sensory Inc., a pioneer in on-device AI for over 30 years, today announced a breakthrough in embedded speech recognition with the launch of its latest Speech-to-Text (STT) engine. Optimized for TensorFlow Lite Micro (TFLM) and advanced Neural Processing Units (NPUs), including the Arm® Ethos™-U55, this new engine delivers unparalleled accuracy and performance in an ultra-compact footprint.

Global Reach: 37 Languages on the Edge

Breaking down the barriers of localized AI, Sensory’s STT engine supports 37 languages, enabling manufacturers to deploy truly global products with a single, ultra-efficient architecture. Supported languages include:

Afrikaans, Arabic, Belarusian, Bengali, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Farsi, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Malay, Mandarin, Norwegian, Polish, Portuguese, Romanian, Russian, Spanish, Swahili, Swedish, Thai, Turkish, Ukrainian, and Vietnamese.

Maximum Power, Minimum Footprint

By leveraging specialized Neural Processing Units (NPUs), Sensory’s STT engine is built to eliminate the performance-draining data transfers between a CPU and an NPU. This architecture offloads the entire tensor computation graph to the hardware accelerator, which can significantly reduce power consumption and latency. By keeping the CPU idle during inference, device manufacturers can extend battery life in portables or reserve processing cycles for complex system tasks and UI management.

The engine is available in two optimized configurations:

2.7 MB Domain-Specific Model: Optimized for large vocabulary “Command & Control” tasks, this model utilizes domain adaptation to maintain high accuracy in specific environments, such as automotive cabins. It features a peak SRAM usage of 787.11 KiB and operates at 892.9 Million MACs per inference.
13 MB General-Purpose Model: A versatile model designed to handle natural language and large vocabularies without per-domain tuning. It fits within standard 2MB SRAM limits with a peak footprint of 1.68 MB, operating at 4.37 Billion MACs per inference.

Technical Specifications & Performance

Feature	Domain-Specific Model	General-Purpose Model
Ideal Use Case	Targeted Commands Large vocabulary	Natural Language Unlimited vocabulary
Model Size	2.7 MB	13 MB
Peak SRAM Usage	787.11 KiB	1.68 MB
Compute Requirements	892.9 Million MACs/inference	4.37 Billion MACs/inference
Acceleration	100% NPU Mapping	100% NPU Mapping

Universal Compatibility: LiteRT Micro & High-Performance Silicon

Sensory’s STT engine is engineered for rapid portability across a broad ecosystem. By using LiteRT Micro (formerly known as Tensorflow Lite Micro) as the essential runtime layer, Sensory provides seamless integration for:

Arm® Ethos™ NPU Family: Native support for Ethos-U55, U65, and U85.
Cadence® Tensilica® HiFi DSPs: Full compatibility with the HiFi 4, HiFi 5, and HiFi iQ series.
Edge Platforms: Optimized for Arm Cortex-M (M4, M7, M55) and popular boards like Arduino Nano 33 BLE Sense, ESP32, and Sony Spresense.

Privacy and Performance

“Our STT engine demonstrates that natural language interfaces can be powerful without relying on the cloud,” said Todd Mozer, Chairman and CEO of Sensory. By processing 100% of voice data on-device, Sensory helps developers ensure user privacy, lower latency, and consistent reliability in environments with limited or no connectivity.

Why Embedded STT Matters

“Our latest STT engine proves that you don’t need a cloud connection or even a big embedded model for powerful, natural language interfaces,” said Todd Mozer, Chairman and CEO of Sensory. Sensory’s on-device approach offers several critical advantages over cloud solutions:

Privacy & Security: Voice data never leaves the device, ensuring total user privacy.
Low Latency: Instantaneous results without relying on internet connectivity or bandwidth.
Lower power/lower heat: Model efficiency and NPU usage reduces power substantially.
Cost Efficiency: Eliminates ongoing cloud processing fees and reduces data transmission costs.
Reliability: Guaranteed performance even in “comms-denied” environments or areas with poor cellular service.

About Sensory Inc. Sensory Inc. creates a safer and superior on device user experience through vision and voice technologies. Sensory’s technologies are widely deployed in consumer electronics applications including mobile phones, automotive, wearables, and smart home devices.

Media Contact: Amanda Defelice, Head of Marketing, Sensory Inc. [email protected]

Wake Words

Speech-to-Text & Commands

Language Models & Grammars

Sound Identification

Biometrics

VoiceHub

Stick to the Heavy Lifting: Build the Best Cloud AI with Sensory Providing the Edge

Webinar Recap: “Hey Car, What’s Next?”

Voices from the Vault: 30+ Years of Sensory’s Most Exciting Voice Tech Adventures

10 predictions for Edge AI in 2026: LLMs gain Efficiency

Sensory Announces the World’s Smallest and Most Powerful On-Device Speech-to-Text Engine

About The Author

Table Of Contents

Experience AI That Works On-Device

Now Supporting TensorFlow Lite Micro and NPU Architectures

Global Reach: 37 Languages on the Edge

Related News

Sensory Announces the World’s Smallest and Most Powerful On-Device Speech-to-Text Engine

The Voice AI Revolution: Key Moments That Shaped Today’s Conversational Technology

Top 10 Reasons Innovative Companies Switch to Sensory

Products

Company

Features

Resources

Wake Words

Speech-to-Text & Commands

Language Models & Grammars

Sound Identification

Biometrics

VoiceHub

Sensory Announces the World’s Smallest and Most Powerful On-Device Speech-to-Text Engine

About The Author

Table Of Contents

Experience AI That Works On-Device

Share This article

Now Supporting TensorFlow Lite Micro and NPU Architectures

Global Reach: 37 Languages on the Edge

Related News

Sensory Announces the World’s Smallest and Most Powerful On-Device Speech-to-Text Engine

The Voice AI Revolution: Key Moments That Shaped Today’s Conversational Technology

Top 10 Reasons Innovative Companies Switch to Sensory