Going Deep Series – Part 3 of 3

1st May, 2015

2 min read

About The Author

Todd Mozer

Founder & CEO

A serial entrepreneur with an IPO, an acquisition, 50+ patents, and a lifetime in audio-tech innovation. Todd has deep experience licensing and working with the largest tech companies in the world, including Amazon, Apple, Google, Microsoft, Samsung, and many others.

Winning on Accuracy & Speed… How can a tiny player like Sensory compete in deep learning technology with giants like Microsoft, Google, Facebook, Baidu and others?

There’s a number of ways, and let me address them specifically:

Personnel: We all know it’s about quality, not quantity. I’d like to think that at Sensory we hire higher-caliber engineers than they do at Google and Microsoft; and maybe to an extent that is true, but probably not true when comparing their best with our best. We probably do however have less “turnover”. Less turnover means our experience and knowledge base is more likely to stay in house rather than walk off to our competitors, or get lost because it wasn’t documented.
Focus and strategy: Sensory’s ability to stay ahead in the field of speech recognition and vision is because we have remained quite focused and consistent from our start. We pioneered the use of neural networks for speech recognition in consumer products. We were focused on consumer electronics before anyone thought it was a market…more than a dozen years before Siri!
“Specialized” learning: Deep learning works. But Sensory has a theory that it can also be destructive when individual users fall outside the learned norms. Sensory learns deep on a general usage model, but once we go on device, we learn shallow through a specialized adaptive process. We learn to the specifics of the individual users of the device, rather than to a generalized population.

These 3 items together have provided Sensory with the highest quality embedded speech engines in the world. It’s worth reiterating why embedded is needed, even if speech recognition can all be done in the cloud:

Privacy: Privacy is at the forefront of todays most heated topics. There is growing concern about “big brother” organizations (and governments) that know the intimate details of our lives. Using embedded speech recognition can help improve privacy by not sending personal data for analysis into the cloud.
Speed: Embedded speech recognition can be ripping fast and consistently available. Accessing online or cloud based recognition services can be spotty when Internet connections are unstable, and not always available.
Accuracy: Embedded speech systems have the potential advantage of a superior signal to noise ratio and don’t risk data loss or performance issues due to a poor or non-existent connection.

Explore All Blogs

Webinar Recap: Stop Renting Your Voice Stack

Voice Control

27th Apr, 2026

Webinar Recap: Stop Renting Your Voice Stack

Todd Mozer

2 min read

Our recent webinar brought together Sensory leaders to explore one of the biggest questions in voice...

The New Era of Zero-Latency Voice: How Sensory is Revolutionizing Tiny STT with LiteRT and NPU Acceleration

Voice Control

15th Apr, 2026

The New Era of Zero-Latency Voice: How Sensory is Revolutionizing Tiny STT with LiteRT and NPU Acceleration

Todd Mozer

4 min read

For decades, the "Holy Grail" of speech recognition has been the ability to process natural language...

Designing Reliable Wake Words for Action Cameras

Voice Control

2nd Apr, 2026

Designing Reliable Wake Words for Action Cameras

Todd Mozer

3 min read

Action cameras are built for moments when the device is out of reach, out of view, or exposed to wind,...

Wake Words

Speech-to-Text & Commands

Language Models & Grammars

Sound Identification

Biometrics

VoiceHub

Stick to the Heavy Lifting: Build the Best Cloud AI with Sensory Providing the Edge

Webinar Recap: “Hey Car, What’s Next?”

Voices from the Vault: 30+ Years of Sensory’s Most Exciting Voice Tech Adventures

10 predictions for Edge AI in 2026: LLMs gain Efficiency

Going Deep Series – Part 3 of 3

About The Author

Table Of Contents

Experience AI That Works On-Device

Related Articles

Webinar Recap: Stop Renting Your Voice Stack

The New Era of Zero-Latency Voice: How Sensory is Revolutionizing Tiny STT with LiteRT and NPU Acceleration

Designing Reliable Wake Words for Action Cameras

Products

Company

Features

Resources

Wake Words

Speech-to-Text & Commands

Language Models & Grammars

Sound Identification

Biometrics

VoiceHub

Going Deep Series – Part 3 of 3

About The Author

Table Of Contents

Experience AI That Works On-Device

Share This article

Related Articles

Webinar Recap: Stop Renting Your Voice Stack

The New Era of Zero-Latency Voice: How Sensory is Revolutionizing Tiny STT with LiteRT and NPU Acceleration

Designing Reliable Wake Words for Action Cameras