Our recent webinar brought together Sensory leaders to explore one of the biggest questions in voice AI today: what belongs on-device, and what should live in the cloud.
Dave Rich, Dr. Andreas Hagen, and Dr. Joseph Tepperman walked through the tradeoffs across embedded, hybrid, and cloud architectures, with a clear message: the best approach depends on the use case, the device, and the user experience. As the team noted, “cloud might not always be the best,” especially when reliability, privacy, latency, and bandwidth are top priorities.

A major theme of the session was the value of hybrid voice systems. By keeping wake word detection and speech-to-text on the device, teams can reduce data transfer, improve responsiveness, and maintain a better experience in low-connectivity environments. In fact, the webinar highlighted that sending only text to the cloud can reduce bandwidth by about 99% compared with streaming raw audio. That approach also helps preserve privacy by keeping more of the interaction local to the device.
The webinar also showed how these ideas apply in real products, from automotive and wearables to TV and retail experiences. Dr. Tepperman emphasized that model size and task scope should be matched carefully to the application, whether that means a compact command set, a lightweight NLU, or a larger model for more open-ended queries.
The takeaway? Voice experiences work best when the architecture is designed around the constraints and goals of the product. Curious to compare which option is best for your next product? Contact us to learn more about solutions available to you or watch the full webinar recording here.