AI That Listens, Sees, and Understands — On the Edge

Sensory Text-Independent Speaker Verification

A flexible, on-device biometric engine that verifies identity from natural speech, any phrase, any wording, no script required.

AmazonArcelikATTBentleyBest Buy HealthCanonDocomoecobeeEdwards LifeSciencesElectronic CaregiverFoxconnFujifilmFujitsuGarminGNGoogleGoProHarmanHasbroHiMirrorHMDHondaHoneywellHuaweiIntuition RoboticsJabraKakaoLenovoLGLifepodLogitechMastercardMattelMeizuMerryMicrosoftMideaMotorolaNECNextbaseNokiaNorlasePanasonicPelotonPhilohealthPlantronicsPrimaxQuanta ComputerSamsungSenaSimplehumanSK TelecomSnapSpotifyStrykerTellyTencentTymphanyUniversal ElectronicsVtechVuzixVWWazeZoomZTE

What Is Sensory Text-Independent Speaker Verification?

Sensory Text-Independent Speaker Verification authenticates users based on how they speak, not what they say. Instead of requiring a fixed phrase, it analyzes vocal characteristics across any natural speech. This makes it ideal for frictionless logins, continuous authentication, and hands-free verification across devices. Everything runs fully on-device, keeping biometric data private while delivering fast and reliable performance.

Speaker Verification Product Briefbtn-white-right-arrow
ai-text-independent-speaker

Core Capabilities

Flexible voice authentication designed for natural speech and real-world conditions.

core-features-ai-text-independent
Authenticate From Any Spoken Phrase

Authenticate From Any Spoken Phrase

No script, no required wording.
Verifies users through natural speech for seamless experiences.

Fully On-Device Biometrics

Fully On-Device Biometrics

Nothing leaves the device.
All matching happens locally, ensuring privacy and quick response times.

Continuous or One-Time Verification

Continuous or One-Time Verification

Adaptable to your workflow.
Support login, session monitoring, or hands-free user recognition.

Noise-Robust Neural Modeling

Noise-Robust Neural Modeling

Built for everyday environments.
Delivers strong accuracy across accents, backgrounds, and microphone types.

Can combine with text dependent wakewords or face authentication for improved performance and increased flexibility

The Sensory Text-Independent Speaker Verification Flow

Identity is confirmed through vocal characteristics, not the words spoken.

Step-by-Step Process
  • Step 1: User Speaks Naturally
    The system captures any phrase or sentence spoken during interaction—no predefined script required.
  • Step 2: Voice Features Are Extracted
    Acoustic markers unique to the speaker (like pitch, timbre, and speaking style) are analyzed locally on the device.
  • Step 3: AI Model Performs Speaker Matching
    The model compares extracted features to enrolled profiles using Sensory’s neural networks, optimized for edge devices.
  • Step 4: Access Granted or Denied Instantly
    Verification happens in real time without cloud contact, giving a fast, private authentication result.
Good to Know:

Because users can speak freely, authentication feels natural and effortless—ideal for voice-controlled or conversational products.

ai-text-independent-speaker-verification-flow

Compact, CPU-Optimized Models for Fast, Low-Latency Biometrics

Open-source and academic models tend to be large and perform poorly on CPU

open-source-intitative
Model Parameter Count Model Size Inference Latency
Model A 15.4M 62 MB 38ms per inference
Model B 21.5M 86 MB 46ms per inference
Model C 1.0M 4.8 MB 50ms per inference

Sensory offers various sizes of CPU-optimized TISV models

Logo
Model Parameter Count Model Size Inference Latency
Extra Small 0.6M 0.7 MB 6ms per inference
Small 1.4M 1.4 MB 10ms per inference
Medium 2.5M 2.5 MB 16ms per inference
image 445
  • 1st stage provides high performance FR and IA rates at low computational cost
  • 2nd stage significantly reduces IA rate with negligible increase in FR rate

Trusted by
Global Innovators

See how leading brands use Sensory’s on-device AI to deliver faster, safer, and more intuitive user experiences at scale.

Andrew Doyle
VP for Frontline Workers
Jabra

“Sensory’s technology has exceeded our high standards for accuracy, speed, and efficiency. By enabling hands-free control of key functions through voice commands, we’re boosting productivity and streamlining workflows for retail staff. This allows our frontline workers to focus on what matters most – delivering exceptional customer service.”

Ephrem Chemaly
General Manager & VP of the Automotive Business Unit
MediaTek

“By combining MediaTek’s expertise in generative AI technology with Sensory’s strengths in on-device voice AI, the collective efforts of our companies enable significant strides in providing next-level entertainment and security in vehicles powered by MediaTek Dimensity Auto.”

Cynthia Lee
Lead Product Manager
Zoom

“Zoom is passionate about making collaboration easier, but we always put our customer’s privacy and security front and center. Sensory’s technology checked all the boxes for us: accurate, fast and private…”

Michael Anderson
CEO
Nextbase

“Sensory’s TrulyHandsfree technology is a key component in making the Piqo not just compact and powerful, but also incredibly user-friendly. This partnership enhances our ability to provide unmatched value and safety to our customers.”

Sascha Prueter
Chief Product Officer
Telly

“The smartest TV ever deserves the smartest approach to privacy. With Telly’s use of Sensory’s on-device speech-to-text and voice technologies, we are able to bring extremely fast, low-latency voice commands to the living room.”

Experience Sensory
Technology Live

Interact with real-time AI demos and find out what sets us apart.

Works Across Industries

Identity verification through free-form speech, built for mobile, automotive, smart devices, and more.

Frequently Asked Questions

Everything You Need to Know

Text-independent systems don’t require a specific phrase. They authenticate based on the speaker’s voice itself.

No. The system can verify identity from short phrases, commands, or brief conversational snippets.

Yes. All biometric matching happens locally, ensuring privacy and reducing latency.

Yes. The model is optimized for embedded systems and low-power devices.

Absolutely. TISV can verify users throughout a session using natural speech.

The model is built for real-world use and handles typical background noise effectively.