industry news
Subscribe Now

Sensory Boosts Performance of Embedded Wake Word and Speech Recognition by Infusing Smarter AI

Santa Clara, Calif., April 27, 2017 – Sensory, a Silicon Valley-based company focused on improving the user experience and security of consumer electronics through state-of-the-art embedded AI technologies, today announced that it has made significant updates to the embedded AI in its TrulyHandsfree™ technology to dramatically boost its performance and accuracy, while staying small and low power.

Introduced in 2009, TrulyHandsfree revolutionized voice user interfaces by offering the first commercially successful embedded small vocabulary speech recognition system to feature an always-listening wake word. Incorporating Sensory’s smartest and most efficient deep neural network technologies to date TrulyHandsfree 5.0 takes embedded voice interfaces to new heights, offering an on-device voice user interface experience that is more natural and intuitive than ever before yet a new shallow learning approach compresses the model sizes down to run in ultra-low power and with minimal memory and MIPS. Today, TrulyHandsfree can be found in leading mobile phones, sports cameras, IoT devices, and even toys!

Smarter Speech Activation for Improved Accuracy 

At the beginning, accuracy concerns were the major limiting factor that prevented mass adoption of voice wakeup technology. The risk of false fires had to be minimized to ensure that devices didn’t mistakenly activate at inappropriate times. TrulyHandsfree was the first solution capable of offering this consistent reliability, and since its introduction into products like the MotoX, and Galaxy S series smartphones, Sensory’s voice models and neural networks have continually evolved to offer better performance. Today, Sensory’s latest deep neural network models for embedded AI have allowed the company to deliver a 5X reduction in false accepts compared to version 4.01, nearly eliminating the chances of the speech recognition system activating when not actually summoned by the user. A new shallow learning approach takes the biggest speech models and compresses them down by a factor of 5-10 with no decrease in accuracy. Additionally, the latest neural network models offer greater reliability for user-defined triggers, providing the option for users to select the wake word they prefer, while still having the same accuracy and performance offered with specialized fixed triggers.

Enhanced Security Makes Sure That It’s You Speaking

One of the greatest challenges facing the IoT industry is user and data security. TrulyHandsfree 5.0 includes a layer of security in the voice interface that utilizes Sensory’s expertise in voice biometrics recognition and combines it with deep neural nets to authenticate users, limiting who can access it. TrulyHandsfree 5.0’s embedded speaker verification technology is highly flexible, allowing users to enroll their voice and their own custom trigger or passphrase, restricting unauthorized users from accessing the voice user interface. Even if an unauthorized person learns the trigger or passphrase, Sensory’s voice biometrics technology will recognize that it’s not the enrolled user speaking and not authenticate them, preventing them from accessing the device.

Advanced Signal Processing for Voice Barge-In and Far-Field Speech Recognition

TrulyHandsfree 5.0 also features a new voice barge-in feature, enabled with Sensory’s proprietary Acoustic Echo Cancellation (AEC) technology. Users can interrupt devices while playing voice prompts, music or other sounds by saying the trigger phrase to control music playback by voice, or provide any other kind of supported speech commands. This provides a more fluid voice user interface experience. Sensory’s new AEC technology is tuned specifically to maximize speech recognition system accuracy. This not only boosts the performance of the embedded TrulyHandsfree speech recognizer, but also any cloud-based speech recognition system that the speech requests are passed to.

Further, the overall performance of voice user interface systems is greatly affected by the signal-to-noise ratio of the audio signal received. Previous versions of TrulyHandsfree boasted excellent robustness to noise, however with version 5.0, Sensory incorporates new deep learning noise suppression algorithms that reduce the level of ambient noise provided to the speech recognizer to ensure that wake words and voice requests are heard clearly, further improving TrulyHandsfree’s recognition hit rate. This is especially helpful in home, automotive and mobile applications where background noise can overshadow the volume of the user’s voice.

Same Low-Power and Efficient Footprint

Today, voice has surpassed all other interface options for a growing list of device categories, however, most devices on the market today rely on cloud services for AI processing. Yet, these cloud-based solutions cannot be accessed completely hands-free without a client-side voice trigger technology. Many of today’s always-listening voice-enabled device applications, especially low-power devices that don’t have the required resources to run completely off the cloud, can benefit from a hybrid client/cloud approach that taps TrulyHandsfree technology. TrulyHandsfree is extremely resource- and power-efficient with ports available for today’s most powerful applications processors to low-power DSP platforms. For ultra-low power devices that have limited battery capacity such as wearables, Sensory offers its Low Power Sound Detector (LPSD) hardware component for DSPs and smart microphones that can reduce low-power configurations of TrulyHandsfree to operate at an average battery draw of less than a 1mA.

 “The demand for voice user interfaces continues to grow rapidly and TrulyHandsfree 5.0 will allow more manufacturers to incorporate low cost, low power voice user interfaces on device without sacrificing the cloud accuracy,” said Todd Mozer, CEO of Sensory. “TrulyHandsfree 5.0 offers the most advanced and efficient embedded AI technologies we’ve ever created. Additionally, we’ve set the bar higher than ever before for speech recognition accuracy by applying our new proprietary echo cancellation and noise reduction algorithms that we are confident will boost far-field voice performance for IoT devices of all kinds.”

TrulyHandsfree is the most widely deployed embedded speech recognition engine in the world, having enabled a hands-free voice user experience on more than 2 billion devices from leading brands worldwide. Additionally, Sensory can deliver voice triggers for all major IoT cloud services, including Amazon AVS, Apple Siri, Google Assistant and Microsoft Cortana, and provide developer support for cloud service interfaces on Linux, Android, iOS and Windows as well as support for dozens of proprietary DSPs, microcontrollers, smart microphones and other low-power embedded devices.

For more information about this announcement, Sensory or its technologies, please contact sales@sensory.com; Press inquiries:press@sensory.com.

About Sensory

Sensory Inc. creates a safer and superior UX through vision and voice technologies. Sensory’s technologies are widely deployed in consumer electronics applications including mobile phones, automotive, wearables, toys, IoT and various home electronics. Sensory’s product line includes TrulyHandsfree voice control, TrulySecure biometric authentication, and TrulyNatural large vocabulary natural language embedded speech recognition. Sensory’s technologies have shipped in over a billion units of leading consumer products. Visit Sensory at www.sensory.com

Leave a Reply

featured blogs
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...
Apr 18, 2024
See how Cisco accelerates library characterization and chip design with our cloud EDA tools, scaling access to SoC validation solutions and compute services.The post Cisco Accelerates Project Schedule by 66% Using Synopsys Cloud appeared first on Chip Design....
Apr 18, 2024
Analog Behavioral Modeling involves creating models that mimic a desired external circuit behavior at a block level rather than simply reproducing individual transistor characteristics. One of the significant benefits of using models is that they reduce the simulation time. V...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured chalk talk

Accessing AWS IoT Services Securely over LTE-M
Developing a connected IoT design from scratch can be a complicated endeavor. In this episode of Chalk Talk, Amelia Dalton, Harald Kröll from u-blox, Lucio Di Jasio from AWS, and Rob Reynolds from SparkFun Electronics examine the details of the AWS IoT ExpressLink SARA-R5 starter kit. They explore the common IoT development design challenges that AWS IoT ExpressLink SARA-R5 starter kit is looking to solve and how you can get started using this kit in your next connected IoT design.
Oct 26, 2023
22,847 views