feature article
Subscribe Now

Mechatronics Meets No-Code Voice AI

I’m currently running around in ever-decreasing circles shouting, “Don’t Panic!” because I’m trying to spin too many plates and juggle too many balls and I was never trained to spin and can’t juggle. Hmmm, now that I come to think about it, that’s not strictly true because I once read a book on the art of juggling and — as a result — I can juggle nine fine china plates, but only for a very short period of time.

The term mechatronics refers to an interdisciplinary branch of engineering that focuses on systems that combine electronic, electrical, and mechanical elements. I just saw some amazing videos of something that really got my mechatronic juices flowing (I’ll mop things up and wipe them down later). Seeing these videos — which I shall present shortly — made me realize that there’s more to life than flashing tricolor LEDs (and I never expected to hear myself say that out loud).

As fate would have it, my degree is in Control Engineering, which — back in the day — was like mechatronics on steroids. However, although the curriculum featured a core of mathematics accompanied by electronics, mechanics, and hydraulics and fluidics, I’ve predominantly focused on electronics and computers career-wise, and — as discussed in Recreating Retro-Futuristic 21-Segment Victorian Displays, for example — flashing tricolor LEDs as a hobby.

I have, of course, played with solenoids and relays and motors and servos — who amongst our number hasn’t dabbled, or been tempted to dabble, if the truth be told? — but I’ve never really plunged into the mechatronic waters with gusto and abandon.

I also have friends who do all sorts of interesting things of a mechatronic bent, such as my chum Paul Parry of Bad Dog Designs, who created a fantastic Nixie tube-based timepiece called Symphony because — as seen in this video — it features and flaunts a form of Tubular Bells comprised of 32 pipes.

 

Just looking at this video reminded me that I really want to build something like this myself one day, although I was thinking of simplifying things somewhat. For example, Paul used solenoids in which the actuator core protruded from only one end. This caused him to have to create a rather sophisticated striker mechanism to translate the linear motion of the solenoid into rotary motion as illustrated below:

My interpretation of Paul’s early Symphony striker mechanism
(Image source: Max Maxfield)

Paul has the advantage that he has ready access to a complete complement of mills and drills, lasers and lathes, and all sorts of CNC goodies that make me drool with desire. I also just remembered that, after first seeing the Symphony, I purchased a simple solenoid for prototyping purposes. Recalling this purchase caused me to have a root in the myriad boxes of “useful things” jammed under my desk and crammed into any available nooks and crannies around my office. As we see in this video, amongst a wide variety of weird and wonderful things, I found a couple of rotary mechanisms from old telephones, an Intel 4004 4-bit microprocessor from 1971 in its white ceramic package, and the aforementioned solenoid, which could well find itself featured in a mechatronic project in the not-so-distant future.

 

In turn, this reminded me of the Pipe Dream video from Animusic. As an aside, the way the folks at Animusic work their magic is really rather clever. First, they create 3D models of all of the out-of-this-world instruments, including realistically modelling any physical characteristics, like mass, inertia, elasticity, pliability, and gravity. They also describe the “sound” associated with each instrument — that is, the tone, pitch, timbre, etc. of each note. Next, they capture a MIDI soundtrack of the music they wish to play. Now, this is the clever bit, because their system looks at the MIDI soundtrack and uses this to control the animation to make sure that the virtual machines do the right things at the right times to cause the right notes to sound when they should.

 

I have both of the Animusic DVDs. As soon as I saw the Pipe Dream track, I started to dream of creating a real-world incarnation. I know, I know, “What sort of fool would set out to do something like this?” I hear you cry. Well, how about the fools at Intel who created an Atom-powered version that was demonstrated at an Embedded Systems Conference (ESC) several years ago as I pen these words.

 

Every time I see this video, it reminds me that I one day intend to create a desktop implementation (admittedly on a large desktop). Flashing tricolor LEDs will, of course, play no small part in this production.

And so, finally, we arrive at the artifact that instigated my mechatronic musings. This was the reality-bending Morph LED Ball, which — as described in this Hackaday article — features 486 stepper motors and 86,000 LEDs. You really do need to watch the videos embedded in this column, which also notes:

The result is an artistic assault on reality, as the highly coordinated combinations of light, sound, and motion make this feel alive, otherwordly, or simply a glitch in the matrix. Watching the renders of what animations will look like, then seeing it on the real thing drives home the point that practical effects can still snap us out of our 21st-century computer-generated graphics trance.

The reality-bending Morph LED Ball (Image source: Hackaday.com)

O-M-G! When I saw this in action I was flabbergasted. In fact, I feel it’s fair to say that rarely has my flabber been gasted to this extent. I really, Really, REALLY want one of these bodacious beauties!

But (and I know this is going to surprise you) none of what we’ve discussed so far is what I actually set out to talk about.

As fate would have it, I was just talking to my chum Ali (you can call him Alireza Kenarsari-Anhari for short). Ali is the CEO of Picovoice.ai, whose mission it is to enable embedded systems designers to add speech recognition and voice control to everything (well, everything that makes sense). Picovoice’s context-aware voice AI for real-world applications provides breakthrough efficiency that enables Voice AI on the edge and that scales from microcontrollers to web browsers.

A key feature of Picovoice is that it works entirely on-device without sending audio data to the cloud, thereby maintaining privacy; also, it continues to function if the connection to the internet is lost.

Picovoice running on an STM32F469 discovery board from STMicroelectronics (Image source: Picovoice.ai)

Ali made a very good point that we’ve all become accustomed to using tactile interfaces like touch screens, which is unfortunate in these days of COVID-19 when we find ourselves increasingly reluctant to touch anything.

Quite apart from anything else, using touch screens can sometimes be a pain in the nether regions. I recently had to take my car in for a service. The service center has a very pleasant customer waiting room with comfy seats and a state-of-the-art coffee making machine whose touch screen user interface brought me to my knees. If ever I needed a coffee, it was after trying to unsuccessfully persuade that machine to give me one!

It would have been so much easier had I been able to simply stroll up to the machine and say, “Please give me an Americano with cream, hold the sugar,” or simply “Coffee,” in which case the machine could respond, “Do you want cream or sugar with that?” 

As part of our chat, Ali gave me a live demonstration of their Picovoice Console and Picovoice Shepherd tools, which they describe as “The first no-code platform for building voice interfaces on microcontrollers.” Of course, there is code, but you don’t create it yourself. Instead, as shown in this video, you use their interface to capture what you are trying to do, and it then generates the code for you.

 

One thing I really like is that, in addition to defining the main speech interaction, you can also specify your own custom wake word. At home, my wife (Gina the Gorgeous) has an Echo Dot on her bedside table. More recently, I took delivery of a Sandman Doppler smart clock, which resides on my side of the bed. Unfortunately, they both respond to the “Alexa” wake word (even if you w-h-i-s-p-e-r), after which confusion invariably ensues. I would love to be able to change my clock’s wake word to “Sandman,” but this is currently not to be.

However, we digress, because I’m now thinking of all of my flashing tricolor LED hobby projects that would be enhanced by the addition of speech recognition and voice control. How about you? Do you have an embedded system that would benefit from speech recognition and voice control capabilities? If so, you really should take a look at Picovoice.ai to see what they have to offer. As always, I welcome your comments, questions, and suggestions.

6 thoughts on “Mechatronics Meets No-Code Voice AI”

    1. I LOVE these — I wish I had one to play — I really do want to start creating mechatronic artifacts — but it will have to wait until I get some free time.

    1. You and me on stage? (Actually, my dear old dad used to be a dancer on the variety hall stage with his two brothers prior to WWII — so it’s in my blood 🙂

Leave a Reply

featured blogs
Nov 12, 2024
The release of Matter 1.4 brings feature updates like long idle time, Matter-certified HRAP devices, improved ecosystem support, and new Matter device types....
Nov 13, 2024
Implementing the classic 'hand coming out of bowl' when you can see there's no one under the table is very tempting'¦...

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

Ultra-low Power Fuel Gauging for Rechargeable Embedded Devices
Fuel gauging is a critical component of today’s rechargeable embedded devices. In this episode of Chalk Talk, Amelia Dalton and Robin Saltnes of Nordic Semiconductor explore the variety of benefits that Nordic Semiconductor’s nPM1300 PMIC brings to rechargeable embedded devices, the details of the fuel gauge system at the heart of this solution, and the five easy steps that you can take to implement this solution into your next embedded design.
May 8, 2024
39,093 views