Open-Source AutoML for Edge AI/ML Development

I was just cogitating and ruminating on the futuristic technologies to which I was exposed when “Star Trek: The Original Series (TOS)” first graced our television screens in 1966. Things like the flip-open communicators, which predated the launch of the world’s first flip phone by 30 years.

Also, there was artificial intelligence (AI) and machine learning (ML) of a form. However, in hindsight (the one exact science), the greatest machine intelligence on Star Trek TOS was exhibited by the doors, which opened only when someone was walking towards them but not when someone was walking past them.

All of this flashed through my mind while I was chatting with Chris Rogers, who is CEO at SensiML Corporation. The catchphrase at SensiML, which is a subsidiary of QuickLogic, is “Making Sensor Data Sensible,” but we’ll try to not hold that against them LOL.

As we previously discussed in my Creating Tiny AI/ML-Equipped Systems to Run at the Extreme Edge column: the mission of the folks at SensiML (pronounced “sense-ih-mel” to rhyme with “sensible”) is to help embedded systems designers create AI/ML-equipped systems that run at the edge where the “internet rubber” meets the “real-world road.” SensiML’s role in all of this is to provide embedded systems developers with accurate AI/ML sensor algorithms that can be used to create models capable of running on the smallest IoT devices, along with the tools to make all this magic happen.

Chris was kind enough to bring me up to date with the latest and greatest news from the folks at SensiML, which is that they’ve made the decision to go open-source.

“Why is this such big news?” I hear you cry. After all, there are lots of open-source AI/ML frameworks out there, like TensorFlow and PyTorch, for example. Well, as Chris says, “If you are a data scientist, you’re accustomed to using these open-source frameworks for building models, but our experience is that most of the designers of embedded systems with whom we’re dealing have very little AI expertise.”

I know what he means. Most of the people I talk to in embedded systems space (where no one can hear you scream) don’t know a great deal about machine learning (ML) other than what they’ve been reading in the press. Some may have an “ML 101” level of knowledge—understanding the basic concepts and the general approach—but when it comes to getting down into the weeds of trying to configure an ML model… they haven’t got a clue. The result is a collision between the data scientists in their ivory towers and the firmware engineers in the trenches.

I think most of us are familiar with the term “TinyML,” which refers to ML models that are optimized to run on very low-power and small footprint devices, such as relatively low-end microcontroller units (MCUs). There’s even a TinyML foundation and a TinyML book (which I highly recommend).

What about the term “AutoML”? Short for “automated machine learning,” this refers to the process of automating the tasks of applying machine learning to real-world problems. To put this another way, AutoML means using ML to build ML models, which is delightfully recursive.

The idea is that if we can use the power of ML to do inferencing, then why can’t we use ML to perform the process of searching for the right model in the first place. This reminds me of the old programmers’ joke, “To understand recursion you must first understand recursion” (this joke is equally applicable to young programmers, of course).

AutoML is a work aid that really helps to democratize things. Suppose you’re a firmware developer who doesn’t necessarily know the ins and outs of what type of model to use, including what type of feature transforms, or segmentation, or any of the pre-processing steps that you would need to feed the input vector to a model. In this case, you can use AutoML tools to sort all this out for you or, at least, get you to a good starting point.

As Chris points out, one of the main pain points for embedded developers is that the ML workflow itself is pretty darned complex. For example, the image below depicts the processing steps that take place on/in the edge model.

The SensiML “Knowledge Pack” ML Model (Source: SensiML)

You’ve got raw sensor data coming in. You’re doing some form of signal pre-processing that could include things like filtering, quantizing, and down-sampling to condition the data to be suitable for the task at hand. Next, in the same way you want a trigger event to commence the display on an oscilloscope, you want a trigger event for the interesting area to analyze for your pattern recognition. This could be as simple as a sliding window in time, where you’ve defined what the general size of a feature is, so by setting your sliding window accordingly, you’ve got this snapshot in time that moves over the data stream.

This next step is where a lot of people have a challenge, just because there are so many ways in which you can transform the raw, sampled data coming in from something like a motion sensor or a microphone. Just taking the raw data as the input vector will require a much more complex model than if you’d performed some feature transforms first.

For example, performing a Fourier transform on sound data results in a plot in the frequency domain as a set of discrete frequency bins with amplitudes, which is a form that’s readily analyzed by a simpler model.

The problem is that there are all kinds of transform options available. This is a wide-open space. What are you going to use? If you’ve got good DSP intuition and you know ML well, you can probably (possibly) make a reasonable stab at this, but a lot of people won’t know where to start.

The outputs from the feature transformation stage feed the input vectors of the classifier model. From there, it’s a pattern recognition task, using weights and biases and activation functions and whatnot to get you your result.

All these things are factors that play into the final model. Rather than doing all of this by hand with tears rolling down your cheeks, AutoML gets you to a starting point with very little upfront input other than specifying the goals you’re looking for. You put in input data, you put in your labels, you let the AutoML tool do its thing, and you’ll come out with a pretty good model without ever having to really do anything yourself.

As wonderful as this all sounds, although there are open-source frameworks and open-source libraries—all the things that are the lower-level fodder AI experts can use to build their models when they know what they’re doing—no such AutoML tool as described above exists in open-source form… until now.

Chris says the guys and gals at SensiML decided there was a gap in the market and an opportunity for them to get out there, to show some leadership, to take the Analytic Studio code base (a proprietary tool they’ve been building for the past seven years), and to make an open-source variant. They are putting this open-source version out as a project, much like Apache Spark, hoping to attract community collaborators.

This is very exciting. The really exciting part is how things will evolve in the future because, as much as SensiML’s open-source AutoML solution currently does, there’s a great deal more that needs to happen.

SensiML’s open-source AutoML solution (Source: SensiML)

As Chris says, “We’re still in the early stages of TinyML. There are a lot of things we have envisioned and things we would like to do going forward.”

The puzzle piece graphic above shows the things the chaps and chapesses at SensiML are bringing to the table. The initial project offering features a foundation of code that’s optimized for the edge, so the models you get are going to be small and efficient. AutoML modeling means people who aren’t experts can actually do something productive. Getting C source code as an output versus some “black box” that you don’t know what it does or how it works is important. Also important is the fact that you work in Python if that’s your preferred programmatic interface, or you can use a point-and-click no-code GUI approach.

Planned project initiatives for the future include things like Generative AIO (GenAI), synthetic data augmentation, edge tuning, and image classification.

SensiML’s open-source AutoML solution is of interest for a wide range of real-world applications, including the following:

Wearable devices and garments that analyze and coach proper human motion and ergonomics in real-time.
Predictive maintenance and anomaly detection sensors that recognize and react locally to faults in factory/plant machinery, pumps, and valves.
Building automation and security endpoints with acoustic event detection, keyword recognition, and speaker identification

Until now, IoT device developers undertaking what are often their first AI/ML projects have had to wade through a fragmented market of proprietary tools with varying capabilities and unclear roadmaps. The open-source release of SensiML’s Analytics Studio marks a significant milestone for the IoT Edge AI software tools industry providing:

Platform Agnostic Model Generation: SensiML’s plug-in style, open-source architecture supports a broad array of MCUs, AI/ML accelerated SoCs, and AI engines inspiring developer confidence to build ML datasets using flexible tools not tied to specific vendors, chipsets, or inference engines.
Time-Series Sensor Inputs: Provides support for all conceivable time-series sensors such as microphones, accelerometers, gyros, IMUs, loadcells, strain gauges, PIR sensors, and more. Inputs can be mixed for more complex models with sensor fusion algorithms.
Rapid Innovation: AI/ML’s fast evolution demands an open-source approach to harness the broader developer community expertise, accelerating key innovations such as generative AI, synthetic data, and edge learning advancements.
Flexibility: Analytics Studio supports multiple model development mechanisms from point-and-click AutoML powered model generation, to code-free GUI-based modeling with full pipeline control, to entirely programmatic Python SDK model creation.
Extensibility: Analytics Studio provides model generation for basic feature-based models, regression models, classic ML, and deep learning neural networks. Its rich library of over 80 feature generators also includes the ability to easily add custom transforms, filters, features, and classifiers, making it easy for community developers to enhance.

I don’t know about you, but I think all this is tremendously exciting. ABI Research projects 3.5 billion AI-enabled edge devices by 2027. I think the folks at SensiML are doing the industry a massive favor by bringing this open-source version of their Analytics Studio to the market. And, of course, they are doing themselves a favor because many users who decide to dip their toes in the open-source AI/ML waters will eventually migrate to the full commercial version of Analytics Studio with its additional bells and whistles (metaphorically speaking).

What say you? Are you as excited as I am? And do you have any thoughts you’d care to share on TinyML, AutoML, and/or AL/ML in general?