Flex Logix Joins the Race to the Inferencing Edge

Have you noticed that there seem to be a lot more products flaunting the fact that they are “Gluten Free” on the supermarket shelves these days? This sort of thing is obviously of interest to the estimated one person out of a hundred who has Celiac disease and is therefore intolerant to gluten, but do these product labels convey useful information or is it just noise dressed up to look like information?

When I first saw these labels, I assumed that the manufacturers had found some way to remove gluten from the products in question, but then I began to see items like jars of almond butter and cans of chopped green chilies boasting that they too were gluten free. This caused me to raise a quizzical eyebrow on the basis that “gluten” refers to a collection of proteins found in cereal grains, which — where I come from — play no part in almond butter or chilies (strictly speaking, “gluten” pertains only to wheat proteins, but it is often used to refer to the combination of prolamin and glutelin proteins naturally occurring in other grains).

The problem is that the dweebs in marketing have saddled up their gluten-free horses and are now riding them mercilessly across our supermarket shelves. A somewhat related problem is that the gluten-free products that have been bold enough to come into contact with my tastebuds — like pizzas with gluten-free crusts — have failed to impress to the extent that I now opt for products that proudly parade their gluten content or at least refrain from proclaiming that they don’t contain it.

The reason I mention this here (yes, of course there’s a reason, which we will get to if you stop interrupting, metaphorically speaking) is that I remember first hearing the artificial intelligence (AI) moniker back in the 1970s in the context of “expert systems” (this concept was formally introduced around 1965 as part of the Stanford Heuristic Programming Project).

These expert systems were divided into two parts: a knowledge base and an inference engine. The knowledge base contained the facts and rules associated with the domain in question. The role of the inference engine was to apply the rules to the known facts in order to deduce new facts.

On the bright side, limited as they were compared to today’s offerings, these early expert systems were among the first truly successful forms of AI software. On the downside, desperate drongos sporting marketing trousers flocked to this concept like a slice of lemmings (I know that “slice” is not an official collective noun for a group of lemmings, but it should be; either that or a “twist”). As a result, it wasn’t long before “Powered by AI” began to be associated with the most improbable software applications. This quickly became a joke, then an annoyance, and eventually a distraction. It wasn’t long before engineers started to cast aspersions (not I, you understand, because my throwing arm wasn’t what it used to be). Eventually, the “AI” designation started to leave a bad taste in everyone’s mouths (which neatly returns us to gluten-free pizza crusts), and people stopped using it because they couldn’t face the snickers, sniggers, and slights that would doubtless ensue.

I really don’t recall hearing much about AI from the tail-end of the 1990s. If you had asked me ten years ago, I would have guessed that little was happening in this arena. As we now know, of course, the sterling, but sadly unsung, guys and gals in academia continued to slog away in the background concocting ever-more cunning algorithms and sophisticated artificial neural network (ANN) frameworks. Meanwhile, processing capabilities, capacities, and technologies proceeded in leaps and bounds, especially in the FPGA and GPU domains. As I’ve mentioned before, in the 2014 version of the Gartner Hype Cycle, AI and machine learning (ML) weren’t even a blip on the horizon. Just one year later in the 2015 version, ML had already crested the “Peak of Inflated Expectations.”

Today, of course, AI and ML are everywhere. In the early days (by which I mean five years ago), the most exciting AI/ML applications predominantly resided in the cloud. Of course, it wasn’t long before some brave apps started to dip their toes in the waters at the edge of the internet, and now there is a veritable stampede toward edge processing, edge inferencing, and edge security.

Some AI/ML systems are targeted for deployment on relatively low-end microcontroller units (MCUs). For example, see my I Just Created my First AI/ML App! column, whose featured app was powered by a humble Arm Cortex-M0+ processor. Other AI/ML apps need substantially more oomph on the processing front. The problem is that, when it comes to wide deployment, price also has a part to play.

In my previous column, I invited you to Feast Your Orbs on NVIDIA’s Jetson Nano 2GB Dev Kit. This bodacious beauty boasts a 128-core NVIDIA Maxwell graphics processing unit (GPU), a quad-core 64-bit ARM Cortex-A57 central processing unit (CPU), and 2 GB of 64-bit LPDDR4 25.6 GB/s memory. Targeting artificial intelligence (AI) at the edge, this entry-level development kit costs only $59. As powerful as this is, many real-world applications require even more processing power, in which case the folks at NIVIDIA are fond of saying that their Jetson Xavier NX system-on-module is the smallest, most powerful AI supercomputer for robotic and embedded computing devices at the edge. The only problem is that the Xavier NX’s ~$350 sticker price is a tad on the high side for many people’s purses.

All of which leads us to the fact that the chaps and chapesses at Flex Logix have just thrown their corporate hat into the AI/ML ring. Until now, most industry observers have associated Flex Logix with their eFPGA (embedded FPGA) technology; that is FPGA IP that developers of System-on-Chip (SoC) devices can embed in their designs. However, wielding their eFPGA skills with gusto, abandon, and panache, the guys and gals at Flex Logix have just announced the availability of their first offerings in a new line of off-the-shelf inferencing chips and boards.

These inferencing engines are designed to accelerate the performance of neural network models — including those used in object detection and recognition — for robotics, industrial automation, medical imaging, gene sequencing, bank security, retail analytics, autonomous vehicles, aerospace, and more.

Let’s start with the InferX X1 chip, which boasts 64 reconfigurable tensor processor units (TPUs) that are closely coupled to on-chip SRAM. The TPUs can be configured in series or in parallel to implement a wide range of tensor operations. Furthermore, the programmable interconnect provides full speed, non-contention data paths from SRAM through the TPUs and back to SRAM. Meanwhile, DRAM traffic bringing in the weights and configuration for the next layer occurs in the background during computation of the current layer, which minimizes compute stalls. The combination of localized data access and compute with a reconfigurable data path, all of which can be reconfigured layer-by-layer (in 4 millionths of a second), provides mindbogglingly high utilization along with eyewatering throughput.

Introducing the InferX X1 chip (Image source: Flex Logix)

But wait — there’s more, because the folks at Flex Logix have also announced the availability of their InferX X1P1, which features a single InferX X1 chip and a single LPDDR4x DRAM on a half-height, half-length PCIe board (future members of the family will boast more InferX X1 chips and additional DRAM).

Introducing the InferX X1P1 board (Image source: Flex Logix)

Flex Logix is also unveiling a suite of software tools to accompany these boards. This includes Compiler Flow from TensorFlowLite/ONNX models along with an nnMAX Runtime Application. Also included in the software tools is an InferX X1 driver with external APIs designed for applications to easily configure and deploy models, as well as internal APIs for handling low-level functions designed to control and monitor the X1 board.

It seems like the folks at Flex Logix have NVIDIA firmly in their sights when it comes to the AI/ML market, because they say, “The InferX X1 runs YOLOv3 object detection and recognition 30% faster than NVIDIA’s industry leading Jetson Xavier and runs other real-world customer models up to ten times faster.”

They go on to claim that “The InferX X1 silicon area is 54mm^2, which is 1/5th the size of a US penny and is much smaller than NVIDIA’s Jetson Xavier at 350mm^2. InferX X1’s high-volume price is as much as 10 times lower than NVIDIA’s Xavier NX, enabling high-quality, high-performance AI inference for the first time to be implemented in mass market products selling in the millions of units.”

With regard to their InferX boards, the folks at Flex Logix say, “The Nvidia Tesla T4 Inference accelerator for edge servers is extremely popular, but customers want to deploy inference in lower price point servers for higher volume applications” and “The new InferX PCIe boards deliver higher throughput/$ for lower price point servers.”

Personally, I think it’s tremendously exciting that Flex Logix has entered AI/ML space (where no one can hear you scream) with these FPGA-based chips and boards. The greater the competition, the faster technologies will develop, and the lower the prices will fall. As always, we truly do live in exciting times.