feature article
Subscribe Now

Generative AI Is Coming to the Edge!

Over the past few months, I’ve waffled on (as is my wont) about various flavors (ooh, tasty) of generative artificial intelligence (GenAI). On the menu were items like GitHub Copilot, which generates code (throwing in errors and security vulnerabilities for free), and Metabob from Metabob (can you spell “Zen”), which looks at the code generated by GitHub Copilot and takes the bugs and security vulnerabilities out again (to understand recursion, you must first understand recursion).

We’ve also discussed other GenAI-based Copilots, like SnapMagic, which can help design engineers pick parts, and Flux.ai, which can help them to design and layout their circuits.

Another form of GenAI comes in the form of text-to-image models, such as Stable Diffusion, which can be used to generate detailed images conditioned on text descriptions. It can also be used for related tasks, such as inpainting (where damaged, deteriorated, or missing parts of an artwork are filled in to present a complete image), outpainting (extending an existing image beyond its original canvass), and image-to-image translations.

The thing is, until now, we all pretty much took it for granted that GenAI models of this class, capacity, and caliber were destined to be run in the cloud. Can you imagine what it would be like to be able to run something like Stable Diffusion on a USB Drive that plugs into your notepad or laptop computer? I can. Actually, I don’t have to imagine this because I’ve seen it in action (I’ll show you a video later in this column).

To be honest, I have a personal interest in being able to generate images from text descriptions. I’m currently writing a small book called The Life (of a Boy Called) Clive (It Rhymes with Five). This all came about because my wife (Gina the Gorgeous) has been pushing me to write fiction. She firmly believes that since I’ve written technical tomes (search for “Clive Maxfield” on Amazon), writing fiction would be easy peasy lemon squeezy. She doesn’t seem to realize that a real fiction writer can convey the impression of something like the grandiosity of a ballroom with a few well-chosen words. By comparison, I would find this task to be stressed depressed lemon zest because I would be reduced to documenting the ballroom’s dimensions using both imperial and metric measurements.

Unwilling to give up, Gina next suggested that I write all of the stories about my formative years that she’s heard (over and over again) from my mother. Like the time when I was around 18 months old and I discovered what the bottom of a large barrel full of ice-cold water looked and felt like, thereby necessitating yet another mad rush to the hospital with my parents (suffice it to say that the doctors and nurses at our local hospital knew me by name). Or the time we went on holiday when I was two years old, and I slid in my socks along the linoleum floor in our Victorian hotel bedroom, and I hit the low windowsill and shot out of the window. Did I mention we were on the sixth floor? My mom said she was talking to my dad when he suddenly performed a ballet leap across the room that would have made Nureyev proud (dad used to be a dancer before the war) and threw himself through the open window, leaving only a hand clasped to the window frame as a reminder of his earlier presence. He then slowly pulled himself back inside, grasping me by my ankle in his other hand. Or the time when I fell off the cliff, making a surprise entrance to the family basking below. Or the time when… but I’m sure you get the drift.

The thing is, I would like to illustrate each of these stories with little minimalist pencil sketches in the style of E.H. Shepard’s illustrations for Winnie-the-Pooh. This is the sort of thing I could happily do in the evenings while ensconced in my comfy chair in our family room—if only I had a USB Drive that could run my own personal copy of Stable Diffusion.

Am I the only one who wishes for the ability to be able to run GenAI at the edge (by which I mean the edge in the form of laptop computers and edge servers—not the extreme edge in the form of IoT devices, although I’m not saying we won’t end up there before too long)? “No!” I cry, “a thousand times no!” In fact, there are many applications ranging from AI home assistants to medical devices that would benefit from GenAI on the edge. The reasons for edge deployment of GenAI include lower cost, privacy and reliability, increased accuracy (personalized models can be fine-tuned with individual and enterprise data for customization and improved accuracy), and low latency (supporting real-time application of GenAI for surveillance, video conferencing, gaming…).

Do you recall deep in the mists of time we used to call 2020 when I wrote my column, Say Hello to Deep Vision’s Polymorphic Dataflow Architecture? At that time, we discussed Deep Vision’s Ara-1 device.

Meet the Ara-2 (Source: Kinara)

Well, reminiscent of Michael Jackson’s 1991 Black or White music video (go on, you know you want to see it again), which boasted the first full photorealistic face morphing (was that really 33 years ago?), Deep Vision somehow morphed into a company called Kinara.

I was just chatting with Ravi Annavajjhala, who is CEO at Kinara, Wajahat Qadeer, who is Co-Founder and Chief Architect at Kinara, and the legendary Markus Levy, who seems to be present wherever cutting-edge machine vision and artificial intelligence appear.

Our conversation spanned too many topics to cover here (I’m sorry). Suffice it to say that, as illustrated in the above diagram, the Ara-1—which is latency-optimized for edge operations and offers 10X Capex/TCO improvement over GPUs—is currently shipping in volume and has been for several years now, which makes it proven technology. Meanwhile, the currently sampling Ara-2 offers 5-8X performance improvement over the Ara-1.

The Ara-2’s neural cores offer enhanced compute utilization, along with support for INT4, INT8, and MSFP16 data types. Each Ara-2 can support 16GB of LPDDR4 (4X that of the Ara-1), and multiple chips can be used to provide scalable performance with automatic load balancing. Furthermore, the Ara-2 supports secure boot and encrypted memory, thereby keeping our secrets safe.

Of particular interest (at least, to me) is the fact that the Ara-1 and Ara-2 are both forward and backward compatible. For example, the guys and gals at Kinara are running Gen-AI applications on Ara-1 “because we can.” They say that GenAI on an Ara-1 chugs along a little slower than on an Ara-2, but—as we just mentioned—you can increase performance by ganging two or more Ara-1s together. An Ara-2 is capable of running a GenAI model on its own but—once again—you can gang multiple devices together if the occasion demands.

All of which leads nicely to the image illustrating Ara-2 products below. We start with the chip itself, which you can purchase standalone to build into your own custom products.

Ara-2 products (Source: Kinara)

Alternatively, you can purchase a module in the form of a USB module or an M.2 module, both of which are available with 4GB or 16GB of memory.  Or, if you really wish to beef things up, you can opt for a PCIe card with 4X Ara-2 chips and 32GB or 64GB of memory, or 8X Ara-2 devices and 128GB of memory. 

Yes, of course, if given a choice, I’d love to slip a PCIe card with 8X Ara-2 chips and 128GB of memory into my office tower computer and take it for a spin. On the other hand, I was just watching this video.

 

 

As we see, this handy-dandy USB device plugs into your notepad or laptop computer, after which you can bask in the glow of being able to run your very own local Stable Diffusion model. I can only imagine the looks of awe and envy if I were to be doing this on a plane while on my way to speak at a conference. One of those flights that’s jam-packed with techno-nerds (my peeps) going to the same conference—the sort of people who would understand and appreciate what they were looking at (history being made).

Not surprisingly, I’m wondering if I could use one of these bodacious beauties to generate the pencil sketch images for my The Life of Clive book. If I get to lay my hands on one, I’ll let you know. What say you? Do you have any thoughts you’d care to share, preferably without asking ChatGPT to craft a cunning comment for you, although—now I come to think about it—I’d be interested to see such a comment so long as you identified its author?

4 thoughts on “Generative AI Is Coming to the Edge!”

  1. PERPLEXITY says: The phrase “Generative AI Is Coming to the Edge!” can be interpreted as a synesthetic comment, suggesting a sensory experience related to the advancement of generative AI towards edge computing. This expression combines the concept of generative AI with the idea of movement or progress, possibly evoking a sense of anticipation or excitement.
    The term “synesthetic” typically refers to the production of a sense impression relating to one sense or part of the body by stimulation of another sense or part of the body. In this context, the phrase may metaphorically evoke a sensory experience related to the technological advancement of generative AI towards edge computing, potentially conveying a sense of innovation or transformation.
    The use of the phrase in this manner is creative and metaphorical, linking the advancement of technology to a sensory experience, and it may be interpreted as an imaginative way to convey the significance of this development.
    If you have a specific interpretation or context in which this phrase is being used, please feel free to provide additional details for a more tailored response.

  2. https://www.perplexity.ai/search/Max-Maxfield-or-I3_C9zXDS8K1lWf7vfAZaA Generative AI Is Coming to the Edge! that’s a synesthetic comment – The combination of GenAI and a gustatory perception sound for me synesthetic and remembers me on the book cover of “Bebop to the Boolean Boogie: An Unconventional Guide to Electronics” which is synesthetic either. After deciphering your book cover I searched for further articles about synesthetic perceptions written by you and found your story about the electronic engineer Jordan A. Mills who saw schematic diagrams in color. I searched a lot in the internet but there was no Jordan A. Mills in the world so it must have been a pseudonym.
    Some years ago I gave to my students the project requirement “you have to build a synaesthesis tester” and the best of them answered “we can’t build a synaesthesis tester because we are no synaesthets” which has been quite right. So I invited some synaesthets into the class room which was an epiphany for most of us. A student build a synaesthesis tester later on which generated a special colored letter pattern which included a second hidden pattern but my specification was wrong what I realized later on when I testes with synaesthets. All letters has been rotated intentionally by a small angle because I supposed the synesthetic capabilities on an other visual processing layer and therefore the synaesthets were not able to read but only to feel the hidden pattern.

    1. As far as I recall, Jordan A. Mills was just a typical engineer, so he might not have a big digital footprint. I think the term “synaesthesis tester” is a big generic — there are many different types of synaesthesia — How did you find some synaesthets to come to your class?

Leave a Reply

featured blogs
Nov 12, 2024
The release of Matter 1.4 brings feature updates like long idle time, Matter-certified HRAP devices, improved ecosystem support, and new Matter device types....
Nov 13, 2024
Implementing the classic 'hand coming out of bowl' when you can see there's no one under the table is very tempting'¦...

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

Ultra-low Power Fuel Gauging for Rechargeable Embedded Devices
Fuel gauging is a critical component of today’s rechargeable embedded devices. In this episode of Chalk Talk, Amelia Dalton and Robin Saltnes of Nordic Semiconductor explore the variety of benefits that Nordic Semiconductor’s nPM1300 PMIC brings to rechargeable embedded devices, the details of the fuel gauge system at the heart of this solution, and the five easy steps that you can take to implement this solution into your next embedded design.
May 8, 2024
39,093 views