feature article
Subscribe Now

Supercomputing To Go

HPEC Raises its Head at SC|05

Some embedded applications are much tougher, however. There are cases when we need to deliver copious amounts of computing power while remaining off the grid. Last week, at Supercomputing 2005 in Seattle, there was ample evidence of just such compute power gone mad. Gigantic racks of powerful processors pumped piles of data through blazing fast networks and onto enormous storage farms. The feel of the place was about as far from “embedded” as you can get, unless your idea of embedding somehow involves giant air-conditioners and 3-phase power.

Behind the huge storage clouds, teraflop racks, and nation-sized networks, there was considerable embedded computing activity going on, however. Although not its main event, high-performance embedded computing (HPEC) was hanging out at the show and getting a good deal of quiet attention. It seems that not all of life’s difficult problems will hold still long enough for you to ship them off to a supercomputer facility. Sometimes, massive processing power is required to interpret images in real-time, process radio signals on the fly, or solve complicated algorithms from inside a moving vehicle. It’s those applications that put the “E” in HPEC, and several forward-thinking companies were at the show, working to help the rest of us see the light.

To briefly trace the history of embedded systems architectures, we have moved rapidly from systems-in-chassis to systems-on-board, then into system-on-chip (SoC) integration over the past decade. Each time we’ve integrated, our power density has increased as our form factors shrank. Interestingly, today, embedded systems have more in common with supercomputers than with commodity desktop and laptop machines. As we highlighted last week in “Changing Waves,” both supercomputers and embedded computers have hit the wall of diminishing returns on single-thread, Von Neumann processors and have moved into the domain of multi-core and alternative architecture processing.

The HPEC folks have just hit the wall a little earlier and a little harder than the rest of us. Supercomputing in embedded applications is a challenging engineering problem with little wiggle room for tradeoffs and compromises. Three primary solution tracks are in evidence today, one with multi-core embedded processing, one with specialized processors such as DSPs, and one with reconfigurable accelerators. Supercomputing 2005 showed us a rich crop of companies targeting multi-core development, and many of the compiler and OS technologies that serve the massively parallel grids and clusters are similarly applicable to HPEC.

A variety of embedded boards and systems from companies like Nallatech, Starbridge Systems, and Annapolis Microsystems were on display. Most of these combine conventional processors feeding DSPs or FPGA accelerators with generous helpings of memory for caching and fifos, and various high-performance I/O connections to hook up to the outside world. Performance claims and demonstrations on many of these devices were impressive, often rivaling or beating non-embedded supercomputers at the same task.

Unlike the HPC strategy of fitting the algorithm to the hardware, however, the HPEC community tends to fit the hardware to the algorithm. The reasons are economic. A typical supercomputer installation justifies its cost by lending its processing power to as many high-value problems as possible. These problems may be highly diverse, with their only commonality being the need for trillions of CPU cycles. In the embedded supercomputing domain, however, the machine is almost always optimized to solve one specific problem. It doesn’t have to be working on DNA sequence comparisons one day, hurricane forecasting the next day, and seismic data analysis on the third. This luxury allows for some serious specialization, and HPEC designers seldom fail to capitalize on that angle.

Like a race car, the HPEC can be fine tuned for precisely the problem it was conceived to solve. In the extreme, a custom ASIC can be designed for massive hardware acceleration for specific compute-intensive tasks with minimal power and space utilization. If more flexibility is needed, programmable logic devices can be used to provide reconfigurable algorithm acceleration with a slight power penalty (compared to ASIC). In any event, making supercomputing embedded almost always involves some additional acceleration beyond simple multi-core processing.

With any of these acceleration strategies, however, there is a formidable programming problem. Supercomputing 2005 was ready with a number of solutions to those issues as well. Mitrionics was debuting their “Mitrion-C” compiler that takes a C-like parallel programming language and generates a hardware-accelerated executable that can run on a variety of machines from Cray XD-1 supercomputers to custom embedded HPEC equipment with FPGAs. Celoxica showed continued success with their Handel-C environment for hardware acceleration of compute-intensive algorithms, aimed squarely at the embedded high-performance computing area. Starbridge Systems demonstrated their “Viva” graphical language compilers generating re-usable applications to run on a variety of hardware platforms from accelerated HPCs to FPGA development boards.

While many of us may never need the gigaflops of compute power available with HPEC systems, it is still good to see the state of the art push ahead, giving everyone some extra breathing room. Even though we may not need the power today, it takes only a small market shift to turn a compute-intensive incremental feature into a must-have. If nothing else, Supercomputing 2005 showed us that the embedded MIPS will be there when we need them.

Leave a Reply

featured blogs
Nov 22, 2024
We're providing every session and keynote from Works With 2024 on-demand. It's the only place wireless IoT developers can access hands-on training for free....
Nov 22, 2024
I just saw a video on YouTube'”it's a few very funny minutes from a show by an engineer who transitioned into being a comedian...

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

Advanced Gate Drive for Motor Control
Sponsored by Infineon
Passing EMC testing, reducing power dissipation, and mitigating supply chain issues are crucial design concerns to keep in mind when it comes to motor control applications. In this episode of Chalk Talk, Amelia Dalton and Rick Browarski from Infineon explore the role that MOSFETs play in motor control design, the value that adaptive MOSFET control can have for motor control designs, and how Infineon can help you jump start your next motor control design.
Feb 6, 2024
55,513 views