feature article
Subscribe Now

Accelerating Mainstream Servers with FPGAs

Intel Puts FPGAs Inside

For a number of years, the world’s largest semiconductor company’s primary marketing slogan was designed simply to remind us they existed. The “Intel Inside” slogan and sticker told us, “Hey, by the way, that snazzy new Dell, HP, Toshiba, or even (ulp) Apple device you just bought gets its oomph from our chips. We just thought you’d like to know.”

It’s a novel situation when most of the customers who are dishing out triple-digit and better cash for your product don’t even realize they are buying it. Few laptop or desktop computer customers bothered to learn what semiconductor process their CPU was built on, how many transistors it encompassed, or even how many cores it boasted. Only the more savvy demographic even paid attention to the clock frequency, and most of them had little understanding what that vanity number even meant.

When Intel bought Altera, we believed the company was assembling the essential ingredients for a revolution in data center computing  – bringing FPGA-based acceleration to the mainstream. Today, Intel announced a major step in that direction. No, we didn’t see the tapeout of a new exotic single-digit nanometer accelerator chip. We didn’t learn that Intel compiler technology had been fused with Altera FPGA design tools to give us the ultimate software-to-FPGA development flow. In fact, we didn’t learn a single new thing about technology for FPGA-based acceleration in the data center. 

Instead, we got a refresher course in the true meaning of “mainstream” from the company who has dominated the data center server market for decades. Intel announced that top-tier OEMs including Dell and Fujitsu are rolling out servers pre-equipped with Intel programmable acceleration cards (PACs) containing Arria 10 GX FPGAs.  

Wait, what?!? Plain-old boring Arria 10? Fabricated on TSMC 20nm planar process no less? In a Dell?

Chevy did not become a mainstream car brand because of the Corvette.  

While we in engineering land breathlessly debate the FinFETs, LUTs, HBMs, and SerDes arriving with the next installment of Moore’s Law, Intel is packing low-cost, hard-working, mass-produced programmable logic devices onto PCIe cards and plugging them into Xeon-powered boxes by the truckload. IT procurement folks who want to future-proof their rack-based server systems will likely be checking order form boxes with Dell R640, R740, R740xd, or Fujitsu Primergy RX2540 M4 servers – without even knowing what an FPGA is.

The battle for supremacy in the FPGA acceleration game will not be decided by next-generation Stratix battling Virtex. It will not hinge on the outcome of the “Falcon Mesa” versus “Everest (ACAP)” showdown. In fact, the war may well be over before those exotic platforms ever see the light of day. The trojan horses have already been welcomed inside the city walls, and the PACs will be waiting patiently for the right combination of workload and acceleration IP to sneak out at night and show their stuff.  

Intel rolled out all the ingredients for today’s announcement back in 2017. They told us about the Intel Programmable Acceleration Cards with Arria 10 GX FPGAs, they announced their “Acceleration Stack” software development flow to allow third parties to create optimized accelerators for Intel Xeon CPUs with Intel FPGAs, and they announced an ecosystem where those third parties could sell acceleration IP for key applications such as financial acceleration, data analytics, cyber security, genomics, video transcoding, and AI.

Today, that recipe has emerged from the oven in the form of servers available for volume shipment with ready-to-run, results-proven acceleration for key workloads including financial risk analytics and database acceleration. Results on both applications are beyond compelling. On financial risk analysis, there’s an 850% per-symbol algorithm speedup and a greater than 2x simulation time speedup compared with traditional “Spark” implementation. On database acceleration, Intel claims 20X+ faster real-time data analytics, 2x+ traditional data warehousing, and 3x+ storage compression.

In the the financial risk analytics business, performance is a big deal. Terabytes of data must be crunched in real time, and rewards for crunching faster are substantial. Options trading, insurance, and regulatory compliance fuel an expected doubling of the risk analytics market over the next five years. Levyx, an Intel financial analytics partner, has a ready-to-use platform already tested and running on the Intel combo. Many more accelerated applications from a wide range of companies are likely to be available, and the strength of that ecosystem will be key to the success of the new servers.

Because of the low cost of Arria devices, the FPGA accelerator isn’t likely to move the needle much on the overall cost of servers, and the benefits will be substantial for companies with well matched applications to run. And, because of the pre-optimized applications with FPGA acceleration already dialed in, most companies will be able to take advantage of the substantial performance and power savings with zero in-house FPGA expertise.

This presents a bit of a new situation for the Intel PSG division (formerly known as Altera). FPGA companies are used to a world where they compete on merits for every socket. Engineers dive into the Fmax, LUT counts, MAC size and performance, design tool runtimes, and a host of other detailed considerations before selecting an FPGA to design into their system. FPGAs are specialized devices that have always been used and specified by experts. Now, however, we’ll have companies specifying and buying Arria FPGAs in mass quantities with essentially no scrutiny of the usual FPGA properties. FPGAs will just be “Inside” their new servers. Maybe they’ll use them, maybe they won’t. 

This also bodes well for Intel in the ecosystem war for FPGA acceleration. If these FPGAs in commodity servers proliferate, there will be fertile ground for third parties to develop high-value accelerators (such as the Levyx example above) based on the Intel Acceleration Stack. That stack is very “sticky” stuff, and applications developed there are likely to be deeply entwined with Intel FPGA architectures and low-level IP. If a large number of application developers take advantage of the Acceleration Stack in particular, Intel’s FPGAs have the potential, then, to become the “x86” of acceleration – which is a substantial defense against would-be competitors. It won’t matter whose FPGAs are better. It will only matter which ones support your application on your servers.

4 thoughts on “Accelerating Mainstream Servers with FPGAs”

  1. I have 2 points:
    1) Intel can not point to one application, based on FPGA acceleration, that they use in a production environment. That’s why they say “you can do it” and not “we use it ourself”

    2) It’s no longer about processes and manufacturing might. It’s all about the software and hardware architectures supporting reconfigurable computing. There is no software that can compile a million line program into an FPGA. There is no hardware you can buy that reconfigures in less time than it takes for a processor to run a billion instructions. Precompiled hardware (bitstreams) can’t be generated and manipulated in real time because FPGA manufacturers keep information on the bitstream a secret. Imagine if Intel was the only company that could compile a program for it’s device? And they charged you $5K to do it?

    If FPGA manufactures want to play on the data center floor they need to think like processor companies and not instant ASIC companies.

  2. Commercial companies have been offering FPGA boards to be used together with servers. It just hasn’t become mainstream or common enough for the software application companies to leverage on them (as compared to GPU). However, with the cloud nowadays, you can easily switch to a server with an FPGA. The cloud data centers have kinda solved this distribution issue. The issue is still software. Unlike GPU, where a singular library works with most GPUs, FPGA software is more complicated.

Leave a Reply

featured blogs
Dec 19, 2024
Explore Concurrent Multiprotocol and examine the distinctions between CMP single channel, CMP with concurrent listening, and CMP with BLE Dynamic Multiprotocol....
Dec 20, 2024
Do you think the proton is formed from three quarks? Think again. It may be made from five, two of which are heavier than the proton itself!...

Libby's Lab

Libby's Lab - Scopes Out Littelfuse's SRP1 Solid State Relays

Sponsored by Mouser Electronics and Littelfuse

In this episode of Libby's Lab, Libby and Demo investigate quiet, reliable SRP1 solid state relays from Littelfuse availavble on Mouser.com. These multi-purpose relays give engineers a reliable, high-endurance alternative to mechanical relays that provide silent operation and superior uptime.

Click here for more information about Littelfuse SRP1 High-Endurance Solid-State Relays

featured chalk talk

Calibre DesignEnhancer: Layout Modifications that Improve your Design
In this episode of Chalk Talk, Jeff Wilson from Siemens and Amelia Dalton investigate the variety of benefits that the Calibre DesignEnhancer brings to IC design and how this tool suite can be used to find and fix critical design stage issues. They also explore how the Calibre DesignEnhancer can Identify and resolve issues early in design flow with sign-off quality solutions and how you can utilize Calibre DesignEnhancer for your next design.
Dec 16, 2024
3,120 views