Storm clouds are building on the horizon. Massive thunderheads darken the sky, their convective currents creating blinding static outbursts as enormous caches of charge follow the shortest path to ground. Change is coming to computing and it’s not going to be smooth or evolutionary. With the IoT driving the cloud and the cloud defining datacenter computing, we are about to be in the midst of perhaps the greatest discontinuous change in the history of computation.
With predictions that the IoT will deploy a trillion sensors worldwide within the next few years, and with billions of connected devices already in the hands of the majority of the civilized world, we are on the brink of an incredible revolution in technology. From self-driving cars to big-data-bolstered healthcare, we will soon see technological feats that even science fiction failed to predict.
At the heart of this revolution is a rapidly changing global computing infrastructure that will process, filter, coalesce, and cross-correlate staggering amounts of data from trillions of independent sources, giving us the information and (eventually) the intelligence to benefit humanity in ways we have barely dreamed of. But what will this global computing infrastructure look like? To paraphrase Douglas Adams’s “Deep Thought”: “I speak of none but the computer that is to come… whose merest operational parameters I am not worthy to calculate…” and yet, we will now try to explain it to you.
Almost the entire electronics industry is in some way now engaged in designing parts of the largest heterogeneous distributed computing engine ever imagined. From fast-paced startups chasing the wearable sensor wealth to old-guard heavy-iron rack builders deploying clouds-in-a-box, just about every aspect of the technology infrastructure is contributing. Starting at the sensors, we can follow the data from the ultra-low-power MCUs in the endpoint through the maze of IoT branches to the backbone of the Internet and ultimately to massive server farms doing cloud-based computing and storage. At every node in this massive web, the key drivers are power and performance. We need maximum throughput with minimum latency on a tiny power budget.
Interestingly, at just about every juncture, programmable logic and FPGA technology play a starring role. Let’s start at the sensor, where massive amounts of data must be gathered and filtered. Many sensors are “always on” but produce interesting data only intermittently. Behind the sensor we need ultra-low-power monitors that watch and wait while the rest of the system sleeps. This standby watcher function is often performed by ultra-low-power programmable logic devices that can sip microwatts while keeping a watchful eye on the sensors. When something interesting does happen, these programmable logic devices can kick into “sensor fusion” gear, aggregating and filtering the input from multiple sensors and trying to derive “context.” What does this set of inputs mean? Perhaps they tell us our customer is now out for a run. That simple piece of information can now be passed up the line, rather than an enormous mass of raw sensor data.
When massive amounts of data do get passed up the line, it is often wirelessly. FPGAs sit very close to the antennae on wireless base stations, performing high-speed signal processing on the antenna output before it is passed to the next stage. Then, FPGAs push packets through multi-gigabit pipes as the data makes its way toward the backbone of the Internet. Once in the backbone, it hits the traditional sweet spot for programmable logic. FPGAs have been the go-to technology for packet switching for the past two decades.
When those packets arrive at the datacenter, FPGAs are on the job again, gathering gobs of incoming data and helping to distribute it to the appropriate racks of servers. In fact, just about every time there is a giant data pipe, there are FPGAs at both ends pushing and pulling the data and routing it off into smaller pipes.
It is at this point that the structure of the datacenter becomes much more application specific. Different server architectures work best for different problems. The optimal proximity of processor, accelerator, memory, and storage, and the types of connections used between them are determined by the task at hand. All clouds may look alike from the outside, but they are actually arranged differently depending on what tasks they are supposed to be performing.
Once inside the individual server blade, we hit the point where FPGAs are making their move into the last bastion of conventional processors. In these pages, we have discussed at length Intel’s strategy to create heterogeneous computers that include both conventional processors and FPGA fabric in the same package, and probably even on the same chip. FPGAs can accelerate selected computing tasks enormously while reducing overall power consumption, so merging processors with FPGAs will be an improvement of epic proportions. Once the ecosystem is in place, these heterogeneous processors will revolutionize the datacenter in terms of performance-per-watt.
Of course, Intel isn’t the only company pursuing FPGAs in datacenter computing. While their dominant market share in datacenter processors and their pending acquisition of Altera certainly give them an enviable position in the race, they are by no means the only viable force in the fight. Xilinx and Qualcomm recently announced a collaboration to deliver a heterogeneous computing solution to the datacenter that is similar in many ways to the Intel/Altera collaboration. In fact, the Xilinx/Qualcomm announcement fueled increased levels of speculation that Xilinx might be an acquisition target for a company such as Qualcomm. But, regardless of whether it’s through collaboration or acquisition, it is clear that competing solutions are on a collision course in wooing the architects of the cloud-based computers of tomorrow.
That race is a competition with many dimensions. We have the basic semiconductor fabrication technology – with Intel going up against TSMC. We have the advanced multi-chip packaging technologies from both sides. We have Altera’s FPGA fabric architecture competing with that of Xilinx. We have Intel’s processor architecture squaring off against an insurgency from ARM. We have various competing high-speed memory architectures including hybrid memory cube (HMC) versus high-bandwidth memory (HBM) and others. We have a wide variety of mass storage technologies claiming supremacy. And we have Qualcomm’s server platforms against a plethora of incumbents, each with their own strengths and differentiators.
If the new reality is like the present, there will be only one winner. Today, Intel dominates the server market. But the primary reason for that dominance is the strength, legacy, and backward-compatibility of Intel’s X86 architecture. Zillions of lines of datacenter code have been optimized for those processors, and that fact alone has been enough of a moat to fend off even the most serious challengers.
But, with the next wave of computing, instruction set compatibility may not be the sticky superpower it once was. If acceleration with FPGAs is the magic bullet, the engineering effort to optimize an algorithm for FPGA-based acceleration will likely be dramatically more than would be required to, say, port software to a different instruction set. And, if the FPGA component is the new “stickiness,” the winning overall system may be the one with the winning FPGA. Finally, the one with the winning FPGA may be the one with the most robust set of software tools for getting applications to take advantage of the incredible power of heterogeneous computing with the least amount of engineering effort and expertise required.
So, it may all boil down to this: The distributed cloud-computing architecture of the future could be determined by which FPGA company has the best software tools.
That’s a high-stakes battle that would be interesting to watch.
Well done Kevin. Again.
I truly enjoyed reading and feeling the article.