Today, Intel made what is likely to be their most important announcement of the year, or perhaps the next couple of years. Clearly the biggest headliner is the introduction of the “Second Generation Xeon Scalable Processors,” which are intended to carry the heaviest load in defending Intel’s estimated 99% market share in data center processors. But the breadth of the announcement is staggering and goes well beyond the obvious, “Here, finally, are Intel’s new Xeons,” with key announcements in networking/connectivity, memory/storage, and FPGAs.
Intel wants us to know that there is going to be a lot of data.
Jennifer Huffstetler, Vice President & General Manager – Data Center Product Management and Storage – referenced a Forbes article from 2018, throwing out the mind-blowing statistic that 90% of the world’s data has been created in the last 2 years, and that only 2% of it is being used. How much data are we talking about? Lisa Spelman, Vice President & General Manager – Intel Xeon Products and Data Center Marketing – says that, in 2018, 33 ZB of data was created (that’s zettabytes – 1021 bytes, or billions of terabytes). By 2025, she says that is expected to rise to 175 ZB, an annual growth rate of over 25%. Spelman goes on to say that, by 2025, an estimated 30% of that data will be created by over 150 billion connected devices.
The Internet of Things is drowning us in data.
Intel, of course, is very happy about this shocking trend. They clearly want to own the business of moving, storing, and processing all that data, and they are planning to take advantage of significant resources at their disposal to win that battle. The point of Intel’s spear is their Xeon line of server processors, but the company brings a much broader technology portfolio to the data party, and just about every element is being updated in today’s announcement. In addition to rolling out more than 50 workload-optimized second generation Xeons (which we’ll cover in detail) Intel rolled out new permutations of their Optane persistent memory, a 30x AI inferencing performance boost for Xeon, a new 10nm FPGA family, and ethernet adapters with 100Gbps port speeds and application device queues (ADQ).
Intel spent two full days briefing analysts and the press for today’s announcement, and the five-page press release barely skims the surface, so we’ll dive here initially into the aspects we found most intriguing. First, of course, the Xeon second generation server processors. Intel has brought new versions of their current 81XX, 61XX, 51XX, and so on with second-generation replacements numbered as 82XX, 62XX, 52XX, etc. In addition, Intel is introducing a new 92XX “Platinum” series, AKA the “Now-we-are-faster-than-AMD-again” series (real part numbers may vary), which is apparently aimed at recapturing bragging rights at the top of the performance curve.
The “Platinum” 92XX series boasts up to 56 cores (so up to 112 cores in a 2S system) with an average 2x performance increase against 81XX Xeons, and “up to 5.8X better performance than AMD EPYC 7601.” (We think they mean “5.8 times the performance of…” but we’ll use their terminology.) The 92XX series also benefits most from significant AI inference performance improvements that Intel is branding as “DL Boost,” which the company claims delivers 30x the performance of Xeon Platinum 8180 processors on AI inferencing tasks. The new processors can attach a whopping 36TB of memory in an 8-socket system, using both DRAM and Intel’s Optane persistent memory.
Servers are typically on a five-year replacement cycle for most large customers, so Intel’s self-referenced performance comparisons are mostly against 5-year-ago models. The company claims “up to 3.5x” 5-year refresh performance improvement, 1.33X against Xeon 5100 Gold processors, and 3.1X “better performance than EPYC 7601” with the new 82XX Xeon. The AI inference from “DL Boost” on the 82XX processors is said to be 14x that of the 2017 8180 Platinum chip. The new devices also include hardware mitigation for side-channel attacks, hardware acceleration for encryption, and on-module encryption for Optane persistent memory.
Intel also announced the Xeon D-1600 processor, which is an SoC version of the Xeon, designed for “dense environments” closer to the edge where power and form factor are critical. The SoC targets 5G and other applications that move computation away from the data center and toward the ever-growing “edge” where the company has faced much stiffer competition than in the data center itself.
Intel is going after much more than the data center with this new wave of technologies. The company estimates the “Data Centric” market at something like $300B, combining the data center processing, networking, connectivity; non-volatile memory; IoT and ADAS; and FPGA markets. Within these markets are cross-market technologies such as AI training and inference, video, networking, storage, and so forth. In other words, thar’s money in them there hills. In this game, Intel’s strength is its breadth – with captive proprietary manufacturing technology and a wealth of architectural IP spanning the gamut of the sub-markets they are attacking. On the down-side for Intel, however, the company faces tough technical competition in most of those areas, with clear leadership positions in only a few. So, while a 99% share in data center processing might seem an insurmountable advantage, the big picture battle will still be an uphill struggle for the company.
Intel’s “Data Centric” vision spans well beyond the data center. The company is looking at the entire computing stack, from data center and cloud to the edge. As the IoT expands, demands will push more and more computing toward the edge (and away from the data center), so a strategy that focused solely on data center dominance would be tragically short-sighted.
The “DL Boost” AI inferencing improvements warrant attention. By achieving over an order of magnitude improvement in inference performance with Xeon, the company has moved the goal posts for the enormously competitive emerging market for AI acceleration. After all, if you already have Xeon processors with some available headroom, and those processors can satisfy your inference needs, you don’t need AI accelerators at all. This effectively shrinks the number of opportunities for AI acceleration from the bottom. Intel has all the options covered here, with FPGAs and their “Nervana” processors to accelerate inference when needed. So, companies aiming for acceleration only have fewer sockets to capture, while Intel wins either way.
Much of Intel’s “Data Centric” strategy seems to work this way. Want a “normal” car? Buy something like a Toyota (or in the case of servers, any of the Intel-powered brands). Want more performance? Buy something like a Porsche (or again, in the case of servers, likely any of the higher-end Intel-powered brands). Want something even faster than that? Start buying aftermarket “tuner” upgrades for your Porsche. By keeping the competitors (like NVidia GPUs for compute acceleration) cordoned off into the high-end “tuner” market (and then also competing directly with them there), Intel puts themselves in a position to take the lion’s share of the business.
One example of this effect in action is FPGAs for data center acceleration use. In this same announcement, Intel is rolling out their “Falcon Mesa” 10nm FPGA families, now branded “Agilex” (more on these in an upcoming article). FPGA-archrival Xilinx is basically betting the farm with their “Data Center First” strategy on their new “Versal” line of compute acceleration FPGAs, which the company refers to as “ACAP” (Adaptive Compute Acceleration Platform). Xilinx’s devices are accelerators, for customers with specialized workloads such as AI inference that can’t be handled effectively by the existing server processor.
By giving Xeon a 30x AI inference boost, Intel effectively cuts off a huge chunk of the potential market for Xilinx’s acceleration devices. Then, with their own new family of FPGAs, they compete with Xilinx for the remaining acceleration sockets. But, because Intel is already building most of the rest of the server, they have the ability to “trojan horse” their own FPGAs into the server from the beginning. That effectively narrows Xilinx’s opportunity to those who aren’t satisfied by the performance of the Xeon itself OR the Xeon plus Intel FPGA that would come bundled in their system. Intel enjoys this same kind of leverage advantage across a number of technologies in this space, from processors to accelerators, storage, networking, and more.
Intel’s announcement today also covered new versions of their “Optane” persistent memory. Optane is Intel’s marketing name for the combination of the 3D XPoint memory developed with Micron Technology, Intel’s memory and storage controllers, interconnect IP, and proprietary software. Optane is a game changer in the memory/storage stack, filling the gap between fast/expensive DRAM and slower/cheaper Flash. Since Optane is persistent, one could look at it either as much faster persistent storage, or as much larger RAM.
By blurring the memory/storage boundary, Optane has the potential to re-architect the way we handle data in our computing systems. Today, Intel announced “Optane DC Persistent Memory” to address the memory side of that duo. This version of Optane offers a persistent and lower-cost way to effectively expand the DRAM capacity of servers, using “real” DRAM almost as a hot cache and Optane for the slightly cooler in-memory data. Over time, applications are likely to capitalize on the persistent nature of Optane as well. By building a server with 3TB DRAM and 6TB Optane, Intel can deliver an effective 9TB system memory. The company claims that would yield a 13x improvement in restart time (20 minutes down to 90 seconds) and a 39% reduction in system memory cost.
Moving to the storage side of the equation, Intel is announcing Optane DC Solid State Drives (SSDs). These add a new faster layer to the storage equation giving dual-port RAM-like performance for critical enterprise IT applications. By delivering Optane in both “ruler” RAM form factor and SSD-like form factors, the company gives customers the flexibility to use either or both sides of the Optane coin to optimize their system performance and cost.
Finally, Intel is also announcing their new “Ethernet 800 Series” cards with 100GbE ports, queue and steering hardware assists, and application device queues (ADQ). ADQ is an application-specific queuing and steering technology that directs application traffic to a dedicated set of queues, giving significant real-world performance advantages. The company claims the new solution (along with the new servers) delivers over 45% latency reduction and over 30% throughput improvement.
Today’s announcement encompassed far more than we’ve outlined here, including software ecosystem, integrated/optimized platforms, and application-specific solutions based on these technologies. We’ll be reviewing more focused subsets of these in upcoming articles. It will be interesting to see what effect this rollout has on the analyst community, who has been forecasting market-share losses for Intel. (Actually, when you have 99% market share, there is really only one direction you can go.) The real key, however, is how Intel’s strategy will stand up to the dramatic and discontinuous change in the world’s compute infrastructure brought on by the deluge of data from IoT, convergence of communication with 5G, and overhaul of application architecture brought on by AI. These forces are rewriting all the rules for computation, storage, and networking, with enormous potential payoffs for the winners.
https://www.karlrupp.net/2018/02/42-years-of-microprocessor-trend-data/
Seems unlikely there’s a 2x in there. If all you are doing is making X86 go as fast as you can with the same SMP architecture you are on an asymptotic approach to some limit – the same limit as AMD, and probably the same limit as everyone else (ARM,RISC-V,MIPS…).
Is the performance gain here just the move to faster memory?