In 1995, I strutted my marketing suit onto the stage at the Design Automation Conference and told the world that a revolution was afoot. I had seen the light – the path to engineering enlightenment, the road to the future of design – and I wanted to share. No longer would designers have to toil and struggle with the arcane anachronisms of register-transfer-level descriptions and clock-accurate timing. Now, thanks to the amazing capabilities of high-level synthesis, thousands of lines of detailed and incomprehensible RTL would be replaced by a few elegant lines of ordinary C or C++, with clear and self-explanatory loops and conditionals, that would magically be transformed and re-transformed into whatever hardware configurations best suited our requirements for latency, power, chip area, and throughput.
With that kind of incredible capability built into our tools, I was certain that RTL design would be dead within 5 years and all engineers would be happily sipping cocktails on the beach while their high-level synthesis tools crafted perfectly-balanced works of art – the likes of which could never even be envisioned by old-school RTL designers. Yes, a new era was upon us and I was the messenger – announcing the arrival of the new age.
I even had PowerPoint slides.
AND a demo.
Ten years later, in 2005, 99.9 percent of people were still designing with plain-old RTL. I was still convinced that high-level synthesis was the design technology of the future – and always would be.
Now, in 2011, it turns out that some people really are using high-level synthesis – and it works. For some things. You see – all those years that high-level synthesis was trying to win us over, it had quiet but effective competition from another high-level abstraction – IP-based design. Way back in the old days, we designed our custom circuits with schematics – using powerful IP blocks like AND, NOT, and OR. With that methodology, the 10,000-gate design became tedious. The 100,000-gate design became somewhat impractical, and the 1,000,000-gate design was ridiculous. The problem, we surmised, was the schematic. So we left the schematic world and went off into the land of language-based design and logic synthesis. Language-based design at the register-transfer level also became tedious, however, when the code required mushroomed into the tens of thousands of lines.
One school of thought was that we needed to raise the level of abstraction of our language-based design – dropping the cycle-accurate architecture-specific RTL descriptions in favor of un-timed, higher-level behavioral descriptions. The other alternative was to raise the level of our building blocks – and revert to a netlist-based model akin to our former schematic-based design flow. This time, however, our blocks would be much more complex than AND, NOT, and OR. We would be connecting up processor cores, memory-management units, video subsystems, Wi-Fi and USB blocks, and the like.
Option B won the match.
Today, most of us design with large, complex IP blocks for most of our design. There is almost a one-to-one mapping of available IP for just about any standard capability we want to include in our design projects. Want a Wi-Fi connection? There’s IP for that. How about SSD storage? Yep – in seven flavors. Need a 32-bit RISC processor subsystem? Just plug-and-play.
We did, however, say “most” of our design. It turns out that you can build a lot of stuff by snapping together large IP blocks and assembling a system-on-chip to be implemented in your choice of ASIC or FPGA technology… and so can your competitors. Then, it’s just a contest to see who can manufacture the cheapest, distribute the best, and succeed at the other intangibles that determine success with a non-differentiated product.
If we want to differentiate our product, it is usually by some “secret sauce” design block that contains our own special magic. In many designs today, that “secret sauce” is some kind of high-performance datapath – like a DSP or video algorithm – that can’t be accomplished in software. That “secret sauce” is the ideal target of high-level synthesis (or ESL synthesis). We can describe our secret sauce in the essence of its algorithm – without regard for the specifics of hardware implementation – and use the power of the high-level synthesis tool to explore alternatives and find the perfect architectural compromise that meets our design goals.
It is that secret sauce that leads us here to the location where our headline has been buried, because it is here that we find the motivation for Xilinx acquiring AutoESL, and it is here that we can examine the latest move in a two-decade game of design-tool chess and estimate the value of the ESL gambit.
FPGAs have enormous potential for accelerating datapath-type algorithms. We’ve waxed on ad-nauseum about the thousands of hardware multipliers – just sitting around waiting to be parallelized – on our latest FPGAs – if only we had the wherewithal to use them. Unfortunately, getting 100x the performance of a DSP from an FPGA also requires about 100x the engineering work – unless you have something like a high-level synthesis tool. With high-level synthesis, we can describe complex algorithms much as we would for a DSP processor – in C or C++ code, and synthesize the result into FPGA hardware. High-level synthesis could truly be the silver bullet that enables FPGAs to break through the barrier and dominate the DSP scene.
Using high-level synthesis, we can describe our algorithm in C or C++, (yes, I see you back there with your hand up, OK, “OR SystemC”), then use the power of the tool to explore hardware architecture alternatives and hone in on the one that meets our needs. It should be noted here that “we” generally refers to hardware designers. While one of the dreams of high-level synthesis has always been to enable software developers to create hardware simply by throwing down some C code and pressing a button – today’s HLS technologies are most definitely not “compilers” that take C code and magically generate hardware. These tools must be guided with hardware engineering expertise – and that’s where “we” come in.
There has been just one problem. High-level synthesis technology is extremely difficult to come by.
There are only a handful of players in the high-level synthesis game. It takes years, or some would even say decades, of development to get a high-level synthesis tool to “production ready” status. In addition to AutoESL’s AutoPilot (which is being acquired by Xilinx), there is Cadence’s C-to-Silicon Compiler, Synopsys’s Synphony C compiler (originally developed by Synfora), Cynthesizer from Forte Design Systems, Catapult C from Mentor Graphics, PowerOpt from ChipVision, C-to-Verilog from C-to-Verilog, CoDeveloper from Impulse Accelerated Technologies, and eXCite from Y Explorations. Of these, several are not focused on FPGAs, and those from large EDA companies tend to be very expensive – think five or six digits per year per seat. When one whittles down this list to those that are proven to have high quality-of-results, those that are affordable to most FPGA design teams, and those that actually target FPGA technology, the list gets very small. Compound this with the fact that most of these companies are bare-bones startups, and the options for FPGA designers wanting high-level synthesis in their flow are sparse at best.
Last year, we talked about how BDTI had begun a certification program for high-level synthesis tools – specifically targeting FPGAs with signal-processing algorithms. In that program, they had two tools – Synfora (now owned by Synopsys) and AutoESL (now being acquired by Xilinx). BDTI showed that for this class of problem, high-level synthesis tools can significantly improve productivity while giving results comparable and often superior to hand-coded RTL implementations of the same design. They also demonstrated that the FPGA version of that same design could then deliver many times the performance of traditional DSPs running that same design – with a smaller form factor, lower power, and less total cost.
A few weeks ago, in “Elephant in the Room” we talked about how synthesis technology was crucial to the use of FPGA technology. We chronicled how the FPGA companies have almost completely succeeded in wresting control of this critical technology away from the EDA industry, and how that threatened to leave the EDA industry mostly out in the cold in the fastest growing segment of the semiconductor industry – FPGAs. The next great hope for EDA in the FPGA world – past RTL synthesis – is high-level synthesis. All three EDA companies have high-level synthesis technology that could be effectively deployed for FPGA design, but all three charge an ASIC-like premium for the product, and they seem to be primarily aiming the product at their traditional custom-semiconductor, ASIC, SoC audience.
By acquiring AutoESL, Xilinx does two things: First, they guarantee that an affordable high-level synthesis technology will be available to Xilinx customers. Second, they guarantee that one of the primary, proven tools will not be available for competitors’ FPGAs. If high-level synthesis is the next lynchpin in the FPGA tools game, then Xilinx may have made a strong strategic move locking them in. The remaining contenders on the tool side now have to contend with the fact that all three big EDA companies already have offerings, and one of the two large FPGA companies does as well. Their options just narrowed. For Xilinx’s rival Altera, the question is now whether they want to depend on EDA to supply a potentially critical tool technology or try to lock in one of the remaining independent tools to restore detente in the high-level synthesis for FPGA wars.
Xilinx says they have big plans for AutoESL – and while it’s too early to judge specifics, the opportunities to optimize high-level synthesis for specific IP libraries and specific target technologies are many. By eliminating some of the huge variables that challenge high-level synthesis tools, it should be possible to make a tighter, better performing tool – after some tuning time. It will be interesting to watch.