Wouldn’t it be great if there were more options in FPGA tools? For decades now, the FPGA community has decried the lack of FPGA design tool options. You’d think that a technology that has been evolving and maturing for over thirty years would have long ago reached the point where there were a wide variety of competitive programming options to choose from. However, we are still basically at the point where there is one and only one option for doing your FPGA design – the tool suite sold and distributed by the FPGA company itself.
It’s not that the third-party and open-source communities haven’t tried to find ways to produce viable alternative design flows. They have. Numerous EDA companies, from fast-moving, highly-motivated, innovative startups to big lumbering institutionalized EDA vendors have poured creativity, energy, determination, and piles of cash into efforts to build a third-party ecosystem for FPGA design. Well-minded communities have unleashed proven formulas for open-sourcing solutions to complex problems, looking for an alternative to proprietary tools. All to practically no avail.
Why is it so hard to create a viable alternative FPGA design flow?
If you are looking for villains, there are plenty of options. EDA companies were quick to blame the FPGA vendors themselves, of course, for everything from withholding important technical data, making it difficult or impossible for a third party to create working tools, to trashing the market by offering competitive tools for free. These theories have some merit. EDA business models would have tools selling for significant amounts of money. The EDA business model really requires at least five-digit price tags for a sale, although they can sometimes trim things up and get by with four-digit pricing. But FPGA companies don’t want such a huge barrier to entry for their devices, so they really do need to have a “free” tool option for designers and teams who don’t have enterprise-grade budgets.
Regarding the proprietary information, there is a fine line that FPGA companies must walk in providing technical data to third-party tool developers. They have hard-earned intellectual property that they don’t want to expose to the world, and a very rich level of detail is required for a typical design tool to do its job. Furthermore, just gathering all the data into a form that would be useful to tool developers is a monumental task.
But, beyond the finger pointing, there are formidable technical barriers to the creation of a viable third-party ecosystem.
First, an FPGA is not a microprocessor. Processors, by and large, can be well characterized by their instruction set. If you know the instruction set, you can make a compiler. If you can make a compiler, you can build an entire development suite on top of it. These days, in fact, you don’t really even need the instruction set. With the proliferation of hypervisors and the like, it takes very little to connect higher levels of software abstraction to the bare-metal logic of the processor.
In the FPGA world, there is no hypervisor. There is no instruction set. There is a complex physical array of components, whose behavior depends not only on their logical structure but also on their physical location. And simply changing the number or arrangement of components has a dramatic impact on the upstream tool chain. To be specific, FPGAs are made up of look-up-tables (LUTs), and “other stuff.” The “other stuff” includes hardened cells for doing arithmetic (multiply, accumulate, etc for digital signal processing applications), memory in various sizes, widths, and configurations, various types of IO, and sometimes even processors and peripherals.
While the output of a software compiler is essentially an ordered list of instructions, the output of an FPGA implementation tool is a complex physical arrangement of components and interconnect. This important difference gives rise to a multitude of problems. On top of that, FPGAs are doing their work in a fine-grained, highly-parallel environment. A processor can tick along doing its job based on a single clock, and that clock can generally be sped up or slowed down with no important changes in the function of the design. An FPGA, however, could have a multitude of clocks, some synchronous and some not, with branches of irregular-length logic chains timed to complete (hopefully) within one clock cycle. So, while the output of a software compiler is basically independent of clock speed, the output of the FPGA flow is most certainly not.
In the old days, this duo of issues was mitigated by the relative speed of interconnect. If you were stacking several layers of logic functions between two registers, the propagation delay through the chain of logic would depend primarily on the logic functions themselves. As geometries got smaller, however, a greater proportion of the total delay was spent in the interconnect – the wires between logic elements. Over time, the interconnect delay became so dominant that the delay picture really depended more on the “wires” between components than on the components themselves.
Why does this matter?
The primary steps in the FPGA implementation flow are synthesis and place-and-route. Synthesis used to be called “logic synthesis” because it converted a (supposedly) human-readable hardware description language (HDL) file and synthesized it into a network of interconnected logic components. Synthesis is the closest thing in an FPGA flow to a “compiler” because it converts a human readable description of an algorithm into a much more detailed but equivalent logical description. Unlike a compiler, however, a synthesis tool must also worry about the timing aspects of the resulting design – making sure that each stack of combinational logic can complete its work in the duration of one clock cycle (in the relevant part of the circuit). That means synthesis also needs to perform a detailed timing analysis of the logic design it is creating.
Now, remember when we said the timing now depends more on the interconnect than on the logic? Well, the synthesis tool has no idea what the interconnect delay is going to be. That will be determined by where the components are placed during the place-and-route phase, and how the router decides to route the interconnects that make up each net. This creates a chicken-and-egg problem where synthesis cannot know the timing until place-and-route has done its job, and place-and-route cannot begin until synthesis has created a netlist of components. This dragon-eating-his-own-tail scenario has gotten worse as designs have gotten larger, interconnect delays have increased, logic delays have decreased, and timing has become more critical.
The implications on the tool chain are probably obvious. It has become almost impossible to do a good job if the synthesis and place-and-route steps are performed separately by separate tools. Doing the optimal design all but requires a single tool and data model that can iterate internally to find the best combination of logic structure, placement, and routing that will maximize performance, minimize power, minimize resource utilization, and reliably meet timing constraints. The model where the FPGA company produces the place-and-route tool, and a third, independent party produces a synthesis tool is fundamentally flawed. Of course, there are examples when third parties such as EDA companies have, in very close cooperation with FPGA companies, produced synthesis tools that work well with FPGA-company-produced place-and-route, but those examples are few and are extremely difficult to maintain.
But, why are place-and-route tools produced only by the FPGA companies?
In the ASIC world, just about every place-and-route tool comes from an independent EDA company. The various silicon vendors have crafted their libraries and processes around the idea of independent place-and-route. The FPGA problem, however, is much more complex. Since each FPGA family (or at least each FPGA company) has a unique structure for its fundamental logic element, and since each device has a unique configuration of those logic elements and the interconnect between them, place-and-route must practically be co-designed with the silicon itself.
When an FPGA company is developing a new chip, they typically run huge suites of designs through place and route, iterating both the place-and-route tool and the physical layout of the chip until they reach a combination where utilization (the percent of the logic on the chip that can be effectively utilized) and routability (the probability of getting 100% completion of all required routes) are acceptable. This co-design of the layout tools with the silicon is essential to getting competitive results with today’s technology in today’s markets.
It’s a small leap of logic to see that: if there is an enormous advantage to having the silicon and the place-and-route co-designed, and there is an enormous advantage to having the place-and-route and synthesis co-designed, there is probably an enormous-squared advantage to having all three elements designed together. This conclusion is backed up by the fact that Synopsys, Synplicity, Mentor Graphics, Exemplar Logic, and many other companies have tried and failed to make a significant, sustainable business doing third-party FPGA synthesis.
There have also been numerous attempts to create open-source solutions for FPGA design. But, in addition to the problems cited above, there is also a major problem with engineering expertise. In the open-source software world, the creation of software tools for software engineers is a perfect storm of expertise. The people who are motivated to participate in the open-source effort are software experts, and they are creating tools for software experts like themselves. They have both the expertise to create the tools and the insight to know what tools should be created and how they should work.
In the hardware-centric world of FPGA, however, the end users of FPGA tools generally do not have the expertise to actually write tools. The community of engineers with both the software skills to create complex tools and the algorithmic and hardware-engineering skill and understanding to make those tools work is very small and is mostly gainfully employed at either an EDA company or an FPGA company. The number of free agents in the world with that particular combination of expertise and experience is vanishingly small – especially compared with the number of engineers required to create and maintain a fast-evolving design tool suite.
Also, in the software world, software engineers working in other areas can justify “volunteering” their time to support open-source projects that directly help themselves and their colleagues in their “normal” jobs. But since the audience for FPGA tools is different from the community of engineers required to develop them, no such virtuous cycle exists. There just has never emerged a critical mass of capable developers with the time, energy, and resources required to create a viable, sustainable open-source FPGA tool ecosystem.
So, for the foreseeable future, we are left with a situation where the tools for FPGA design (at least the implementation portion) will be predominantly supplied by the FPGA companies themselves. Competition that might push those tools to evolve will be competition between the entire ecosystems – tools combined with silicon, IP, reference designs, boards, and support. So, rather than choosing an FPGA tool you like, or even an entire FPGA tool suite you like, you’re really left with choosing only an FPGA company that you like.
Of course, as we move up the chain above the level of implementation tools – to high-level languages and synthesis, model-based design, graphical and other design creation methods, simulation, emulation and other debugging and verification tools, there is much more diversity and many more options – owing to the cleaner separation between these software tools and the bare metal of the FPGA itself. In this realm, there is definite hope for a robust, competitive environment for selecting the tool that best suits your team’s needs and preferences.
The complexity of P and R/Synthesis tools are also cited as a barrier to entry for new FPGA companies. Several new architectures have appeared but failed to make an impact because the hardware guys underestimated the importnace of these tools
The high end FPGA business is a systems business. What is being sold is the combination of hardware and the tools to design a hardware system using those tools.
One analysis I have seen shows that the maximum OEM price for a chip is less than $500, based on chip size, yield and wafer cost. It uses the principle that the minimum yield must be greater than 1%, or the yield will be exactly zero.
High end FPGAs can sell for $5,000 each. This implies that what is being bought is the combination of the silicon and the tools. The silicon becomes a dongle for the tools.
So it would be expensive for the chip+tools vendor to split them apart.
If you look at ASIC design in Wikipedia, functional verification is done before synthesis and includes logic simulation.
Altera ModelSim requires a netlist created in the compile step which includes synthesis and probably place and rout.
Meanwhile only a small percentage of paths are critical anyway.
It is the design process that needs to be fixed.
FPGAs are just too big to run synthesis for every minor iteration.
Never mind that an EDA company is touting 3X synthesis time improvement.
Logic/Function design is independent of the physical implementation since it can be code running in a cpu or hardware. Hardware can be ASIC, FPGA, or relays/switches.
Logic simulation should be the first step. And whomever sold HDL for logic design entry was an idiot, not a designer. Get the logic right, then do physical design that faithfully preserves the function. It works because I have done it more than once.
I agree that open source FPGA code will be difficult for all the reasons pointed out in this article. However, I do at least applaud the major FPGA vendors for providing their tools on open source platforms. Nothing irritates me more than hardware vendors who provide embedded OS solutions based on Linux but then expect you to develop for the same on Windows.
I do wish the FPGA vendors would do a better job of supporting the open source ecosystem, though. For example, making it easier to integrate HDL code development with revision control systems such as git or subversion, and better support for automation through scripting languages, makefiles and the like.
Why not begin with open source libraries?
For all the main computer programming languages one can find a lot of excellent open source libraries (like for instance Boost for C++). These libraries have also the added value of having been designed by experts, and are of excellent pedagogical value about design in the chosen languages.
But nothing for VHDL or Verilog… And most open source examples available are rather badly written, as by beginners.
@rdj Yes, I’m also grateful that most tools work fine on Linux nowadays, since most of the other development is done here. If you are interested in better tools to structure the HDL code, I can recommend a tool called FuseSoC (http://github.com/olofk/fusesoc) which I have been working on for some time now. It is intended to handle dependencies between cores and provide tool flows for simulation and bitstream generation. (currently supports three FPGA backends and six simulators)
@JChMathae I agree that it can be hard to find high quality Open Source IP cores, but we are working on improving the situation. Last year we formed the FOSSi Foundation, a vendor-neutral non-profit to promote and assist Open Source Silicon in industry, academia and for hobbyists. At our yearly conference (orconf) last year we collected insights from the people there about what’s keeping people from using Open Source HDL code. It was very clear that the two main things keeping people from using existing cores was visibility and trustability, meaning that you don’t know where to find it, and if you do, you don’t know if it’s any good. We have made this a high priority to address this within the FOSSi Foundation and we will actually reveal some of our first steps at orconf in Bologna later this week.
On a general note, I am more ok with keeping the internal details of the FPGA closed, but I would love to see the frontend side being opened up. Many bugs I come across are related to for example GUI or HDL parsing. I don’t think those should be counted as “hard-earned intellectual property that they don’t want to expose to the world”, but they limit productivity as it usually takes over a year for a bug to be fixed, if they fix it at all. Compare this to bugs found in Yosys, Icarus Verilog och Verilator, where I have gotten patches the same day, and even if no one would fix it for me, at least I have the option to do it myself. Time is money, and waiting for bug fixes is very expensive.
The other problem with the current FPGA tools is that they are huge monolithic things running on desktops and servers, which prevents quick incremental changes and reconfiguration from within an embedded system. While this might not be an option for a bigger FPGA, I anticipate the need for this on smaller FPGAs which might serve as peripheral controllers or a small acceleration unit. One very interesting project is Project IceStorm, which is a completely open source reimplementation of synthesis, p&r and bitstream generation for the small Lattice (previously SiliconBlue) iCE40 devices. The whole toolchain can be run on a Raspberry Pi, and this has opened up for some innovative new products.
Long reply, but the short answer is: We might not see _completely_ open FPGA toolchains for a long time, but it makes sense to start opening up parts of it, because the industry is starting to take interest in Open Source Silicon now
I agree with the fundamental problems that prevents open source initiatives for the implementation tools but most of
my time as an FPGA developer is spent here:
* Design entry: Open-source (O) Emacs and closed-source (C) Sigasi Studio which builds on Eclipse (O)
* Test framework: VUnit (O)
* Simulator: Mainly Riviera-PRO (C) but I also use GHDL (O) on a regular basis
* Version control system: Git (O)
* Code review: Gerrit (O)
* Scripting: Python (O)
* Continuous integration: Jenkins (O)
I don’t think I would be able to create a competitive tool chain free from open source.
You don’t have the tools because the computer science guys have not worked out how to make parallel programming easy. However, I do have an open-source language solution –
http://parallel.cc
– an extended C++ that supports synchronous and asynchronous fine-grained threading. RTL needs to be replaced by an asynchronous FSM approach to move forward.
Repeated attempts to get my friends/acquaintances at Altera and Xilinx to take an interest have gone nowhere – so you can blame them if you like.
I also have an alternative plan that works with plain old C/C++ (get in touch if you are interested).
@Kev: I assume RTL means VHDL/Verilog, but if you are complaining about transfers between registers — I have news for you: It is the logic in the data path between registers that does the processing. So that is half of the story because the conditions that exist when the result is valid must enable the transfer.
Synchronous or asynchronous simply means how the time to capture the result is determined.
I know from experience how difficult it is to determine the time asynchronously and so far very few asynchronous designs have succeeded.
It is much like multi-core/parallel programming trying to identify and handle data dependencies.
It is not Altera and Xilinx that need to be convinced, it is the design community. If it is not TCL scripted HDL to run on Linux,no one cares and ignores it.
As I recall ParC made the programmer specify what is parallel, but in the real world of chip design it is the conditions/events that determine the sequences. The data flow must be designed for parallel processing — if there is only one ALU, guess what? No parallel processing.
FSM does not rule the world and programmers are not skilled in data flow design, neither are programming languages (neither are HDLs).
@Karl –
Asynchronous design is different from asynchronous implementation. The problem with RTL is that designers are defining the clocking scheme and the logic together, where you really just want to do the logic and let the tools do the clocking for you.
ParC is just C++ with the extra semantics of HDLs and message passing to support CSP style software (http://usingcsp.com) – aka asynchronous FSM, or event driven programming, coming whether you like it or not –
http://www.robert-drummond.com/2015/04/21/event-driven-programming-finite-state-machines-and-nodejs/
The one-ALU thing is incorrect in that you often only want one of an expensive resource, and it’s up to the compilers to multiplex it appropriately between threads.
@Kevin –
What makes open source compilers and operating systems successful is that each vendor’s compiler and OS teams do the porting and long maintenance for each of the vendors supported architectures. There is help from the community, but the primary effort is funded and paid for by the vendor in the form of their in house developers.
With this model, each vendor pays for a portion of the tools, with their staff budgets. What is saved is that each vendor doesn’t have to pay the FULL cost of maintaining a completely proprietary tool chain and operating system.
The same shared development vendor tool chain model can, and should, be applied to FPGA and ASIC tools.
Once patents have been filed, there should not be a significant barrier to exposing low level FPGA implementation data in the form of open source P&R low level descriptions for open source P&R tools.
At this point, the main points of your argument provide the compelling case for all the FPGA and ASIC vendors to step up to the plate, and start sharing development with industry standard open source tools. Just like the computer vendors did for compilers and operating systems.
@Kev –
What makes open source successful is the richness of alternative implementations like yours to explore new areas. And at the same time what makes open source difficult for production environments, is in-house developers unnecessarily using too many different tools which become unsupported later.
Once a case has been made for a new alternative implementation, the best end result is making the case for that technology to become part of the main stream tool chain. Case in point, is OpenMP for both fine grain and course parallelism for C/C++, which is widely implemented in GCC.
This is where standards bodies for the tool chain are important … both to protect the integrity of the tools, and the integrity of long term support for end user development communities.
@Dwyland –
And that is why using an open source FPGA & ASIC tool chain with shared development by Intel, Xilinx, IBM, Atmel, Fujitsu, Hitachi, and all the smaller FPGA/ASIC vendors would become a huge win.
As each of these vendors dumps huge amounts of money into their tool chains, each re-inventing the wheel every product cycle. And that tool chain money inflates the costs for end user silicon solutions.
And every last one of these proprietary tool chains become the boat anchors slowing down faster evolution of high quality advanced tool chain access for the FPGA & ASIC communities. And sadly, the advance of silicon progress too.
Simply because of the lack of budget by FPGA and ASIC vendors to radically advance the state of art in tool chain solutions.
Intel understands this because of the significant success of GCC/Linux has had supporting their microprocessor architectures. It would make sense for Intel to push/release their ASIC and Altera tools into the open source community as a first step … and let the rest of the FPGA/ASIC vendors try and catch up.
For Intel, that would be nearly free … it would cost every other vendor the porting effort to align to the defacto Intel standard open source FPGA/ASIC tool chains.
Or maybe that is IBM? or Xilinx? that is first?
We saw the same happen with early NUMA and Cluster Linux architectures … and with time, and some standardization efforts, the vendor’s OS’s slowly merge into a standard offering. Ditto for CPU architectures.