editor's blog
Subscribe Now

Calypto Refreshes HLS

iStock_000002098045_Medium.pngHigh-level synthesis (HLS) recently got a round of improvement. Calypto’s Catapult 8 represents yet another fundamental renewal of an EDA tool for improving ease of use and quality of results.

Let’s review some basics. HLS generally refers to the use of C or C++ for specifying untimed design behavior. That’s actually caused some confusion, since SystemC is based on C++. So, loosely speaking, folks may use the term HLS to refer to designs built with either language.

To my mind, the aspect of HLS that really changes the game is moving from a hardware description language, where the focus is on hardware structure, to a software language, where the focus is on behavior. In other words, writing an untimed (that is, no notion of a clock) algorithm in C or C++ and then letting a tool create the hardware structure (in RTL) and the timing that implements the behavior. This is an enormous transformation.

SystemC, by contrast, is a hardware language; it just happens to rely on C++ syntax and semantics. While the “pure” C/C++ algorithm specifies behavior, not structure, SystemC specifies structure. It has its benefits, but the transformation to RTL is much less dramatic.

Calypto makes reference to “language wars”; people arguing over which is better. My sense has been that tool makers lobby for the language they support; that language may or may not be the best tool for a given job. Well, in Catapult 8, Calypto neutralizes this debate by supporting both untimed C/C++ and SystemC. Discussion over.

Catapult’s biggest contribution, historically, has been the creation of hardware from an untimed C/C++ algorithm. Originally created by Mentor Graphics, it was largely used by big companies (historically, it’s not been viewed as inexpensive) and wasn’t a push-button process. But over the years, many of the difficult issues have been handled, removing barriers to use.

So… why isn’t everyone using it then? I mean, what could be more attractive than being able to take a high-level snippet of code and have hardware created for it? (Besides price?)

That’s a question Calypto asked, and they identified three things to improve in Catapult 8:

  • Hierarchy management for incremental design and minimizing the impact of ECOs
  • Better tie-in with verification schemes
  • Power optimization

The ECO problem was the same as has been an issue at RTL levels in the past: at the end of the design cycle, when you want to make a little change, the whole design gets resynthesized, undoing months’ worth of work. The design process has been top-down only, so each change forces this complete recompile.

Catapult 8 allows better management of the hierarchy, automatically maintaining metadata for each block (and allowing engineers to annotate metadata below the block level). If something in one block is changed, it’s possible to lock the other blocks down so that they’re not affected. This puts some bottom-up back into the process.

This also means that a design can proceed incrementally, getting portions working, locking them down, and then working on new portions. You can even plug together pre-synthesized blocks. This change has opened up design capacity as well, allowing 10x bigger designs.

As to verification, apparently the old code generation didn’t play so well with UVM. One practical impact of that was that verification tools might raise a number of issues with the Catapult-generated RTL. But that RTL isn’t considered readable by engineers – the whole point is to get designers focusing on the higher level – except that the tools didn’t translate issues and error messages into the higher level.

This is partly because of the change of design from structure to behavior. With C/C++, you can improve and speed up your functional coverage at the C/C++ level, but because there’s no structure yet at that level, you’ve not getting structural coverage – that’s what you’re nailed with when doing verification after you’ve created RTL.

So Calypto worked with a number of their customers to make changes that allowed HLS to drop into common verification flows. In particular, it:

  • automatically synthesizes assertions and cover points;
  • identifies points that you can test for equivalence;
  • allows cross-probing between RTL and C/C++; and
  • integrates with formal tools, which to identify unreachable states.

As to power reduction, design choices in prior versions had to be implemented in the design source. That process now involves constraints expressed separately. The design source code, which specifies the behavior, can remain unchanged while, off on the side, you play with constraints to achieve your power goals. The types of strategies involved include deciding which frequency to run at, which resources can be shared, automatic implementation of clock gating, and minimizing access to memory.

Finally, they’ve assembled a library of functional components optimized for use with Catapult 8. They call it Catware, and it’s basically an IP collection of functions that they find their customers frequently need. The intent is to get designers spun up faster.

You can get more details in their announcement.

Leave a Reply

featured blogs
Nov 22, 2024
We're providing every session and keynote from Works With 2024 on-demand. It's the only place wireless IoT developers can access hands-on training for free....
Nov 22, 2024
I just saw a video on YouTube'”it's a few very funny minutes from a show by an engineer who transitioned into being a comedian...

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

Versatile S32G3 Processors for Automotive and Beyond
In this episode of Chalk Talk, Amelia Dalton and Brian Carlson from NXP investigate NXP’s S32G3 vehicle network processors that combine ASIL D safety, hardware security, high-performance real-time and application processing and network acceleration. They explore how these processors support many vehicle needs simultaneously, the specific benefits they bring to autonomous drive and ADAS applications, and how you can get started developing with these processors today.
Jul 24, 2024
91,803 views