feature article
Subscribe Now

Hot Links

Are Symbolic Links Evil?

This topic was going to be a blog post, but, when the dust settled, it got turned into an article (as you can see by the length). You see, what should be a simple discussion became not so much so. And there’s a tendency in some corners of our industry not to come out and deal with an issue, but to whisper things in blogs and anonymous posts here and there, raising more questions than are answered.

My desire in what follows was originally to clear up facts. I must confess that I have not come out of my research feeling like I had a complete handle on what all the facts are – and contradictory claims remain. So I invite both the companies involved and anyone else with experience or an interest in the matter to weigh in via comments.

I guess it would be helpful here for me to mention what this whole kerfuffle is about: it’s yet more noise in the tiny Design Management (DM) corner of the EDA world. Honestly, the perception in this world tends to be that IC Manage does or says something provocative, and a dust-up ensues. My goal at this point isn’t to adjudicate good guys and bad guys, but rather to sort through the rubble to figure out what the issues are.

Let’s start with some background, which may be review for many of you. DM tools exist to manage the complex sets of files and “views” that go primarily with analog design. They can be used for RTL design as well, but because RTL design behaves a lot like software design, many logic designers just use software configuration management (SCM) tools instead of something more specialized.

There are four major players, as outlined in the DM article link above: the big (and quietest) guy is Dassault, with their Enovia/Synchronicity tool. They and Cliosoft have built their systems from the ground up, so to speak. By that, I’m contrasting them with IC Manage and Methodics, which tend to wrap existing SCM tools like Perforce and Subversion. IC Manage, in particular, runs over Perforce.

At issue here are the mechanics by which files are managed. Any design has an enormous number of associated files – both source files and the attendant views that accompany them. We’re talking tens and hundreds of thousands of files. The issue relates to how a workspace is populated with the information needed for a designer to do work. That many files flying through the network could reasonably be expected to cause delays, and everyone is looking for ways to manage that.

There are three fundamental approaches that are taken, two of which are most common.

  • Mirrors appear to be an older approach. A mirror contains a common workspace from which multiple users can work. Because it can house only a single version of a design, an update from one designer might cause a new version of some file to be “pushed” onto other designers when they don’t want it. That said, according to Dassault, who appears to be the only one of the four to use mirrors, numerous large semiconductor vendors use this approach (which is but one option that Dassault offers; Dassault users aren’t forced to use mirrors).
  • Caches provide users with a more direct way to control their own workspaces. They can handle multiple versions at the same time, and so there is no longer an issue of updates being pushed. The most “straightforward” caching approach avoids the cost of caching every single file in the design by using soft links or symbolic links (“symlinks”) – or even hard links (which Dassault has as an option for yet faster performance). This means that the workspace is populated with links to original files, and the files themselves are read only if actually needed.
  • A virtual file system (VFS) wraps the existing file system with a new one that has behaviors and properties customized to the DM problem. The oldest (and, up until recently, only) example of this is Clearcase. Other terms you might hear referring to this approach are “versioned object base” or “VOB” or “FUSE-based approach,” where FUSE stands for “File system in Userspace.” The latter refers to the fact that, normally, a file system is an intimate lower-level aspect of any OS, and, in a Unix-like system, it’s handled in the kernel. A FUSE approach implements another file system up in user space.

The origin of this whole issue is that IC Manage has cited a number of issues with the more traditional symlinks approaches, announcing their Views product as a solution. Views is a VFS approach: they’ve created their own file system that wraps Perforce’s capabilities (which lies over the native file system). The other guys, all of whom use (or at least offer) symlink-based approaches are firing back. So what gives?

IC Manage has reinforced their offering with data that they have gathered from a survey they did. One of the critical questions, and the associated answers, was as follows:

“What are your biggest issues with using symbolic links for disk space management?”

    • There were 524 respondents
    • 75% of them had experience with symlinks
      • 28% of those had no issues
      • 72% had issues; of those:
        • 49% had an issue with the lack of control of mirrors, including forced updates
        • 32% cited instability when versions were removed from caches to save space
        • 28% cited security issues relating to file permissions
          (These add to > 100% because you could cite > 1 issue)

Now… admittedly, the question is prejudicial (“How long has it been since you stopped beating your wife?”) However, they say that this same survey has been given on a yearly basis, so it wasn’t just a one-shot made-for-release event. They say that results have been consistent over the years. They also didn’t survey only their own customers: they rented a list, of which it was estimated that around 3% were IC Manage customers. And the possible answers from which people chose included the equivalent of, “I don’t beat my wife,” so respondents weren’t forced into suggesting there was a problem if there wasn’t.

So, taking all of that at face value, it seems like reasonable data. And the IC Manage Views product is marketed at addressing the concerns voiced by the respondents (with Broadcom going on record with a specific endorsement).

Needless to say, the other participants in the market don’t agree with the conclusions. Here are some of the general discussion points:
 

  • IC Manage suggests that symlinks are brittle; that versions removed to save space can break things.

Methodics also suggested that having tens of thousands of symlinks creates a bottleneck. Now Methodics and Dassault both use symlinks, but not to individual files, as does Cliosoft. Methodics uses a symlink per library; Dassault uses one per “module.”

Cliosoft, on the other hand, argues that, rather than causing a network bottleneck, using symlinks reduces traffic since only individual files that actually have to be read – a small fraction of the ones in a project at any given time – need to be transferred; the symlinks themselves are small and keep the traffic down.

Originally, I’m led to understand that the number of inodes (“index nodes”) used to be an issue, but that’s no longer the case. Regardless, whether you use a file or a link, either one uses an inode, so one is not better than the other from that standpoint.

As to things being broken by removed versions, the symlinks all appear to be used in caches, so rather than breaking something, a missing file would simply have to be re-read (admittedly taking time, but not qualifying as broken).
 

  • IC Manage includes a local disk cache so that data is closer and easier to read. (Writes are written through.)

Everyone seems to agree that having a local cache makes things faster. And IC Manage doesn’t appear to be alone in providing this. As to the benefit of read-local/write-through, apparently NFS already does that anyway – there’s some caching that goes on in the file system, so that’s not unique to Views.

The other caution was that most work isn’t done on a local machine: the local machine simply provides connectivity into the network servers, where the actual designs reside. By staying in the network servers, designs are backed up regularly, which they wouldn’t be if they were on a local machine.

Another benefit of network storage that Cliosoft takes advantage of is snapshotting: they take regular snapshots of the checked-out files so that if anything goes wonky and the files are corrupted, they can be reconstructed from the snapshots.

In addition, much of the processing that requires lots of files – like simulation – is sent by LFS to some machine in the farm. Because you don’t know which machine this is going to be, pre-caching doesn’t have much impact.

A general view is that caching with symlinks does the same thing that Views does, only explicitly and transparently. Views may do the same thing, but the system is proprietary and opaque. Methodics makes use of Perforce’s “shelving” feature, which they see as simpler and safer than a proprietary system.
 

  • There’s an argument that goes something to the effect of, “Clearcase did this years ago – no one else has taken that same approach (up to now).”

This isn’t a completely convincing argument, but it does reflect a couple of facts. One reflects an impression I got back in the late ‘90s when I became aware of Clearcase. My sense was that, when you buy Clearcase, you also need to buy a dedicated IT person to manage it. Another comment, “The VOBs are down,” was reportedly not an uncommon thing to hear.

The essence of this is that such systems are very complex to build, and, because they’re not transparent, when things break, you’re stuck. It’s suggested that IT folks are extremely wary of implementing an entirely new, proprietary file system since file systems are so fundamentally basic to any computing infrastructure. I haven’t heard an IT person say that directly, so it may be speculation (if any of you out there are IT folks, feel free to chime in in the comments).
 

  • Mirrors bad.

OK, I oversimplified that. But not by much. No one seems to debate that mirrors have their issues with respect to pushed version updates. They appear to be an older technology that still has a following, but caching is far more common.
 

  • IC Manage had to do this because Perforce doesn’t optimize disk space in the way these other systems can.

If that’s the case, then the rest of the arguments start to sound like something of a red herring. Not being a Perforce user (and wanting to draw this discussion to a close at some point), I’ll leave that for others to comment on. (“I’m a doctor, not a polygon pusher!”)
 

  • Broadcom spoke of being able to populate 10,000 files in less than 15 seconds.

Cliosoft talked about doing 100,000 files in less time than that. (I don’t believe he added a “Neener” after that point.)
 

  • Views becomes a visible (pardon the pun) aspect of the design environment.

This is related to the concern that, if Views breaks, you’re out of luck. Looked at another way, this argument says that Views is a “bump in the wire” vs. other systems. Dassault, for example, says that, after you set their system up, it stays out of the way; it works inconspicuously in the background.
 

  • Symlink-using systems have security issues because you can’t manage permissions at the necessary level of granularity.

No one else seemed to see this as an issue; permissions are normally handled by the systems themselves (like the cache server), so individual designers aren’t getting in there and monkeying with that. In fact, for the most part, they can’t: they don’t have the right permissions.
 

  • Symlinks bad.

This more or less sums up the highest-level contention by IC Manage. In addition to the comments I’ve already given about symlinks, Dassault suggests that they’re not really the issue. Yes, everyone wants everything to go faster, but it’s not symlinks that are in the way. In general, a common theme was the fact that, at the end of the day, if you have to read a file, then you need that file to be opened. Whether that’s done with a hard link, symlink, or some other system, those bits will be carried over the network – there’s no getting around that. And if that’s the bottleneck, then these systems won’t help.

In fact, Cliosoft mentioned that an RTL design house, of all things, specifically picked them because they use symlinks. Oi!

If it were true that only Views was able to selectively open and cache just those files that were needed, it would be a different matter. But it seems like it’s simply another way of doing what’s already being done in symlinks; as far as I’ve been able to tell, there’s no huge savings in network bandwidth. I could be wrong…
 

I’m sure I’ve missed some issues here, and perhaps my interpretations would differ from those of any one or more of the guys I talked to. Or those of you reading this. If so, please sound off and give us your view of reality.

For the record, the official spokespeople that I talked with (both live and by repeated email) were:

  • Shiv Sikand and Dean Drako of IC Manage
  • Srinath Anantharam of Cliosoft
  • Simon Butler of Methodics
  • Rick Stanton of Dassault

I thank them (and their PR folks) all for their patience as I badgered them with emails.

One thought on “Hot Links”

Leave a Reply

featured blogs
Dec 19, 2024
Explore Concurrent Multiprotocol and examine the distinctions between CMP single channel, CMP with concurrent listening, and CMP with BLE Dynamic Multiprotocol....
Jan 10, 2025
Most of us think we know something about quantum computing, right until someone else asks us to explain it to them'¦...

featured chalk talk

Machine Learning on the Edge
Sponsored by Mouser Electronics and Infineon
Edge machine learning is a great way to allow embedded devices to run applications that can collect sensor data and locally process that data. In this episode of Chalk Talk, Amelia Dalton and Clark Jarvis from Infineon explore how the IMAGIMOB Studio, ModusToolbox™ Software, and PSoC and AURIX™ microcontrollers can help you develop a custom machine learning on the edge application from scratch. They also investigate how the IMAGIMOB Studio can help you easily develop and deploy AI/ML models and the benefits that the PSoC™ 6 Artificial Intelligence Evaluation Kit will bring to your next machine learning on the edge application design process.
Aug 12, 2024
56,208 views