feature article
Subscribe Now

Two Cores When One Won’t Do

Synopsys Announces Dual-Core Module for ASIL-D

Do you trust your processor?

Yeah, you’re right; that’s not a fair question. If the question is reworded as, “Will your processor always give the correct result?” then the obvious comeback is, “Correct according to what?” If there’s a bug in the software, then the processor will give the correct – but not the desired – result.

So let’s assume good software. Now will the processor always give the correct – and desired – response?

Well, what if there’s a bug in the hardware? Of course, many of you reading this may well be deep in the throes of making sure that’s not going to be the case on your processor. As with software, it’s hard to guarantee that hardware has zero bugs. But, unlike software, great gobs of money and effort are expended on taking the bug count asymptotically close to zero.

So if we assume a good job has been done on verifying the processor, then can we (please) trust our processor?

Yes. Well… maybe. What are you doing with it? If you’re liking a dog picture on social media, then sure. But if my life depends on it? If it runs my car and is the main thing determining whether or not I become a roadside casualty, then… maybe not so much.

Even if the processor design team had truly discovered and resolved all issues, some of those issues aren’t binary. In particular, performance issues are verified to some level of confidence. It’s never 100%. Yeah, you can embrace 6?, but what if some unlikely condition out at 7? occurs?

Then there are uncontrollables like alpha particles. Or silicon wear-out, or walking-wounded issues that manifest later. Some of these may be temporary, some fatal.

So the most tech-naïve of us knows that we can’t count 100% on simple technology all the time, and we make allowances when that web page doesn’t come up the first time or when our call drops.

Running the Critical Parts of the Car

There’s an old set of jokes about what it would be like if cars were run by Windows. Things like, every now and then having to pull over to the side of the road, shut the engine down, and restart it – for no particular reason. That’s all funny – until you realize that upcoming self-driving cars are going to feature technology, nominally of the same sort that occasionally features a blue screen (whether or not branded by Microsoft).

So if we can’t 100% guarantee outcomes for so-called safety-critical operations – circuits in planes and trains and automobiles and medical devices and nuclear power plants – then how can we trust that those circuits won’t be our undoing?

In the automotive world, the ISO standard 26262 lays out expectations for different sorts of functions according to how likely they are to happen, how much control the driver has, and what the consequences of failure would be. These are given ASIL ratings: A (of least concern) to D (stuff better work or people could die).

So, out at that ASIL-D level, what do you do?

This concern has long been a factor in the mil/aero industries, where planes need to stay aloft and munitions must not deviate from their trajectories. One of the solutions there is referred to as “triple-module redundancy” (TMR). This idea, oversimplified, makes the assumption that, by tripling up the computing at critical nodes, if one processor has an issue (low probability if designed well), then the other two are even less likely to have the same issue. So in the event that all three processors don’t agree, a two-out-of-three vote settles the argument. Democracy in action!

This works – for a price. In that market, prices are indeed higher to support this extra cost burden (and many others). The same can’t be said, however, for the automotive market. Lives are still at stake, but shaving costs is critical. In this case, there’s a different way of handling processor failure. It still involves redundancy, but less than TMR.

The automotive approach is to use two instead of three processors. And, instead of three processors without hierarchy, the dual-core approach has a main processor and a shadow processor that acts as a double-check. Synopsys has announced a dual-core module targeting ASIL-D applications, referring to their instances in a circuit as “safety islands.”

 Diagram_Synopsys_ASIL_D_Ready_Dual-Core_Lockstep_Processor_IP_FINAL.JPG

(Image courtesy Synopsys)

The idea here is that the main core has primacy, but it’s got this shadow core looking over its shoulder. If the shadow doesn’t agree with a result that the main core produces, it alerts. What happens then depends on the application; think of it as throwing an exception, and the code has to determine the error handler. Except that, this being hardware, there are several options for manifesting a (hopefully) graceful exit from the state of grace.

When such a disagreement occurs, a two-bit error signal is set – and remains set until specifically reset. The state of the cores is also frozen for forensic or debug purposes. For recovery, you get three options: reset the core; interrupt the core; or send a message to a host processor. Synopsys sees the first two as most likely, since trust in the main core is now compromised (even though it’s theoretically possible that it could be the shadow core that glitched).

Simple in Principle, But…

So far, so good. But… what happens if some event occurs – a power glitch, an alpha particle, whatever – that affects both processors? As circuits get smaller, even localized events start to affect more circuitry at the same time. If that happens, the main core might generate an incorrect result – and the supervisor, still reeling from the same event, might go along with it. Not a good thing at 70 mph.

So the module includes a notion called “time diversity” – the shadow core does what the main core does, only one or two clock cycles later. (The specific number of cycles is programmable.) This makes it much less likely that something affecting the main core will affect the shadow core equally.

This is done with a FIFO in the safety monitor; the main core’s inputs and result are pushed into the FIFO so that it can be compared at a (slightly) later time with the shadow core’s outcome. This comparison is done for each clock cycle.

Which raises a new question: what is a “result”? Some instructions take more than one cycle to complete; what’s the intermediate result? Some instructions perform a calculation, in which case there is a specific result. But others might store data into memory – what exactly is the result there? Do you then go test whether the data truly ended up in memory? Does the shadow core do a test-and-store if the to-be-stored values disagree?

There are a couple of pieces to the answers. First, you can’t have results with definitions that vary according to the application; that’s just crazy-making. Instead, there’s some subset of the internal state that gets compared. That then works for each clock cycle, regardless of the specific instruction.

The other piece is that the shadow core can read from memory, but it can’t write to it. It’s not there to “do” anything; it simply supervises, tattling when there’s an issue.

Synopsys says that dual-core processors aren’t a new thing, but most are higher performance. They say that their ARC-based dual-core module – intended specifically for ASIL-D usage – is the first one in the microcontroller range.

All of this effort so that, when you’re cruising down the coast, hair blowing all over, magical tunes blaring from your speakers, and your car doing all the work automatically, you won’t have to think about your processors. You’ll just trust them.

More info:

Synopsys ARC Safety-Island IP

One thought on “Two Cores When One Won’t Do”

Leave a Reply

featured blogs
Apr 25, 2024
Structures in Allegro X layout editors let you create reusable building blocks for your PCBs, saving you time and ensuring consistency. What are Structures? Structures are pre-defined groups of design objects, such as vias, connecting lines (clines), and shapes. You can combi...
Apr 25, 2024
See how the UCIe protocol creates multi-die chips by connecting chiplets from different vendors and nodes, and learn about the role of IP and specifications.The post Want to Mix and Match Dies in a Single Package? UCIe Can Get You There appeared first on Chip Design....
Apr 18, 2024
Are you ready for a revolution in robotic technology (as opposed to a robotic revolution, of course)?...

featured video

MaxLinear Integrates Analog & Digital Design in One Chip with Cadence 3D Solvers

Sponsored by Cadence Design Systems

MaxLinear has the unique capability of integrating analog and digital design on the same chip. Because of this, the team developed some interesting technology in the communication space. In the optical infrastructure domain, they created the first fully integrated 5nm CMOS PAM4 DSP. All their products solve critical communication and high-frequency analysis challenges.

Learn more about how MaxLinear is using Cadence’s Clarity 3D Solver and EMX Planar 3D Solver in their design process.

featured paper

Designing Robust 5G Power Amplifiers for the Real World

Sponsored by Keysight

Simulating 5G power amplifier (PA) designs at the component and system levels with authentic modulation and high-fidelity behavioral models increases predictability, lowers risk, and shrinks schedules. Simulation software enables multi-technology layout and multi-domain analysis, evaluating the impacts of 5G PA design choices while delivering accurate results in a single virtual workspace. This application note delves into how authentic modulation enhances predictability and performance in 5G millimeter-wave systems.

Download now to revolutionize your design process.

featured chalk talk

Nexperia Energy Harvesting Solutions
Sponsored by Mouser Electronics and Nexperia
Energy harvesting is a great way to ensure a sustainable future of electronics by eliminating batteries and e-waste. In this episode of Chalk Talk, Amelia Dalton and Rodrigo Mesquita from Nexperia explore the process of designing in energy harvesting and why Nexperia’s inductor-less PMICs are an energy harvesting game changer for wearable technology, sensor-based applications, and more!
May 9, 2023
40,790 views