feature article
Subscribe Now

Mystery CPU for the Masses

New Synopsys Processor Makes Leaps in Performance

Pop quiz! What’s the second-most-popular CPU core in the world? First place goes to ARM, of course, but who’s the runner-up?

If you guessed MIPS, PowerPC, x86, Tensilica, 8051, or XMOS, you’re wrong. (In good company, but still wrong.) The correct answer is: ARC.

According to Synopsys, 1.3 billion ARC processors were embedded into chips last year, and that number is growing by about 300 million per year. That puts ARC second only to the mighty ARM. Must be something about the name. Maybe all those designers thought they were getting ARM but licensed ARC by accident.

Not likely. ARC and ARM are vastly different beasts, even though both occupy the same phylum (or is that genus?) of the microprocessor taxonomic tree. They’re both 32-bit RISC processors; both are offered as licensed IP; both are used in SoC development; and both have a number of variations and configuration options. One runs practically every cellphone and tablet in the world, while the other one appears in… uh… where do all those billions of ARC processors go?

In just about anything that’s not a cellphone or a tablet, really. ARC-based chips are in cameras, utility meters, televisions, flash drives, cars, and on and on. Think “embedded system” or “system on chip” and you run a good chance of identifying a product harboring at least one ARC processor. (Extra credit for knowing that ARC has more licensees than ARM does, too.)

This week, Synopsys (the owners of the ARC architecture since its acquisition in 2010) announced its latest and greatest ARC processor, the HS. The ARC-HS is more than a tweak or an upgrade from the existing ARC-EM; it’s a huge leap. In fact, the performance gap between the two suggests there may be a midrange ARC processor in the offing. Whereas the EM putters along in the MHz range, the HS is rated for at least 1.6 GHz (in 28nm high-k silicon), with 2.2 GHz totally doable. The EM’s puny three-stage pipeline is tossed overboard in favor of an all-new 10-stage design with lots of sexy performance-enhancing features. (Disclosure: your humble scribe was once employed by ARC.)

The new pipeline has big-boy features like dynamic branch prediction, out-of-order instruction retirement (albeit with in-order dispatch), and the ability to keep up to eight pending instructions in flight. A unique new feature of the HS is its second, or late, ALU. Arithmetic and logic operations typically execute in stage 6 of the 10-stage pipeline, which is pretty typical. But if the ALU operation depends on data just loaded from memory, that data is unlikely to be available in time. Rather than stall the operation, the HS postpones its resolution to stage 9, in the late ALU. This sidesteps the usual load/use penalty of long pipelines. If the stars align just right (i.e., by accident), the HS can occasionally execute instructions in both the early and the late ALU simultaneously.

As quick as it is, the overriding goal of the HS is to remain small, simple, and power-miserly. ARC isn’t trying to give MIPS, ARM, or PowerPC any serious competition. It’s intended as a deeply embedded CPU core for deeply embedded software. The HS has neither superscalar nor out-of-order execution, two tricks that could have improved performance at the cost of die area and power. Instead, its designers embraced RISC simplicity. In 28nm silicon, a minimally configured HS core measures just 0.12 mm2, which is about one-fifth the size of ARM’s Cortex-R7. An HS processor will likely be smaller than the SRAMs or caches it’s attached to.

The new features are swell, but performance isn’t the secret to ARC’s volume success. That would be its configurability. ARC built its reputation as a DIY processor, a CPU core that designers can tweak, twist, pull, and reshape to suit their own desires. It’s the Silly Putty of CPUs. Developers can add and remove registers, invent their own instructions, change the caches, swap byte ordering, include an FPU, configure a hardware multiplier to improve performance or to save space, and more. It’s not so much a prepackaged processor as a smorgasbord of processor features that designers can browse and select from. The end result may be radically different from your neighbor’s ARC core. Or it might be the same; it’s your call. (For the record, Synopsys also offers preconfigured ARC cores for the less adventurous.)

It’s this configurability – plus ARC’s low cost of ownership – that has led designers to include it seemingly everywhere. If you don’t need a “brand name” processor with a big third-party software base, ARC fits the bill. Its small size takes up less silicon than its better-known competitors, and its licensing terms are less onerous. Like Tensilica (now part of Cadence), ARC’s configurability means that you get the features that you want, with none of the baggage that you don’t.

On the down side, you’re on your own for software. ARC HS is supported by a few real-time operating systems, including ThreadX and MQX, but that’s about it. The processor doesn’t have an MMU, so there’s no Linux or Android port. The compiler and debugger are clever about tracking ARC’s configurability – remove a hardware feature and they automatically remove software support for it – but that’s useful only if you’re compiling your own code. Third-party applications are pretty much nonexistent.

Having said that, the HS is binary compatible with its EM sibling, and it is “source code compatible” with the earlier ARC 600 and 700-series CPUs. All ARC processors implement a core set of instructions that can’t be changed, so it’s not as though the ISA is entirely random.

So maybe the ARC HS isn’t going to power the next Windows Phone or Galaxy tablet. But it might wind up in ten times more devices that have lower profiles. If what you want is a small, unassuming little 32-bit CPU that spins away in some corner of your device, the HS may stand for “hidden secret.”

8 thoughts on “Mystery CPU for the Masses”

  1. Pingback: bdsm
  2. Pingback: bet535
  3. Pingback: lose weight pdf
  4. Pingback: DMPK

Leave a Reply

featured blogs
Nov 22, 2024
We're providing every session and keynote from Works With 2024 on-demand. It's the only place wireless IoT developers can access hands-on training for free....
Nov 22, 2024
I just saw a video on YouTube'”it's a few very funny minutes from a show by an engineer who transitioned into being a comedian...

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

High Power Charging Inlets
All major truck and bus OEMs will be launching electric vehicle platforms within the next few years and in order to keep pace with on-highway and off-highway EV innovation, our charging inlets must also provide the voltage, current and charging requirements needed for these vehicles. In this episode of Chalk Talk, Amelia Dalton and Drew Reetz from TE Connectivity investigate charging inlet design considerations for the next generation of industrial and commercial transportation, the differences between AC only charging and fast charge and high power charging inlets, and the benefits that TE Connectivity’s ICT high power charging inlets bring to these kinds of designs.
Aug 30, 2024
36,112 views