feature article
Subscribe Now

MIPS I7200 Breaks the Chain

New 32-bit CPU Design Isn’t Very RISC-like Anymore

“It is surely harmful to souls to make it a heresy to believe what is proved.” — Galileo Galilei

Heresy! Sacrilege! Apostasy! The RISC orthodoxy has been profaned! Get the pitchforks and assemble the townspeople while I look for my wooden stake.

The High Sparrow and Lord Protector of RISC canon, MIPS Technologies, has decided that the RISC code is more what you’d call guidelines than actual rules. Welcome aboard the MIPS I7200.

To hell with orthodoxy, say we. We just want our embedded microprocessors to work efficiently, quickly, and expeditiously. And if that means tossing aside decades of research, careers of intense evangelism, veritable mountains of scholarly texts, and more than a few doctoral theses, so be it. This be war, and all’s fair in war and microprocessor design.

Our rebellious RISC renegade is the latest 32-bit CPU core design to emerge from the scriptorium in Sunnyvale, birthplace of MIPS Computer Systems and all that is holy in modern CPU architecture. MIPS, ARM, SPARC, and virtually every other new CPU to see the light of day in the past 20-odd years has been based on RISC philosophy. “Less is more,” is the guiding principle. Make the CPU hardware do less and, ipso facto, it will be faster. Shunt complexity to software instead because – hey, programmers are cheaper to hire than real electrical engineers, and software is easier to patch than hardware. Minimize the hardware and balloon the software. So it is written. So it shall be done.

Yeah, but. It’s a tradition more honored in the breach than the observance. No real CPUs adhere strictly to those early RISC principles. It’s simply too ascetic, too Spartan, and too damn hard to program. Almost from the beginning, ARM, MIPS, and all the other so-called RISC processors started contaminating their pure architectures with oddball instructions for shifting bits, calculating addresses, and handling floating-point numbers. Today, most RISC processors are reduced in name only.
But one tenet was cast in stone: Thou shalt have fixed-length instructions. Usually 32 bits wide, same-size instruction words are easy to decode, split, and execute. They neatly align on memory boundaries. They make it easy for compilers and linkers to figure jump addresses. Surely that’s one golden rule we can all agree on?

Eh, not so much.

The new MIPS I7200 marks the debut of a brand-new instruction set called nanoMIPS, and it’s – gasp! – variable-length. The final shoe has dropped and we’re not maintaining any pretense of RISC traditionalism anymore. The new nanoMIPS ISA isn’t even binary compatible with other MIPS processors (or anything else, naturally). It’s an entirely new ISA designed for small code size, never mind what it does to compatibility with the rest of the MIPS product line.

Have the MIPS designers lost their minds? Are they under an evil spell? Or perhaps they’ve been possessed by evil spirits emanating from a certain neighboring CPU facility in Santa Clara? Heaven knows those people design hideously complex processors with no regard whatsoever for the elegant virtues of abstemious design. Perchance MIPS has been affected by the number of the beast: x86.

Or maybe they’re just good CPU designers being practical. The I7200 is a midrange processor in MIPS’s product catalog, somewhere above the existing interAptiv CPU but below its many 64-bit processors. That makes the I7200 the hottest 32-bit CPU in the MIPS catalog. It was designed, the company says, with the help and encouragement of a certain Tier-1 vendor on the LTE/5G space, probably MediaTek. Thus, it’s a good fit for upcoming 5G modems, where parallel processing and low power consumption will be vital.

And one of the best ways to shrink silicon size is to shrink code size. After all, most processor chips are mostly memory. Between the big L1 and L2 caches in your average 32-bit SoC implementation, it’s hard to even find the CPU core swimming amongst all that SRAM. If you can reduce your code size by an appreciable amount, you can cut the memory size as well, and the power consumption with it. It’s the exact opposite of RISC: lard up the CPU silicon so that you need less code. Heresy!

Like its more conventional brethren, the I7200 does multithreading, which has become a MIPS hallmark. The CPU core can handle up to nine threads and switch between threads with zero overhead. Snippets of code can also be preloaded and “parked,” ready for instant deployment in the case of an interrupt handler or a high-priority task. This feature, combined with new scratchpad RAMs that bypass the cache, is designed to make the I7200 more deterministic – another important feature for exotic 5G or LTE Advanced modems.

The processor’s MMU can be dumbed-down to perform faster under an RTOS that prioritizes fast access time over elaborate memory-management schemes. Or, you can enable the full-on MMU to run Linux.

The nine-stage pipeline copies its structure from other recent MIPS processors, and it accommodates up to three (instead of just two) user-defined coprocessors under the existing ASE interface specification. This allows creative designers to add in their own hardware accelerators without inventing a completely new CPU from scratch. In addition to the new scratchpad RAM, the I7200 also supports conventional L1 data and instruction caches. Internal bus interfaces are now AXI4, whereas previous processors used OCP with AXI wrappers. If your ambitions extend to multiprocessor SoC design, the I7200 works in four-processor (36-thread) clusters, with cache coherence throughout.

But does it work? Gracious, yes. The I7200 outperforms its oddly named interAptiv predecessor by somewhere between 35% and 65%, depending on which EEMBC benchmark you prefer. It’s also a wee bit faster than arch-rival Arm’s Cortex-R8, which is pitched at roughly the same types of real-time applications. If it matters, the I7200 is also about 20% faster than a Cortex-A53 running at the same clock speed.

The real raison d’être for the I7200, however, is its code density. MIPS already has a compressed/condensed instruction set in the form of MIPS16e. It’s been deployed for ages and implemented in uncountable millions of devices. So why reinvent that particular wheel?

Because it’s better. The new nanoMIPS ISA is about 12% smaller than ARM’s Thumb2, and a good 15% to 20% smaller than MIPS16e, according to the company. Like MIPS16e, nanoMIPS is a standalone, self-contained instruction set. It’s not a mode or an extension; it really is the processor’s native ISA.

Unlike MIPS16e, however, nanoMIPS is not optional. Up until now, any MIPS licensee had the option of enabling MIPS16e or not. The “real” MIPS instruction set was always the default ISA and always required. That seemed only natural, and it meant that all MIPS processors were binary compatible with one another.

Not anymore. The nanoMIPS instruction set is the only one that the I7200 runs. There is no “standard” 32-bit MIPS option, which means that the I7200 is not binary compatible with any other MIPS processor (so far). It’s a clean break.

The company obviously feels that this was the right move, and it does make some sense, even if it’s a bit weird. Back when MIPS was still competing head-to-head with Arm and the other RISC pretenders, binary compatibility was important. (Look what it did for Intel.) But that ship has sailed. Few customers use their MIPS processors to run application code. There isn’t much of a third-party software library. It’s mostly deeply embedded real-time code that customers never see. So breaking compatibility isn’t a big deal, and it’s a fair trade for smaller code size. So, MIPS Technologies took a deep breath and took the plunge. As a result, we have an entirely new generation of MIPS processors that are unlike other MIPS processors.

The underlying architecture is still the same, and the programmer’s model is identical. If you didn’t look closely at the binaries excreted by your compiler, you wouldn’t know the difference. It’s all MIPS where it counts.

At 2-GHz clock rates without breathing hard, the I7200 is fast, efficient, reasonably small, and definitely scalable. MediaTek is already taping out its first I7200-based design, so it’s not even vaporware. The I7200 has achieved corporeal presence, with more on the way. Sing Hallelujah, O brothers and sisters, the new doctrine is here! Can I get an Amen?

6 thoughts on “MIPS I7200 Breaks the Chain”

  1. Finally some sanity!
    The mythical “execute all instructions in one cycle” BS was pure nonsense from day one. Loads and stores take one cycle PLUS THE MEMORY ACCESS TIME — meanwhile IF THERE ARE ANY IN THE PIPE they can execute.

    CPUs circa 1960 — almost 60 years ago — had interleaved memory and guess what the address of the next instruction was calculated while the memory was accessed. And sure enough the add/subtract kind of instructions executed in one cycle -==> when the data arrived from memory.
    RISC was pure hype from the very beginning, is now and always will be.

  2. None of the processors have really been RISC for a long time, the ISAs are for RISC processors of long ago, and the internals of the CPUs are now largely independent of the instruction sets.

    MIPS guys told me a while back they were breaking with backward compatibility because they had lost control of their ISA, and wanted to disenfranchise people using it – seems unlikely they’ll get a following for their new stuff in the face of RISC-V.

    I was hoping they would buy into some tech of mine to get a performance advantage (since Tallwood seem to have money to burn)…

    http://parallel.cc/cgi-bin/bfx.cgi/WT-2018/WT-2018.html

    … but some organizations just have a death wish.

Leave a Reply

featured blogs
Nov 12, 2024
The release of Matter 1.4 brings feature updates like long idle time, Matter-certified HRAP devices, improved ecosystem support, and new Matter device types....
Nov 7, 2024
I don't know about you, but I would LOVE to build one of those rock, paper, scissors-playing robots....

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

Driving Next-Gen Efficiency and Productivity to the Battery Lifecycle
Sponsored by Mouser Electronics and Vicor
In this episode of Chalk Talk, Amelia Dalton and David Krakauer from Vicor explore the evolution of battery technology and the specific benefits that power modules bring to battery cell formation, battery testing and battery recycling. They investigate what sets Vicor power modules apart from other solutions on the market today and how you take advantage of Vicor power modules in your next design.
May 24, 2024
36,383 views