At first, the concept of “multi-core” from a processor IP company might seem a bit confusing. Couldn’t we already put multiple MIPS cores on our devices? If your concept of multi-core ends with putting more than one processor on a chip, you may not be yet dialed into the subtleties.
This week, MIPS launched their highest-performance solution ever with the new MIPS32 1004K “Coherent Processing System” – a multi-core, multi-threaded IP solution. The challenge of keeping all your cores busy in a symmetric multi-processing (SMP) system is actually made much easier when the multi-processor is combined with multi-threading. With monolithic processors continuing to pound frequency and particularly power consumption walls, multi-core technology is permeating every segment of computing. Embedded applications have long taken advantage of multiple processors – tasking different cores to perform completely independent system functions. More general multi-core processing is relatively new, however, and the proliferation of higher-end applications running on sophisticated operating systems makes multi-core an imperative for power-sensitive, performance-hungry embedded applications.
Last year, MIPS rolled out their highest-performance monolithic, single-threaded core – the MIPS32 74K. The 74K gets its performance the old fashioned ways – higher frequencies (up to 1GHz) and deeper, more sophisticated pipelines. This approach, however, runs into power problems as you continue to boost your performance. When processes can be efficiently parallelized, multiple cores can do the same work at lower frequencies or more work at the same frequencies with much better power-per-performance metrics. For those applications, MIPS is now rolling out the 1004K.
1004K can provide up to four processor cores in either single- multi-threaded configurations. The real excitement comes in with combining multi-cores with multi-threading. MIPS’s new product pivots around a Coherence Manager (CM) that coordinates the activities of the (up to four) processor cores. Additionally, an optional I/O Coherence Unit (IOCU) can coordinate coherence for I/O peripherals.
In each core, an intervention port connects to the coherence manager. In the CM, read and response requests come and go from CPUs or from the IOCU via a Request Unit (RQU). A Memory Interface Unit (MIU) communicates with physical memory and receives coherent read/writes from the IVU and non-coherent read/writes and speculative coherent reads from the RQU. The MIU then hands read responses to a Response Unit (RSU) that passes data on to CPUs or the IOCU as applicable.
The IOCU enables hardware I/O coherence via bridging I/O subsystems to the CM. It translates OCP 2.2 non-coherent requests into OCP 3.0 coherent/non-coherent requests. It breaks up bursts and unaligned accesses into cacheline/dword transactions, which minimizes impact to the coherence fabric by structuring I/O data to the coherent system. Attributes are applied per-transaction, and requests can be tagged to snoop L1+L2, L2 only, or neither. I/O parking gives I/O transactions priority over the CPU cores in the coherence manager.
The “MIPS32” part at the beginning of “MIPS32 1004K” tells us that the new multi-core processor is compliant with existing software developed for MIPS 24K, 24KE and 34K families. Multi-threading is enabled by virtual processing elements (VPEs). Each individual core can be configured in a wide variety of ways – one or two VPEs for single- or multi-threaded operation. An FPU is available, and the CPU/FPU clock ratio can be configured. You can select and size TLB, caches, and scratchpad RAM and also create user-defined instructions for your application.
There is also a Global Interrupt Controller. CPU access to the GIC is managed through a relocatable memory-mapped address range. The GIC can connect to the CM or elsewhere in the system. The GIC supports system-level and inter-processor interrupts and routs interrupts to the particular core or VPE. The number of system interrupts is configurable (up to 256).
Example specs, for a system configured with two cores, and each core set up with two VPEs for multi-threading, caches, coherence manager, and global interrupt controller, the base cores can operate at 800MHz, achieving a DMIPS rating of >2400. Using TSMC 65nm, 9-track, low Vt the total area is about 3.8mm2.
MIPS is a big player in the Linux club, and the new cores are friendly with open-source SMP Linux. MIPS also has a complete software debug environment, including in-system debug from MIPS FS2. FS2’s PDtrace has coherence awareness for compatibility with the multi-core environment. The software tools are GNU-based, including an Eclipse Navigator IDE.
The 1004K core is available now. Early RTL was delivered in Q4 last year.