feature article
Subscribe Now

A Last Embedded Dance with Jack Ganssle

I’ve known embedded system guru Jack Ganssle for four decades. He and I both started designing embedded systems in the 1970s, when the early 8-bit microprocessors and microcontrollers were so primitive that you could easily call some of them brain dead. In fact, that’s exactly the term Ganssle used to describe the Intel 8051 in a keynote speech given at the recent Embedded Online Conference, which Jack described as his last embedded speaking engagement. Today, embedded designers generally use vastly more powerful 32- and 64-bit processors from companies including AMD, Infineon, Intel, Microchip, NXP, Renesas, STMicroelectronics, Texas Instruments, and Zilog to implement embedded systems. These processors can handle large data types such as images, audio files, and video streams. They can also run a multitasking RTOS, and they can handle networking stacks with ease. The 8-bit processors can’t.

The title of Jack’s presentation was “Really Real Time…it’s not what you learned in school!” I might describe the talk as “Jack Ganssle’s Greatest Embedded Tips.” Before we get down to the tips, allow me to provide a quick version of Jack’s biography, so you’ll understand why he’s someone worth following. When I first met Jack, he owned a company named Softaid that made in-circuit emulators for various microprocessors and microcontrollers. In those days, these processors had no internal debugging capabilities, so in-circuit emulators glommed onto the microprocessor pins and figured out what the processor was doing on a clock-by-clock basis. Microcontrollers were far more problematic, because their external pins didn’t tell the outside world was going on inside of the chip. So, in-circuit emulators for microcontrollers had to replicate the entire internal structure of the microcontroller and then emulate the microcontroller’s pin wiggling for the rest of the embedded system.

In both cases, for the microprocessors and for the microcontrollers, a wide, braided ribbon cable carrying all the emulated IC’s pin signals usually snaked between the test system and the in-circuit emulator. This scheme worked only when processor clock rates were slow, say a few MHz. After that, Maxwell’s equations began to bite you as the electromagnetics grew overly complex with faster signal transitions, the Fourier series tail grew correspondingly larger and longer, and signal propagation delays became overwhelmingly significant. If that weren’t enough, the arrival of surface-mount parts quickly made 40-pin socketed processors a thing of the past. Since those days, manufacturers have been forced to build hardware debugging tools into their processors, accessible through JTAG ports.

As of today, Jack has written more than 1000 articles about embedded hardware and software design, published six books on similar topics, and given countless presentations at events and in corporate meeting rooms. Two of his books, “The Art of Programming Embedded Systems” and “The Firmware Handbook,” have a place in my permanent technical library, where they’re always within reach. In addition, Jack has published nearly 500 issues of his newsletter, “The Embedded Muse,” since he started publishing it in 1997.

Contrary to what I expected, Jack’s final embedded talk at the Embedded Online Conference was not a historical review of embedded development since the 1970s. Instead, it contained pure nuggets of experience refined from Jack’s long embedded career. Because only about 300 people attended this presentation, I decided to summarize some of these nuggets, because I think they deserve wider exposure.

 

The Rate-Monotonic Myth

Rate-monotonic scheduling theory tells you that if you have several periodic software tasks with known run times that need to run at varying periods, you can guarantee deterministic behavior of your multitasking embedded software system if you meet a few conditions. First, the tasks must not share resources or have any dependencies such as synchronization constraints. Second, you must rank order the tasks’ priorities, giving the highest frequency task the highest priority and the lowest frequency task the lowest priority. Then, if the total CPU utilization for all the tasks running at their required frequencies consumes less than 69 percent of the CPU’s cycles, you can guarantee that all the tasks’ real-time deadlines will be met, at least in theory.

In practice, says Ganssle, the constraints of rate-monotonic scheduling are impractical, with the way today’s software is written. Rate-monotonic scheduling has worked well for tightly coded embedded software such as DSP code, and it was a major topic in the early days of the Ada programming language. However, Ganssle pointed out in his talk that today’s compilers provide software engineers with no timing information, so task durations must be empirically measured because they vary wildly with the compiler, the target processor, and the numeric precision used for the computations. Worse, some compilers will generate code that exhibits different timing with minor changes in the source code such as the addition or deletion of a space character. In addition, modern coding practice makes significant use of libraries developed by other teams or other companies. Today’s code is littered with calls to other software modules developed by other team members, purchased from 3rd party vendors, or downloaded from github, making exact task timing virtually impossible to determine. Consequently, today’s code cannot meet the constraints demanded by rate-monotonic scheduling, making the technique impractical for most of today’s embedded system designs.

 

Simple Ways to Instrument Your Embedded Code

If you want to debug your embedded code, Ganssle has a few low-overhead ways to do that. The simplest way is to ensure that the embedded processor can access a few GPIO signal lines. Critical software tasks can then assert one of these signal lines upon entering the task and can negate the line upon exit. An oscilloscope then shows you when the task starts and how long it takes to complete.

One such task you might want to monitor is the idle task of an RTOS. Modify this task to assert a GPIO signal when it starts and to negate the signal when it ends, and you can monitor the task with an oscilloscope, although Ganssle has another trick for monitoring this task. He recommends using an analog voltmeter to monitor the output of the associated GPIO pin. If the pin is high all the time, the voltmeter registers the maximum voltage of the GPIO pin. If the software never enters the idle task, the voltmeter reads zero. This trick reminds me of an old-style dwell meter for setting timing on carbureted gasoline engines, which just might be where Ganssle got the idea.

You can also monitor the currently executing task by instrumenting the task-switching module in an RTOS. The following trick will let you monitor as many as eight task IDs with just three of a processor’s GPIO lines and a digital oscilloscope. Instrument the task-switching module by adding code to the module that outputs the task ID number on the three GPIO pins. Then, connect the three GPIO pins to an R-2R ladder DAC, as shown in the graphic below.

 

This 3-bit R-2R ladder DAC uses just three GPIO pins to provide the ID of the currently executing task. Image credit: Jack Ganssle

The output voltage from the R-2R ladder DAC provides an 8-level analog indication of the active task’s ID. You’ll get a waveform that looks like the one shown below.

 

Using three GPIO pins with an R-2R ladder DAC to indicate the active task produces a waveform like this on an oscilloscope. Image credit: Jack Ganssle

If your code has more than eight tasks of interest, the R-2R ladder DAC can be expanded by one bit to accommodate sixteen tasks.

Reentrant Code is an Enormous Problem for Real-Time Systems

Tasks that are not designed to be reentrant create significant problems for real-time systems. A function is reentrant if:

  • It does not call non-reentrant functions
  • It uses all shared variables in an atomic way
  • It does not use the hardware in a non-atomic way

Any task that calls an external package is liable to become non-reentrant because the called package may not be reentrant.

An interrupt service routine (ISR) that interrupts a task can cause problems if the ISR isn’t written properly. The ISR must carefully save and restore the states of all relevant interrupts and registers when the ISR is called. Otherwise, the ISR may return with the processor’s interrupts incorrectly enabled when the return occurs.

Avoid Using Non-Maskable Interrupts to Fix Missed Interrupts

Embedded system developers often try to solve problems with missed interrupts by attaching the affected interrupt source to the system’s non-maskable interrupt (NMI) pin. Ganssle advises you not to do this because he says that NMIs are “guaranteed to break non-reentrant code.” You should use NMIs only to signal the apocalypse. For embedded systems, the apocalypse might mean the imminent loss of power, for example, or some other pending catastrophe.

Start Debugging When First Writing the Code

Ganssle suggests that you insert debugging aids in your embedded code when you start writing the code. Build the debugging aids into your data structures. For example, today’s processors have immense interrupt vector tables that are rarely used completely. You should fill every unused interrupt vector with a pointer to a debugging routine. If the code ever starts executing that routine, you’ll know that you have gotten a stray interrupt caused by bad code or an improperly initialized piece of hardware, and you’ll be able to track the problem backwards from the interrupt.

Keep Learning

The state of the art for embedded systems continues to advance, so Ganssle suggests you set aside fixed times in your week to stay abreast of these developments. There are abundant online resources including articles, books in PDF form, and even videos. This may have been Ganssle’s last official presentation, but he’s left us with six books, more than 1000 articles, and almost 500 issues of “The Embedded Muse.” You’ll find all 27 years of “The Embedded Muse” on Ganssle’s Website here; You’ll find a large collection of Jack’s published embedded design articles here; And, if you want Jack’s opinion on a wide range of diverse topics, you’ll find his rants here. I suggest that these documents are an ideal place to start expanding your embedded education.

5 thoughts on “A Last Embedded Dance with Jack Ganssle”

  1. How time passes. Brain dead processors? Try a 1 bit processor (Motorola) or the more common 4 bitters.

    I designed the IO board that went into the World’s First BBS. Helped Randy and Ward with the dog 8251 serial chip. I never used that one again. TTL was glue logic back then. Great fun. But for me and all us old timers the Permanent Embed is not far off. Heck of a run though. You are welcome.

    1. The Intel 8251 and 8251A differed in ready-pin response time. I made a design that required the 8251A. NEC second-sourced both parts, and NEC 8251A’s worked well. Toshiba second-sourced the 8251 but labeled their parts 8251A, causing problems in purchasing and production. Yes, for us old timers the Permanent Embed is not far off.

  2. I started working on embedded systems in 1973, using Intel and second-source Microsystems International 8008 processors. From 1980 to 2010, I worked on hard-real-time math-heavy electromagnetic instrumentation. We coded in assembly language on the bare metal. In 1998, I used one of the first 24-bit audio A/D converters driving an Integrated Device Technology 64-bit microprocessor to make DFT arithmetic roundoff errors below the roundoff errors of the ADC. I only needed a few spectral components, so the discrete Fourier transform DFT was much faster than the fast Fourier transform FFT: The processor calculated the DFT while the data was arriving, whereas an FFT could not begin until all the data had arrived. I wrote in self-contained stand-alone assembly language (no libraries, no OS, just one dedicated DFT task) so that I could design the hardware to be fast enough for the task’s 10-microsecond execution requirement.

  3. Thanks for sharing this Steve — Jack is one of my heroes — plus he’s a really nice guy — I’d not heard the trick about using a 3-bit R-2R ladder DAC to provide the ID of the currently executing task on an analog meter — VERY COOL!!!

Leave a Reply

featured blogs
Nov 12, 2024
The release of Matter 1.4 brings feature updates like long idle time, Matter-certified HRAP devices, improved ecosystem support, and new Matter device types....
Nov 13, 2024
Implementing the classic 'hand coming out of bowl' when you can see there's no one under the table is very tempting'¦...

featured video

Introducing FPGAi – Innovations Unlocked by AI-enabled FPGAs

Sponsored by Intel

Altera Innovators Day presentation by Ilya Ganusov showing the advantages of FPGAs for implementing AI-based Systems. See additional videos on AI and other Altera Innovators Day in Altera’s YouTube channel playlists.

Learn more about FPGAs for Artificial Intelligence here

featured paper

Quantized Neural Networks for FPGA Inference

Sponsored by Intel

Implementing a low precision network in FPGA hardware for efficient inferencing provides numerous advantages when it comes to meeting demanding specifications. The increased flexibility allows optimization of throughput, overall power consumption, resource usage, device size, TOPs/watt, and deterministic latency. These are important benefits where scaling and efficiency are inherent requirements of the application.

Click to read more

featured chalk talk

Shift Left Block/Chip Design with Calibre
In this episode of Chalk Talk, Amelia Dalton and David Abercrombie from Siemens EDA explore the multitude of benefits that shifting left with Calibre can bring to chip and block design. They investigate how Calibre can impact DRC verification, early design error debug, and optimize the configuration and management of multiple jobs for run time improvement.
Jun 18, 2024
37,620 views