The linear execution of a program by a processor is unnatural. It was the way in which the first computers of von Neumann and Turing operated, in part because the resources needed to create even a single processor were enormous. It has continued to be the way in which computers operate just because “it is the way in which computers work.” Well perhaps that is a bit unfair, but if we train thousands of people to think in ways that allow them to develop systems that execute linearly, then it is very hard to get them to think in other ways.
But the world is inherently parallel. In any society, company, or other organisation, people get on with their own areas of activity and communicate with others as needed. So why is there this fuss about how difficult it is to build systems that will execute on multiple processors? And, even more surprisingly, why is this fuss an issue in the embedded world, where we have been building systems incorporating high degrees of parallelism for decades? The different elements of parallel execution were not necessarily executing on a multi-core processing element, or even on multiple copies of the same processor architecture, but all the issues that are currently the subject of prolonged anguished debate have already been resolved within products that are shipping in millions.
Some of the problems in implementing on a multi-core are related to the architecture chosen, particularly to how the processors and their applications communicate. Intel’s decision to communicate through cache memory, for example, has been questioned, with the biggest question mark over how this will scale to tens or even hundreds of cores. Being Intel, the company will probably find ways of solving that problem. But the real problem is people: people are conservative, and, once trained in a particular mode of thinking, they are reluctant to change. A special case of this is the way in which programmers will cling tenaciously to the language in which they have been trained.
We are now seeing an explosion of concern over programming parallel systems, triggered in part by Intel’s very public embrace of multi-core as the way to achieve improved processor performance at acceptable levels of power consumption. Alongside this has been a rash of new companies launching new multiple-processor architectures. So people are looking for better ways to develop new systems and to convert existing or legacy software to these new systems.
Taking the legacy problem first. Next week, my colleague, Bryon Moyer, will be looking at how a Dutch-based company, Vector Fabrics, has developed powerful tools for analysing existing code and parallelising it to run as multiple threads. Other people are attempting this as well, but all of them run into Amdahl’s law, which says that, while you might get significant improvements to the speed of executing code through parallelising, the size of the biggest sequential section that remains after the parallelisation process is complete is the limit to the overall runtime improvement.
This week, I am more interested in how to develop new software. So far, the main effort has been devoted to adding parallel constructs to existing languages, and, in the embedded space, this has normally been C. This raises some interesting issues, as the C language is already sprawling and messy. Normally, the route taken has been to create a sub-set of C and then add the required constructs that divide the language into sections that can execute in parallel and provide for the communication between them. But when you start looking at these languages, it is clear that many are designed to run on large machines, even super-computers, and they are not appropriate for an embedded environment. (And Handel C was designed for generating hardware design language and is still in use for FPGA programming.)
OK, so where do we go from here? Tucker Taft, of AdaCore, has no doubts. He thinks we need to consider a new language, and he has been working on one – which he calls ParaSail.
To give this some context – Taft has been around the block. Nearly forty years ago he was the system programmer on the first UNIX machine in the wild, at Harvard University. He has spent much of the intervening time working on Ada, including the rather mind-warping task of building an Ada compiler in Ada. Compilers need to do more than just compile, and he began building in analysis tools that help to get improved optimisation. These tools also highlighted code issues, and he looked at ways to show these to the programmers. In the end, he abandoned the compiling stuff and concentrated on the code analysis, producing, within a company he founded called SofCheck, a set of static code analysis tools for Java and Ada.
One of his licensees was AdaCore, which incorporated SofCheck Inspector into its CodePeer tool, and which, early in 2012, bought the company, making Taft the Director of Language Research. He took with him his work on ParaSail (Parallel Specification and Implementation Language).
His reason for working on ParaSail is that he saw that it would soon be necessary to develop systems to run on hundreds of cores. He argues that today’s languages are not just going to find it difficult to extend to cope with this level of parallelism, but that constructs within many languages make it certain that they won’t be at all suitable. His particular list of parallelisation breakers includes:
Global Variables
Garbage-Collected Global Heap
Parameter Aliasing
Run-Time Exception Handling
Explicit Threads
Explicit Lock/Unlock
Explicit Wait/Signal
Race Conditions
and worst of all …
Pointers
If you are having a meal with Taft and want to catch up on eating, just ask him what is wrong with pointers in a parallel environment. Plenty of time to eat and listen. So ParaSail excludes pointers as well as all the other evil language constructs on his list.
(Catch up on these arguments on the ParaSail blog which also links through to a download of the language with libraries etc.)
ParaSail is designed to look familiar to anyone used to working with something like Java or C++. It uses an object-oriented model (and replaces the pointers with expandable and shrinkable objects). But the big initial difference is that it requires the programmer to work hard to write sequential code, the default mode being parallel.
The compiler is obviously an integral part of the language, and Taft’s experience with developing compilers and code analysis tools contributes to the compiler’s error checking and debugging. It is, in effect, a high-powered and specialised code analyser as well as a compiler, and it looks for, and eliminates, race conditions and other run-time error conditions, such as using null values or uninitialized data, indexing outside an array, numeric overflow, dangling references, etc. The compiler also creates hundreds of micro-threads, which can be assigned to the multiple processors/cores for execution.
Obviously Taft is a great enthusiast for ParaSail; it is his creation. It’s up to you to carry out a closer examination to see if it is likely to meet your parallel processing requirements. But whatever the future for ParaSail, programmers are going to have to bite the bullet and accept that if they are going to efficiently develop programs for embedded applications that deploy massively parallel processing, they are going to require a new language. There will be a rear-guard action of further attempts to force existing languages into a parallel processing paradigm, but these are going to be inefficient and will probably require intensive de-bugging, for which the tools do not yet exist. The big fear is that the recognition of this need is going to spark an explosion of new languages, muddying the waters still further and starting yet more software wars of religion.
Great article about a problem that surely must be solved if we are to keep getting gains in computing performance. I came across this very informative article on the topic by Ed Lee at Berkeley a while back. Interesting stuff especially the anecdote about Ptolemy in section 5.
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-1.pdf
Andy Haines
“The linear execution of a program by a processor is unnatural.” Well, no. It is how logical proofs work, but more than that, it is how you write a list of instructions for someone else to do something. Do this, then do that; If you see this situation, do something else; Do this N times., etc.
We have had both serial and parallel activities for a very long time, indeed. There are lists of actions to do a task, and there are GANT charts of parallel activities to get a project done. (Or what ever they were called in Roman times, etc. )
In the machine control world, the CPU does the step-at-a-time task stuff, and the PLC handles the parallel stuff of connecting the contact in to the relays out in parallel, real time.
The parallel activity of the PLC was coded as ladder diagrams, that evolved to functional blocks (remember schematics?) to things like Labview and Matlab/Simulink.
The primary problem is what abstraction to use to represent the parallel activities. The closest problem in the land of computers is data flow, where you have parallel streams of data that interact and weave together to generate the output. The closely related problem is graphical representation of this situation, so that you can see how the streams flow in time and interact in time.
Because if time is not important, you do not need multiple cores!
Perhaps what I should have said is that serial activities, in the overwhelming number of cases, are only a subset of a larger set of parallel activities.
If all we needed was a new programming language to solve the multi-core programming problem it would have been solved years ago. There have been, literally, dozens of parallel programming languages created over the past few decades. So far, none have succeeded in going mainstream.
There’s a wikipedia page which lists parallel and concurrent programming languages. I count 85 there. But it’s missing at least a dozen obscure ones which came out of MIT over the years.
I admire Tucker’s enthusiasm, and I wish him well. But it will take more than another great programming language to solve this problem.
Agreed that linear execution of a program is unnatural when compared to the native processes of life and matter. The linear-sequential approach, however, is a most appropriate method of manipulating strings of symbols to decrypt encoded messages, which was Alan Turing’s original motivation for developing a theory of automated computation. Computation can also produce a desired string of symbols from any given string, by applying suitable Boolean and arithmetic operations, transformations and translations. Computation is possibly the best means with which to manipulate symbol strings, regardless of intrinsic meaning. Computation is the workhorse of data management, or data processing, as it is more commonly known.
Do we need yet another programming language to attempt to overcome the inherent limitations of linear-sequential data-processors, or is programming and computation perhaps not the best method to achieve the next level in our drive to automate everything?
Automation, at its most useful, monitors and controls a physical process. It amounts to mechanical decision-making, assessing and acting on such questions as: does the process need correction to continue proper function, should a maintenance person be alerted, or should the process be halted to ensure the safety of personnel or equipment?
At present, the same linear-sequential faculty that is so apt at doing the grunt-work of data-processing has been applied toward management of physical processes. But data-processing is not an appropriate tool for process management, which is best handled in parallel-concurrent fashion. The data-processing solution, when applied to process management, becomes complex and always occurs after-the-fact, due to polling and instruction-fetch and -execution (software) delays. Using data-processing for process management is like making your lowest-rated assembly worker (who must be guided by explicit instructions every step of the way) suspend his labor and act as manager when supervisory duties are required.
The most correct process management requires an instinctive and immediate response, wherever and whenever—as the process events and condition changes occur. Better process management thus demands a different kind of tool than linear-sequential data-processing.
I propose a more appropriate process management tool that would have superior managerial qualities and abilities and would be:
Natively parallel-concurrent: It would react immediately to changes in process management events and conditions without affecting other supervisory duties. It could accommodate changes and additional tasks without disrupting or modifying existing functions.
Compatible: It could exist in parallel with linear-sequential data-processors, but would have separate facilities for independent operation.
Dwyland writes: “In the machine control world, the CPU does the step-at-a-time task stuff, and the PLC handles the parallel stuff of connecting the contact in to the relays out in parallel, real time.”
It is not that simple: In modern PLCs, the inputs are scanned, the logic performed, and the output conditions are executed—all from programs in step-at-a-time microprocessors or microcontrollers. The result may be touted as “real time” but it is really linear-sequential operation just like general-purpose computers. (Check up on PLC *latency* issues.) See http://falconlabs.com/PLCTIPS.htm…
The Labview/Matlab/Simulink examples are essentially computer (line-by-line) programs, as well.
CharlieM
True; there is a bit of ying and yang here. PLCs implement relay ladder diagrams, which originally showed how to wire up a whole room of contacts and relays to control an automobile assembly line. BTW, this is where the term “relay rack” originated. They held the relays being wired-up. The relays and contacts all worked independently and in parallel.
The PLC converted the ladder diagrams into sequential code for a 1-bit, 8 instruction computer. Each ladder diagram element became a sequence such as: “If contact 13 is closed and contact 245 is open, turn on relay K561,” a 3-instruction program sequence. The full ladder diagram was implemented as a long list of these instructions.
The key was that you could go through 8,000 of these 1-bit instructions in 1/2 cycle of the 60 Hz line frequency. Since the original relays ran on 60 Hz AC, the PLC was effectively as fast as the original, parallel set of relays.
The PLC uses a repeating set of sequential instructions to implement a parallel algorithm model. It works as long as the execution time of each pass through the sequence is fast enough top appear instantaneous to the algorithms being implemented. PID control loops are another example of this method. In the PID case, the update time is of the order of 100 times the loop bandwidth to keep the phase delay below 5 degrees.
Even though PLCs and PID loops may be sequential inside, they are implementing a parallel design. The challenge is how best to represent this multiple, parallel thread abstraction.
Labview/Matlab/Simulink can be implemented as sequential code on a single computer, on multiple computers in parallel or as truly parallel hardware in one or more FPGAs, depending on the timing requirements (propagation delay) of the design. And you can easily move from one configuration to another to trade hardware for performance, because the problem is expressed in parallel terms. OTOH, you typically cannot do this with single thread, sequential computer programs because of the difficulty of breaking the program into parallel chunks.
Dwyland, if “Labview/Matlab/Simulink … is expressed in parallel terms,” (as you put it) what is the challenge?
Hasn’t it already been done in those tools, or is there something remaining to solve?
An earlier commenter wrote:
“If all we needed was a new programming language to solve the multi-core programming problem it would have been solved years ago. There have been, literally, dozens of parallel programming languages created over the past few decades. So far, none have succeeded in going mainstream.”
There are features (or perhaps, lack of certain impediments) that make ParaSail different from most of the hundreds of other parallel programming languages out there. But you are certainly right that having a new programming language is not sufficient, even if you buy my view that it is necessary, to solve the multi-core programming problem.
Not to plug a competitor, but if you want more details on the guts of ParaSail, you might want to check out the longer EETimes article:
http://www.eetimes.com/design/embedded/4375616/ParaSail–Less-is-more-with-multicore
For even more gory details, check out the blog, or download the current version (pointed-to from the blog):
http://parasail-programming-language.blogspot.com