Last year, Nimbic put a lot of focus on their cloud implementation – to the point of changing the company name (erstwhile Physware). This year, part of their focus has been on implementing their tools on so-called “private clouds”: making use of the large server farms that some companies have. The drivers for this are the existence of these farms – why not use them? – as well as the usual security concerns that, while not universal, still dog the whole public cloud question.
But this now starts to sound a whole lot like an enterprise installation of the tools on a corporate farm, managed by LSF – a trip back, oh, 20 years or so. Is that, in fact, the case?
Not really. The old model is one of letting LSF assign a particular job to some available server (perhaps one with specific required characteristics). But the key is that LSF schedules independent jobs. The cloud implementation actually makes use of two other levels of parallelism. One is the obvious ability to take advantage of multicore within a system. But it also allows a single job to be distributed over multiple systems, and these systems communicate using MPI.
This requires much more coordination than the old model, and it also requires that the server machines be roughly of the same class, since intra-job load balancing is done statically.
This adjustment is but one of several we’ll see over the next little while as companies refine their approach to the cloud.