Distributed Computation

Solving a specific task with a computer, as usually understood, is a sequential process where we dispose of a list of commands and data that the computer executes one after the other. The following figure shows a very simplified example.

Computers can work doing multitasking, meaning that they can deal with (execute) several non-related tasks at the same time. This multitasking may be fictitious as, with just one processor, tasks have to be executed in turns. Apparently the tasks have been solved at the same time, but the time required is the sum of the time required for all the tasks. This is shown in the following figure.

From some years ago, multiprocessors with several cores, such as Intel Dual Core or the Core Ix family, where not only one but two, four or even eight cores share memory and are able to work in parallel. Nowadays they are very common in our desktops. In a system with two or more processors we can carry on a true multitask, because we can execute simultaneously more than one command, as shown in the following figure.

Appears then the idea of using this ability to solve problems in a faster manner, using the parallel computing. With it we do not want to execute several unrelated tasks (as in the previous example) but to solve a specific problem as fast as possible. This is done by means of the so called parallelisation. With this, instead of executing a program sequentially we execute several parts at the same time, producing the final result once every part has been processed (shown in the following figure).

Note that not all problems may be solved with parallelisation but it depends on its own nature (mainly if the final result depends on partial results non related among them). In addition, if a problem requires a time T to be solved with a single processor, this does not mean that it will require T/2 with two processors, T/3 with three and so on. In real life, when executing a program in parallel some parts are completed before others, and the effort of synchronise and communicate the partial results has a cost in term of time. Because of this, there is a point where, no matter the processors you have, the time required is not improved.

Even with these drawbacks, computers with multiple processors are the preferred solution when facing tasks that require big amounts of calculations. This is because a single processor can not increase its speed and memory indefinitely, while nothing stops us from setting together (in theory) as many as we want to make them work simultaneously composing a single supercomputer. A good example is the Mare Nostrum supercomputer in Barcelona, one of the most powerfull computers in the world, composed of 10.240 processors.

Generalizing this concept, we define the so called distributed computation, where the supercomputer is a virtual computer meaning that it does not exist physically as an entity, and instead of it each one of its processors (or nodes) is itself an independent computer, being all them linked together using a network. In this way, each node receives a task or problem, solves it, and returns the result. At the end, the sum of the results form the solution to the original complete problem.

Leave a Reply

You must be logged in to post a comment.