Parallel Computing Toolbox
We demonstrate three concepts in this demo:
To demonstrate these concepts we use the example of computing the maximum safe velocity of a bobsled traveling through a series of icy banked curves of varying radius R.
Given an icy curve of radius R with elevation angle , we can compute the maximum safe (i.e., with no lateral skidding) velocity v of a bobsled entering this curve via .
In this demo, for purposes of simplicity we assume that there is no friction between the sled and the ice, and that the curve is banked at an angle of 30 degrees.
The figure above is a visual demonstration of forces acting on the bobsled as it enters an elevated curve of radius R. N is the normal force acting on the bobsled, is the gravitational force acting on the bobsled and is the reactive centrifugal force. For the bobsled to remain safely in the track while inside the curve, the reactive centrifugal force has to be equal to or less than the centripetal force exerted on the bobsled by the road.
In this section, we present an implementation of the sequentialBobsledVelocity function for computing the maximum safe velocity maxV of a bobsled on an icy curve, elevated at angle . This function computes the maximum safe velocities for a number of curve radii and is implemented as a nested function. The sequentialBobsledVelocity function uses a for-loop in place of a more optimized "vectorized" implementation to serve as a demonstration on how to use parfor.
First, we set the number of radii that we want to compute maximum safe velocities for, then we initialize the vector R representing the series of curves that our bobsled enters, and finally we initialize the angle to 30 degrees.
radii = 2e6; R = linspace(1, 10000, radii); theta = 30; tic
Call the sequential function for computing the maximum safe bobsled velocities.
[~] = sequentialBobsledVelocity(radii, R, theta);
Measure the sequential elapsed time.
seqElapsedTime = toc;
Implementation of the sequentialBobsledVelocity function.
% This function uses a for-loop in place of a more optimized "vectorized" % implementation to serve as a demonstration on how to use parfor. function maxV = sequentialBobsledVelocity(radii, R, theta) maxV = zeros(size(R)); % Initialize the output vector to 0's. g = 9.8; % Gravitational constant in (m/s^2). for i = 1:radii % Compute the maximum safe bobsled velocity in a curve % elevated at angle theta for specific radius R. maxV(i) = sqrt(R(i)*g*tand(theta)); end end
In this section, we modify our sequential implementation of the maximum safe bobsled velocity function to use parallel for-loops (parfor), and sliced variables.
We start the process by analyzing the sequential for-loop implementation. In each iteration of the for-loop, only one element of vectors maxV and R indexed by variable i is accessed. Because both maxV and R are indexed by the same value of i, all iterations of the for-loop are independent of each other and can be executed in parallel in any order. The only information an iteration i of for-loop needs is the contents of R at index i. The variable i is called a slicing index variable, maxV is called a sliced output variable, and R is called a sliced input variable. The variables theta and g are broadcast variables and every worker will have a local copy of them. Since all iterations of the for-loop are independent of each other, we only need to change for to parfor in the function implementation and open a MATLAB pool to take advantage of parallel computing resources.
A MATLAB pool is a collection of MATLAB workers that can be used to offload work from the MATLAB client interactively and in parallel. To take full advantage of parfor to speed up computations, a MATLAB pool has to be open. If the MATLAB pool is not open, the parfor executes sequentially in the client MATLAB session, i.e., the MATLAB session in which the user issues the matlabpool open command. All code outside of parfor is executed on the client session.
Using sliced input and output variables allows parfor to transfer only the minimum necessary amount of data to each worker in the MATLAB pool, and then use a minimum amount of communication to accumulate the partial results at the end of the parfor loop. This minimizes the communication overhead and increases computational efficiency.
Refer to the parfor documentation for more details. See the Advanced Topics section of the parfor documentation for more information on the classification of variables and sliced variables.
First, we verify that the MATLAB pool is open.
poolSize = matlabpool('size'); if poolSize == 0 error('parallel:demo:poolClosed', ... 'This demo needs an open MATLAB pool to run.'); end fprintf('This demo is running on %d MATLABPOOL workers.\n', ... matlabpool('size'));
This demo is running on 8 MATLABPOOL workers.
We scale the number of radii by the size of the MATLAB pool to demonstrate the parfor weak scaling capability.
parforRadii = radii * poolSize; R = linspace(1, 10000, parforRadii); tic
Call the parallel function for computing maximum safe bobsled velocity.
[~] = parallelBobsledVelocity(parforRadii, R, theta);
Measure the parallel elapsed time.
parElapsedTime = toc;
Implementation of the nested parallelBobsledVelocity function.
function maxV = parallelBobsledVelocity(radii, R, theta)
Each iteration is independent, and involves only the use of a single element of sliced input vector R(i), a single element of sliced output vector maxV(i), and the two broadcast variables g and theta. Therefore we change only the for to parfor to execute loop iterations in parallel. parfor automatically distributes the sliced input variable R to MATLAB pool workers as needed and broadcasts the variable g and theta to all workers.
maxV = zeros(size(R)); % Initialize the output vector to 0's. g = 9.8; % Gravitational constant in (m/s^2). parfor i = 1:radii % Compute the maximum safe bobsled velocity in a curve % elevated at angle theta. maxV(i) = sqrt(R(i)*g*tand(theta)); end
parfor automatically accumulates all the partial results from the workers into the sliced output variable maxV on the client.
We compare the execution times of the sequential and parallel loop implementations. We observe that even though the execution times are comparable for both implementations, the parfor implementation executes a poolSize times larger number of loop iterations, and computes safe velocities for a larger number of radii. In addition, the code change required to accomplish this significant increase in performance involves just changing the for keyword to parfor in the main loop of the program.
fprintf('The sequential-loop function executed in %8.2f',... seqElapsedTime); fprintf(' seconds\n and computed %4.2e safe velocities per second.\n',... radii/seqElapsedTime); fprintf('The parallel-loop function executed in %8.2f',... parElapsedTime); fprintf(' seconds\n and computed %4.2e safe velocities per second.\n',... parforRadii/parElapsedTime);
The sequential-loop function executed in 23.84 seconds and computed 8.39e+04 safe velocities per second. The parallel-loop function executed in 22.25 seconds and computed 7.19e+05 safe velocities per second.