Parallelism across the time
Block methods, special PC methods and RK methods allows a limited amount of parallelism which was called
method parallelism. The only way that it seems possible to use large-scale parallelism
on small problems is via parallelism across time with iterative methods.
In order to improve the efficiency of numerical integrators, it was
proposed to use some continuous time iteration methods to decouple the
ODE and then to discretize the resulting subsystems. One such approach is
called WR method.
A method that is massively parallel across time for ODEs must simultaneously evaluate the right side of the
differential equation y'=f(t,y) at many different time points tn. These evaluations require approximations of the solution,
y(tn), to compute a better approximation, so such a method is essentially an iterative method.
Let the successive iterates be denoted by yn (m) for m=0,1, ... . For each value of m the approximation
can be through as a discrete approximation to a waveform, so such a method is really a waveform method. The
number N of discrete points, tn, n= M+1, ... , M+N handled simultaneously is the degree of parallelism of the method.
The obvious approach for nonlinear ODEs is to linearize and use iteration.
The simplest iterative method is the Picard method.
Example: Picard 's method
There is no natural parallelism across time when solving IVPs, so that if the system size is moderate
then it is necessary to use temporal techniques in order to exploit
any massive parallelism. The simplest techniques in this respect is the Picard
method . The Picard method generates a sequence of iterative solutions y (0) (t), y (1) (t), ...
on the region of integration satisfying the equations
y' (k+1) (t)=f(t,y (k) (t)), y (k+1) (t0)=y0.
starting from y (0) (t)=y0.
If the dimension size is m, then this iterative approach allows the decoupling of the problem
into m independent parallel quadrature problems. Unfortunately, the iterations converge very slowly unless ||d f/d y|| is small.
Once the decoupling has take place different quadrature methods are used to discretized the resulting subsystems.
One way to improve the convergence is by the technique of time-windowing in which the region
of integration is split up into a series of windows and the iterative process then
takes place on each window.% and the iterative process then takes place on each window.
Another way to improve the convergence behaviour is via the concept of the splitting of the
right-hand side and this leads to the shifted Picard method (Skeel) which takes the
form
y' (k+1) (t)-Mky (k+1) (t)=f(t,y (k) (t))-Mky (k) (t).
Clearly there are difficulties in knowing how to choose the
window size and the splitting matrix Mk automatically.
Waveform relaxation method
General form .
The Picard method is a particular example of a much more general class of iterative
methods known as waveform relaxation (WR) methods.
Waveform relaxation, also called dynamic iteration or Picard - Lindelof iteration , is
a technique for solving large systems of ordinary differential equations.
These methods are based on an iterative scheme extended over the whole interval of integration [t0,tN].
As a part of this scheme, decoupled subsystems are solved concurrently.
Waveform relaxation methods have been developed for circuit analysis.
The system is decomposed into
yi'=fi(t,yi,pi), i=1, ... , r , yi(t0)=y 0i
with the decoupling vectors pi=(y1, ... ,y i-1 ,y i+1 , ... ,yr)T.
Treating the vectors pi as known parameters allows the simultaneous solution of the r independent IVPs.
The relaxation process starts with a initial approximation over the whole interval [t0,tN] needed to initialize the decoupling vectors p1, ... ,pr.
WR techniques can perform extremely
efficiently on problems arising from electrical network modeling.
These techniques have been generalized to general systems of equations with less
success because the automatic reordering and efficient groupings of the subsystems is often performed on the
basis of physically which is not always available. A slow convergence of the iterations can occur in the case of strong
coupling between subsystems.
To consider large-scale parallelism we have to consider iterations that encompass the space of the system
or a significant part of the time interval. Some information from a previous iterate is used to compute the next iterate by replacing
the ODE y'=f(t,y) with
y' (n+1) =f(y (m) )+G(y (m+1) ,y (m) ), y (m+1) (0)=y0,
where G(y,z) is any function such that G(z,z)=0.
The general form of a continuous-time WR method is given by
z' (k+1) (t)=G(t,z (k+1) (t),y (k+1) (t),y (k) (t)), z (k+1) (t0)=y0
y (0) (t)=y0, y (k+1) (t)=g(t,z (k+1) (t),y (k) (t)),
where G:[t0,T] x Rm x Rm x Rm to Rm and g:[t0,T] x Rm x Rm x Rm to Rm
satisfy G(t,y,y,y)=f(t,y), g(t,y,y)=f(t,y). The functions G and g are called
the splitting functions and are chosen in an attempt to decouple (as much as possible) the
original system into independent subsystems which can be solved by a number of different
numerical methods in an independent fashion.
A natural generalization of the Picard method is the more simple WR method
y' (k+1) (t)=F(t,y (k+1) (t),y (k) (t)), y (k+1) (t0)=y0,
where the splitting function F:[t0,T] x Rm x Rm to Rm satisfies F(t,y,y)=f(t,y).
The Picard - Lindelof iteration
is obtained for F(t,y,z)=f(t,z). One of the main purposes of splitting function is that of decoupling
the original system as much as possible. In other words, the aim is to get an iterative process, in which
there are as many independent subsystems as possible. If f is continuously differentiable
and Lipschitz continuous in y and F
is also continuously differentiable and is Lipschitz continuous
in the second and the third argument, then the sequence y (k) converges uniformly in [t0,tf]
to the solution y of ODE. The rate of convergence is of superlinear type. If the interval of integration
is long, it can be necessary to perform a very large number of iterations before reasonable approximations of
y(t) are obtained far from t0.
In waveform iteration, a processor obtains the current values of the waveforms from other processors prior
to the start of the next iteration and then integrates its subsystem of equations over the integration interval independently
of the progress of other processors. Thus, stepsizes appropriate to the subsystems can be used. In waveform methods it is
necessary:
- to estimate the error introduced in each subsystem by the errors in the other subsystems;
- to determine when an iteration has converged sufficiently;
- to develop criteria for setting the window length (the subinterval over which the integration is performed at each iteration).
The main weakness of waveform relaxation is the iteration process
which may suffer slow convergence or even divergence. Another
disadvantage is the need for storing the waveforms which adds substantially to the
memory requirement.
The WR has been extended to the scalar and parallel solution of PDEs.
There are a number of possibilities for waveform relaxation iterations:
- WR Jacobi (WRJ) is described by
Fi(t,y (k+1) ,y (k) )=fi(t,
y1 (k) ,
... ,
y i-1 (k) ,
yi (k+1) ,
y i+1 (k) , ... ,
y m (k) ), i=1, ... ,m
if f=(f1, ... , fm), so that
y'i (k+1) =fi(t,y1 (k) ,
... ,
y i-1 (k) ,
yi (k+1) ,
y i+1 (k) , ... ,
y m (k) ), i=1, ... ,,m
- WR Gauss-Seidel (WRGS) is described by
Fi(t,y (k+1) ,y (k) )=fi(t,
y1 (k+1) ,
... ,
y i-1 (k+1) ,
yi (k+1) ,
y i+1 (k) , ... ,
y m (k) ),i=1, ... ,m
- WR successive overrelaxation (WRSOR) is described by
Gi(t,z (k+1) ,y (k+1) ,y (k) )=fi(t,
y1 (k+1) ,
... ,
y i-1 (k+1) ,
zi (k+1) ,
y i+1 (k) , ... ,
y m (k) ), i=1, ... ,m
gi(t,z (k+1) ,y (k) )=wzi (k+1) +(1-w)yi (k) , i=1, ... ,m
The advantage of the Jacobi approach is that each component of the system can be
solved independently in parallel, while a drawback is that a great deal of past information
must be stored if m is large. Storage is not such a problem in the WRGS case, but
there is no obviously decoupling of the systems. The SOR methods can be parallelized by different orderings of the
components (the so-called chequer-board effect).
A different approach to the previous iterations is to linearize the problem, as in the
Newton wavefrom approach (WRN), in which the function F is
F(t,y (k+1) ,y (k) )=f(t,y (k) )+fy(t,y (k) )(y (k+1) -y (k) )
This is a linear problem and so,
for example, quadrature techniques can be used or techniques based on the fast
parallel solution of linear recurrences.
One of the difficulties associated with the Jacobi an and GS scheme is their poor convergence behaviour.
One way of accelerating this convergence is to use a multigrid WR approach. In particular, if a system of
ODEs is derived from a parabolic PDE by a semi-discretization of the spatial variables, a
multigrid acceleration of WR is possible which exploits the geometry of the problem.
Standard time-stepping techniques for solving parabolic PDEs cannot be parallelized or vectorized efficiently unless the
problem to be solved is very large and the number of processors is small.
New parallel algorithms are needed for solving relatively small parabolic problems, i.e. problems with few unknowns per time
level or problems solved on large scale parallel machines.