Computational science applications ----------------------------------- 0. Goal of lecture --------------- To understand parallel computing, you need to understand how parallelism arises in high-performance computing applications. In this lecture, we introduce the finite-difference method for solving systems of ordinary differential equations. We discuss two simple techniques known as forward-Euler and backward-Euler methods, and show that the key computational kernel for the forward-Euler method is a matrix-vector product, while the key kernel for backward-Euler is a linear solve (Ax = b). The matrix in both cases is often sparse. Large sparse linear systems of equations are usually solved using iterative techniques, and the key kernel in these is again a matrix-vector product. Therefore, ultimately, the key kernel for implementing finite-differences is a dense or sparse matrix-vector product. 1. Introduction ------------ We develop this lecture in the following stages: i) solving a single linear ode: forward-Euler and backward-Euler methods ii) generalizing to systems of linear ode's: kernels are MVM and Ax = b iii) reducing higher-order ode's to systems of ode's (iv) showing that iterative methods for solving linear systems require MVM computations (v) describing how sparse MVMs can be implemented on a parallel computer. The goal of this lecture is not to make you an expert in finite-differences. Rather, it is to (i) get an intuitive understanding of the method, and (ii) get an idea of what the key computational kernels are in using the finite-difference method. 2. Single ode: ---------------- Consider the ode u'(t) = -3u(t) + 2 ---(1) u(0) = 1 This system is an initial value problem because the initial value of u is given by the second equation, and the way in which u evolves with time is given by the differential equation. Exact solution: -------------- Using elementary calculus, it is easy to solve this equation exactly. The solution is u(t) = 1/3 (e^-3t + 2) ---(2) In general, we will not be so lucky, and we will not be able to find an exact solution. In that case, we can use finite-differences to solve these kinds of equations approximately. As we have said many times in class, computers, like people, are bad at calculus but they are good at matrix methods, so we replace the calculus problem with a linear algebra problem and use the computer to solve the new linear algebra problem efficiently. The linear algebra problem is usually either MVM or solving a linear system Ax = b. Forward-Euler method: -------------------------------- The simplest way to do this is to discretize time into fixed-size steps (t = 0,h,2h,3h,....) and to replace the derivative with a *difference* as follows: u'(t) --> (u(t+h) - u(t))/h ---(3) Replacing the derivative with a difference of this kind is essentially the inverse of how derivatives were probably introduced to you, where you started with differences, and then took the limit of the difference as h --> 0. If we do this for our equation, we get u(t+h) - u(t) / h = -3u(t) + 2 so u(t+h) = (1-3h)u(t) + 2h ---(4) We can now compute the first few values of u as follows: u(0) = 1 u(h) = 1-h u(2h) = 1-2h+3h^2 .... By choosing a value for h, we can numerically calculate the value of u for any time step. Actually, in this simple case, we can do better. It is not hard to show that if the difference equation is u(t+h) = a*u(t) + b u(0) = 1 the solution at any time step n*h is u(n*h) = a^n + b*(1-a^n)/(1-a) Therefore, for equation (4), the value of u at the n^th time step is u(n*h) = ((1-3h)^n + 2)/3 --- (5) One note of caution: when using forward-Euler, it is important to choose a "small enough" value of the time-step h; otherwise, the results of the numerical computations may exhibit pathological behavior such as wild oscillations. In our example, we need to use a value of h in the open interval (0,2/3). Backward-Euler method: ----------------------------------- Another approach is to use the backward-Euler method: u'(t+h) --> (u(t+h) - u(t))/h ---(6) For our differential equation, this gives us u(t+h) - u(t) / h = -3*u(t+h) + 2 which, after rearrangement, gives u(t+h) = (u(t)+2*h)/(1+3h) As before, this difference equation is simple enough that we can write down the exact solution: u(n*h) = ((1/(1+3h))^n + 2)/3 ---(7) Notice that for backward-Euler, we can choose any positive value for h without suffering instabilities. 3. Systems of ode's -------------------------- We now consider systems of ode's, and show that the use of forward-Euler and backward-Euler methods requires MVM and the solution of linear systems respectively. Consider a system of the following form: u'(x) = a11*u(x) + a12*v(x) + a13*w(x) + c1(x) v'(x) = a21*u(x) + a22*v(x) + a23*w(x) + c2(x) --(8) w'(x) = a31*u(x) + a32*v(x) + a33*w(x) + c3(x) If we use forward-Euler to discretize this system, we get u(x+h)-u(x) / h = a11*u(x) + a12*v(x) + a13*w(x) + c1(x) v(x+h)-v(x) / h = a21*u(x) + a22*v(x) + a23*w(x) + c2(x) w(x+h)-w(x) / h = a31*u(x) + a32*v(x) + a33*w(x) + c3(x) which can be simplified to U(x+h) = (I+hA)U(x) + h*C(x) --(9) in matrix notation, where U(x) is the column vector [u(x),v(x),w(x)]' etc. The key computation is the matrix-vector product (I+hA)U(x). If we use backward-Euler to discretize the system, we get u(x+h)-u(x) / h = a11*u(x+h) + a12*v(x+h) + a13*w(x+h) + c1(x+h) v(x+h)-v(x) / h = a21*u(x+h) + a22*v(x+h) + a23*w(x+h) + c2(x+h) --(10) w(x+h)-w(x) / h = a31*u(x+h) + a32*v(x+h) + a33*w(x+h) + c3(x+h) which can be written as (I-hA)U(x+h) = U(x) + h*C(x+h) --(11) Therefore, to compute U(x+h) given U(x), we need to solve a linear system of equations. In short, for systems of ode's, forward-Euler requires MVM and backward-Euler requires the solution of linear systems of equations. 4. Higher-order ode's -------------------------- Before we study how to perform MVM and solve linear systems, it is useful to understand how systems of ode's might arise. In many problems, systems of ode's arise from discretization of higher-order ode's. For example, consider the equation y'' + y = f(x) We introduction an auxiliary variable v where v = y' Then, v' = y'', so the original differential equation becomes v' = -y + f(x) Therefore, the original equation can be reduced to the following system of ode's y'(x) = 0*y(x) + v(x) + 0 v'(x) = -y(x) + 0*v(x) + f(x) We can now use the methods of the previous section to solve this system using finite-differences. This idea can obviously be extended to higher-order linear ode's of any order. One point to note is that there are a lot of 0's in the right-hand side. This is true for most problems that arise in practice. Therefore, we conclude that the matrix (I+hA) in equation (9) and the matrix (I-hA) in equation 11 are often sparse. Therefore, to apply the finite-difference method, we need to implement the following kernels: (i) MVM: compute Ax where A is a large and sparse matrix (ii) linear solve: solve Ax = b where A is a large and sparse matrix 5. Solving Ax = b when A is large and sparse -------------------------------------------------------------- The classical method for solving linear systems is to factorize A into a product of lower and upper triangular matrix L and U, so that Ax = b can be written as LUx = b We then solve the triangular system Ly = b and then the triangular system Ux = y to obtain x. The factorization of A into LU can be done by classical methods such as Cholesky factorization (if the matrix is symmetric positive definite) and LU with pivoting. These methods solve the equation *exactly* (assuming the computations are performed with unbounded precision). However, they also take O(n^3) time, where n is the number of rows/columns of A. This can be prohibitive for large matrices. In addition, the factorization of A can require a lot of additional storage for representing L and U since these tend to be dense even if A is sparse. For these reasons, large sparse systems are solved using iterative methods. Iterative methods compute approximate solutions to linear equations. This may seem to be a disadvantage compared to direct methods which compute the exact solution at least in principle, but notice that to arrive at Ax = b, we had to discretize the differential equation, which introduced approximation any way. In practice, even the differential equation tends to be an abstraction (approximation) of some complex phenomenon we are trying to model. Therefore, approximation is the name of the entire game anyway, so there is no reason to look down on approximate solution techniques for linear equations. Jacobi method ---------------------- The Jacobi method is the simplest iterative method for solving linear systems. It works only for some kinds of matrices, and it is relatively slow compared to modern methods like conjugate gradient, so it is rarely used by itself, except in the context of more complex methods like multigrid. Consider the system shown below: 4x + 2y = 8 ----(12) 3x + 4y =11 The exact solution is (x=1,y=2). Given an approximation to the solution, we can refine the approximation by using the first equation to refine the value of x, and the second equation to refine the value of y. In other words, we consider the iterative system x(i+1) = (8 - 2*y(i))/4 ---(13) y(i+1) = (11-3*x(i))/4 Here are some approximations this process generates: i 1 2 3 4 5 6 7 8 ...... x 0 2 0.625 1.375 0.8594 1.1406 0.9473 1.0527 ...... y 0 2.75 1.250 2.281 1.7188 2.1055 1.8945 2.0396 ....... With a little bit of work, you can show that the general formulation for Jacobi iterations is the following: given a system A x = b, set up the system M*X(i+1) = (M-A)*X(i) + b (where M is an nxn matrix containing the diagonal of A) which can also be written as X(i+1) = X(i) - M^(-1)AX(i) + M^(-1)b It can be seen the key operation is the matrix-vector product AX(i); the rest is lower-order work. Notice also that this system can be computed without modifying the matrix A, which is a big advantage over direct methods. More modern iterative methods for solving linear systems ------------------------------------------------------------------------------------ Faster methods like conjugate gradient and GMRES are used widely. However, all of them involve a matrix-vector product as the key kernel, just like in Jacobi. 6. Conclusions --------------------- The finite-difference method can be used to solve systems of ode's. Forward-Euler and backward-Euler are two classical finite-difference methods, and many other methods are available. The key computation in forward-Euler is an MVM, where the matrix may be sparse. The key computation in backward-Euler is a linear solve Ax = b where A may be sparse. Sparse linear systems are usually solved using iterative methods. The key kernel in iterative methods is again an MVM where the matrix is usually sparse. There is another method called the finite-element method that is used to solve partial differential equations. Finite-element methods are more complex than finite-difference methods, but it turns out that a key step in finite-element methods is solving sparse linear systems Ax = b. Therefore, implementing efficient dense and sparse MVM is key to good performance in many computational science applications.