CHAPTER 04.08: GAUSS-SEIDEL METHOD: Theory: Part 1 of 2

 

In this segment, we're going to talk about the Gauss-Seidel method of solving simultaneous linear equations, and we're going to talk about the theory in this part. Now, you might be already familiar with when you are solving simultaneous linear equations, if you would write them in the matrix form.  So if you have n simultaneous linear equations, then the matrix form will look like this, where C is your right-hand side, A is your coefficient matrix, and X is the solution vector, or the unknown vector, and you're trying to find out what X is, given what A and C are. You might also be familiar with, already with methods like Gaussian elimination, which are based on forward elimination and back substitution, and Gauss elimination is a direct method, where you go through the steps of forward elimination and back substitution and you get the results. So the question arises that why do we need to learn another method to solve a set of simultaneous linear equations, when Gauss elimination is pretty straightforward, it's a direct method, you do your steps of forward elimination, you do steps of back substitution, and you get your answer. Now, the reason why that is so is because of two main reasons.  One, may be computationally more efficient for large n, because Gauss-Seidel method is an iterative procedure, it's not a direct method of solving a set of simultaneous linear equations, it's an iterative procedure where you're going to guess the solution, and then you're going to start making the solution better and better by following the algorithm for Gauss-Seidel method, so it might be computationally more efficient for large n, because as n become very large, Gauss elimination does take a lot of computational time to solve.  And the second one is that you can control round-off error, because in Gauss elimination, the only way you can control the round-off error is by either by choosing Gauss elimination with partial pivoting, or by increasing the number of bits you use to store your numbers, going from single precision to double precision, or to quad precision, but here, because it's an iterative procedure, you can control the round-off error, because you can stop the procedure as soon as the absolute relative approximate error is less than some pre-specified tolerance.  You don't have that luxury in Gauss elimination. So those are the reasons why we want to talk about Gauss-Seidel method.  So let's go ahead and see that what Gauss-Seidel method is all about.  If you go back and look at the set of simultaneous linear equations which might be given to you in matrix form, where A is the coefficient matrix, X is the solution vector, and C is the right-hand side vector, and what we want to be able to do is we want to be able to find out what X is.  So what Gauss-Seidel method is all about is that if you write down the first equation in the regular form, in the expanded form, this is what I will get, I'll get a11 x1, plus a12 x2, plus all the way up to a1n xn is equal to c1, this will be the first equation which you're going to write, if you go ahead and go and take this expanded form of this . . . of this matrix form.  And what you do in Gauss . . . Gauss-Seidel method is that you basically take the first equation, so you're going to take the first unknown which you have there, and you're going to take everything to the right-hand side.  So what that means is that you're going to take everything to the right-hand side, like you're going to take this to the right-hand side, the next term to the right-hand side, the last term to the right-hand side, and then divided by a11.  So what you are doing is you are rewriting the first equation just in terms of x1.  Now some people might say that, hey, you are taking things to the right-hand side which you don't know, but that's the whole point, that, yes, x1 is unknown, so is x2, x3, all the way up to xn, and we are taking to the right hand side, generally we only take known things to the right-hand side, but the reason why we are doing that is because we are just writing the whole equation just in terms of xi, and since we're going to be assuming the values . . . the initial values of x1, x2, all the way up to xn, it gives a way of being able to figure out a better value of x1. So same thing's going to happen if you had the ith equation, let's suppose, so let's write the equation for the ith equation, so i can be anywhere from 1 less than equal to i less than equal to n, where n is the number of equations which you have.  So the ith equation is going to look like this, it's going to look like ith row, first column, x1, plus all the way up to the ith row, ith column, xi, plus all the way up to ain is equal to x . . . is equal to ci. So what we're trying to do is we're trying to write down the general equation for any equation which you will get out of these n equations, so we're writing the ith equation, where i can be anywhere from 1 to n.  So we're going to do the same thing, because we are writing the equation . . . the ith equation, we're going to take the ith unknown, and write it out by choosing . . . taking everything to the right-hand side, except for this part here, so this will be xn here, what I will get is ci minus ai1 x1, minus all the way up to minus ain xn, divided by aii, that's what I'm going to get. So this is the ith equation, and i can be anywhere from 1 to n, so I can write i is equal to 1 all the way up to n. So if I want to write it in a compact form so that I can algorithmically write a program for it or discuss it in a flow chart, I'll write it like this, summation aij xj, j going from 1 to n, divided by aii.  So all I'm doing is I'm bundling all these terms which I have here, I'm going to bundle them there by saying ai1 x1, ai2 x2, those are all the terms which will be taken care of by this. The only term which is not there is this aii times xi, you don't see aii xi here, because it's right here, so I'll have to say j not equal to i in the summation.  So now you can very well see that how one would be able to take this particular formula and program it in such a fashion that you are able to get the values of xi, having started with some new . . . some old values. So the way the algorithm works is that you're going to start with some initial value of all the xs, and you're going to find the new value of xs by using this particular formula here for the ith equation, and one of the things which you do is that you keep on updating the recent value of xi.  So for example, if you have written the formula for x1, you don't use an x2, you're going to use the new value of x1 which you obtained, so the X vector which you are using for . . . for the process keeps on getting . . . gets updated regularly, as opposed to at the end of the iteration.  So I'm going to write down the algorithm, and that will make it a little bit clearer, but the main thing is to understand that how we have rewritten each of the equations which we have.