CHAPTER 06.03: LINEAR REGRESSION: Example
In this segment, we're going to take an example of linear regression. We're going to see that how can we find out the constants of a linear regression model, and here we're talking about the general linear regression model, which looks like y is equal to a0, plus a1 x. So you are given n data points, let's suppose, x1, y1, x2, y2, and so on and so forth, and you are trying to do linear regression. Now, the example which we're going to take is of a mousetrap. So if you have seen a mousetrap, for example, there's a spring here, and there's an arm here, which allows you to store the energy in the mousetrap. So we have done this experiment where we tried to figure out how much torque is required, how much torque is required to open up this arm up to a certain angle, theta let's suppose. So how much torque is required to open it up? And the data is given as follows, you have theta here, and you have torque here, theta is given in radians, torque is given in Newton meters, and this is the data here, 0.7, 0.19, 0.96, 0.21, at 1.13 radians, the torque which needs to be applied is 0.23, at 1.57, it is 0.25, and 1.92, it is 0.31. So at different angles of theta, given in terms of radians, you have to apply this much torque for the arm to open up to that particular angle. And what we want to be able to do is we want to be able to find out the relationship between torque and angle, and it'll be k1, plus k2 times theta, that'll be the relationship, that is the straight line relationship, which we are assuming for this mousetrap spring, between the torque which is required and the angle through which it opens. And now, keep in mind that we don't have k1 equal to 0 in this case, because if you look at a mousetrap spring, when theta is equal to 0, that spring is totally closed, you will find out that there is some residual stress in there, so that's why this k1 is not . . . not equal to 0. So we have a general straight line formula which tells us what the relationship between torque and . . . torque and theta is. So we want to find out from these five data points, although when we conducted the experiment, we had several data points, but for simplicity, we're just taking five data points to show you how to do the linear regression. So when I am looking at this particular data, I need to just write down the formulas for k1 and k2, and be able to do the proper substitutions to be able to do so. So let's go ahead and do it right here. We will say k2, which is the slope, is give by n times summation, theta-i Ti, minus summation, adding all the theta-i values, and then adding all the torque values, and then dividing it by n times summation, taking all the theta values, the square of theta values, and then adding them up. Keep in mind this means that you square the theta values first, and then you add them up, and in the next expression, what we have is summation of theta-i, i is equal to 1 to n, where n is the number of points you have, so you add all the theta-is, and then you take the square of that, so those are two different things here. So what do we get for our value of k2 from here is to find out what these individual summations are, so we know that n is equal to 5, because five data points are given to me. Then I add the theta-is, from i is equal to 1 to 5, because I have five data points, so, for example, if I add the theta-is I get 0.7, plus 0.96, plus 1.13, plus 1.57, plus 1.92, so it's simply taking all the theta values which I have obtained, and this summation turns out to be 6.28. So that's how you calculate these individual summations. So if you have summation, Ti, again, what you do is you take all the T, the torque values, add them up, and you will get 1.19. Now, you also need summation of theta-i Ti, that's what you need, because that's in the formula right here, that you need summation of theta-i Ti, so you need to multiply each theta value with corresponding torque value, and then add them up. So what that means is that, for example, the first number will be the theta value is 0.7, and the corresponding torque value is 0.19, so you multiply the two, then you add the next one, which is 0.96 is the theta value, and the torque value is 0.21, and then you continue this process to the last one, where 1.92 is the theta value, and 0.31 is the torque value. So you have to multiply each of the theta values by its corresponding torque value, and then add them up, and this number here turns out to be 1.5822. So we still need to do one more summation, and that summation is summation of, i is equal to 1 to n, theta-i squared. As I said that this is squaring each theta quantity, and then adding them up. So what that means is that the first theta value which was given to me was 0.7, I've got to square it up, then I take the next theta value, which was 0.96, and then I square it up, and I go with the last one, which is 1.92, that's the last value which I have, I square it up, and add it up. So there are two more data points right here which need to be squared, and that value turns out to be 8.8398. So I have all the summations, so what you are seeing is these four summations which I have right here, summation of the theta-i values, the summation of the independent variable, the summation of the dependent variable here, which is Ti, summation of theta-i Ti, and summation of theta-i squared. I should not call theta to be the independent variable, I should call it the explanatory variable, because I don't want you to get the impression that it's a causal effect, that theta causes torque, although it does in this case, that when you open it up that you need a certain amount of torque to do so. So you have these four summations which you need for the formula itself. So let's go ahead and substitute those into that. So I get k2 is equal to n, which is 5, then the summation of theta-i Ti, which is 1.5822, then the summation of all the theta values, which is 6.28, times the summation of all the torque values, which is 1.19, divided by 5 times the summation of the theta squared values, which is 8.8398, minus summation of all the theta values, which is 6.28, but having to square them, and this value here turns out to be 0.09196, and the units of that will be in Newton meter per radian, those are the units of k2. Now, k1 I'm going to find out by using that, hey, it is the average value of torque, minus k2 times the average value of the theta values, that's the formula for k1. So, the average value for the torque values is whatever the torque values are, which are 1.19, so 1.19 is the addition of all the torque values, so I divide by 5, because I have five points given to me, minus k2, which is 0.09196, and theta-bar is the addition of all the theta values, divided by 5, and this turns out to be equal to 0.238, minus 0.09196 times 1.256. I'm writing this step just to show what the average value of torque is and the average value of theta is in the relationship. So I get k1 equal to 0.1224, so that's what turns out to be the value of k1, and the units of that are simply Newton meter. So based on this, we have been able to find out the regression model for our mousetrap spring, that torque is equal to k1, plus k2 theta. So k1 is equal to 0.1224, because we just found out what k1 is, and k2 is 0.09196, 0.09196 theta. So that is your regression model which you have just obtained to be able to relate to the five data points which you had for torque versus . . . versus theta. So now you can use this particular linear regression formula to be able to find out the value of torque at any value of theta which might be of your choice, to be able to see how much torque is required to open up the mousetrap spring up to a certain angle here.