|
In this segment we will talk about how we do curve fitting by using nonlinear regression especially the exponential model. So, if you look at a typical nonlinear regression model, we will be given n data points and we want to be able to best fit a nonlinear curve which can be any function of X - of X through which we'll explain the data. Now the curve which we are going to draw is going to be away from the data points which we have. And what we want to be able to do is to minimize the distance between the data points which are given to us and the regression curve. So, for example if we have X sub I comma Y sub I which is our data point right here; this is the observed value of y at this point and this will be the predicted value of y at that point, the same value of x here which is x sub I. And what we want to be able to do is we want to take this distance right here which is the observed value minus the predicted value from the regression curve and we want to make it as small as possible. Typical nonlinear regression models include exponential model, power model, saturation growth model, and polynomials of order two or more. Since we are concentrating on the exponential model. So, we have n data points and then we want to best fit it to an exponential model is y equal to a power b X. So, the Y data is given to us as a function of X. So, the two constants of the regression model which we have are a and B. So, if we look at this particular graph here, we are showing the possible regression curve right here, and then we have these data points which are given to us on this plot. Now if we take a typical point X sub I comma Y sub I, that is the observed value. And what we are going to get here is the predicted value of y, and the difference between the two is what we call as a residual. The difference between the observed value and the and the difference between observed value and the predicted value. Now what do you want to do them? We want to make these residuals to be as small as possible so we'll follow the same principles as we applied for the linear regression model to be able to figure out what these constants a and B should be. So, we are again following the same principle, we find the sum of the square of the residuals. This is the residual; we square each of the residuals we add them all up and that gives us some of the square of the residuals. Now what we want to be able to do is we want to minimize this sum of the square of the residuals and the way to do it is to take the derivative of the residual with respect to a and B which are the constants of the model. And those are the only things which we can change in this expression. Y sub I and X sub I are fixed, the only thing which you can change is a and B. So how do we take the derivative with respect to a? We apply the chain rule, we have for example somebody says hey- how can you differentiate u squared, which is where u is a function of a, with respect to a. So that's nothing but two u du by da. So that's what's happening here, we're taking the derivative of this with respect to a. So, it would be two, two times what's here times the derivative of this inside expression with respect to a, and that turns out to be this. And we would put that equal to 0. Then the same thing which we do we apply the same chain rule formula, we would take the derivative with respect to B. So again, it will be two times what's inside here, that is this, times the derivative of this quantity with respect to B, which turns out to be this quantity right here. So again, you put that equal to 0. So, if we expand the summations from the previous two expressions, we get two equations and what you are going to find out - hey these are simply nonlinear equations - they are nonlinear equations in terms of a and B. So, what you got to figure out here is that we must find a and B by solving the simultaneous nonlinear equations numerically. We're going to do that in the class itself, so what I'm going to introduce to you here is to look at how can we find the constants of a regression model, nonlinear regression model such as the exponential model without having to solve nonlinear equations. So, one of the ways to do it is to transform the data. Keep in mind that we're not transforming the model, the model is still exponential. What we are doing is transforming the data so that we can use simpler formulas to calculate the constants of our regression model. So, let's go through that. So, this is our regression model- right. So, what we're going to do is we will take the natural log of both sides. So, if I take the natural log of both sides, I get log of Y on the left side the right side is the form of log of U times V, which will give us log of U plus log of V, so that we have log of a plus log of e to power of b X. And then log of e to power of b x is just b x, so we get this. So how does this help us if I substitute for log of Y, I substitute Z. for log of a I substitute a naught, and for purposes of convenience I substitute for b, I substitute a one. So, if I do that this becomes Z, so that's right here. This becomes a naught, that's right here. And this becomes a one times X. Now clearly recognize that this is a linear model. So, what you are finding out here is that the Z versus X data is correlated linearly; it is not an exponential model. But, keep in mind that this is Z versus X is the linear model. And now if we are able to find - we know how to find a naught and a one for a linear regression model. And once we are able to find a naught and a one, we can find out a back by taking the exponential of a naught, because log of a is a naught. So that means a is e to the power of a naught. And we just for convenience sake we said a one is equal to b which simply means once you find a one, b will be equal to a one. So, as we said that we have now, this linear regression model of Z versus X so we already know what the linear regression formulas are which are very simple. Those are simply some giving you some summations as for the linear regression model, and then we find a naught and as I said that once we find a one, that's the value of b. Once we find a naught, a is nothing but e to the power of a naught. Let's go ahead and illustrate this through an example. One of the simple examples is that whenever people go to a hospital to get their gallbladder or other internal organs scanned, they will take a radioactive dye. And that's what's going to tell them hey whether there's something wrong with any of the internal organs. So, one of the radioactive dyes which is used is made of technetium 99m isotope. It's half-life is about 6 hours, it takes about 24 hours for the radiation levels to go back to whatever we are exposed to in our day to day activities. So, to keep things simple, what we have done is that we are giving you the relative intensity of radiation as a function of time. Now when we say relative intensity, the intensity is relative to what it was at time t equals zero; that's when the dye was injected into a particular person. That's why this value is 1. And then you can see the relative intensity is decreasing as time goes by and we have a model of the relative intensity of gamma is equal to A e to the power of lambda t, is an exponential model. And so, in this case, so what are we trying to find? We are trying to find lambda and we're trying to find a. Those are the two regression constants which we are trying to find for this model. So, what we are doing here is simply taking the tabular data which is given to us, showing it as a plot. The reason why I showed it as a plot is so that you get a sense of how your relative intensity varies as a function of time and if somebody would look at this graph intuitively, they might think that relative intensity is almost like a linear function of time, which is not the case. So again, I'm going to go through the theory which you already went through, just for the sake of reinforcing what we have found. So, we have an exponential of gamma is equal to A e to the power of lambda t. So, gamma versus time is given to us. We take the log of both sides, and that gives us this. Once you take the log of both sides, we assume that this is Z, a naught is log of a, which is this one. And for the sake of convenience we say a one is equal to lambda. What do we get is Z is equal to a zero plus a one t.? And that's the linear relationship between Z and t which is given here. Keep in mind the relationship is linear between Z and T, the relationship between gamma and time is still exponential, that hasn’t changed. So, we have a linear regression model right here, Z is equal to a zero plus a one t. From the linear regression models, we know what the values of a one and a naught are. So bottom line is that we have to find these one, two, three, four different summations to be able to find out what a one is, and then from those summations we can also then find a naught because this is nothing the average value of Z and t bar is nothing but the average value of the time data, which is given to us. So here is the data which is given to us, so we have six data points so that's what I’m showing you here. These are the times which are given to us, these are the relative radiation intensity which is given to us, keep in mind that in order to use the linear regression formulas we have to take a log of this thing of the gamma values, and that's what we are doing here. And then we must find t I times Z i, and then we must find t I squared so that we can find those relations. So the summation of the time value is 25, we don't find the summations, the gamma values which is a big mistake students make because they find the summations of gamma quality, rather than as equality so we are finding the summations of the Z quantity, the summation of t i times z i, and the summations of t i squared. So, because those are the summations which are in our formulas for a one and a naught, so the summations have, so we have six data points, the summations are being placed right here to show them to you separately, which are coming from these tabular values here. And what I'm going to do is now is to simply take these summations - take these summations and substitute them into these formulas for a one and a naught. And we'll be able to get numbers for about two constants a naught and a one of the Z versus time data. So, this is what we get by making the substitution. So, we get a one equal to minus point 1 1 5 0 5. To find Z bar we take the average of the Z value - summation of Z values and then we take the average of the t values - time value given to us. That gives us t bar. And that gives us a value of a naught to be equal to this quantity right here. So, since we know that a naught is log of a, that's the substitution which you’ve made in order to convert our exponential model to a linear regression model. A will be nothing but e to the power of a naught so that turns out to be e to power a naught, which we just found out in this point 9 9 9 7 4. Lambda is same as a one, we just substituted a one for lambda just for convenience sake, so that comes out to be minus point 1 1 5 0 5. So, in fact we have found out the constants of the regression model; so, this is a, and this is lambda. So, the resulting model will be this: gamma is equal to A e to the power lambda t, so A e to the power lambda t. A is this quantity and lambda this quantity and that's how we're able to write the what the resultant model is. We have also plotted the data points and the curve for the sake of convenience, and this is how it looks like so it's like - visually it looks like it's a good fit, but as we going to talk about later that just visualizing the da enough to figure out whether a particular regression model is a good fit and that's the end of this segment |