CHAPTER 06.04: NONLINEAR REGRESSION: Harmonic Decline Curve Model: Transformed Data: Derivation
In this segment, we are going to derive the regression formula for something called a harmonic decline curve. Harmonic decline curve is one of the three primary curves which is used by people who are drilling for oil, to figure out how long an oil well will last, when to abandon it, and how much will be the production of the oil per day. So the harmonic decline curve, we'll do the derivation of this harmonic decline curve by using transformed data, and what we mean by transformed data will be quite evident when we go through the derivation itself. So our problem statement is still given x1, y1, x2, y2, so you're given n data points, and what you want to do is you want to best fit . . . you want to best fit the harmonic decline curve to the data, which is given by y is equal to b, divided by 1 plus a x. So you want to best fit your data to this particular harmonic decline curve, and why is it called a harmonic decline curve? It's because as x approaches to infinity, y goes to 0, so it is going to 0 as x approaches infinity. So we can see that as when x is equal to 0, then the value of y is equal to b, which is the initial value then. So if we are talking about an oil production well, then in that case, b will be the initial production which you have from the well itself, and when x approaches infinity, then you are finding out that y goes to 0. So what you want to be able to figure out is that when y is some quantity there . . . y is some quantity, what should be what will the value of x, let's suppose, or what will be the value of y at a particular value of x, so those are the kind of things which you want to get from this harmonic decline curve. What we're going to pay our attention to is to how do we derive, how do we find out the constants of the model? We have two constants of the model, we have a, and we have b, these are the two constants of this harmonic decline curve, and we want to be able to find this. And what we're going to do is we're going to use the transformed data to be able to do that, and how do we go about doing that? That's as follows, we're going to say, hey, we are given these n data pair values, but what we're going to do is we're going to transform that data so that we can find the values of a and b. So what I'm going to do is I'm going to take the 1 divided by y, so what I am going to basically do is exchange the numerator with the denominator, because I am taking the inverse of both sides. So I'm going to get that, then 1 by y will be equal to 1 divided by b, plus a divided by b, x. So now you can very well see that if I call this to be z, I call this to be a0, and I call this to be a1, what I will get is I'll get z is equal to a0, plus a1 x. So I'm getting z versus x . . . z versus x a linear relationship. I'm getting z versus x as a linear relationship, because now you are finding out that z is equal to a0, plus a1 x, which is simply a straight line. So what that means is that I can use my . . . use my formulas from my linear regression to be able to calculate a0 and a1, but a0 and a1 are not the constants of my harmonic decline curve. What are the constants of my harmonic decline curve are a and b, but I can see that there is a relationship between a0 and a1, right there, a0 and a and b, and a1 and a and b there, so if I look at it this way, so if I am able to find a0 and a1 by using my linear regression formulas, then a0 is 1 divided by b, which implies that if I want to find b, which is the constant in my original model, will be 1 by a0, so that gives me the value of b. How do I find the value of a? I find the value of a from here, because a divided by b, a divided by b is a1, so that gives me . . . implies a is equal to . . . a divided by b is equal to a1, so I'm trying to find b, so let's see what b turns out to be, that implies b is equal to a divided by a1, and what is a? No, I'm trying to find a, so a is b times a1, so b is 1 divided by a0, times a1, so that's what . . . that's what a turns out to be, a1 divided by a0. So I'm able to find out the two constants of the original harmonic decline model from the constants of the linear model, but again, keep in mind that the linear relationship between it is z and x, the linear relationship is not between y and x, because the relationship between y and x is a nonlinear one of the harmonic decline curve. So the only reason why we are doing this conversion is so that we can find out the values of a0 and a1 by using the linear regression formulas, and then backtrack our values of the original constants, which are a and b, from the values of a1 and a0. Now, in order to find out the values of z, what I have to do is I have to take the inverse of y values. So that means that the zi values which I have to calculate will be simply 1 divided by yi, so all the y values which are given to me, you have to invert them in order to be able to get the corresponding z values. So once you have done the inversion, then you have to find a0 and a1. I'm going to write them for completion. So a1, which is the formula coming from the linear regression formula, it will be n times summation, zi xi, minus summation, zi, summation, xi, divided by n times summation, xi squared, minus . . . xi squared . . . so it'll be n times summation, xi squared, minus summation, xi, whole squared, so that will give you the value of a1. So we just said that how we're going to find the values of the zs by inverting all the y values, that's how we're going to do that, and a0 is nothing but z-bar, minus a1 x-bar. So that's how we will be able to find out what the values of a0 and a1 are. Now, keep in mind that what we are doing here is that we are not finding out the original sum of the square of the residuals. The actual sum of the square of the residuals I should have been finding is yi, minus b, divided by a plus xi, whole squared, because this is . . . 1 plus . . . 1 plus a xi, whole squared, because this is what I should have been finding . . . how I should be finding the values of a and b, but . . . and minimizing this with respect to a and b, but we know that if we take this sum of the square of the residuals, which is the observed value minus the predicted value, and we square them, and then add them up, and try to minimize with respect to a and b, then what's going to happen is that we're going to get two nonlinear equations. The only reason why we did the transformation of the data was to be able to find a and b by using simple linear regression formulas, which are right here, so that we can find out the values of a and b. So the actually thing, the actual Sr which you are basically calculating is summation of, i is equal to 1 to n, zi, minus a0, minus a1 xi, squared, that's the sum of the square of the residuals which you are minimizing when you are using the transformed data. So what that means is that since we know that zi is nothing but 1 by yi, and a0 is nothing but 1 divided by b, and a1 is nothing but a divided by b, xi, whole squared. So there's a difference. So this is what you should have minimized, if you were going to start from the basics of sum of the square of the residuals, and minimize with respect to the constants of the model, but because of the transformed data, you are actually minimizing this summation here. So again, I want to emphasize this is not the optimal way of doing this, because the best way to do the best fit would be to minimize this sum of the square of the residuals, not this sum of the square of the residuals, but the only reason why you are doing this is because it's mathematically convenient. And that's the end of this segment. |