Let \(y\) be a function of \(x\text{.}\) We have studied in great detail the derivative of \(y\) with respect to \(x\text{,}\) that is, \(\frac{dy}{dx}\text{,}\) which measures the rate at which \(y\) changes with respect to \(x\text{.}\) Consider now \(z=f(x,y)\text{.}\) It makes sense to want to know how \(z\) changes with respect to \(x\) and/or \(y\text{.}\) This section begins our investigation into these rates of change.
Subsection13.3.1First-order partial derivatives
Consider the function \(f(x,y) = x^2+2y^2\text{,}\) as graphed in Figure 13.3.1.(a). By fixing \(y=2\text{,}\) we focus our attention to all points on the surface where the \(y\)-value is 2, shown in both Figure 13.3.1.(a) and Figure 13.3.1.(b). These points form a curve in the plane \(y=2\text{:}\)\(z = f(x,2) = x^2+8\) which defines \(z\) as a function of just one variable. We can take the derivative of \(z\) with respect to \(x\) along this curve and find equations of tangent lines, etc.
The key notion to extract from this example is: by treating \(y\) as constant (it does not vary) we can consider how \(z\) changes with respect to \(x\text{.}\) In a similar fashion, we can hold \(x\) constant and consider how \(z\) changes with respect to \(y\text{.}\) This is the underlying principle of partial derivatives. We state the formal, limit-based definition first, then show how to compute these partial derivatives without directly taking limits.
Definition13.3.2.Partial Derivative.
Let \(z=f(x,y)\) be a continuous function on a set \(S\) in \(\mathbb{R}^2\text{.}\)
The partial derivative of \(f\) with respect to \(x\) is:
Example 13.3.3 found a partial derivative using the formal, limit-based definition. Using limits is not necessary, though, as we can rely on our previous knowledge of derivatives to compute partial derivatives easily. When computing \(f_x(x,y)\text{,}\) we hold \(y\) fixed — it does not vary. Therefore we can compute the derivative with respect to \(x\) by treating \(y\) as a constant or coefficient.
Just as \(\frac{d}{dx}\big(5x^2\big) = 10x\text{,}\) we compute \(\frac{\partial}{\px}\big(x^2y\big) = 2xy\text{.}\) Here we are treating \(y\) as a coefficient.
Just as \(\frac{d}{dx}\big(5^3\big) = 0\text{,}\) we compute \(\frac{\partial}{\px}\big(y^3\big) = 0\text{.}\) Here we are treating \(y\) as a constant. More examples will help make this clear.
Example13.3.4.Finding partial derivatives.
Find \(f_x(x,y)\) and \(f_y(x,y)\) in each of the following.
\(\displaystyle f(x,y) = x^3y^2+ 5y^2-x+7\)
\(\displaystyle f(x,y) = \cos(xy^2)+\sin(x)\)
\(\displaystyle f(x,y) = e^{x^2y^3}\sqrt{x^2+1}\)
Solution.
We have \(f(x,y) = x^3y^2+ 5y^2-x+7\text{.}\) Begin with \(f_x(x,y)\text{.}\) Keep \(y\) fixed, treating it as a constant or coefficient, as appropriate:
We have \(f(x,y) = \cos(xy^2)+\sin(x)\text{.}\) Begin with \(f_x(x,y)\text{.}\) We need to apply the Chain Rule with the cosine term; \(y^2\) is the coefficient of the \(x\)-term inside the cosine function.
To find \(f_y(x,y)\text{,}\) note that \(x\) is the coefficient of the \(y^2\) term inside of the cosine term; also note that since \(x\) is fixed, \(\sin(x)\) is also fixed, and we treat it as a constant.
Note that when finding \(f_y(x,y)\) we do not have to apply the Product Rule; since \(\sqrt{x^2+1}\) does not contain \(y\text{,}\) we treat it as fixed and hence becomes a coefficient of the \(e^{x^2y^3}\) term.
We have shown how to compute a partial derivative, but it may still not be clear what a partial derivative means. Given \(z=f(x,y)\text{,}\)\(f_x(x,y)\) measures the rate at which \(z\) changes as only \(x\) varies: \(y\) is held constant.
Imagine standing in a rolling meadow, then beginning to walk due east. Depending on your location, you might walk up, sharply down, or perhaps not change elevation at all. This is similar to measuring \(z_x\text{:}\) you are moving only east (in the “\(x\)”-direction) and not north/south at all. Going back to your original location, imagine now walking due north (in the “\(y\)”-direction). Perhaps walking due north does not change your elevation at all. This is analogous to \(z_y=0\text{:}\)\(z\) does not change with respect to \(y\text{.}\) We can see that \(z_x\) and \(z_y\) do not have to be the same, or even similar, as it is easy to imagine circumstances where walking east means you walk downhill, though walking north makes you walk uphill.
The following example helps us visualize this more.
Example13.3.5.Evaluating partial derivatives.
Let \(z=f(x,y)=-x^2-\frac12y^2+xy+10\text{.}\) Find \(f_x(2,1)\) and \(f_y(2,1)\) and interpret their meaning.
Solution.
We begin by computing \(f_x(x,y) = -2x+y\) and \(f_y(x,y) = -y+x\text{.}\) Thus
It is also useful to note that \(f(2,1) = 7.5\text{.}\) What does each of these numbers mean?
Consider \(f_x(2,1)=-3\text{,}\) along with Figure 13.3.6.(a). If one “stands” on the surface at the point \((2,1,7.5)\) and moves parallel to the \(x\)-axis (i.e., only the \(x\)-value changes, not the \(y\)-value), then the instantaneous rate of change is \(-3\text{.}\) Increasing the \(x\)-value will decrease the \(z\)-value; decreasing the \(x\)-value will increase the \(z\)-value.
Figure13.3.6.Illustrating the meaning of partial derivatives
Now consider \(f_y(2,1)=1\text{,}\) illustrated in Figure 13.3.6.(b). Moving along the curve drawn on the surface, i.e., parallel to the \(y\)-axis and not changing the \(x\)-values, increases the \(z\)-value instantaneously at a rate of 1. Increasing the \(y\)-value by 1 would increase the \(z\)-value by approximately 1.
Since the magnitude of \(f_x\) is greater than the magnitude of \(f_y\) at \((2,1)\text{,}\) it is “steeper” in the \(x\)-direction than in the \(y\)-direction.
Subsection13.3.2Tangent Planes
Another way to interpret partial derivatives is in terms of the tangent plane. Consider the graph of a function \(f(x,y)\text{,}\) such as the one in Figure 13.3.1. Setting \(x=a\text{,}\)\(y=b\) defines a point \((a,b,f(a,b))\) on the graph. Through the point \((a,b)\text{,}\) we have the lines \(x=a+s, y=b\text{,}\) and \(x=a, y=b+t\text{,}\) parallel to the \(x\) and \(y\) axes, respectively (where \(s,t\) are parameters).
Using the function \(f(x,y)\) we define two vector-valued functions:
Both vector-valued functions define space curves that lie on the surface \(z=f(x,y)\text{,}\) and these curves intersect at the point \((a,b,f(a,b))\text{,}\) when \(s=t=0\text{.}\)
Now consider computing \(\vec{r}_1'(s)\text{.}\) The first two components of this derivative are found in a straightforward manner: they are \(1\) and \(0\text{,}\) respectively. To find the third component of the derivative, notice that in \(\vec{r}_1(s)\) we vary the \(x\)-component of \(f\) while holding the \(y\)-component constant. Using the Chain Rule and Definition 13.3.2, we find that the third component is \(f_x(a+s,b)\text{.}\) Altogether, we have
From Section 12.2, we know that \(\vec{r}_1'(0)\) defines a tangent vector to the curve \(\vec{r}_1(s)\) when \(s=0\text{,}\) and similarly, \(\vec{r}_2'(0)\) defines a tangent vector to the curve \(\vec{r}_2(t)\) when \(t=0\text{.}\)
It seems reasonable that any vector that is tangent to these curves, which lie on our surface, should also be considered tangent to that surface. The vectors \(\vec{v}\) and \(\vec{w}\) are therefore tangent to \(z=f(x,y)\) at \((a,b,f(a,b))\text{,}\) and they are definitely not parallel. From Section 11.6 we know that any two non-parallel vectors at a point define a plane through that point. We also know that taking the cross product of these two vectors gives us a normal vector: the cross product gives us
It is customary to solve for \(z\) in this equation and make the following definition.
Definition13.3.7.
Let \(f(x,y)\) be a function whose first-order partial derivatives exist at \((a,b)\text{.}\) The tangent plane to the surface \(z=f(x,y)\) at the point \((a,b,f(a,b))\) is the plane defined by the equation
\begin{equation*}
z = f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)\text{.}
\end{equation*}
Example13.3.8.Finding a tangent plane equation.
Find the equation of tangent plane to the surface \(z=x^2+3y^2\) at \((x,y)=(1,-1)\text{.}\)
Solution.
Our function is \(f(x,y)=x^2+3y^2\text{,}\) and we have \(f(1,-1)=4\text{,}\) so the point on the surface is \((1,-1,4)\text{.}\) The partial derivatives are \(f_x(x,y)=2x\) and \(f_y(x,y)=6y\text{,}\) so \(f_x(1,-1)=2\text{,}\)\(f_y(1,-1)=-6\text{.}\) Using Definition 13.3.7, our plane is given by
\begin{equation*}
z = 4+2(x-1)-6(y+1)\text{.}
\end{equation*}
Notice the similarity between the tangent plane equation in Definition 13.3.7 and the single variable tangent line equation \(y = f(c)+f'(c)(x-c)\text{.}\) As with functions of one variable, this suggests a connection between derivatives and linear approximation. We explore this connection in Section 13.4, where we’ll see that Definition 13.3.7 should be strengthed to require that the partial derivatives of \(f\) be continuous.
Subsection13.3.3Second-order partial derivatives
Let \(z=f(x,y)\text{.}\) We have learned to find the partial derivatives \(f_x(x,y)\) and \(f_y(x,y)\text{,}\) which are each functions of \(x\) and \(y\text{.}\) Therefore we can take partial derivatives of them, each with respect to \(x\) and \(y\text{.}\) We define these “second partials” along with the notation, give examples, then discuss their meaning.
Similar definitions hold for \(\frac{\partial^2f}{\py^2} = f_{yy}\) and \(\frac{\partial^2f}{\px\py} = f_{yx}\text{.}\)
The second partial derivatives \(f_{xy}\) and \(f_{yx}\) are mixed partial derivatives.
The notation of second partial derivatives gives some insight into the notation of the second derivative of a function of a single variable. If \(y=f(x)\text{,}\) then \(\fp'(x) = \frac{d^2 y}{dx^2}\text{.}\) The “\(d^2y\)” portion means “take the derivative of \(y\) twice,” while “\(dx^2\)” means “with respect to \(x\) both times.” When we only know of functions of a single variable, this latter phrase seems silly: there is only one variable to take the derivative with respect to. Now that we understand functions of multiple variables, we see the importance of specifying which variables we are referring to.
Example13.3.10.Second partial derivatives.
For each of the following, find all six first and second partial derivatives. That is, find
\(f(x,y) = e^x\sin(x^2y)\) Because the following partial derivatives get rather long, we omit the extra notation and just give the results. In several cases, multiple applications of the Product and Chain Rules will be necessary, followed by some basic combination of like terms.
Notice how in each of the three functions in Example 13.3.10, \(f_{xy} = f_{yx}\text{.}\) Due to the complexity of the examples, this likely is not a coincidence. The following theorem states that it is not.
Theorem13.3.11.Mixed Partial Derivatives.
Let \(f\) be defined such that \(f_{xy}\) and \(f_{yx}\) are continuous on a set \(S\text{.}\) Then for each point \((x,y)\) in \(S\text{,}\)\(f_{xy}(x,y) = f_{yx}(x,y)\text{.}\)
Finding \(f_{xy}\) and \(f_{yx}\) independently and comparing the results provides a convenient way of checking our work.
Subsection13.3.4Understanding Second Partial Derivatives
Now that we know how to find second partials, we investigate what they tell us.
Again we refer back to a function \(y=f(x)\) of a single variable. The second derivative of \(f\) is “the derivative of the derivative,” or “the rate of change of the rate of change.” The second derivative measures how much the derivative is changing. If \(\fp'(x)\lt 0\text{,}\) then the derivative is getting smaller (so the graph of \(f\) is concave down); if \(\fp'(x) \gt 0\text{,}\) then the derivative is growing, making the graph of \(f\) concave up.
Now consider \(z=f(x,y)\text{.}\) Similar statements can be made about \(f_{xx}\) and \(f_{yy}\) as could be made about \(\fp'(x)\) above. When taking derivatives with respect to \(x\) twice, we measure how much \(f_x\) changes with respect to \(x\text{.}\) If \(f_{xx}(x,y)\lt 0\text{,}\) it means that as \(x\) increases, \(f_x\) decreases, and the graph of \(f\) will be concave down in the \(x\)-direction. Using the analogy of standing in the rolling meadow used earlier in this section, \(f_{xx}\) measures whether one’s path is concave up/down when walking due east.
Similarly, \(f_{yy}\) measures the concavity in the \(y\)-direction. If \(f_{yy}(x,y) \gt 0\text{,}\) then \(f_y\) is increasing with respect to \(y\) and the graph of \(f\) will be concave up in the \(y\)-direction. Appealing to the rolling meadow analogy again, \(f_{yy}\) measures whether one’s path is concave up/down when walking due north.
We now consider the mixed partials \(f_{xy}\) and \(f_{yx}\text{.}\) The mixed partial \(f_{xy}\) measures how much \(f_x\) changes with respect to \(y\text{.}\) Once again using the rolling meadow analogy, \(f_{x}\) measures the slope if one walks due east. Looking east, begin walking north (side-stepping). Is the path towards the east getting steeper? If so, \(f_{xy} \gt 0\text{.}\) Is the path towards the east not changing in steepness? If so, then \(f_{xy}=0\text{.}\) A similar thing can be said about \(f_{yx}\text{:}\) consider the steepness of paths heading north while side-stepping to the east.
The following example examines these ideas with concrete numbers and graphs.
Example13.3.12.Understanding second partial derivatives.
Let \(z=x^2-y^2+xy\text{.}\) Evaluate the 6 first and second partial derivatives at \((-1/2,1/2)\) and interpret what each of these numbers mean.
Solution.
We find that:
\(f_x(x,y) = 2x+y\text{,}\)\(f_y(x,y) = -2y+x\text{,}\)\(f_{xx}(x,y) = 2\text{,}\)\(f_{yy}(x,y) = -2\) and \(f_{xy}(x,y) = f_{yx}(x,y) = 1\text{.}\) Thus at \((-1/2,1/2)\) we have
The slope of the tangent line at \((-1/2, 1/2, -1/4)\) in the direction of \(x\) is \(-1/2\text{:}\) if one moves from that point parallel to the \(x\)-axis, the instantaneous rate of change will be \(-1/2\text{.}\) The slope of the tangent line at this point in the direction of \(y\) is \(-3/2\text{:}\) if one moves from this point parallel to the \(y\)-axis, the instantaneous rate of change will be \(-3/2\text{.}\) These tangents lines are graphed in Figure 13.3.13.(a) and Figure 13.3.13.(b), respectively, where the tangent lines are drawn in a solid line.
Figure13.3.13.Understanding the second partial derivatives in Example 13.3.12
Now consider only Figure 13.3.13.(a). Three directed tangent lines are drawn (two are dashed), each in the direction of \(x\text{;}\) that is, each has a slope determined by \(f_x\text{.}\) Note how as \(y\) increases, the slope of these lines get closer to \(0\text{.}\) Since the slopes are all negative, getting closer to 0 means the slopes are increasing. The slopes given by \(f_x\) are increasing as \(y\) increases, meaning \(f_{xy}\) must be positive.
Since \(f_{xy}=f_{yx}\text{,}\) we also expect \(f_y\) to increase as \(x\) increases. Consider Figure 13.3.13.(b) where again three directed tangent lines are drawn, this time each in the direction of \(y\) with slopes determined by \(f_y\text{.}\) As \(x\) increases, the slopes become less steep (closer to 0). Since these are negative slopes, this means the slopes are increasing.
Thus far we have a visual understanding of \(f_x\text{,}\)\(f_y\text{,}\) and \(f_{xy}=f_{yx}\text{.}\) We now interpret \(f_{xx}\) and \(f_{yy}\text{.}\) In Figure 13.3.13.(a), we see a curve drawn where \(x\) is held constant at \(x=-1/2\text{:}\) only \(y\) varies. This curve is clearly concave down, corresponding to the fact that \(f_{yy}\lt 0\text{.}\) In part Figure 13.3.13.(b) of the figure, we see a similar curve where \(y\) is constant and only \(x\) varies. This curve is concave up, corresponding to the fact that \(f_{xx} \gt 0\text{.}\)
Subsection13.3.5Partial Derivatives and Functions of Three Variables
The concepts underlying partial derivatives can be easily extend to more than two variables. We give some definitions and examples in the case of three variables and trust the reader can extend these definitions to more variables if needed.
Definition13.3.14.Partial Derivatives with Three Variables.
Let \(w=f(x,y,z)\) be a continuous function on a set \(D\) in \(\mathbb{R}^3\text{.}\)
The partial derivative of \(f\) with respect to \(x\) is:
Similar definitions hold for \(f_y(x,y,z)\) and \(f_z(x,y,z)\text{.}\)
By taking partial derivatives of partial derivatives, we can find second partial derivatives of \(f\) with respect to \(z\) then \(y\text{,}\) for instance, just as before.
Example13.3.15.Partial derivatives of functions of three variables.
For each of the following, find \(f_x\text{,}\)\(f_y\text{,}\)\(f_z\text{,}\)\(f_{xz}\text{,}\)\(f_{yz}\text{,}\) and \(f_{zz}\text{.}\)
We can continue taking partial derivatives of partial derivatives of partial derivatives of …; we do not have to stop with second partial derivatives. These higher order partial derivatives do not have a tidy graphical interpretation; nevertheless they are not hard to compute and worthy of some practice.
We do not formally define each higher order derivative, but rather give just a few examples of the notation.
In the previous example we saw that \(f_{xxy} = f_{yxx}\text{;}\) this is not a coincidence. While we do not state this as a formal theorem, as long as each partial derivative is continuous, it does not matter the order in which the partial derivatives are taken. For instance, \(f_{xxy} = f_{xyx} = f_{yxx}\text{.}\)
This can be useful at times. Had we known this, the second part of Example 13.3.16 would have been much simpler to compute. Instead of computing \(f_{xyz}\) in the \(x\text{,}\)\(y\) then \(z\) orders, we could have applied the \(z\text{,}\) then \(x\) then \(y\) order (as \(f_{xyz} = f_{zxy}\)). It is easy to see that \(f_z = -\sin(z)\text{;}\) then \(f_{zx}\) and \(f_{zxy}\) are clearly 0 as \(f_z\) does not contain an \(x\) or \(y\text{.}\)
A brief review of this section: partial derivatives measure the instantaneous rate of change of a multivariable function with respect to one variable. With \(z=f(x,y)\text{,}\) the partial derivatives \(f_x\) and \(f_y\) measure the instantaneous rate of change of \(z\) when moving parallel to the \(x\)- and \(y\)-axes, respectively. How do we measure the rate of change at a point when we do not move parallel to one of these axes? What if we move in the direction given by the vector \(\la 2,1\ra\text{?}\) Can we measure that rate of change? The answer is, of course, yes, we can. This is the topic of Section 13.6. First, we need to define what it means for a function of two variables to be differentiable.
Exercises13.3.7Exercises
Terms and Concepts
1.
What is the difference between a constant and a coefficient?
2.
Given a function \(f(x,y)\text{,}\) explain in your own words how to compute \(f_x\text{.}\)
3.
In the mixed partial fraction \(f_{xy}\text{,}\) which is computed first, \(f_x\) or \(f_y\text{?}\)
\(\displaystyle f_x\)
\(\displaystyle f_y\)
4.
In the mixed partial fraction \(\frac{\partial^2f}{\partial x\partial y}\text{,}\) which is computed first, \(f_x\) or \(f_y\text{?}\)
\(\displaystyle f_x\)
\(\displaystyle f_y\)
Problems
Exercise Group.
In the following exercises, evaluate \(f_x(x,y)\) and \(f_y(x,y)\) at the indicated point.
5.
\(f(x,y) = x^2y-x+2y+3\) at \((1,2)\)
6.
\(f(x,y) = x^3-3x+y^2-6y\) at \((-1,3)\text{.}\)
7.
\(f(x,y) = \sin(y) \cos(x)\) at \((\pi/3,\pi/3)\)
8.
\(f(x,y) = \ln(xy)\) at \((-2,-3)\) Find:
Exercise Group.
In the following exercises, find \(f_x\text{,}\)\(f_y\text{,}\)\(f_{xx}\text{,}\)\(f_{yy}\text{,}\)\(f_{xy}\) and \(f_{yx}\text{.}\)