Section15.8Change of Variables in Multiple Integrals

We have seen in Sections 15.3 and 15.7 that switching to a different coordinate system can be a powerful tool. Integrals that are intractable (or even impossible) in one coordinate system can become straightforward in another.

Changing from rectangular coordinates to polar, or cylindrical, or spherical coordinates, are special cases of a general process known as a change of variables or transformation. A change of variables should be considered in any situation where we are presented with an integral that is difficult to evaluate in rectangular coordinates.

Our goals in this section are as follows:

Understand how a change of variables affects the area element \(dA\) in a double integral, or the volume element \(dV\) in a triple integral.

Derive a general change of variables formula for multiple integrals that works for any suitable change of coordinates, including the ones we have already seen in Sections 15.3 and 15.7

Develop some basic guiding principles for knowing when a change of variables should be considered, and how to define the corresponding transformation.

Subsection15.8.1Review of substitution techniques

One of the situations that should be covered by our general change of variables formula is that of substitution for a definite integral in one variable, as encountered in Section 6.1, way back in Calculus I. Of course, for a definite integral in one variable, there is only one type of region of integration: a closed interval \([a,b]\text{.}\) For single integrals, our only consideration when making a change of variables is the function being integrated. Recall that substitution — at least, for indefinite integrals — is essentially an attempt to reverse the Chain Rule: given

The formula we seek will be a generalization of this result, with one notable change in perspective: for multiple integrals, it is often the region of integration that creates most of the difficulty, and not the function being integrated. (In one variable, one closed interval is transformed into another, and we apply the Fundamental Theorem of Calculus. What we will find is that in most cases, we start on the right hand side of our analogue of Equation (15.8.1), and move to the left.

Changing from polar coordinates can be viewed as the process of writing our rectangular coordinates \((x,y)\) in terms of new variables \(r\) and \(\theta\text{:}\)

We can think of the polar coordinate transformation as a change of variables, where we define new variables in terms of old ones, but we could also think of it as a function from the plane \(\mathbb{R}^2\) to itself. That is, we have a mapping

where \(D\) is some subset of \(\mathbb{R}^2\) (with coordinates labelled by \(r\) and \(\theta\)), and the codomain is \(\mathbb{R}^2\) with usual \((x,y)\) coordinates. As we know from Section 15.3, the polar coordinate transformation \(T\) given above transforms a rectangle such as \(D=[0,3]\times [0,2\pi]\) into a disk — in this case, the set of points \((x,y)\) with \(x^2+y^2\leq 9\text{,}\) as shown in Figure 15.8.2 below.

It is interesting to pause and consider what happens to the four sides of the rectangle \(D\) in the transformation above. (As we’ll see, this particular transformation exhibits some behaviour we usually prefer to avoid!). First, the side with \(r=0\) is collapsed to a single point: the origin. The side with \(r=3\) forms the entire perimeter of the circle. What happens to the sides \(\theta=0\) and \(\theta=2\pi\text{?}\) They both get sent to the line segment from \((0,0)\) to \((3,0)\text{!}\)

These observations let us imagine transformation as a physical process: first, the left side of the rectangle is shrunk down to a single point, while the right side is simultaneously stretched by a factor of 3. (Vertical lines in between are stretched/shrunk by a factor of \(r\text{,}\) with \(0\leq r\leq 3\text{.}\)) The top of the rectangle is then bent around until it joins with the bottom.

It is perhaps easier to picture the transformation for a domain of the form \([a,b]\times [\alpha,\beta]\text{,}\) with \(0\lt a\lt b\) and \(0\leq \alpha\lt \beta\lt 2\pi\text{.}\) The case \(r\in [1,2]\text{,}\)\(\theta\in [\pi/6, \pi/3]\) is pictured in Figure 15.8.3 below.

Subsection15.8.2General transformations

Given a region \(R\) in the plane and an integral \(\iint_R f(x,y)\,dA\text{,}\) we will look for a domain \(D\subseteq \mathbb{R}^2\) and a function \(T:D\to \mathbb{R}^2\) of the form

that maps \(D\) onto \(R\text{,}\) which can be used to simplify our integral.

Definition 15.8.12 below specifies the properties we require for a function \(T:D\subseteq \mathbb{R}^n\to\mathbb{R}^n\) to be used to define a change of variables. To explain some of those properties, we will need the following definitions.

Definition15.8.4.The image of a point or set.

Let \(D\subseteq \mathbb{R}^n\) be any subset, and let \(F:D\to \mathbb{R}^m\) be a function. For any point \(\mathbf{x}\in D\text{,}\) the image of \(\mathbf{x}\) under \(F\) is the point \(\mathbf{y}=F(\mathbf{x})\) in the range of \(F\text{.}\)

For any subset \(C\subseteq D\text{,}\) the image of \(C\) under \(F\) is denoted \(F(C)\) and defined by

In other words, \(\mathbf{y}\in F(C)\) if and only if \(\mathbf{y}\) is the image of \(\mathbf{x}\) for some \(\mathbf{x}\in C\text{.}\)

In particular, we denote the range (or image) of \(F\) by \(F(D)\text{.}\)

Definition15.8.5.One-to-one and onto functions.

Let \(A\subseteq \mathbb{R}^n\text{,}\) let \(B\subseteq \mathbb{R}^m\text{,}\) and let \(T:A\to B\) be a function.

We say that \(T\) is one-to-one if no two points in \(A\) have the same image. That is, for any \(\mathbf{x}_1,\mathbf{x}_2\in A\text{,}\) if \(\mathbf{x}_1\neq \mathbf{x}_2\text{,}\) then \(T(\mathbf{x}_1)\neq T(\mathbf{x}_2)\text{.}\)

We say that \(T\) is onto if the range of \(T\) is \(B\text{;}\) that is, if \(T(A)=B\text{.}\)

A function used for a change of variables is called a transformation. Such functions need to be one-to-one, except possibly on the boundary of their domain, and they need to be continuously differentiable. (See Definition 15.8.12 below.) One of the important properties of a transformation, which we will justify later in this section (see Theorem 15.8.27), is that the boundary of a closed, bounded domain is mapped to the boundary of the range. This observation is key to visualizing the effect of a transformation.

Example15.8.6.

Let \(D\subseteq \mathbb{R}^2\) be the rectangle defined by \(1\leq u\leq 2\) and \(0\leq v\leq 1\text{.}\) Determine the range of the function \(T:D\to\mathbb{R}^2\) defined by

The function \(T\) is continuously differentiable, since \(x=4uv\) and \(y=u^2+v^2\) both have continuous first-order partial derivatives with respect to \(u\) and \(v\text{.}\) Showing that \(T\) is one-to-one is a mess of algebra that we omit here. From these properties, we can conclude that the boundary of \(D\) will be transformed to the boundary of \(T(D)\text{.}\)

Now, let’s see what happens to the boundary of \(D\text{.}\) The boundary consists of four line segments:

The segment \(u=1\text{,}\)\(0\leq v\leq 1\text{.}\)

The segment \(u=2\text{,}\)\(0\leq v\leq 1\text{.}\)

The segment \(1\leq u\leq 2\text{,}\)\(v=0\text{.}\)

The segment \(1\leq u\leq 2\text{,}\)\(v=1\text{.}\)

On the first segment, \(x=v\) and \(y=1+v^2\text{,}\) with \(0\leq v\leq 1\text{.}\) Eliminating the parameter \(v\) gives us portion of the parabola \(y=1+x^2\) from \((0,1)\) to \((1,2)\text{.}\)

For the second segment we have \(x=2v\) and \(y=4+v^2\text{.}\) This is the part of the parabola \(y=4+\frac14 x^2\) from \((0,4)\) to \((2,5)\text{.}\)

The third segment has \(x=0\) and \(y=u^2\text{,}\) for \(1\leq u\leq 2\text{.}\) This is the portion of the \(y\) axis from \((0,1)\) to \((0,4)\text{.}\)

Finally, the fourth segment is given by \(x=u\text{,}\)\(y=u^2+1\text{,}\) for \(1\leq u\leq 2\text{.}\) This is again the parabola \(y=1+x^2\text{,}\) but this time \(1\leq x\leq 2\text{.}\)

The resulting region is plotted in Figure 15.8.7. Interestingly, two of the four sides of the rectangle bounding \(D\) were mapped to (different portions of) the same curve.

This example is interesting, in that two of the four sides of our rectangular domain were mapped to the same curve. Note also that (without explicitly solving for the inverse function, giving \(u\) and \(v\) as functions of \(x\) and \(y\)) we can see that lines of constant \(x\) in the \(u,v\) plane are circles, and lines of constant \(y\) are hyperbolas.

One other observation is worthy of note: we mentioned above that we will be primarily concerned with finding transformations that can be used to simplify a double integral. Suppose we were given a double integral over \(R=T(D)\text{,}\) as pictured in Figure 15.8.7. We probably wouldn’t even consider a change of variables in this case, unless one was needed for the function being integrated: the region can be described by the inequalities

We already learned how to deal with such regions at the beginning of this Chapter, and in any case, it’s unlikely that anyone looking at this region would come up with the transformation we just considered.

Subsection15.8.3The Jacobian of a transformation

We’re ready to move on, and describe the effect of a change of variables on an integral. We begin with an observation from single variable calculus. Consider the definition of a definite integral as a limit of Riemann sums. When we make a change of variables \(x=T(u)\) in a single integral, a partition of \([a,b]\) given by \(a=u_0\lt u_1\lt \cdots \lt u_n=b\) is transformed into a partition \(x_0=T(u_0),x_1=T(u_1), \ldots, x_n=T(u_n)\text{.}\) (As long as \(T'(u)\gt 0\text{,}\) we have \(x_0\lt x_1\lt \cdots \lt x_n\text{.}\))

The transformation affects the size of the subintervals in the partition: from Section 4.4, we know that \(\Delta x_i \approx T\primeskip' (u_i)\Delta x_i\text{.}\) Thus, the derivative tells us how the size of each subinterval changes under the transformation.

This gives us a way of thinking about the geometric effect of a substitution. In the integral \(\displaystyle\int_{T(a)}^{T(b)}f(x)\,dx\text{,}\) the subintervals in a partition (thought of as the width of the rectangles in a Riemann sum) are stretched/shrunk horizontally by a factor given by the derivative \(T'(u)\) of the transformation function \(g(u)\text{.}\) In the integral \(\displaystyle\int_a^b f(T(u))T'(u)\,du\text{,}\) the derivative \(T'(u)\) is part of the integrand, and therefore our horizontal stretch/shrink becomes a vertical stretch/shrink. Of course, the area of a rectangle changes by the same amount regardless of whether the stretch/shrink is horizontal or vertical.

Similarly, when we do a change of variables in two or three variables, we need a measure of how the size of each subregion in a partition changes under change of variables. This measure is given by an object known as the Jacobian.

Definition15.8.8.The Jacobian of a transformation.

Let \(D\subseteq \mathbb{R}^2\) be a subset of the plane, described with coordinates \((u,v)\text{.}\) Let \(T:D\subseteq \mathbb{R}^2\to \mathbb{R}^2\) be given by

In the case of a transformation \(T:D\subseteq \mathbb{R}^3\to\mathbb{R}^3\text{,}\) with \((x,y,z)=T(u,v,w)\text{,}\) the definition of the Jacobian is similar, except that we need to compute the determinant of a \(3\times 3\) matrix.

If you read Section 14.6 on the definition of the derivative as a matrix of partial derivatives, you probably recognize the matrix whose determinant gives us the Jacobian: it’s the derivative matrix! One therefore could write

for the Jacobian, and this formula is valid in any dimension.

In fact, we can even use this definition for single integrals: a \(1\times 1\) matrix is just a number, and the determinant does nothing to that number, and of course, the derivative of a function of one variable is the same as always.

Example15.8.9.

Compute the Jacobian of the transformation \((x,y)=T(u,v)\) given by

\begin{equation*}
x = 7u-3v \quad y = -4u+2v\text{.}
\end{equation*}

In this case, the Jacobian is a constant function. This is the case whenever \(x\) and \(y\) are linear functions of \(u\) and \(v\text{.,}\) but not true in general. We’ll look at the case of linear transformations in more detail after a few more examples.

Example15.8.10.

Compute the Jacobian of the transformation given by

Again, this is a direct application of the definition, but we should be clever about how we compute our partial derivatives. Our transformation defines \(x=\sqrt[3]{x^2y}\) and \(y=\sqrt[3]{xy^2}\text{.}\) If we blindly push forward with the partial derivatives as written, we get a mess. For example, if we get a little too excited, we might do something like:

As hinted at earlier, the Jacobian is important because it appears in the change of variables formula to come. Its role is analogous to that of the derivative \(g'(u)\) in Equation (15.8.1). We also need the Jacobian to precisely define the type of function that can be used for a change of variables.

Definition15.8.12.Properties of a transformation.

Let \(D\) and \(E\) subsets of \(\mathbb{R}^2\text{,}\) with \(D\subseteq \mathbb{R}^2\) described in terms of coordinates \(u,v\text{,}\) and \(E\subseteq \mathbb{R}^2\) described in terms of coordinates \(x,y\text{.}\) Let \(D^\mathsf{o}\) denote the interior of \(D\text{;}\) that is, the set of all non-boundary points of \(D\text{.}\) We say that a function \(T:D\to E\) is a transformation if:

\(T\) is continuously differentiable on \(D^\mathsf{o}\text{.}\)

\(T\) is one-to-one on \(D^\mathsf{o}\text{,}\) and the range of \(T\) is \(E\text{.}\)

The Jacobian of \(T\) does not vanish: \(J_T(u,v)\neq 0\) for all \((u.v)\in D^\mathsf{o}\text{.}\)

When \(D\) is a closed, bounded subset, note that we do not require Definition 15.8.12 to hold on the boundary. Each of the three conditions above must hold on the interior of \(D\text{,}\) but are allowed to fail on all or part of the boundary. In particular, this is the case for cylindrical, and spherical coordinates:

The polar coordinate transformation \(x=r\cos\theta, y=r\sin\theta\) is only one-to-one if \(r\gt 0\) and \(\theta\) belongs to an interval whose length is less than \(2\pi\text{.}\) Note that \(J_T(r,\theta)\) vanishes at \(r=0\text{.}\)

Of course, we often use a domain such as \(r\in [0,R]\text{,}\)\(\theta = [0,2\pi]\) to describe a disk centred at the origin. The conditions of Definition 15.8.12 fail at \(r=0\text{,}\) and because points with \(\theta=0\) get mapped to the same place as points with \(\theta = 2\pi\text{.}\) But these coordinates describe 3 of the 4 sides of the boundary rectangle for our domain, and the conditions are not required to hold on the boundadry.

The cylindrical coordinate transformation has exactly the same issues as polar coordinates.

For spherical coordinates, we take \(\rho\geq 0\) and again accept the fact that our transformation is not one-to-one (and the Jacobian is zero) when \(\rho=0\text{.}\) Similarly, we generally allow \(\theta\in [0,2\pi]\) and \(\varphi\in [-\pi/2,\pi/2]\) even though endpoints of these intervals might get sent to the same point.

Before we move on to the change of variables formula, we consider one more example that will help clarify the geometry involved in a change of variables, and that may be familiar to you from a first course in linear algebra.

We saw in Example 15.8.9 that when \(x\) and \(y\) are linear functions of \(u\) and \(v\text{,}\) the Jacobian of the transformation is a constant. What does that constant tell us about the transformation? Here is an example taken from the book Matrix Algebra, by Greg Hartman (who is also the main author of this text).

Example15.8.13.

Consider the function \(T:\mathbb{R}^2\to \mathbb{R}^2\) given by

Note that \(T\) is linear in both variables. In fact, if we set \((x,y)=T(u,v)\) and represent points by vectors, replacing \((x,y)\) by \(\vec{x}=\begin{bmatrix}x\\y\end{bmatrix}\) and \((u,v)\) by \(\vec{u}=\begin{bmatrix}u\\v\end{bmatrix}\text{,}\) then we can write this function as the matrix transformation \(\vec{x}=A\vec{u}\text{,}\) where \(A\) is the \(2\times 2\) matrix \(\begin{bmatrix}1\amp4\\2\amp3\end{bmatrix}\text{.}\) That is:

To visualize the effect of \(T\text{,}\) plot the vectors representing the four corners of the unit square, before and after they have been multiplied by \(A\text{,}\) where

\begin{equation*}
A = \begin{bmatrix} 1\amp4\\2\amp3\end{bmatrix}\text{.}
\end{equation*}

Solution.

The four corners of the unit square can be represented by the vectors

The unit square and its transformation are graphed in Figure 15.8.14, where the shaped vertices correspond to each other across the two graphs. Note how the square got turned into some sort of quadrilateral (it’s actually a parallelogram). A really interesting thing is how the triangular and square vertices seem to have changed places — it is as though the square, in addition to being stretched out of shape, was flipped.

How does all this relate to Jacobians and change of variables? First note that the derivative of any linear function is (perhaps not so surprisingly) the matrix that defines it: for \(T(u,v) = (u+4v,2u+3v)\text{,}\) we have

The Jacobian of \(T\) is then the determinant of this matrix:

\begin{equation*}
J_T(u,v) = \det A = 1(3)-4(2)=-5\text{.}
\end{equation*}

Let us make a note of a few key points about Example 15.8.13. First, note that in this case, the derivative matrix, (and as a result, the Jacobian) is constant. (This of course is generally true of the derivative for linear functions.)

What happens when we apply the map \(T\) to the unit square? The value \(J_T(u,v)=-5\) tells us two things:

First, the area of the unit square is increased by a factor of 5.

Second, the transformation \(T\) reverses the orientation of the unit square. This is indicated by the negative value of the determinant. The reversal of orientation is responsible for the “flipping” of the square noticed in the solution above.

The result of performing the transformation \(T\) on the unit square is therefore the following: first, the square is flipped over. Then, the square is stretched out into a parallelogram whose area is 5 times that of the original square.

Let us make a couple more remarks about the Example 15.8.13. First, note the need for an absolute value around the determinant, to ensure the area computed is positive. This absolute value will be needed in our change of variables formula as well.

Second, since our transformation was linear, with constant derivative, the effect on area is the same for any portion of the plane: applying the transformation \(T\) to a closed bounded region \(D\subseteq \mathbb{R}^2\) of area \(A\) will produce a region of area \(5A\text{.}\) For non-linear transformations, the value of the Jacobian (and hence, the effect on area) will vary from point to point.

Before we move on, let’s do two more examples, with transformations we’ve already encountered. In these examples, we’ll find that the value of the Jacobian is not a constant.

Example15.8.15.

Compute the Jacobian for

The polar coordinate transformation

\begin{equation*}
x = r\cos\theta \quad y = r\sin\theta
\end{equation*}

The spherical coordinate transformation

\begin{equation*}
x = r\cos\theta\cos\varphi \quad y = r\sin\theta\cos\varphi \quad z = r\sin\varphi\text{.}
\end{equation*}

Solution.

Here we’ve defined \(x\) and \(y\) in terms of the coordinates \(r\) and \(\theta\) instead of \(u\) and \(v\text{,}\) but the process is the same:

Interesting. Note that the value of the Jacobian is \(r\text{,}\) which is precisely the correction factor needed in the area element for a double integral when we change from rectangular to polar coordinates. Let’s try the spherical coordinate transformation to see if this was merely a coincidence.

Although we haven’t defined the Jacobian for a change of coordinates in three variables, the process is exactly the same. We form the derivative of the transformation, given by the matrix of partial derivatives, and compute its determinant. We find:

We computed the above \(3\times 3\) determinant using a cofactor expansion along the second column. This is once again exactly the correction factor for the volume element in spherical coordinates, as given in Theorem 15.7.21 in Section 15.7.

Subsection15.8.4The Change of Variables Formula

It seems that we’re onto something. It is time that we stated the general change of variables formula for multiple integrals. Notice how, as with the derivative \(g'(u)\) in Equation (15.8.1), the Jacobian gives us a measure of how subregions in the domain are stretched or shrunk. It shouldn’t be too surprising, then, that the Jacobian plays the same role in multiple integrals that the derivative does in a single integral.

Theorem15.8.17.Change of variables formula for double integrals.

Let \(D\) be a closed, bounded region in the plane, and let \(T:D\subseteq \mathbb{R}^2\to \mathbb{R}^2\) be a transformation. If \(f\) is a continuous, real-valued function on \(D\text{,}\) then

Let \(R\) be the region in the \(x,y\) plane whose boundary is the parallelogram with vertices \((0,0)\text{,}\)\((3,1)\text{,}\)\((1,4)\text{,}\) and \((4,5)\text{.}\)

Determine a rectangular region \(D\) and a transformation \(T:D\to \mathbb{R}^2\) such that \(R=T(D)\text{.}\)

Use the transformation \(T\) and Theorem 15.8.17 to determine the area of \(R\text{.}\)

Solution.

For inspiration, we look to Example 15.8.13. Notice how the transformation defined by the matrix \(A=\begin{bmatrix}1\amp 4\\2\amp 3\end{bmatrix}\) preserves the origin, and sends the points \((1,0)\) and \((0,1)\) to \((1,2)\) and \((4,3)\text{,}\) respectively. In general, the transformation

We check that \(T(0,0)=(0,0)\text{,}\)\(T(1,0)=(3,1)\text{,}\)\(T(0,1)=(1,4)\text{,}\) and \(T(1,1)=(4,5)\text{.}\) The four corners of the unit square are mapped to the four corners of the parallelogram. Since linear transformations map “lines to lines”, we have our transformation.

To use Theorem 15.8.17, we need to compute the Jacobian of our transformation. We have

We need to determine a domain for the transformation \(T(u,v)=(u+v,u-v)\) such that the range of \(T\) is \(R\text{.}\) Let’s put \(x=u+v\) and \(y=u-v\) into the equations of our boundary lines, to see what the corresponding lines in the \(u,v\) plane are.

These lines are simply the boundary of the unit square in the \(u,v\) plane. Thus, if we take the domain \(D=[0,1]\times [0,1]\) for \(T\text{,}\) we will have \(R=T(D)\text{,}\) as required.

Let’s try one more example where we’re given some guidance before tackling a general change of variables problem.

Example15.8.22.

Let \(R\) be the region in the first quadrant bounded by the lines \(y=x\) and \(y=4x\text{,}\) and the hyperbolas \(y=1/x\) and \(y=4/x\text{.}\) Evaluate the integral

using the change of variables \(x=u/v\text{,}\)\(y=v\text{.}\)

Solution.

First, we note that setting \(y=kx\text{,}\) where \(k\) is a constant, gives us

\begin{equation*}
v = k\frac{u}{v} \quad\Rightarrow\quad u=\frac1k v^2\text{,}
\end{equation*}

while setting \(y=k/x\) gives \(xy=k\text{,}\) or \(u=k\text{.}\) The region \(R\) is therefore the image under the transformation \(T(u,v)=(u/v,v)\) of the region \(D\) bounded by the curves \(u=v^2\) and \(u=\frac14 v^2\text{,}\) and the lines \(u=1, u=4\text{;}\) see Figure 15.8.24.

This is perhaps not the best possible change of variables: the domain \(D\) is not a rectangle. (See Example 15.8.31 below for a change of variables that is more effective for this type of region.) However, it is a region of the type we considered in Section 15.2, so we’re better off than we were with the original region. We have \(1\leq u\leq 4\text{,}\) and the equations \(u=v^2\text{,}\)\(u=\frac14 v^2\) can be re-written (noting that \(v>0\)) as \(v=\sqrt{u}\) and \(v=2\sqrt{u}\text{.}\)

With \(f(x,y)=xy^2\) we have \(f(T(u,v))=\frac{u}{v}\cdot v^2=uv\text{,}\) and the Jacobian is given by

Our next goal is to tackle the following general problem: given a multiple integral over a region \(E\text{,}\) determine a transformation \(T\) with domain \(D\) such that \(T(D)=E\text{,}\) and use it to evaluate the integral. Before attempting a couple of examples, we take a brief detour to consider some technical details that will assist us in understanding the problem.

Recall from Definition 15.8.12 that we require transformations to be one-to-one and onto (see Definition 15.8.5), except possibly on the boundary of their domain.

One of the reasons that we require these properties is that they guarantee that \(T\) has an inverse. If a transformation \(T:D\to E\) is one-to-one and onto, then we can define the inverse mapping \(T^{-1}:E\to D\) according to

\begin{equation*}
T^{-1}(\mathbf{x}) = \mathbf{u} \quad\text{ if and only if }\quad \mathbf{x} = T(\mathbf{u})\text{.}
\end{equation*}

Notice that the onto condition guarantees that the domain of \(T^{-1}\) is all of \(E\text{.}\) When considering a changes of variables for a multiple integral over a region \(E\text{,}\) we would ideally like to have a one-to-one and onto mapping from \(D\) to \(E\) to ensure that when we convert to an integral over \(D\text{,}\) each point in \(E\) only gets “counted once”.

For example, consider the mapping \(T(u,v)=(u^2,v)\) defined on \([-1,1]\times [0,1]\text{.}\) (That is, \(x=u^2\) with \(-1\leq u\leq 1\) and \(y=v\text{,}\) with \(0\leq v\leq 1\text{.}\)) The image of \(T\) is the square \([0,1]\times [0,1]\text{,}\) but each point \((x,y)\) corresponds to two points \((\pm \sqrt{x},\sqrt{y})\) in \(D\text{,}\) so integrating over \(D\) would be the same as integrating over \(E\) {\em twice}!

Next we want to consider differentiability. Recall that a vector-valued function

is continuous if and only if each of the component functions \(x(t), y(t)\) is continuous, and similarly, \(\mathbf{r}(t)\) is differentiable if and only if each of the component functions is differentiable, and

Similarly, a function \(T:D\subset \mathbb{R}^n \to \mathbb{R}^n\) is continuous if and only if each of its components is continuous (as a function of several variables), and (for \(n=2\)) the partial derivatives of \(T\) can be viewed as the vector-valued functions

with similar formulas for \(n=3\text{.}\) (For \(n=1\) we have only the single derivative \(T'(u)\text{.}\))

If each of the components of each of the partial derivatives is continuous (that is, if the partial derivative of each of the \(\mathbf{x}\) variables with respect to each of the \(\mathbf{u}\) variables is continuous) we say that \(T\) is \(C^1\text{,}\) or continuously differentiable.

If a function \(T:D\subset\mathbb{R}^n\to E\subset \mathbb{R}^n\) is \(C^1\text{,}\) then as with real-valued functions, being continuously differentiable implies that \(T\) is differentiable (in the sense of the definition from Section 14.6), and therefore continuous. The derivative of \(T\) is then an \(n\times n\) matrix. For example, when \(n=2\text{,}\) if \(T(u,v) = (x(u,v), y(u,v))\text{,}\) we get

Notice that, while the gradients \(\nabla x(u,v), \nabla y(u,v)\) make up the rows of the derivative matrix \(DT(\mathbf{a})\text{,}\) the columns of \(DT(\mathbf{a})\) are the partial derivative vectors \(\mathbf{r}_u\) and \(\mathbf{r}_v\text{.}\)

Given our function \(T:D\subset \mathbb{R}^n\to E\subset \mathbb{R}^n\text{,}\) let us denote by \(DT\) the matrix of partial derivatives, as in Section 14.6. Since the dimension of the domain and range are the same, \(DT\) is a square (\(n\times n\)) matrix, so we can compute its determinant, and this, of course, is the Jacobian, as defined in Definition 15.8.8.

Let’s come back to the change of variables formula. If we let \(d\mathbf{x}\) denote either \(dx\text{,}\)\(dA\text{,}\) or \(dV\text{,}\) depending on whether \(n=1,2\) or \(3\text{,}\) and doing the same for \(d\mathbf{u}\text{,}\) the change of variables formula for a transformation \(T:D\to E\) can be written as

where the integral sign represents a single, double, or triple integral, depending on the value of \(n\text{.}\) (So this really is just a generalization of the method of substitution you learned in Calculus I.)

Note that the properties required for \(T\) to be a transformation tell us that every point of \(E\) corresponds to a point in \(D\text{,}\) and integrating over \(D\) is the same as integrating over \(E\text{,}\) once we account for the “stretch factor” of the transformation given by the Jacobian \(J_T(\mathbf{u})\text{.}\) A rigorous proof of the change of variables formula is very difficult, but we will give an argument at the end of this section similar to the one we considered for the polar and spherical coordinate transformations that, although not a complete proof, is at least a plausible explanation.

The general inverse function theorem, which is not stated in most calculus textbooks, (probably in part because the statement requires defining the matrix \(DT\) of partial derivatives and explaining what the inverse of a matrix is), states that if \(T:D\to E\) is one-to-one and onto, then \(T^{-1}\) exists, and moreover, if \(T\) is \(C^1\)and\(J_T(\mathbf{u})\neq 0\) for all \(\mathbf{u}\in D\text{,}\) then \(T^{-1}\) is also a \(C^1\) function, and

A useful consequence of Equation (15.8.2) is obtained by taking the determinant of both sides of the above equation (recall that \(\det(A^{-1}) = 1/\det(A)\) for any invertible matrix \(A\)).

Theorem15.8.25.The Jacobian of an inverse transformation.

Let \(T:D\to\mathbb{R}^2\) be a one-to-one \(C^1\) mapping with image \(E=T(D)\text{.}\) If \(J_T(\mathbf{u})\neq 0\) for all \(\mathbf{u}\in D\text{,}\) then \(T^{-1}:E\to \mathbb{R}^2\) is a transformation, and the Jacobian of \(T^{-1}\) is given by

This result can come in handy in cases where it’s easy to come up with the inverse mapping \(\mathbf{u} = T^{-1}(\mathbf{x})\text{,}\) but hard to solve for \(\mathbf{x}\) in terms of \(\mathbf{u}\) to obtain \(T\text{.}\)

Our last technical detail is a theorem that can be very useful when trying to determine the transformation to use for a change of variables: the boundary of \(E\) must correspond to the boundary of \(D\text{.}\) This is useful because we usually would like \(D\) to be as simple as possible, ideally a rectangle (or box, if \(n=3\)).

Since the sides of the rectangle are given by setting \(u\) or \(v\) equal to a constant, we look at the curves that define the boundary of \(E\text{.}\) If the boundary of \(E\) can be expressed in terms of level curves for two functions \(f(x,y)\) and \(g(x,y)\text{,}\) we can define \(u=f(x,y)\) and \(v=g(x,y)\text{,}\) which allows us to define \(T^{-1}(x,y) = (f(x,y),g(x,y))\text{.}\) From there, we can try to compute \(T\) from \(T^{-1}\text{,}\) which is a matter of solving for \(x\) and \(y\) in terms of \(u\) and \(v\text{.}\)

Theorem15.8.27.Transformations preserve the boundary.

Let \(D,E\subset \mathbb{R}^n\) be closed, bounded regions. If \(T:D\to E\) is a transformation, then the boundary of \(E\) is the image under \(T\) of the boundary of \(D\text{;}\) that is, if \(T(\mathbf{u})=\mathbf{x}\) is on the boundary of \(E\text{,}\) then \(\mathbf{u}\) is on the boundary of \(D\text{.}\)

We will prove this result in the case that \(T\) is one-to-one, with \(J_T(\mathbf{u})\neq 0\text{,}\) on all of \(D\text{,}\) including the boundary. Note that if this property fails on some portion on the boundary, this will not affect the integral. For example, if \(n=2\text{,}\) the boundary of \(D\) consists of a finite union of continuous curves, so any portion of the boundary is a continuous curve, and we know that we can neglect the graphs of finitely many continuous curves when carrying out an integral. We begin by first proving a simpler result.

Theorem15.8.28.Transformations are open mappings.

If \(f:A\to B\) is a continuous, one-to-one, and onto mapping from \(A\) to \(B\) with continuous inverse \(f^{-1}:B\to A\text{,}\) then \(f\) maps open sets to open sets. That is, if \(U\subset A\) is an open subset of \(A\text{,}\) then the image \(f(U) = \{f(\mathbf{u})\in B|\mathbf{u}\in U\}\) is an open subset of \(B\text{.}\)

Proof.

Let \(U\subset A\) be open, and let \(\mathbf{x}\in f(U)\text{.}\) We need to show that there exists some \(\delta\gt 0\) such that \(N_\delta(\mathbf{x}) = \{\mathbf{y}\in A| \norm{\mathbf{x}-\mathbf{y}}\lt \delta\}\) is a subset of \(f(U)\text{.}\) (By definition, \(f(U)\) is open if each element of \(f(U)\) has a \(\delta\)-neighbourhood completely contained in \(f(U)\text{.}\)) Since \(f\) is one-to-one and onto, there exists a unique \(\mathbf{v}=f^{-1}(\mathbf{x})\in U\) such that \(f(\mathbf{v})=\mathbf{x}\text{.}\) (We must have \(\mathbf{v}\in U\) since \(f(\mathbf{v})\in f(U)\text{.}\)) Since \(U\) is open, there exists an \(\epsilon\gt 0\) such that \(N_\epsilon(\mathbf{v})\subset U\text{.}\)

Now, since \(f^{-1}\) is continuous, there exists a \(\delta\gt 0\) such that if \(\mathbf{y}\in N_\delta(\mathbf{x})\text{,}\) then \(f^{-1}(\mathbf{y})\in N_\epsilon(\mathbf{v})\text{.}\) But if \(f^{-1}(\mathbf{y})\in N_\epsilon\subset U\text{,}\) then \(f(f^{-1}(\mathbf{y}))=\mathbf{y}\in f(U)\text{,}\) by definition of \(f(U)\text{.}\) Thus, \(N_\delta(\mathbf{x})\subset f(U)\text{,}\) which is what we needed to show.

Using the above lemma, we can now give a proof of our theorem.

Proof of Theorem 15.8.27.

Let \(T:D\to E\) be the given transformation, which is one-to-one and onto, and such that \(J_T(\mathbf{u})\neq 0\) for all \(\mathbf{u}\in D\text{.}\) Since \(T\) is one-to-one and onto, we can find an inverse function \(T^{-1}:E\to D\text{.}\) Since \(T\) is \(C^1\) and \(J_T(\mathbf{u})\neq 0\) for all \(\mathbf{u}\in D\text{,}\) the inverse function theorem tells us that \(T^{-1}\) must be \(C^1\) on \(E\text{.}\) Since \(T\) and \(T^{-1}\) are both \(C^1\text{,}\) they are differentiable and therefore continuous.

Now, let \(\mathbf{x}\in E\) be a boundary point. We need to show that \(\mathbf{x}\) is the image of a boundary point in \(D\text{.}\) Recall that \(\mathbf{x}\) is a boundary point if and only if every neighbourhood of \(\mathbf{x}\) contains both points in \(E\) and points not in \(E\text{.}\) Let \(\mathbf{u}=T^{-1}(\mathbf{x})\in D\) be the element of \(D\) that is mapped to \(\mathbf{x}\) by \(T\text{.}\) For the sake of contradiction, suppose that \(\mathbf{u}\) is not a boundary point of \(D\text{.}\) Then since \(\mathbf{u}\in D\) it must be an interior point of \(D\text{,}\) and therefore, there exists some \(\delta\gt 0\) such that \(N_\delta(\mathbf{u})\subset D\text{.}\) (That is, there is a neighbourhood of \(\mathbf{u}\) that is completely contained in \(D\text{.}\))

However, since \(T\) satisfies the conditions of Theorem 15.8.28, we know that \(T\) must map open sets to open sets. In particular, since \(N_\delta(\mathbf{u})\) is an open subset of \(D\text{,}\)\(T(N_\delta(\mathbf{u}))\) must be an open subset of \(E\text{.}\) But since \(\mathbf{u}\in N_\delta(\mathbf{u})\text{,}\) we must have \(\mathbf{x} = T(\mathbf{u})\in T(N_\delta(\mathbf{u}))\text{,}\) and thus \(T(N_\delta(\mathbf{u}))\) is an open subset of \(E\) that contains \(\mathbf{x}\text{,}\) which contradicts the fact that \(\mathbf{x}\) is a boundary point. Thus, it must be the case that \(\mathbf{u}\) is a boundary point of \(D\text{.}\)

Note that since \(T^{-1}:E\to D\) is also a transformation with the same properties as \(T\text{,}\) the converse to this result is valid as well: if \(\mathbf{u}\) belongs to the boundary of \(D\text{,}\) then \(T(\mathbf{u})\) belongs to the boundary of \(E\text{.}\)

We will see how Theorems 15.8.27 and 15.8.25 are put to use in the following examples.

Example15.8.29.

Compute \(\displaystyle \iint_E \left(\frac{y^2}{x^4}+\frac{x^2}{y^4}\right)\,dA\text{,}\) where \(E\) is the region bounded by \(y=x^2\text{,}\)\(y=2x^2\text{,}\)\(x=y^2\text{,}\) and \(x=4y^2\text{.}\)

Solution.

The region \(E\) is pictured in Figure 15.8.30 below. We need to find a region \(D\subset \mathbb{R}^2\) and a transformation \(T:D\to \mathbb{R}^2\) whose image is \(E\text{.}\) We use the fact that \(T\) must map the boundary of \(D\) to the boundary of \(E\) as a guideline. In particular, note that since \(T\) is \(C^1\text{,}\) it must map smooth curves to smooth curves by the chain rule. This tells us that the corners of \(E\) must correspond to the corners of \(D\text{,}\) and in particular, that each of the four curves that make up the boundary of \(E\) must come from four curves that make up the boundary of \(D\text{.}\) Since we would like the integral over \(D\) to be as simple as possible, we try to find a transformation such that \(D\) is a rectangle.

Since the sides of a rectangle in the \(uv\)-plane are given by either \(u=\text{constant}\) or \(v=\text{constant}\text{,}\) we try to express the boundary of \(E\) in terms of level curves \(u(x,y)=c_1, c_2\) and \(v(x,y)=d_1,d_2\text{.}\) Let’s look at the curves \(y=x^2\) and \(y=2x^2\text{.}\) These both belong to the family of curves \(y=cx^2\text{,}\) or \(\dfrac{y}{x^2}=c\text{,}\) so we set \(u(x,y) = \dfrac{y}{x^2}\text{.}\) The region between these two parabolas is then given by \(1\leq u\leq 2\text{,}\) or \(u\in [1,2]\text{.}\) Similarly, the other two sides of the boundary of \(E\text{,}\) given by \(x=y^2\) and \(x=4y^2\) both belong to the family of curves \(x=dy^2\text{,}\) or \(\dfrac{x}{y^2}=d\text{.}\) This suggests that we take \(v(x,y)=\dfrac{x}{y^2}\text{,}\) with \(1\leq v\leq 4\text{.}\)

We have now determined a map \(S:E\to D=[1,2]\times [1,4]\) given by

which is defined and non-zero on all of \(E\text{.}\) This means that \(S=T^{-1}\) for some transformation \(T:D\to E\text{.}\) We can now proceed to compute the integral via change of variables in one of two ways:

Directly, by solving for \(x\) and \(y\) in terms of \(u\) and \(v\text{,}\) which will give us the transformation \(T\text{.}\)

From \(u=\dfrac{y}{x^2}\) we get \(y=ux^2\text{,}\) so \(x=vy^2 = vu^2x^4\text{.}\) Since \(x\neq 0\) on \(E\text{,}\) this gives us \(x^{-3} = u^2v\text{,}\) so \(x = u^{-2/3}v^{-1/3}\text{,}\) and thus \(y=ux^2 = u^{-1/3}v^{-2/3}\text{.}\) The transformation \(T\) is thus \(T(u,v) = (u^{-2/3}v^{-1/3},u^{-1/3}v^{-2/3})\text{,}\) and its Jacobian is given by

Indirectly, using the fact that \(J_T(u,v) = \dfrac{1}{J_{T^{-1}}(x(u,v),y(u,v))}\text{.}\)

From the above, we have that \(J_{T^{-1}}(x,y) = \frac{3}{x^2y^2}\text{,}\) so \(J_T(u,v) = \frac{1}{3}(x(u,v))^2(y(u,v))^2\text{.}\) From \(u=\dfrac{y}{x^2}\) and \(v=\dfrac{x}{y^2}\text{,}\) we have \(uv = \dfrac{xy}{x^2y^2} = \dfrac{1}{xy}\text{.}\) Thus, \(x^2y^2 = \dfrac{1}{u^2v^2}\text{,}\) so \(J_T(u,v) = \dfrac{1}{3u^2v^2}\) as before. From here we can proceed as above.

Example15.8.31.

Compute \(\displaystyle \iint_E xy \, dA\text{,}\) where \(E\) is the region in the first quadrant bounded by \(y=x\text{,}\)\(y=4x\text{,}\)\(y=1/x\text{,}\) and \(y=2/x\text{.}\)

Solution.

We need to find a region \(D\subset \mathbb{R}^2\) and a transformation \(T:D\to \mathbb{R}^2\) whose image is \(E\text{.}\) This problem is almost identical to the one we solved in Example 15.8.22, where we were given a change of variables whose domain was still somewhat complicated. This time, we look for a transformation with a rectangular domain.

Using the principle that \(T\) must map the boundary of \(D\) to the boundary of \(E\) as above, we set \(u=\dfrac{y}{x}\text{,}\) so that \(1\leq u\leq 4\) gives the region between \(y=x\) and \(y=4x\text{,}\) and \(v=xy\text{,}\) so that \(1\leq v\leq 2\) gives the region between \(y=1/x\) and \(y=2/x\text{.}\) Thus the desired transformation is defined on the rectangle \(D = [1,4]\times [1,2]\) and has an inverse given by \(T^{-1}(x,y) = (y/x,xy)\text{.}\)

This time we leave the direct method (solving for \(x\) and \(y\) in terms of \(u\) and \(v\)) as an exercise and use the indirect method. The Jacobian of \(T^{-1}\) is given by

if we let the symbol \(\int\) stand for a single, double, or triple integral as necessary.

In practice, we use the formula in one of two ways:

Right-to-left, because it is easier to compute antiderivatives for the function \(f(\mathbf{x})\text{.}\) This is the case with change of variables for single integrals.

Left-to-right, because the domain \(D\) is a simpler region of integration than \(T(D)\text{,}\) such as the examples above, as well as the transformations to polar, cylindrical, or spherical coordinates considered earlier. (Of course, we might also get lucky and find that our function simplifies as well!)

Let’s consider this formula in the intermediate case of a double integral. If the function \(f\) is positive throughout the region \(E=T(D)\text{,}\) we can interpret the integral on the left as a volume. In terms of Riemann sums, we are adding up volumes of boxes:

The distortion in area caused by the mapping \(T\) when we move from the region \(D\) in the \(u,v\) plane to the region \(E\) in the \(x,y\) plane is hidden within the \(dx\,dy\) area element in the integral on the left-hand side.

To ensure that both integrals compute the same volume, the Jacobian is introduced as part of the integrand on the right-hand side to produce a corresponding change in height:

Appropriately interpreted, the only differences between the integrals on either side are the labelling of the variables, and whether the Jacobian provides a measure of height, or of area, in the calculation of volume.

In general, transformations produce what are called “curvilinear coordinate systems”: the original linear coordinate system in the \(u,v\) plane, with grid lines given by \(u=\text{constant}\) or \(v=\text{constant}\) is transformed into a “grid of curves” in the \(x,y\) plane. This is the case, for example, with the polar coordinate transformation, as seen in Figure 15.8.34 below.

For another example, consider the transformation \(T\) given by

A grid in the \(u,v\) plane is transformed to two families of curves: lines \(u=m\text{,}\)\(v=n\text{,}\) where \(m,n\) are constants become the curves \(y=\frac{m}{x}\) and \(y=nx^2\text{,}\) respectively. The transformation is pictured in Figure 15.8.35 below.

In Figure 15.8.35 we’ve highlighted one of the rectangles in our grid to see how it’s transformed. Imagine now that our grid lines are much finer, coming not from the integer values of \(u\) and \(v\text{,}\) but from a partition of a rectangle \(D\) in the \(u,v\) plane. Zooming in, we’d see that each rectangle in the partition is transformed much like the one above.

Indeed, recall the following philosophy from Section 14.6: the transformation \(T\) maps points in the \(u,v\) plane to points in the \(x,y\) plane. The derivative matrix \(DT(u,v)\) of \(T\) at a point \((u,v)\text{,}\) when viewed as the matrix of a linear transformation, maps (tangent) vectors at the point \((u,v)\) to (tangent) vectors at the point \((x,y)=T(u,v)\text{.}\) (This is a consequence of the Chain Rule.)

Consider a general transformation \(T(u,v) = (x(u,v),y(u,v))\) and a uniform partition of the domain of \(T\text{.}\) At a point \((u_i,v_j)\) in our partition, the lines \(u=u_i\) and \(v=v_j\) can be viewed as parametric curves:

\begin{align*}
\vec{r}_1(t) \amp= \langle t,v_j\rangle, \text{ for } u_i\leq t\leq u_i+\Delta u, \text{ and}\\
\vec{r}_2(t) \amp= \langle u_i,t\rangle, \text{ for } v_j\leq t\leq v_j+\Delta v\text{.}
\end{align*}

The \((i,j)\)-th rectangle, given by \(u_i\leq u\leq u_i+\Delta u\) and \(v_j\leq v\leq v_j+\Delta v\text{,}\) has area \(\Delta u \,Delta v\text{.}\)

Viewed another way, this rectangle is a parallelogram spanned by the vectors \(\Delta u\vec{i}\) and \(\Delta v\vec{j}\text{.}\) The area of this parallelogram is given by the determinant of the matrix whose columns are these vectors. Of course, this produces the same area:

Now, let’s consider the corresponding region in the \(x,y\) plane. The curves in Figure 15.8.35 above can also be realized as parametric curves. In fact, they are precisely the composition of the curves above with our transformation, if we view \(T\) as a vector-valued function. We have curves

making up two of the four sides of our transformed rectangle.

Now, \(\vec{s}_1(t)\) and \(\vec{s}_2(t)\) are curves in general, not lines, and the image of our rectangle is no longer rectangular. But for \(\Delta u,\Delta v\) small enough, our curves are approximately linear, and the image of our rectangle is approximately a parallelogram. See Figure 15.8.36.

We can make linear approximations to vector-valued functions in much the same way as we do for real-valued functions. We have

with a similar result for \(\vec{s}_2\text{.}\) This means that we can approximate the area of our transformed rectangle using the parallelogram spanned by the vectors

The area of our transformed region is therefore approximated by the area of the parallelogram spanned by the vectors \(\vec{a}\) and \(\vec{b}\text{:}\)

This is exactly the result we wanted: the area of our transformed rectangle is approximately the area of the original rectangle, multiplied by the Jacobian.

We can begin to see the change of variables formula by putting this result into the Riemann sum definition of the double integral:

\begin{equation*}
f(x_i,y_j)\Delta x\Delta y \approx f(T(u_i,v_j))\cdot J_T(u_i,v_j)\Delta u_i\Delta v_j\text{.}
\end{equation*}

This equation should be viewed somewhat skeptically. The area element on the left is that of a rectangle, not the parallelogram we ended up with above. The argument given here is far from a complete proof of Theorem 15.8.17, but the result is true nonetheless. The interested reader is directed to search online, or seek out the advanced calculus section of their library, should they wish to see a proof.