# Vectors, and Vector Functions

## Vectors

A vector in the plane.

A vector is a mathematical object which has magnitude like a number, but also direction. For example, if I wanted to describe how a person is driving at any particular time, I would need two numbers: their speed, and some measure of their direction (the angle it makes with true north, for example). These two quantities together constitute the motion vector of that person.

We also use vectors to describe position: for example, to locate an object, it suffices to know how far it is from you, and in what direction.

In three dimensions, three component numbers are required to describe a vector.

Vectors are the bread and butter of multivariable calculus because they pack multiple variables into a single, easily manipulated quantity. Let's look at some more specific examples.

In our first descriptions, we used a coordinate system you should be familiar with call "polar coordinates." If you aren't familiar with this system, we will review them shortly. When using Cartesian coordinates, vectors are very, very familiar. The point (x,y) has a position vector, or put more simply, is the vector written $\begin{pmatrix}x\\y\end{pmatrix} \$. See illustration at right. Unlike variables which may be numbers, like $a, x, \$ etc., vectors are usually written $\vec{w} \$ or $\mathbf{w} \$. While the latter is common in print and online, we will use the former since it is more easily duplicated in handwriting.

In three dimensional space we use three variables to write down a vector. In Cartesian coordinates, this would look like $\vec{w} = \begin{pmatrix}x\\y\\z\end{pmatrix} \$.

We can, if we wish, talk about just the length of the vector - this is its absolute value, or $\left\| \vec{w} \right\| \$. For a vector written in Cartesian coordinates, the Pythagorean theorem tells us this is just $\sqrt{x^2 + y^2} \$.

Vectors can be added and subtracted, just like numbers. See the illustration below.

Vectors are added by connecting them head to tail. Vectors are subtracted by reversing heads and tails for the subtracted vector, and then connecting them head to tail.

In Cartesian coordinates, this is particularly easy: $\begin{pmatrix}x\\y\end{pmatrix} + \begin{pmatrix}a\\b\end{pmatrix} = \begin{pmatrix}x+a\\y+b\end{pmatrix} \$ and $\begin{pmatrix}x\\y\end{pmatrix} - \begin{pmatrix}a\\b\end{pmatrix} = \begin{pmatrix}x-a\\y-b\end{pmatrix} \$.

Multiplication by a number is also simple: the vectors direction is unchanged, but its length is multiplied by that number. Again, in Cartesian coordinates, this becomes simple: $c\begin{pmatrix}a\\b\end{pmatrix} = \begin{pmatrix}ca\\cb\end{pmatrix} \$.

The dot product is a measure of the length of one vector in the direction of another.

There are two ways to multiply a vector by another vector. In two dimensions, we can perform only one, which is called the dot product or in more advanced terminology, inner product. Given two vectors, this dot product is how far one vector goes in the direction of the other, using the other as the "unit" of length. It is defined $\vec{u} \cdot \vec{w} = \left\|\vec{u}\right\| \left\|\vec{w}\right\| \cos(\theta) \$, where $\theta \$ is the angle between $\vec{u} \$ and $\vec{w} \$. As before, this takes a particularly simple form in Cartesian coordinates: $\begin{pmatrix}x\\y\end{pmatrix} \cdot \begin{pmatrix}a\\b\end{pmatrix} =xa+by \$. This expression is similar for the three dimensional case: $\begin{pmatrix}x\\y\\z\end{pmatrix} \cdot \begin{pmatrix}a\\b\\c\end{pmatrix} = xa+yb+zc \$.

The cross product of two vectors.
While the dot product can be used in any number of dimensions, there is a special way of multiplying vectors which only exists in three dimensions (although it is a specific case of a more general concept). It is called the cross product and written $\vec{u} \times \vec{w} \$. The cross product is defined as having length equal to the area of the parallelogram borderd on two sides by $\vec{u} \$ and $\vec{w} \$, and direction perpendicular to either vector, with a right handed orientation. This means that if we take our right hand, point the palm in the direction of $\vec{u} \$, bend our knuckles so that they point in the direction of $\vec{w} \$, then our thumb is already pointing in the direction of $\vec{u} \times \vec{w} \$. This definition is not easy to calculate with, and shortly we will introduce a more convenient way of calculating these kinds of products.

While we often write $\vec{w} = \begin{pmatrix}x\\y\\z\end{pmatrix} \$ to describe a vector, we often find it more convenient to define three vectors $\vec{i}=\begin{pmatrix}1\\0\\0\end{pmatrix}, \vec{j}=\begin{pmatrix}0\\1\\0\end{pmatrix}, \$ and $\vec{k}=\begin{pmatrix}0\\0\\1\end{pmatrix} \$, unit vectors which point along the positive $x, y, \$ and $z \$ axes, respectively.

With these vectors, we can write our original vector $\vec{w}=x\vec{i} + y\vec{j} + z\vec{k} \$. It means exactly the same thing as $\begin{pmatrix}x\\y\\z\end{pmatrix} \$, but is a little friendlier to the printed page. As an additional notational convenience, we often write $\vec{w} = w_1 \vec{i} + w_2 \vec{j} + w_3 \vec{k} \$, since keeping track of all the $x \$'s and $a \$'s would get unwieldy very fast if we are working with several vectors.

As we are about to see, these conveniences can make the calculation of the cross product relatively simple. Hopefully, the student will remember the definition of the determinant of a matrix, because this will make calculating cross products much, much easier. For those who are unfamiliar with the idea, we will review the concept.

For a matrix $\begin{pmatrix}a&b\\c&d\end{pmatrix} \$, we define the determinant $\begin{vmatrix}a&b\\c&d\end{vmatrix}$ to be $ad - bc \$. To compute the determinant of a larger square matrix, then, we choose any row of the matrix and take the left entry, and multiply that number by the determinant of the matrix formed by eliminating the first row and column of the original matrix. From this, we subtract the next entry in our row times the matrix formed by eliminating THAT row and column of the orginal matrix. Then we add the next term, and so on and so forth until we have a complete expansion.

Below is an example of this technique, expanding along the first row.

$\begin{vmatrix}a&b&c\\r&s&t\\x&y&z\end{vmatrix} = a\begin{vmatrix}s&t\\y&z\end{vmatrix} - b \begin{vmatrix}r&t\\x&z\end{vmatrix} + c \begin{vmatrix}r&s\\x&y\end{vmatrix}$

Examine this equation, and make sure you understand how each of the matrices on the right hand side were formed. Once we have reached this point, we can use our definition of a 2x2 determinant to arrive at a value.

We now arrive at a means of calculating the cross product of two vectors. It turns out that the definition we gave of cross product amounts to

$\vec{u}\times\vec{w}=\begin{vmatrix}\vec{i}&\vec{j}&\vec{k}\\u_1&u_2&u_3\\w_1&w_2&w_3\end{vmatrix} \$

Normally, we don't allow vectors (like $\vec{i} \$) to be elements of a matrix or a determinant, but in this case, we make an exception because the formula is so useful and easy to remember. We will encounter a similar exception to the rules when we give an easy way to remember the formula for curl.

In this course, we will frequently switch between the notations $(x,y,z), x\vec{i}+y\vec{j}+z\vec{k},$ and $\begin{pmatrix}x\\y\\z\end{pmatrix} \$ without discussion of the switch: constantly keep in mind that these are all ways of saying the same thing.

## Vector Functions and Vector Fields

An illustration of a vector field. The arrows show the magnitude and direction of the vector at the location of the arrow's base.

Before, we described scalar functions as taking a point in space and ascribing a number to it - for example, $f(x,y,z) = x^2 + y^2 +z^2 \$. With our new notation, we can describe this as $f(\vec{r}) = \vec{r} \cdot \vec{r} \$, which is a scalar function written with vectors.

There is another class of functions, though, which return not numbers but vectors. Mathematicians define a function as anything that maps some set (the "domain") to some set (the "range".) For the functions we have seen so far, the range is the real numbers (scalars.)

When the range is a vector space, it can be called a vector function. But the most common types of vector functions are those for which the domain is physical space. In that case, the function is called a vector field.

Vector fields are useful for describing many phenomena, most notably flows, winds, heat dispersion, and electric and magnetic forces. For example, at every point in the ocean, the water is moving with both a speed and a direction. If we have some vector $\vec{r} \$ which can be used to describe points in the ocean, we can write a function $f(\vec{r}) = \vec{v} \$, which will give us the velocity vector of the water at $\vec{r} \$. Very often, we use $\vec{r} \$ as a variable vector describing a position or location, $\vec{v} \$ for velocity, and later we'll encounter $\vec{a} \$ for acceleration.

# Coordinate Systems

The spherical coordinate system.

We're all very familiar with the Cartesian coordinates in two dimensions, and once the student has learned the "right hand rule" and committed it to memory, they have learned all there is about the Cartesian system in three dimensions.

Hopefully, the students will be familiar with polar coordinates as well, but we shall review.

In the plane, it is frequently useful to describe shapes, not by their distance along two axes, but by their straight-line distance to a point, and their angle to a fixed line. We use $(r,\theta) \$ instead of the usual $x,y \$, where $r \$ is the distance and $\theta \$ is the angle.

We can easily move between Cartesian and polar coordinates with these equations:

$r = \sqrt{x^2+y^2} \$

$\theta = \arctan{\frac{y}{x}} \$

$x=r \arccos{(\theta)} \$

$y=r \arcsin{(\theta)} \$

Cylindrical coordinates are just polar coordinates, with a z-axis. They are frequently useful for describing phenomenon which have circular motion in a plane, but are moving in some direction (like a spiral).

More common than cylindrical coordinates are spherical coordinates, which use distance from a center point together with "latitude and longitude" to describe points in three dimensional space.

In the spherical coordinate system, a point is described with its distance to a central point, $\rho \$, the angle made in the $xy \$ plane between the points projection into that plane and the x-axis $\theta \$, and the angle made between the point and the z-axis $\phi \$.

As with polar coordinates, there are equations which allow us to transfer between spherical and Cartesian coordinates.

$\rho = \sqrt{x^2 + y^2 + z^2} \$

$\theta = \arctan{(\frac{y}{x})} \$

$\phi = \arctan{ \left({ \frac{\sqrt{x^2+y^2}}{z} }\right) } \$

$x = r \sin(\phi) \cos(\theta) \$

$y = r \sin(\phi) \sin(\theta) \$

$z = r \cos(\phi) \$

# Limits of Multivariate Functions

See the limit article.

The limit of a multivariate function $\lim_{(x,y)\rightarrow(x_0,y_0)} f(x,y) \$ is almost identical to the single-variable limit in concept, but complicated by the realization that (x,y) can approach the target point from any direction, not just the left or right hand sides.

The definition is not dissimilar from the single-variable case. The limit of a function is defined so that $\lim_{(x,y) \rightarrow (x_0,y_0)} f(x,y) = L \$ if, for any tiny little number $\epsilon>0 \$ you can think of, there is some distance $\delta \$ such that every point within $\delta \$ of $(x_0,y_0) \$ is taken by the function to a number within $\epsilon \$ of $L \$.

Nevertheless, the kinds of phenomenon we see in the multivariate case is very different. Note that while it is still possible to define a limit from a direction, we don't bother to do so - unlike the one-variable case when a one-sided deriative told us half of the functions behavior at a point, in higher dimensions such a limit tells us almost nothing. Indeed, the only kind of phenomenon that these kinds of limits are capable of describing are "point discontinuties," which can be "fixed" just be defining the function to be the limit at that point.

# Partial Derivatives

See the proof page for details.

When dealing with functions of more than one variable, one has to be careful about what a "derivative" is. The operative concept is the partial derivative, which is the derivative of the function with respect to a given argument while holding all the other arguments constant. It is written like a derivative, but with a different "d" character.

If

$f(x,y,z) = \frac{x y^2 \sin{x}}{z}\,$

we have

$\frac{\partial}{\partial x} f(x,y,z) = \frac{y^2 (x \cos{x} + \sin{x})}{z}$
$\frac{\partial}{\partial y} f(x,y,z) = \frac{2 x y \sin{x}}{z}$
$\frac{\partial}{\partial z} f(x,y,z) = \frac{- x y^2 \sin{x}}{z^2}$

Here are some other notations for partial derivatives that one might encounter. We won't use them.

$\frac{\partial}{\partial x} f(x,y,z) = \partial_x f(x,y,z) = f_x (x,y,z)$

One can take higher-order partial derivatives as well, such as

$\frac{\partial^2 f}{\partial x^2}$

or

$\frac{\partial^2 f}{\partial x \partial y}$

which means

$\frac{\partial}{\partial x}\left(\frac{\partial f}{\partial y}\right)$

A question that arises is: do mixed partial derivatives commute? That is, do we have:

$\frac{\partial^2 f}{\partial x \partial y} = \frac{\partial^2 f}{\partial y \partial x}$

The answer is yes, if the derivatives are continuous. In problems that arise in practice, the derivatives are always continuous, and switching derivative order is a staple of work in multivariate calculus.

The study of various coordinate systems (sometimes called "curvilinear coordinate systems") is central to the subject of multivariable calculus. You are already familiar with some of the common coordinate systems such as polar coordinates in 2 dimensions, and spherical or cylindrical coordinates in 3 dimensions. In this section we will examine the general topic of coordinate systems, and how one can perform calculations—involving vectors, surfaces, integrals, and so on—directly in specialized coordinate systems. Physical problems can often be solved more easily in a coordinate system that matches the symmetries of the problem.

An extremely important consideration to be aware of is that geometrical or physical phenomena (points, vectors, regions, surfaces, etc.) have a fundamental existence and meaning that is independent of any coordinate system. Coordinate systems simply give us ways to attach numbers to them. Different coordinate systems will attach different numbers to a given vector field, for example, but the vector field has an underlying geometric meaning that does not change. The various vector field operations have an underlying geometrical or physical meaning, and coordinate systems simply let us perform mathematical calculations on them.

You have already seen the notion of "geometrical universality" in the definitions of the dot product and the cross product. They were given purely geometrical definitions that can be visualized without regard to any coordinate system. Than, for any Cartesian coordinate system, these operations were defined in terms of the components of the vectors in that system, and it was shown that those results matched the purely geometrical definitions. What we are going to do next is develop the tools to define vectors, and the dot product and cross product, in any coordinate system at all. Later, we will introduce new operations—the divergence, the curl, and various integrals, and show how to calculate them in any coordinate system.

For example, Stokes' theorem makes a very powerful geometric statement about integrals of various vector fields over very general surfaces and lines. The concepts appearing in the theorem are purely geometric, and therefore independent of the coordinate system. Once we know how to manipulate those concepts (vector fields, integrals, and the "curl" operator) in any coordinate system, we will be able to choose a coordinate system that makes the surface simple. Specifically, we will change from the usual x/y/z Cartesian system to a u/v/w system for which the surface is defined by holding w=0. (Another way of looking at this is that the surface has been "parameterized" in terms of parameters u and v.) Once this is done, we will be able to calculate the curl operator, perform the integration, and prove Stokes' theorem in the u/v/w system.

As another example, Maxwell's equations make statements about the curl and divergence of the electric and magnetic field. While the formulas for curl and divergence are simpler in Cartesian coordinates than in spherical, some problems, such as the electric field in the vicinity of a point charge, have symmetry that makes spherical coordinates more natural. When we work out the divergence theorem in spherical coordinates, we will be able to solve problems of this sort.

We will use "u" and "v" in many of our general examples of alternative coordinate systems in 2 dimensions, and u, v, and w in 3 dimensions. For the very common cases of polar coordinates in 2 dimensions and spherical in 3 dimensions, we will often use the more familiar r/θ or r/θ/φ.

We always have the coordinates of each system representable as functions of the other. For example, the transformation between polar coordinates and Cartesian coordinate is this:

$x = r \cos \theta\,$
$y = r \sin \theta\,$
$r = \sqrt{x^2 + y^2}\,$
$\theta = \tan^{-1} (y/x)\,$

In spherical coordinates:

$x = r \sin \theta \cos \phi\,$
$y = r \sin \theta \sin \phi\,$
$z = r \cos \theta\,$
$r = \sqrt{x^2 + y^2 + z^2}\,$
$\theta = \tan^{-1} (\sqrt{x^2 + y^2} / z)\,$
$\phi = \tan^{-1} (y/x)\,$

## The Jacobian

Much of what happens in multivariable calculus involves the partial derivatives of the coordinates in one system with respect to the coordinates in the other. It is very useful to represent these as a matrix:

$\begin{bmatrix} \displaystyle\frac{\partial x}{\partial r} & \displaystyle\frac{\partial x}{\partial \theta} \\ \\ \displaystyle\frac{\partial y}{\partial r} & \displaystyle\frac{\partial y}{\partial \theta}\end{bmatrix}\,$

Or, going in the other direction:

$\begin{bmatrix} \displaystyle\frac{\partial r}{\partial x} & \displaystyle\frac{\partial r}{\partial y} \\ \\ \displaystyle\frac{\partial \theta}{\partial x} & \displaystyle\frac{\partial \theta}{\partial y}\end{bmatrix}\,$

This is the derivative matrix of the transformation from one coordinate system to the other. It is often called the Jacobian matrix, though some authors use the term "Jacobian" specifically to refer to the matrix's determinant. To avoid any confusion on this point, we will always put square brackets around matrices, and use absolute value signs to denote determinants.

The notation for the Jacobian matrix of u/v/w in terms of x/y/z is written:

$J(u, v, w / x, y, z) = \frac{\partial(u, v, w)}{\partial(x, y, z)} = \begin{bmatrix} \displaystyle\frac{\partial u}{\partial x} & \displaystyle\frac{\partial u}{\partial y} & \displaystyle\frac{\partial u}{\partial z} \\ \\ \displaystyle\frac{\partial v}{\partial x} & \displaystyle\frac{\partial v}{\partial y} & \displaystyle\frac{\partial v}{\partial z} \\ \\ \displaystyle\frac{\partial w}{\partial x} & \displaystyle\frac{\partial w}{\partial y} & \displaystyle\frac{\partial w}{\partial z}\end{bmatrix}$

More generally, it is:

$J(f_1,f_2,...,f_n / x_1,x_2,...,x_n) = \frac{\partial(f_1,f_2,...,f_n)}{\partial(x_1,x_2,...,x_n)} = \begin{bmatrix} \displaystyle\frac{\partial f_1}{\partial x_1} & ... & \displaystyle\frac{\partial f_1}{\partial x_n} \\ ... & & ... \\ \displaystyle\frac{\partial f_3}{\partial x_1} &...&\displaystyle\frac{\partial f_n}{\partial x_n}\end{bmatrix}$

and the determinant is:

$\begin{vmatrix} \frac{\partial f_1}{\partial x_1} & ... & \frac{\partial f_1}{\partial x_n} \\ ... & & ... \\ \frac{\partial f_3}{\partial x_1} &...&\frac{\partial f_n}{\partial x_n}\end{vmatrix}$

For polar coordinates, the Jacobian matrices are:

$J(r, \theta / x, y) = \frac{\partial(r, \theta)}{\partial(x, y)} = \begin{bmatrix} \displaystyle\frac{\partial r}{\partial x} & \displaystyle\frac{\partial r}{\partial y} \\ \\ \displaystyle\frac{\partial \theta}{\partial x} & \displaystyle\frac{\partial \theta}{\partial y} \end{bmatrix}$

and:

$J(x, y / r, \theta) = \frac{\partial(x, y)}{\partial(r, \theta)} = \begin{bmatrix} \displaystyle\frac{\partial x}{\partial r} & \displaystyle\frac{\partial x}{\partial \theta} \\ \\ \displaystyle\frac{\partial y}{\partial r} & \displaystyle\frac{\partial y}{\partial \theta} \end{bmatrix}$

Working these out, we have

$J(x, y / r, \theta) = \begin{bmatrix} \cos \theta & - r \sin \theta \\ \sin \theta & r \cos \theta \end{bmatrix}$

with determinant r, and:

$J(r, \theta / x, y) = \begin{bmatrix} \displaystyle\frac{x}{\sqrt{x^2 + y^2}} & \displaystyle\frac{y}{\sqrt{x^2 + y^2}} \\ \\ \displaystyle\frac{- y}{x^2 + y^2} & \displaystyle\frac{x}{x^2 + y^2} \end{bmatrix} = \begin{bmatrix} \cos \theta & \sin \theta \\ \\ \displaystyle\frac{- \sin \theta}{r} & \displaystyle\frac{\cos \theta}{r} \end{bmatrix}$

with determinant 1/r.

An interesting theorem is that these matrices are inverses of each other. (It follows that their determinants are reciprocals of each other.)

Exercise: Prove that the matrix product of the J matrices of two sequential coordinate changes is the J matrix of the combined change. That is, if

$f_1,f_2,...,f_n \to g_1,g_2,...,g_n$

and

$g_1,g_2,...,g_n \to h_1,h_2,...,h_n$

are two coordinate changes, so that

$f_1,f_2,...,f_n \to h_1,h_2,...,h_n$

is the result of doing both, then

$\frac{\partial(h_1,h_2,...,h_n)}{\partial(f_1,f_2,...,f_n)} = \frac{\partial(h_1,h_2,...,h_n)}{\partial(g_1,g_2,...,g_n)}\ \ \frac{\partial(g_1,g_2,...,g_n)}{\partial(f_1,f_2,...,f_n)}$

Hint: This is really just the chain rule for partial derivatives. The adding up of terms in the chain rule is the same as the adding up of products in a matrix multiplication.

It follows from this that, if the second change is just the inverse of the first, bringing us back to the original coordinate system, the two matrices are inverses.

The coordinates should always be chosen such that the Jacobian determinant is positive. If it isn't, exchange the order of two of the coordinates. This ensures that the coordinate system is "right handed". (Or more precisely, that it has the same handedness as the Cartesian system.) As an example, the coordinate order (r/θ) is correct for polar coordinates; (θ/r) is not. The order (r/θ/φ) is correct for spherical coordinates; (r/φ/θ) is not.

The Jacobian determinant must not be zero. (Equivalently, the Jacobian matrix must not be singular.) The astute reader will notice that this means that polar coordinates don't work at the origin, and spherical coordinates don't work at the north and south poles.

## Vectors in Arbitrary Coordinate Systems

Having gotten the preliminary concepts out of the way, we next examine what vectors, and vector operations such as the dot product, look like in arbitrary coordinate systems.

A vector has different components in different coordinate systems. (It's still the same vector, of course.) If the components of the vector $\vec{A}$, in the u/v/w coordinate system, are:

$\vec{A} = \begin{bmatrix} A_u \\ A_v \\ A_w \end{bmatrix}$

and, in the standard Cartesian system the components are:

$\vec{A} = \begin{bmatrix} A_x \\ A_y \\ A_z \end{bmatrix}$

then the transformation rule is:

$\begin{bmatrix} A_u \\ A_v \\ A_w \end{bmatrix} = \begin{bmatrix} \displaystyle\frac{\partial u}{\partial x} & \displaystyle\frac{\partial u}{\partial y} & \displaystyle\frac{\partial u}{\partial z} \\ \\ \displaystyle\frac{\partial v}{\partial x} & \displaystyle\frac{\partial v}{\partial y} & \displaystyle\frac{\partial v}{\partial z} \\ \\ \displaystyle\frac{\partial w}{\partial x} & \displaystyle\frac{\partial w}{\partial y} & \displaystyle\frac{\partial w}{\partial z}\end{bmatrix} \begin{bmatrix} A_x \\ A_y \\ A_z \end{bmatrix}$

In other words, the transformation of vector components from x/y/z to u/v/w is just a matrix multiplication by the Jacobian matrix

$\begin{bmatrix}J(u, v, w / x, y, z)\end{bmatrix}$

To go the other way, multiply by the matrix

$\begin{bmatrix}J(x, y, z / u, v, w)\end{bmatrix}$

We make a simplification to avoid having our notation get completely out of control. In keeping with our goal of expressing everything just in terms of the new coordinate system:

• The matrix that we have been calling [J(x, y, z / u, v, w)] we will call just [J]. That is, [J] for a coordinate system means [J(Cartesian/the coordinate system under discussion)].
• The matrix that goes the other way, formerly called [J(u, v, w / x, y, z)] we will call [J] − 1. It's just the inverse of the first matrix.

We will call [J] the "Jacobian of the coordinate system".

The astute reader will notice that this is not a good definition. [J] depends on the choice of the "reference" Cartesian system, so we're not justified in calling it just "the Jacobian". But it turns out that it won't make any difference. We will always get the same answers, no matter what reference system we used. See the exercises.

Therefore:

$\begin{bmatrix} A_u \\ A_v \\ A_w \end{bmatrix} = [J]^{-1} \begin{bmatrix} A_x \\ A_y \\ A_z \end{bmatrix}$

and:

$\begin{bmatrix} A_x \\ A_y \\ A_z \end{bmatrix} = [J] \begin{bmatrix} A_u \\ A_v \\ A_w \end{bmatrix}$

For polar coordinates in two dimensions, we have:

$\begin{bmatrix} A_r \\ A_\theta \end{bmatrix} = \begin{bmatrix}J(r, \theta / x, y)\end{bmatrix} \begin{bmatrix} A_x \\ A_y \end{bmatrix} = \begin{bmatrix} \displaystyle\frac{\partial r}{\partial x} & \displaystyle\frac{\partial r}{\partial y} \\ \\ \displaystyle\frac{\partial \theta}{\partial x} & \displaystyle\frac{\partial \theta}{\partial y}\end{bmatrix} \begin{bmatrix} A_x \\ A_y \end{bmatrix} = \begin{bmatrix} \cos \theta & \sin \theta \\ \\ \displaystyle\frac{- \sin \theta}{r} & \displaystyle\frac{\cos \theta}{r} \end{bmatrix} \begin{bmatrix} A_x \\ A_y \end{bmatrix}$

And, going the other way:

$\begin{bmatrix} A_x \\ A_y \end{bmatrix} = [J] \begin{bmatrix} A_r \\ A_\theta \end{bmatrix} = \begin{bmatrix} \cos \theta & - r \sin \theta \\ \sin \theta & r \cos \theta \end{bmatrix} \begin{bmatrix} A_r \\ A_\theta \end{bmatrix}$

Let's look at some vectors whose components are given in general coordinates—there are some important and striking aspects of this.

In polar coordinates, a "radial basis vector" at any point has components:

$(A_r=1, A_\theta=0)\,$

Its Cartesian components (from the formulas above) are:

$(A_x=\cos \theta, A_y=\sin \theta)\,$

The actual (Cartesian) direction in which the vector points depends on its location. It always points radially outward. Its length is $\sqrt{A_x^2 + A_y^2} = \sqrt{\cos^2 \theta + \sin^2 \theta} = 1\,$. So the radial basis vector is a unit vector.

The "angular basis vector" has components:

$(A_r=0, A_\theta=1)\,$

Its Cartesian components are:

$(A_x=- r \sin \theta, A_y=r \cos \theta)\,$

It always points "laterally", at right angles to the radial direction. More surprisingly, its length is $\sqrt{A_x^2 + A_y^2} = r\,$. The angular basis vector is not a unit vector! It is bigger when farther away from the origin, even though the sum of the squares of its components is 1.[1] In general curvilinear coordinates, the "square root of the sum of the squares of the components" rule is not correct for the length of a vector. The correct rule will be given below.

### The Dot Product in Arbitrary Coordinates

We need a better formula for the dot product than the rule that works for Cartesian coordinates:

$\vec{A} \cdot \vec{B} = A_x B_x + A_y B_y + A_z B_z\,$

Given two vectors $\vec{A}$ and $\vec{B}$, described in a u/v/w coordinate system, we can find their dot product by converting to Cartesian coordinates and using the simple rule that we know, which is the sum of the pairwise products of the vector components. This means that, at least momentarily, we will have to keep track of the vectors' components in both the Cartesian system, which we have been calling Ax, Ay, and Az, and in the new coodinate system, which we have been calling Au, Av, and Aw. To do the required matrix manipulations, we have to give the components numbers instead of names. Things like $A_{u_1}$, $A_{u_2}$, and $A_{u_3}$ are just too unwieldy.[2] So we will make this simplification:

• The components of a vector in the new coordinate system (formerly Au, Av, and Aw) will just be called A1, A2, and A3.
• The components of a vector in the "reference" Cartesian system will be written with a tilde: $\tilde{A_1}$, $\tilde{A_2}$, and $\tilde{A_3}$.

Once we reformulate the rules for vectors in terms of the coordinate system of interest, we will only use A1, A2, and A3 (or Au, Av, and Aw), and we will never have to look at another tilde again.

We can now calculate the dot product of $\vec{A}$ and $\vec{B}$, from their components and the Jacobian. We know that

$\vec{A} \cdot \vec{B} = \sum_i \tilde{A_i} \tilde{B_i}$

in terms of the components in the reference system. We also have the transformation rule in terms of the Jacobian matrix.

$\tilde{A_i} = \sum_j [J(x, y, z / u, v, w)]_{ij} A_j$

This is just matrix multiplication. The summation is from 1 to 3, or whatever the dimension is.

By our new convention, this is just the Jacobian matrix [J], so

$\tilde{A_i} = \sum_j [J]_{ij} A_j$
$\tilde{B_i} = \sum_k [J]_{ik} B_k$
$\vec{A} \cdot \vec{B} = \sum_i \tilde{A_i} \tilde{B_i} = \sum_{ijk} [J]_{ij} [J]_{ik} A_j B_k$

As promised, no more tildes!

We can combine the two Jacobians into one matrix, and then use that whenever we need a dot product. Let:

$[g]_{jk} = \sum_i [J]_{ij} [J]_{ik} = \sum_i [J]^t_{ji} [J]_{ik}$

That is, [g] is just the matrix product of the transpose of [J] and [J] itself.

$[g]_{jk} = ([J]^t [J])_{jk}\ \ \ or\ \ \ [g] = [J]^t [J]\,$

Then:

$\vec{A} \cdot \vec{B} = \sum_{jk} [g]_{jk} A_j B_k$

This is the rule for the dot product in general coordinates. The matrix [g] is called the metric tensor or just the metric. We will have occasion to use its determinant, which we will call just g. Actually, its square root is more commonly used.

For those who really like equations that express things in terms of matrix multiplication, this could be written as:

$\vec{A} \cdot \vec{B} = \begin{bmatrix} A_1 ... A_N \end{bmatrix} [g] \begin{bmatrix} B_1 \\ ... \\ B_N \end{bmatrix}$

The first vector is written in the form of a row instead of the more traditional column. This gives a 1xN matrix times am NxN matrix times an Nx1 matrix, which is a 1x1 matrix. The single value in that matrix is the dot product.

To review what we have for two popular systems, in 2-dimensional polar coordinates:

$[J] = \begin{bmatrix} \cos \theta & - r \sin \theta \\ \sin \theta & r \cos \theta \end{bmatrix}$
$[J]^{-1} = \begin{bmatrix} \cos \theta & \sin \theta \\ \\ \displaystyle\frac{- \sin \theta}{r} & \displaystyle\frac{\cos \theta}{r} \end{bmatrix}$
$[g] = \begin{bmatrix} 1 & 0 \\ 0 & r^2 \end{bmatrix}$
$\sqrt{g} = r$

For spherical coordinates in 3 dimensions (we will need these soon):

$[J] = \begin{bmatrix} \sin \theta \cos \phi & r \cos \theta \cos \phi & - r \sin \theta \sin \phi \\ \sin \theta \sin \phi & r \cos \theta \sin \phi & r \sin \theta \cos \phi \\ \cos \theta & - r \sin \theta & 0 \end{bmatrix}$
$[J]^{-1} = \begin{bmatrix} \sin \theta \cos \phi & \sin \theta \sin \phi & \cos \theta \\ \\ \displaystyle\frac{\cos \theta \cos \phi}{r} & \displaystyle\frac{\cos \theta \sin \phi}{r} & - \displaystyle\frac{\sin \theta}{r} \\ \\ - \displaystyle\frac{\sin \phi}{r \sin \theta} & \displaystyle\frac{\cos \phi}{r \sin \theta} & 0 \end{bmatrix}$
$[g] = \begin{bmatrix} 1 & 0 & 0 \\ 0 & r^2 & 0 \\ 0 & 0 & r^2 \sin^2 \theta \end{bmatrix}$
$\sqrt{g} = r^2 \sin \theta$

So, in polar coordinates:

$\vec{A} \cdot \vec{B} = A_1 B_1 + r^2 A_2 B_2$

or, going back to giving the coordinates names:

$\vec{A} \cdot \vec{B} = A_r B_r + r^2 A_\theta B_\theta$

In spherical coordinates:

$\vec{A} \cdot \vec{B} = A_r B_r + r^2 A_\theta B_\theta + r^2 \sin^2 \theta A_\phi B_\phi$

The metric is always a symmetric matrix. (Exercise: Prove this. Use the definition in terms of [J] and [J] − 1.) For all coordinate systems that have the coordinate lines crossing at right angles everywhere (which includes all the common systems), the metric is a diagonal matrix—the only nonzero items are on the main diagonal.

### The Length of a Vector in Arbitrary Coordinates

The length of a vector (sometimes called the "norm") is the square root of the dot product of the matrix with itself. Note that this is defined in terms of something that we have defined in a purely geometrical way, independent of any coordinate system. Note also that it is consistent with our earlier definition of the dot product of two vectors as the product of their lengths, times the cosine of the angle between them.

The length of a vector is usually written with a sort of double absolute value sign.

In polar coordinates, we have:

$\Vert\vec{A}\Vert = \sqrt{A_r^2 + r^2 A_\theta^2}$

In spherical coordinate:

$\Vert\vec{A}\Vert = \sqrt{A_r^2 + r^2 A_\theta^2 + r^2 \sin^2 \theta A_\phi^2}$

### The Cross Product in Arbitrary Coordinates

The rule for the cross product in the general u/v/w coordinate system makes use of all the things we have been developing. First, make a vector from the determinant with basis vectors (in the u/v/w system!) across the top, the way we did previously:

$\vec{P}=\begin{vmatrix}\hat{u}&\hat{v}&\hat{w}\\ A_u & A_v & A_v \\ B_u & B_v & B_w \end{vmatrix}$

If the coordinate system is Cartesian, we are done. Otherwise, apply the inverse of [g] to it, and then multiply by $\sqrt{g}$. (Remember that just "g", without brackets, is the determinant of the matrix.)

$\vec{A} \times \vec{B} = \sqrt{g}\ [g]^{-1}\ \vec{P} = \sqrt{g}\ [g]^{-1}\ \begin{vmatrix}\hat{u}&\hat{v}&\hat{w}\\ A_u & A_v & A_v \\ B_u & B_v & B_w \end{vmatrix}$

In the case that [g] is a diagonal matrix, this can be simplified. [g] is a diagonal matrix for any "orthogonal" coordinate system, that is, one in which the coordinate lines always cross at right angles. This is true for all common coordinate systems, including polar, spherical, and cylindrical, so we will frequently make use of this simplification.

If [g] is diagonal, the only nonzero elements are g11, g22, and g33. The only nonzero elements of [g] − 1 are $\frac{1}{g_{11}}$, $\frac{1}{g_{22}}$, and $\frac{1}{g_{33}}$. In this case, the cross product formula is:

$\vec{A} \times \vec{B} = \sqrt{g_{11}g_{22}g_{33}}\ \begin{vmatrix}\frac{1}{g_{11}}\hat{u}&\frac{1}{g_{22}}\hat{v}&\frac{1}{g_{33}}\hat{w}\\ A_u & A_v & A_v \\ B_u & B_v & B_w \end{vmatrix} = \begin{vmatrix}\sqrt{\frac{g_{22}g_{33}}{g_{11}}}\hat{u}&\sqrt{\frac{g_{11}g_{33}}{g_{22}}}\hat{v}&\sqrt{\frac{g_{11}g_{22}}{g_{33}}}\hat{w}\\ A_u & A_v & A_v \\ B_u & B_v & B_w \end{vmatrix}$

In spherical coordinates, this becomes:

$\vec{A} \times \vec{B} = \begin{vmatrix}r^2 \sin \theta\ \hat{r}&\sin \theta\ \hat{\theta}&\frac{1}{\sin \theta}\hat{\phi}\\ A_u & A_v & A_w \\ B_u & B_v & B_w \end{vmatrix}$

Example: Suppose $\vec{A}$ is a radial basis vector:

$\vec{A} = \hat{r} = \begin{bmatrix}1 \\ 0 \\ 0 \end{bmatrix}$

(The length of $\vec{A}$ is 1.) Let $\vec{B}$ be a θ-pointing basis vector:

$\vec{B} = \hat{\theta} = \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}$

(The length of $\vec{B}$ is r.) $\vec{B}$ is a vector at the surface of a sphere of radius r that points "south", tangential to the surface. $\vec{A}$ points outward from the surface, that is "up". $\vec{A}$ and $\vec{B}$ are orthogonal. (Prove it; caclulate $\vec{A} \cdot \vec{B}$.)

By the formula for the cross product, we have:

$\vec{A} \times \vec{B} = \begin{vmatrix}r^2 \sin \theta\ \hat{r}&\sin \theta\ \hat{\theta}&\frac{1}{\sin \theta}\hat{\phi}\\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{vmatrix} = \frac{1}{\sin \theta}\ \hat{\phi} = \begin{bmatrix} 0 \\ 0 \\ \frac{1}{\sin \theta} \end{bmatrix}$

This points "east" at the surface of the sphere of radius r, is perpendicular to the other two vectors, and has length r, which is the product of the lengths of the other two vectors. The geometrical properties of the cross product (perpendicular to each of the given vectors, and of length equal to the product of their lengths times the sine of the angle between them) are obeyed. The cross product, like all other vector operations, is a geometric invariant. It doesn't matter what coordinate system was used to calculate the components—the answer is the same.

### The Divergence in Arbitrary Coordinates

The divergence and curl will be discussed in detail in lecture 6. But, to give a flavor of how the [g] metric matrix is used, we give the definitions here.

Here's the definition of the divergence operator in arbitrary curvilinear coordinates:

$\nabla \cdot \vec{A} = \frac{1}{\sqrt{g}} \sum_i \frac{\partial}{\partial x_i}\ (\sqrt{g}\ A_i) = \sum_i \frac{\partial A_i}{\partial x_i} + \frac{1}{\sqrt{g}}\ \sum_i \frac{\partial \sqrt{g}}{\partial x_i}\ A_i$

(The second form follows from the first due to the product rule for derivatives.) In practice, it's not as forbidding as it looks, for vector fields that are well suited to the particular coordinate system.

• Need to put in some examples, etc. I have them worked out, but typing them in is tedious.

### The Curl in Arbitrary Coordinates

Here's the definition of the curl operator in arbitrary curvilinear coordinates:

First, to find the curl of a vector field $\vec{A}$, let the matrix [g] operate on $\vec{A}$, yielding $\vec{B}$:

$\vec{B} = [g] \vec{A}$
Special case: If the coordinate system is orthogonal, so that [g] is diagonal, this is very easy:
$B_u = g_{11} A_u\ \ \ B_v = g_{22} A_v\ \ \ B_w = g_{33} A_w$

Then the curl is:

$\nabla \times \vec{A} = \frac{1}{\sqrt{g}} \begin{vmatrix} \hat{u} & \hat{v} & \hat{w} \\ \\ \displaystyle\frac{\partial}{\partial u} & \displaystyle\frac{\partial}{\partial v} & \displaystyle\frac{\partial}{\partial v} \\ \\ B_u & B_v & B_w \end{vmatrix}$

Or, if the coordinate system is orthogonal:

$\nabla \times \vec{A} = \frac{1}{\sqrt{g}} \begin{vmatrix} \hat{u} & \hat{v} & \hat{w} \\ \\ \displaystyle\frac{\partial}{\partial u} & \displaystyle\frac{\partial}{\partial v} & \displaystyle\frac{\partial}{\partial v} \\ \\ g_{11}A_u & g_{22}A_v & g_{33}A_w \end{vmatrix}$
• Need to put in some examples, etc. I have them worked out, but typing them in is tedious.

# Problems

## Review Problems

1. Find an antiderivative of sin(x)e2x − 1

2. Find the Taylor series of $f(x) = e^{-x^2}, f(0)=0$ at 0. What do you notice about this sum?

3. Sketch level curves of z = cos(x) + sin(y) in the xy plane.

4. Describe the level surface of f(x,y,z) = x2 + y2 + z2 / 4.

## Main Problems

1. If $\vec{v} = 2\vec{i} - (1/2)\vec{j}$ and $\vec{w}=-\vec{i}+ (1/3)\vec{j}$, find $\vec{u}+\vec{w}, \vec{u}-\vec{w}, \vec{u}\cdot\vec{w},$. Moving to three dimensional space now, what can you say about $\vec{u}\times\vec{w}$ before you even compute it? Carry out the computation. Was your prediction correct?

2. Let $\vec{u}=4\vec{i}-2\vec{j}+\vec{k}$. Find a spherical coordinate expression for $\vec{u}$.

3. Let $\vec{F}(r,\theta,\phi) = (-r^{-1})(\vec{i}+\vec{j}+\vec{k})$. Describe $\vec{F}$ in words. Find a formula for $\vec{F}$ in Cartesian coordinates.

4. Find the partial derivatives $\frac{\partial}{\partial x}$ and $\frac{\partial}{\partial y}$ of 3exysin(y) + 2x

5. Find the partial derivatives $\partial_x, \partial_y$ and $\partial_z$ of $\frac{x}{y+z} + \frac{y}{x+z}$.

## Challenging Problems

1. Prove that the geometric definition of $\vec{a}\cdot\vec{b}$ matches our coordinate expression.

2. Prove that the geometric definition of $\vec{a}\times\vec{b}$ matches our coordinate expression.

3. Suppose f(x,y) has continuous derivatives of all orders. Prove $\partial_{xy}f = \partial_{yx}f$.

## References

1. Some textbooks use a definition of vector components such that basis vectors always have length 1. Such a treatment imposes the requirement that the coordinate lines be perpendicular everywhere. The formulas for vector operations are more complicated and less elegant when this is done.
2. This is a well known notational problem in multivariable calculus. It is one of the reasons why tensor algebra textbooks are so hard to read.