Analysis, Part IV: Several Variable Differential Calculus
October 17th, 2017
Derivatives in Several Variables
Recall that the definition of the derivative in a single variable is:
$$ f'(x) = \lim\limits_{h \rightarrow 0} \frac{f(x+h) - f(x)}{h} $$
However, this definition does not generalize to higher dimensions. In particular, the quantity on the right hand side would be a quotient of vectors. Instead we note that we can rewrite the above as:
$$ 0 = \lim\limits_{h \rightarrow 0} \frac{|f(x+h) - f(x) - f'(x)h|}{|h|} $$
We generalize the definition as follows.
Definition: Let $E$ be a subset of $\mathbb{R}^n$ and let $f: E\rightarrow \mathbb{R}^m$ be a function. Then we say that $f$ is differentiable at $x_0 \in E$ if there exists a linear transformation $L: \mathbb{R}^n \rightarrow \mathbb{R}^n$ so that:
$$ 0 = \lim\limits_{h \rightarrow 0} \frac{|f(x+h) - f(x) - Lh|}{|h|} $$
Because limits are unique, (with some work), we can show that this definition makes sense, i.e. $L$ is unique at each point if it exists.
Partial and Directional Derivatives
Definition: Let $E\subset \mathbb{R}^n$ and let $f$ be a function $f: E \rightarrow \mathbb{R^m}$, with $x_0 \in \text{int}(E)$. Let $v$ be a vector in $\mathbb{R}^n$. Then if the following limit exists, it is called the directional derivative of $f$ in the direction of $v$, or $D_vf(x_0)$:
$$ \lim \limits_{t \rightarrow 0} \frac{f(x_0+tv) - f(x_0)}{t} $$
Note that this object is a vector in the codomain $\mathbb{R^m}$, not a linear transformation as the derivative is.
One special type of directional derivative is a partial derivative.
Definition: The partial derivative $\frac{\partial f}{\partial x_j}$ or sometimes denoted $f_{x_j}$ is the directional derivative of $f$ in the direction of the $j$th basis vector in the standard basis for $\mathbb{R}^n$.
Proposition
Let $E\subset \mathbb{R}^n$ and let $f$ be a function $f: E \rightarrow \mathbb{R^m}$, with $x_0 \in \text{int}(E)$. Let $v$ be a vector in $\mathbb{R}^n$. Then if $f$ is differentiable at $x_0$, then $f$ has a directional derivative $D_vf(x_0)$, and in particular:
$$ D_vf(x_0) = f'(x_0)v $$
Thus, in particular, if $f$ is differentiable then we can recover its rows. Note that from above we have:
$$ f_{x_j}(x_0) = f'(x_0)(e_j) $$
The quantity on the right is exactly the $j$th column of $f’(x_0)$. Thus, if a function is differentiable, then its partial derivatives all exist and they completely describe the derivative. However, the converse is not true; there are many examples of functions which have partial derivatives but which are not differentiable. The stronger condition that guarantees the converse is as follows.
At times, $f’(x_0)$ is used to denote the linear transformation of the derivative, while $Df(x_0)$ represents the matrix of the derivative with respect to some basis; in practice, it is harmless to use the two interchangeably.
Theorem
Let $E\subset \mathbb{R}^n$ and let $f$ be a function $f: E \rightarrow \mathbb{R^m}$, with $x_0 \in \text{int}(E)$. Then if the partial derivatives of $f$ exist and are continuous in a neighborhood of $x_0$ contained in $E$, then $f$ is differentiable at $x_0$, and the $j$th column of the derivative is exactly the $j$th partial derivative of $f$.
Alternatively, we could describe the rows of the derivative instead of the columns. Let $f = (f_1, \dots, f_m)$. then we can write:
$$ f'(x_0) = \begin{bmatrix}\nabla f_1(x_0) \\ \vdots \\ \nabla f_m(x_0)\end{bmatrix} $$
The Chain Rule
There is also a suitable analogue for the chain rule.
Theorem (Chain Rule)
Let $E\subset \mathbb{R}^n$ and let $f$ be a function $f: E \rightarrow F \subset \mathbb{R^m}$, with $x_0 \in \text{int}(E)$. Let $g: F \rightarrow \mathbb{R}^p$. Supose that $f$ is differentiable at $x_0$ and $g$ is differentiable at $f(x_0)$. Then $g\circ f: E\rightarrow \mathbb{R}^p$ is also differentiable at $x_0$, and:
$$ (g\circ f)'(x_0) = g'(f(x_0))\cdot f'(x_0) $$
Clairaut’s Theorem
We similarly define what it means for a function to be twice differentiable.
Definition: let $E \subset \mathbb{R}^n$ be an open set, and $f: E \rightarrow \mathbb{R}^m$. Then $f$ is continuously differentiable (also written $C^1$) if the partial derivatives all exist and are continuous. $f$ is twice continuously differentiable if it is continuously differentiable, and the partial derivatives (also being functions from $\mathbb{R}^n$ to $\mathbb{R}^m$) are also continuously differentiable. We have $C^1 \subset C^2$.
We can indeed generalize this definition as follows. A function $f$ is $C^k$ if its partial derivatives exist up to order $k$ and are continuous. The following useful fact is immensely important to calculus:
Theorem (Clairaut’s Theorem)
Let $E \subset \mathbb{R}^n$ open, and let $f: E \rightarrow \mathbb{R}^m$ be a twice continuously differentiable function on $E$. Then for each $1\leq i, j \leq n$
$$ (f_{x_i}(x_0))_{x_j} = (f_{x_j}(x_0))_{x_i} $$
In other words, we can swap the order of differentiation.
The Inverse & Implicit Function Theorems
Recall that if a function $f:\mathbb{R} \rightarrow \mathbb{R}$ is invertible, differentiable, and $f’(x_0)$ is nonzero, then $f^{-1}$ is differentiable at $f(x_0)$, and:
$$ (f^{-1})'(f(x_0)) = \frac{1}{f'(x_0)} $$
In particular, note that we really only need that $f$ is continuously differentiable. If $f’(x_0)$ is nonzero, then $f’$ is either strictly positive or strictly negative. In a small enough neighborhood, by the continuity of $f’$, $f$ is strictly positive or strictly negative; in either case, it is invertible if we pick a small enough neighborhood around $x_0$ and $f(x_0)$, respectively.
The analogue for this theorem is as follows:
Theorem (Inverse Function Theorem)
Let $E$ be an open subset of $\mathbb{R}^n$, and $f: E \rightarrow \mathbb{R}^n$ be a function which is continuously differentiable on $E$ (i.e. its partial derivatives exist and are continuous). Suppose $x_0 \in E$ and $f’(x_0): \mathbb{R}^n \rightarrow \mathbb{R}^n$ is invertible.
Then, there exists a neighborhood of $x_0$ in $E$, and a neighborhood $V \in \mathbb{R}^n$ of $f(x_0)$ such that $f$ is a bijection from $U$ to $V$ (i.e. there exists an inverse $f^{-1}: V \rightarrow U$).
Finally, $f^{-1}$ is differentiable at $f(x_0)$, and $(f^{-1})’(f(x_0)) = (f’(x_0))^{-1}$.
There is also an analogue for implicit differentiation which follows from the inverse function theorem.
Theorem (Implicit Function Theorem)
Let $E \subset \mathbb{R}^n$ be an open set, and $f: E\rightarrow \mathbb{R}$ be continuously differentiable. Let $y=(y_1, \dots, y_n)$ be a point in $E$ such that $f(y) = 0$ and $f_{x_n}(y) \ne 0$.
Then, there exists an open subset $U$ of $\mathbb{R}^{n-1}$ containing $(y_1, \dots, y_{n-1})$, an open subset $V$ of $E$ containing $y$, and a function $g: U \rightarrow \mathbb{R}$, such that:
$$ g(y_1, \dots, y_{n-1})= y_n $$
Furthermore, the zero set of $f(x)$ is a graph of a function over $U$. Finally, $g$ is differentiable with derivative:
$$g_{x_j}(y_1, \dots, y_{n-1}) = -\frac{f_{x_j}(y)}{f_{x_n}(y)}$$
Download this Page as a PDF:
Note: Images are replaced by captions.
Recent Posts
- A Note on the Variation of Parameters Method | 11/01/17
- Group Theory, Part 3: Direct and Semidirect Products | 10/26/17
- Galois Theory, Part 1: The Fundamental Theorem of Galois Theory | 10/19/17
- Field Theory, Part 2: Splitting Fields; Algebraic Closure | 10/19/17
- Field Theory, Part 1: Basic Theory and Algebraic Extensions | 10/18/17