Gradient#
Revised
17 Jun 2023
[LINEAR APPROXIMATION]
Univariable Calculus (Calculus 1)
\( \begin{aligned} f &= f (x) \\ \Delta f &\approx f'(x) \Delta x \\ \end{aligned} \)
Multivariable Calculus (Calculus 2/3)
\( \begin{aligned} f &= f(x, y, z) \\ \Delta f &\approx f_x \Delta x + f_y \Delta y + f_z \Delta z \\ &= \langle f_x, f_y, f_z \rangle \cdot \langle \Delta x, \Delta y, \Delta z \rangle \\ &= \nabla f \cdot \Delta \mathbf{x} \\ \end{aligned} \)
\( \begin{aligned} \nabla f &= \langle f_x, f_y, f_z \rangle && \text{the gradient vector of} \, f \\ \Delta \mathbf{x} &= \langle \Delta x, \Delta y, \Delta z \rangle && \text{the displacement vector} \\ \end{aligned} \)
Let \((x, y)\) be a point close to \((a, b)\) such that \(\|\langle x - a, y - b \rangle\|\) is small.
\( \begin{aligned} \Delta f &\approx f_x(a, b) \Delta x + f_y(a, b) \Delta y \\ f(x, y) - f(a, b) &\approx f_x(a, b)(x - a) + f_y(a, b)(y - b) \\ f(x, y) &\approx \underbrace{f_x(a, b)}_{m}(x - a) + \underbrace{f_y(a, b)}_{n}(y - b) + \underbrace{f(a, b)}_{c} \\ z &\approx m(x - a) + n(y - b) + c \end{aligned} \)
This is a plane since the equation is linear.
\( \begin{aligned} z &= m(x - a) + n(y - b) + c \\ 0 &= mx - ma + ny - nb + c - z \\ mx + ny - z &= \underbrace{ma + nb - c}_{\text{constant} \, d} \\ d &= mx + ny - z \\ &= \underbrace{\langle m, n, -1 \rangle}_{\text{normal vector} \, \mathbf{n}} \cdot \langle x, y, z \rangle \end{aligned} \)
This tangent plane will approximate the function near \((a, b)\).
\( \begin{aligned} \Delta f &\approx \nabla f(a, b, c) \cdot \Delta \mathbf{x} \\ f(x, y, z) - f(a, b, c) &\approx \nabla f(a, b, c) \cdot \langle x - a, y - b, z - c \rangle \\ \end{aligned} \)
The gradient can be used to do approximations. We can use this to figure out derivatives in directions other than the x-, y-, and z-directions.
[LOCAL LINEARITY]
Calculus 1
Near \(x=a\), \(f(x)\) is approximately equal to its tangent line.
Calculus 2/3
For a two-variable function \(f(x, y)\), the function is approximately equal to a tangent plane near the point of tangency.
Local linearity is also seen in level curves (e.g., \(f(x, y) = x^2 + y^2\))
[example]
\( \begin{aligned} f &= f(x, y, z) \\ f(1, 2, 3) &= 10 \\ f_x(1, 2, 3) &= 4 \\ f_y(1, 2, 3) &= -5 \\ f_z(1, 2, 3) &= 6 \\ f(1.5, 2.1, 2.75) &\approx \,\,\, ? && \text{approximate}\\ \end{aligned} \)
\( \begin{aligned} \Delta x &= 1.5 - 1 = 0.5 \\ \Delta y &= 2.1 - 2 = 0.1 \\ \Delta z &= 2.75 - 3 = -0.25 \\ \end{aligned} \)
\( \begin{aligned} \Delta f &\approx f_x \Delta x + f_y \Delta y + f_z \Delta z \\ &= (4)(0.5) + (-5)(0.1) + (6)(-0.25) \\ &= 2 + (-0.5) + (-1.5) \\ &= 0 \\ \end{aligned} \)
[DEL OPERATOR VECTOR]
Del \(\nabla\) is an operator vector.
\( \begin{aligned} \nabla &= \left\langle \frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z} \right\rangle \end{aligned} \)
[GRADIENT VECTOR OF A FUNCTION F]
Since \(f\) is a scalar function, we can do scalar multiplication with \(\nabla\).
\( \begin{aligned} \nabla f &= \left\langle \frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z} \right\rangle f \\ &= \left\langle \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z} \right\rangle \\ \end{aligned} \)
\(\nabla f(x, y, z)\) is a function which tells you the gradient vector at \((x, y, z)\)
[GRADIENT VECTOR OF A FUNCTION AT A POINT]
\(\nabla f(a, b, c)\) is the gradient at the point \((a, b, c)\)
[example]
\( \begin{aligned} f &= xy^2 - 2yz^2 \\ \nabla f &= \left\langle \frac{\partial}{\partial x}, \frac{\partial}{\partial y}, \frac{\partial}{\partial z} \right\rangle (xy^2 - 2yz^2) \\ &= \underbrace{\langle y^2, 2xy - 2z^2, -4yz \rangle}_{\text{gradient field}} \\ \end{aligned} \)
[DIRECTIONAL DERIVATIVE]
\( \begin{aligned} \frac{\partial f}{\partial \mathbf{u}} &= D_\mathbf{u}f \end{aligned} \)
where \(D_\mathbf{u}\) is the derivative in the \(\mathbf{u}\) direction.
\( \begin{aligned} \mathbf{a} &= (a, b, c) \\ D_\mathbf{u}f(\mathbf{a}) &\approx \frac{f(\mathbf{a} + h\mathbf{u}) - f(\mathbf{a})}{\| h\mathbf{u} \|} \\ &= \frac{\Delta f}{\| h\mathbf{u} \|} \\ &\approx \frac{\nabla f(\mathbf{a}) \cdot (\cancel{h}\mathbf{u})}{\| \cancel{h}\mathbf{u} \|} \\ &= \nabla f(\mathbf{a}) \cdot \frac{\mathbf{u}}{\| \mathbf{u} \|} \\ &= \nabla f(\mathbf{a}) \cdot \mathbf{\hat{u}} \\ \end{aligned} \)
where \(\begin{aligned} \mathbf{\hat{u}} = \frac{\mathbf{u}}{\| \mathbf{u} \|} \end{aligned}\) is the unit vector that describes the direction
[example]
If we start at \((1, 2, 3)\) and move in the direction toward \((4, 3, 2)\), how fast does the value begin to change?
\( \begin{aligned} \mathbf{u} &= \langle 4, 3, 2 \rangle - \langle 1, 2, 3 \rangle = \langle 3, 1, -1 \rangle \\ \mathbf{\hat{u}} &= \frac{\mathbf{u}}{\| \mathbf{u} \|} = \frac{\langle 3, 1, -1 \rangle}{\sqrt{(3)^2 + (1)^2 + (-1)^2}} = \left\langle \frac{3}{\sqrt{11}}, \frac{1}{\sqrt{11}}, -\frac{1}{\sqrt{11}} \right\rangle \\ \nabla f(1, 2, 3) &= \langle 4, -14, -24 \rangle \\ D_\mathbf{\hat{u}}f(1, 2, 3) &= \langle 4, -14, -24 \rangle \cdot \left\langle \frac{3}{\sqrt{11}}, \frac{1}{\sqrt{11}}, -\frac{1}{\sqrt{11}} \right\rangle \\ &= \frac{12}{\sqrt{11}} - \frac{14}{\sqrt{11}} + \frac{24}{\sqrt{11}} \\ &= \frac{22}{\sqrt{11}} \\ &= 2\frac{11}{\sqrt{11}} \\ &= 2\sqrt{11} \\ \end{aligned} \)
The rate of change per unit in the direction toward \((4, 3, 2)\) at \((1, 2, 3)\).
\( \begin{aligned} D_\mathbf{\hat{u}}f(a, b) &= \nabla f(a, b) \cdot \mathbf{\hat{u}} \\ &= \| \nabla f(a, b) \| \| \mathbf{\hat{u}} \| \cos\theta \\ &= \| \nabla f(a, b) \| \cos\theta \\ -\| \nabla f(a, b) \| \le D_\mathbf{\hat{u}}f(a, b) &\le \| \nabla f(a, b) \| \\ \end{aligned} \)
The magnitude of \(\nabla f(a, b)\) is the largest possible rate of change from \((a, b)\).
This occurs when the direction from \((a, b)\) is parallel to \(\nabla f(a, b)\).
Terms#
Acknowledgements#
Thanks to Prof. Matthew Katz of the Pennsylvania State University.