In an effort to better understand the underlying mathematics behind deep learning algorithms, I have been studying the Jacobian matrix and its applications in differential calculus.

The Jacobian of a multivariable function is a matrix in which each element is the partial derivative of the function with respect to each variable.

Therefore, in order to calculate the Jacobian, we first differentiate the multivariable function with respect to each variable. Let's take the following function as an example.

We want to find its Jacobian.

First, we have to differentiate with respect to $x$. Using the power rule, we bring the power term to the front, and reduce the power by $1$.

Then we differentiate with respect to $y$, so $-2y$ just becomes a constant.

Lastly, we differentiate with respect to $z$, so $\sin$ differentiates to $\cos$.

Once we have all the partial derivatives, we construct the Jacobian matrix with each partial derivative as an element. In this case, because we are ending up with a $1 \times 3$ matrix, we call it a vector. We can refer to it as the Jacobian vector of our original multivariable function.