Mathematics
The Covariance Matrix
In this course, we will explore the concept of the covariance matrix, a fundamental tool in statistics and data analysis that captures the variance and covariance of data. We will outline what the covariance matrix is, detailing it's properties and ending with an applied example: how the covariance matrix can be used to derive the optimal portfolio of a set of assets.
We saw in the previous section the concept of covariance between two random variables, $X$ and $Y$. To generalise covariance to an arbitrary number of random variables, we incorporate the set of covariances into a single object: the covariance matrix. Suppose that we are working with three random variables, $X$, $Y$ and $Z$. The covariance matrix is defined as $$ K = \begin{pmatrix} \text{Cov}(X, X) & \text{Cov}(X, Y) & \text{Cov}(X, Z)\\ \text{Cov}(Y, X) & \text{Cov}(Y, Y) & \text{Cov}(Y, Z)\\ \text{Cov}(Z, X) & \text{Cov}(Z, Y) & \text{Cov}(Z, Z)\\ \end{pmatrix} $$ Similarly, we can define the correlation matrix of $X$, $Y$ and $Z$ as $$ P = \begin{pmatrix} \rho_{XX} & \rho_{XY} & \rho_{XZ}\\ \rho_{YX} & \rho_{YY} & \rho_{YZ}\\ \rho_{ZX} & \rho_{ZY} & \rho_{ZZ}\\ \end{pmatrix}. $$ The diagonal elements of the correlation matrix are all 1, as the correlation of a random variable with itself is equal to 1. $$ \rho_{kk} = 1 \hspace{2mm} \forall k $$ Because $\text{Cov}(X,Y) = \text{Cov}(Y,X)$ for any $X$, $Y$, the covariance and correlation matrices are symmetric, meaning that $$ K^T = K \text{ and } P^T = P. $$ The covariance and correlation matrices have another important property: they positive semidefinite. A positive semidefinite matrix, $A$ obeys the relation $$ \textbf{a}^T \cdot A \cdot \textbf{a} \geq 0 $$ for any vector $\textbf{a}$. In other words, all eigenvalues are non-negative. A matrix is only positive semidefinite if its determinant is greater than or equal to zero. This property can be used to determine whether or not a matrix is a valid covariance or correlation matrix.
Example: Mean-Variance Portfolio Optimisation
Suppose we have a set of $N$ assets. Let us define the vector, $\textbf{u}$, such that the $i$th element in the vector is equal to the mean return of the $i$th asset, $u_i$. The covariances between the assets are described by the covariance matrix $\underline{\Sigma}$. How can we use this information to construct the optimal portfolio of assets to maximise returns while minimising risk? First of all, let us define a target rate of returns $r_t$. We will find the portfolio with the minimum variance which meets the target rate of returns. Next, let us define a vector for our portfolio, $\textbf{x}$, which will contain the weights of each asset, normalised such that the weights of all assets sum to 1. Now, we need to construct an objective function which when minimised, minimises our risk given the constraint that our expected returns $r$ is equal to the target return $r_t$. The most commonly used objective function for this purpose is $$ f(\textbf{x}, \lambda_1, \lambda_2) = \textbf{x}^T \cdot \underline{\Sigma} \cdot \textbf{x} - \lambda_1 (\textbf{u}^T \cdot \textbf{x} - r_t) - \lambda_2 (\textbf{1}^T \cdot \textbf{x} - 1). $$ While this function may look daunting, it is worth keeping in mind that the last two terms are only there to satisfy the constraints that the portfolio returns should equal the target returns, and that the sum of the portfolio weights should equal 1. What is important is that we're minimising $\textbf{x}^T \cdot \underline{\Sigma} \cdot \textbf{x}$, which are the eigenvalues of the covariance matrix when the eigenvector is our portfolio. We minimise $f(\textbf{x}, \lambda_1, \lambda_2)$ by finding the solution which satisfies $$ \frac{\partial f}{\partial \textbf{x}} = 0, \hspace{2mm} \frac{\partial f}{\partial \lambda_1} = 0, \hspace{2mm} \frac{\partial f}{\partial \lambda_2} = 0. $$ So, we must compute these partial derivatives. The partial derivatives relative to $\lambda_1$ and $\lambda_2$ simply give us our constraints $$ \frac{\partial f}{\partial \lambda_1} = 0 \implies \textbf{u}^T \cdot \textbf{x} = r_t, $$ $$ \frac{\partial f}{\partial \lambda_2} = 0 \implies \textbf{1}^T \cdot \textbf{x} = 1. $$ The partial derivative with respect to $\textbf{x}$ gives us $$ \frac{\partial f}{\partial \textbf{x}} = 2 \textbf{x}^T \underline{\Sigma} - \lambda_1 \textbf{u}^T - \lambda_2 \textbf{1}^T = 0 $$ This system of linear equations can be expressed as $$ \begin{pmatrix} 2 \underline{\Sigma} & -\textbf{u}^T & -\textbf{1}^T\\ \textbf{u} & 0 & 0\\ \textbf{1} & 0 & 0\\ \end{pmatrix} \begin{pmatrix} \textbf{x}^T\\ \lambda_1\\ \lambda_2\\ \end{pmatrix} = \begin{pmatrix} 0\\ r_t\\ 1\\ \end{pmatrix} $$ The form is this equation is the matrix equation $\underline{A} \cdot \textbf{v} = \textbf{b}$, and we can find the optimal portfolio by computing $\textbf{v} = \underline{A}^{-1} \cdot \textbf{b}$.