Three Short Theorems About Symmetric Matrices
July 2nd, 2017
Matrices are incredibly useful for a multitude of reasons in programming and science in general. For one, a matrix can be used to store two dimensional collections of data, like tables or even images (which are collections of pixels). Arguably more importantly, matrices are synonymous with linear maps. A linear map is a special, “nicely behaved” function of one or more variables which we can do a lot of interesting math with, and in fact there is an entire branch of math which studies the ways in which we can approximate the world of non-linear functions by linear functions – otherwise known as calculus.

I will assume that you know some basic facts about matrices and eigenvalues from now on. A suitable definition of symmetric matrices from first principles is that for any symmetric matrix A, A=AT. In other words, A is symmetrical about the diagonal. It is often hard to interpret this abstract property about matrices in the physical or geometrical world, but it generally has something to do with symmetry. For example, it appears in the notion of “stress tensors” from physics, because of equal and opposite forces. And it appears in Hessian matrix from calculus, since partial derivatives can be done in any order.
The point is that symmetric matrices are useful, and you end up seeing them a lot. In this article, I’m going to be explaining 3 of my favorite facts about symmetric matrices in no particular order – as to how they’re all connected: who knows? This post is just a way for me to fit together 3 short relatively unrelated ideas into one post – think of it like a bargain deal!
Symmetric matrices have an orthonormal basis of eigenvectors
This is often referred to as a “spectral theorem” in physics. We can define an orthonormal basis as a basis consisting only of unit vectors (vectors with magnitude 1) so that any two distinct vectors in the basis are perpendicular to one another (to put it another way, the inner product between any two vectors is 0). You might imagine such a basis is useful – because every such basis looks like the basis we always use in the real world (x, y, z). Let’s start with proving this statement.
First, consider the following fact about dot products (inner products). We can represent them with matrix multiplication. In short, the dot product between two column vectors can be computed by making the first of those two a row vector:
x⋅y=xTy
This will be crucial to this short proof. Consider two distinct eigenvalues of A, λ1 and λ2, with corresponding eigenvectors v1 and v2. We will show that they are perpendicular to one another (once we have this, we can always pick multiple orthogonal vectors corresponding the same eigenvalue, and also we are free to pick vectors with length 1). We consider the following sum:
(Av1)Tv2−vT1Av2
We can simplify this a little bit knowing that v1,v2 are eigenvectors and thus Av1=λv1. We can rewrite (note that we can take constants out of matrix multiplication)
=λ1vT1v2−λ2vT1v2 =(λ1−λ2)(vT1v2) =(λ1−λ2)(v1⋅v2)
Now, we can use the fact that for any two matrices A and B, ABT=BTAT. Rewriting the first equation above, we arrive at:
vT1ATv2−vT1Av2
But A=AT, since A is symmetric! So in fact this quantity is zero. This means either (λ1−λ2) is zero (it isn’t, since we picked two different eigenvalues); or v1⋅v2 is zero. Therefore, the two eigenvectors are perpendicular to one another.
Now, this lessens our workload considerably for a lot of purposes. Our eigenvectors automatically form a natural frame to work with.
Every square matrix can be written as the sum of a symmetric matrix and a skew-symmetric matrix.
This is a fun, short proof. We define a skew-symmetric matrix as a matrix A where AT=−A; so, reading the matrix horizontally or vertically returns the same matrix but with a flipped sign in each entry. We consider the following two sums:
M=12(A+AT)
What can we say about this matrix? Well, we look at its entry in row i and column j (denoted Mij), which is the sum Aij+Aji. But since addition is commutative, we also have
Mji=Mij=Aji+Aij
Indeed, the matrix reads the same horizontally and vertically; M is symmetric. In the same way, we can define:
N=12(A−AT)
And by a nearly identical argument, we can show that N is skew symmetric. And looking at our definitions, we can see that A=M+N, and we are done.
Any square matrix is the product of two symmetric matrices
In order to prove this statement, we have to cover the Jordan normal form, which is a unique way of representing every single matrix in a nice form.
You may remember from linear algebra the concept of diagonalization. Some matrices can be diagonalized, meaning they have a basis of eigenvectors. By applying a change of basis, then, we can arrive at the following equation:
M=SDS−1
Where D is a diagonal matrix and S is a change of basis matrix whose columns are eigenvalues.
We can consider matrices as linear maps, which gives is the geometric intuition of diagonalization: for any diagonalizable matrix, we can pick a different set of axes where the transformation is simply scaling along each axis.

It turns out that if you generalize this form, and take “almost diagonal” matrices instead of just diagonal ones, you end up with something called the Jordan normal form, which is unique for every matrix (basically, up to some reordering of “diagonal blocks”):
M=SJS−1
Like any good generalization, it holds for the orginal; the Jordan normal form of a diagonalizable matrix is its diagonal matrix. In particular, symmetric matrices have the following nice Jordan form:
M=QΛQ−1
Where Q is an orthogonal matrix, whose columns are the eigenvectors of M and therefore are: you guessed it, orthogonal!
The actual Jordan normal form looks like:
[A1000...000Ak]
Where each of the Ai refers to a Jordan block. For each eigenvalue λ of multiplicity m, we can have any number of associated Jordan blocks, whose sizes add up to m. Each Jordan block looks like:
[λ1000λ1000...100...λ]
The idea is that the matrix has eigenvalues along the diagonals, and 1s above the diagonals. We can use the following trick in our proof:
[λ100λ100λ]=[001010100][00λ0λ1λ10]
The key idea here is that for any non-diagonal Jordan block J, J=AB, where A is the matrix above with 1s on the off diagonals, and B is the remaining factor. The key thing here is that both A and B are symmetric. If our Jordan block is diagonal, on the other hand it is already symmetric. What we get from this is that any Jordan block can be written as the product of two symmetric matrices. You can probably see where we’re going from here.
M=SJS−1=S (AB) S−1=SA (STST−1) BS−1=SAST (ST−1BTS−1)
The term on the left, however, is symmetric (check by taking its transpose; note that A is symmetric. By a similar argument, the right term is symmetric as well. So, we’ve proven that any square matrix can be written as the product of two symmetric matrices. In the real case, real square matrices are written as products of two real symmetric matrices.
Share this on:
Recent Posts
- A Note on the Variation of Parameters Method | 11/01/17
- Group Theory, Part 3: Direct and Semidirect Products | 10/26/17
- Galois Theory, Part 1: The Fundamental Theorem of Galois Theory | 10/19/17
- Field Theory, Part 2: Splitting Fields; Algebraic Closure | 10/19/17
- Field Theory, Part 1: Basic Theory and Algebraic Extensions | 10/18/17