Deriving the Pauli Matrices

Tags: #Mathematics/GeometricAlgebra #Mathematics/LinearAlgebra #Mathematics/Spinor

0 : Overview

This article provides a derivation for the Pauli spin matrices by showing that they arise naturally when looking for a matrix representation of the Clifford algebra ${Cl}_{3}$ (also known as the Algebra of Physical Space or simply 3D (Vanilla) Geometric Algebra).

Prerequisites & Resources

Basic Knowledge of 2D and 3D Geometric Algebra:

Sudgylacmoe's introduction suffices.
Linear and Geometric Algebra by Alan Macdonald provides more depth.

Basic Knowledge of Linear Algebra:

3Blue1Brown's Essence of Linear Algebra suffices.

Basic Knowledge of Complex Numbers:

Welch Labs' Imaginary Numbers are Real suffices.

Basic Knowledge of Quaternions is helpful, but not required.

3Blue1Brown's Visualizing quaternions (4D numbers) with stereographic projection suffices.

1 : Two Dimensions

1.1 : Recap of 2D Geometric Algebra

The 2D geometric algebra ${Cl}_{2}$ consists of multivectors $M_{2}$ with expansions like so:^[1]^[2]

M_{2} = a + b \hat{x} + c \hat{y} + d \hat{x} \hat{y}

The geometric product between vectors is as follows:^[1:1]^[2:1]

\vec{u} \vec{v} := \vec{u} \cdot \vec{v} + \vec{u} \land \vec{v}

The first term is just the standard interior (dot) product between vectors; the second term is the exterior (wedge) product, which is anti-commutative and produces a bivector:

\vec{u} \land \vec{v} = - \vec{v} \land \vec{u} = u v (\sin θ) \hat{x} \hat{y}

Just as a vector is an "oriented line segment", a bivector is an "oriented plane segment". In 2D, there is only the $x y$ -plane, so we have only the $+ \hat{x} \hat{y}$ and $- \hat{x} \hat{y}$ orientations. Since the unit scalar $1$ also only has $+ 1$ and $- 1$ orientations, we call $\hat{x} \hat{y}$ the unit pseudoscalar of 2D G·A.^[1:2]^[2:2]

Note that parallel vectors' geometric product is just the interior product, while perpendicular vectors' geometric product is the exterior product:^[1:3]^[2:3]

{\vec{u}}_{∥} \vec{v} = {\vec{u}}_{∥} \cdot \vec{v}, {\vec{u}}_{⊥} \vec{v} = {\vec{u}}_{⊥} \land \vec{v} .

Since we are working with an orthonormal basis, the geometric product of our basis vectors is just their wedge product, so $\hat{x} \hat{y} = \hat{x} \land \hat{y}$ .

Using anti-commutativity of $\land$ and the fact that $\hat{x}$ is perpendicular to $\hat{y}$ , we have the following:

(\hat{x} \hat{y})^{2} = \hat{x} \hat{y} \hat{x} \hat{y} = - \hat{y} (\hat{x} \hat{x}) \hat{y} = - \hat{y} \hat{y} = - 1

So $\hat{x} \hat{y}$ squares to $- 1$ , just like the imaginary unit $i$ . We shall define $\hat{x} \hat{y} =: \hat{ı}$ for this reason.

Now our multivector looks like a complex number added to a 2D vector:

M_{2} = (a + d \hat{ı}) + (b \hat{x} + c \hat{y})

1.2 : Representation of Complex Numbers

A real number $k$ has a trivial $2 \times 2$ matrix which behaves just like multiplication by $k$ :

k 1 = [\begin{matrix} k & 0 \\ 0 & k \end{matrix}] ⟹ k A = k 1 A

For any $2 \times 2$ matrix $A$ . So $1$ is a $2 \times 2$ matrix representation of the scalar $1$ . We might wonder — is there a $2 \times 2$ matrix representation for the complex $i$ ? Let's try to find one! We set up the defining equation $i^{2} = - 1$ in terms of $2 \times 2$ matrices, and solve:

\begin{aligned} {[\begin{array}{c} a & b \\ c & d \end{array}]}^{2} & = - 1 \\ [\begin{array}{c} a & b \\ c & d \end{array}] [\begin{array}{c} a & b \\ c & d \end{array}] & = [\begin{array}{c} - 1 & 0 \\ 0 & - 1 \end{array}] \\ [\begin{array}{c} a^{2} + b c & b (a + d) \\ c (a + d) & b c + d^{2} \end{array}] & = [\begin{array}{c} - 1 & 0 \\ 0 & - 1 \end{array}] \end{aligned}

We want real-valued entries, so $a = d = i$ isn't a valid solution to our problem. So instead, we'll try setting $a = d = 0$ :

[\begin{matrix} b c & 0 \\ 0 & b c \end{matrix}] = [\begin{matrix} - 1 & 0 \\ 0 & - 1 \end{matrix}]

Therefore we must have $b = - 1 / c$ . A simple solution is $c = 1$ so that $b = - 1$ . (There are many alternative solutions, but this is the one I'll be using.)

Thus, we have a real-valued matrix representation for $i$ !

{[\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix}]}^{2} = - [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}]

A general complex number $p + q i$ then has this matrix form:

p + q i \mapsto [\begin{matrix} p & - q \\ q & p \end{matrix}]

Where ' $\mapsto$ ' denotes the representation mapping here.

1.3 : Representation of Vectors

Since 2D V·G·A's unit scalar $1$ and pseudoscalar $\hat{ı}$ behave just like the complex $1$ and $i$ , these same representations work here as well! This then begs the question: are there $2 \times 2$ matrices for $\hat{x}$ and $\hat{y}$ also?

To begin, let's do a 'sanity check': do we even have enough degrees of freedom in $2 \times 2$ matrices to represent a 2D multivector? A $2 \times 2$ matrix has four independent components, so requires four basis matrices to span the space. We've already used up two by defining matrices for $1$ and $i$ , so we have two degrees of freedom left. This is a great sign, since we only have two objects ( $\hat{x}$ & $\hat{y}$ ) we want to assign matrices to!

In fact, if we can find matrices for $\hat{x}$ and $\hat{y}$ , we'll have a complete basis for $2 \times 2$ real-valued matrices! This would show that 2D G·A is in fact equivalent to our matrix space:

{Cl}_{2} ≅ R^{2 \times 2}

The symbol $≅$ denotes an isomorphism, which you may consider a technical term for "equivalence" in this context.^[3]

With our sanity check out of the way, let's see if we can find the desired matrix representations! We have three key properties of $\hat{x}$ and $\hat{y}$ that our matrices must respect:

\begin{aligned} 1 : & {\hat{x}}^{2} & = 1 \\ 2 : & {\hat{y}}^{2} & = 1 \\ 3 : & \hat{x} \hat{y} & = - \hat{y} \hat{x} = \hat{ı} \end{aligned}

Just as we did for $i$ , we'll write these equations in terms of matrices, and solve them.

Equations (1) & (2):

\begin{aligned} {[\begin{array}{c} a & b \\ c & d \end{array}]}^{2} & = [\begin{array}{c} 1 & 0 \\ 0 & 1 \end{array}] \\ [\begin{array}{c} a^{2} + b c & b (a + d) \\ c (a + d) & d^{2} + b c \end{array}] & = [\begin{array}{c} 1 & 0 \\ 0 & 1 \end{array}] \end{aligned}

This is very similar to what we had for $i$ 's matrix, but now we have $1$ instead of $- 1$ . We need to find two distinct solutions, one for $\hat{x}$ and one for $\hat{y}$ .

The first solution will be similar to that for $i$ . We set $a = d = 0$ to ensure the off-diagonal elements are taken care of. Then we see that $b = 1 / c$ , and we set $c = b = 1$ . Then:

\hat{x} \overset{?}{\mapsto} [\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix}]

This is still just a hypothesis at this stage — while this matrix squares to $1$ , we don't know if it will work with $\hat{y}$ for equation (3).

For the second solution, we instead set $b = c = 0$ . This also takes care of the off-diagonal elements. We then see we need $a^{2} = b^{2} = 1$ . We can't have $a = b$ , since then we'd just have $a 1$ , which isn't a new matrix! We thus set $a = - b = 1$ instead:

\hat{y} \overset{?}{\mapsto} [\begin{matrix} 1 & 0 \\ 0 & - 1 \end{matrix}]

Finally, we just check equation (3) to verify $\hat{x} \hat{y} = - \hat{y} \hat{x} = \hat{ı}$ holds:

\begin{aligned} [\begin{array}{c} 0 & 1 \\ 1 & 0 \end{array}] [\begin{array}{c} 1 & 0 \\ 0 & - 1 \end{array}] & = [\begin{array}{c} 0 & - 1 \\ 1 & 0 \end{array}] \\ [\begin{array}{c} 1 & 0 \\ 0 & - 1 \end{array}] [\begin{array}{c} 0 & 1 \\ 1 & 0 \end{array}] & = [\begin{array}{c} 0 & 1 \\ - 1 & 0 \end{array}] \end{aligned}

Indeed, it does, so the matrices we found meet all requirements for $\hat{x}$ and $\hat{y}$ !

1.4 : Representation of Cl(2)

Now we have a full representation for 2D G·A!

Matrix Representation for 2D G·A

The generic multivector $M_{2}$ from previously then has this representation:

a + b \hat{x} + c \hat{y} + d \hat{ı} \mapsto [\begin{matrix} a + c & b - d \\ b + d & a - c \end{matrix}]

Since we can translate from ${Cl}_{2}$ multivectors to $2 \times 2$ real-valued matrices and back (see the exercise below), we do indeed have an isomorphism between the two spaces:

{Cl}_{2} ≅ R^{2 \times 2}

Exercise: Multivector from a Matrix

We've shown how to find the matrix corresponding to a 2D multivector, but what about going backwards? (An inverse map is required to prove isomorphism.)

Write down formulae for the multivector components in terms of the matrix entries for an arbitrary $2 \times 2$ matrix.

Solution

$\begin{aligned} [\begin{array}{c} X_{0}^{0} & X_{1}^{0} \\ X_{0}^{1} & X_{1}^{1} \end{array}] & ≅ a + b \hat{x} + c \hat{y} + d \hat{ı} \\ ⟹ & {\begin{cases} a & = X_{0}^{0} + X_{1}^{1} \\ b & = X_{1}^{0} + X_{0}^{1} \\ c & = X_{1}^{0} - X_{0}^{1} \\ d & = X_{0}^{0} - X_{1}^{1} \end{cases} \end{aligned}$

2 : Three Dimensions

2.1 : Recap of 3D Geometric Algebra

In 3D, we have an extra basis vector, $\hat{z}$ , so a generic multivector $M_{3}$ expands like this:^[1:4]^[2:4]

\begin{aligned} M_{3} & = a \\ + b_{x} \hat{x} + b_{y} \hat{y} + b_{z} \hat{z} \\ + c_{x y} \hat{x} \hat{y} + c_{y z} \hat{y} \hat{z} + c_{z x} \hat{z} \hat{x} \\ + d \hat{x} \hat{y} \hat{z} \end{aligned}

I've separated it into the various-grade elements (scalar, vector, bivector, trivector / pseudoscalar) for added clarity.

The geometric product between vectors still works the same way, as do the interior (dot) and exterior (wedge) products. However, $\hat{x} \hat{y}$ is no longer the unit pseudoscalar since we have not only the $x y$ -plane, but also the orthogonal $y z$ and $z x$ planes, and corresponding bivectors $\hat{x} \hat{y}$ , $\hat{y} \hat{z}$ , & $\hat{z} \hat{x}$ . (So a general bivector is not limited to $\pm \hat{x} \hat{y}$ anymore.) We can decompose any plane segment / bivector into a linear combination of the basis bivectors. Each plane is associated to a "normal vector" (the unit vector perpendicular to the plane), so we call bivectors the unit pseudovectors of 3D G·A.^[1:5]^[2:5]

The unit pseudoscalar is instead now $\hat{x} \hat{y} \hat{z}$ , since in 3D we have only the $x y z$ -volume and therefore get only $+ \hat{x} \hat{y} \hat{z}$ or $- \hat{x} \hat{y} \hat{z}$ orientations.^[1:6]^[2:6]

Note that $\hat{x} \hat{y} \hat{z}$ also squares to $- 1$ :

(\hat{x} \hat{y} \hat{z})^{2} = \hat{x} \hat{y} \hat{z} \hat{x} \hat{y} \hat{z} = (\hat{x} \hat{x}) \hat{y} \hat{z} \hat{y} \hat{z} = - \hat{z} \hat{y} \hat{y} \hat{z} = - 1

So, since the 3D pseudoscalar also behaves like $i$ — but we've already used $\hat{ı}$ — I'll denote the unit trivector $I$ .

Returning now to the bivectors, note that they all square to $- 1$ (by the same proof as for $\hat{x} \hat{y}$ ). Since we previously had $\hat{x} \hat{y}$ corresponding to the imaginary unit $i$ , perhaps our three bivectors correspond to the imaginary quaternion units $i, j, k$ ?

Recall the defining equations for quaternions:

\begin{matrix} i^{2} = j^{2} = k^{2} = - 1 \\ i j = - j i = k \\ i j k = - 1 \end{matrix}

We define the following:

\hat{κ} := \hat{y} \hat{z}, \hat{ȷ} := \hat{z} \hat{x}, \hat{ı} := \hat{x} \hat{y} .

Now we verify that $\hat{ı} \hat{ȷ} = - \hat{ȷ} \hat{ı} = \hat{κ}$ :

\begin{aligned} \hat{ı} \hat{ȷ} & = \hat{x} \hat{y} \hat{z} \hat{x} = \hat{y} \hat{z} = \hat{κ} \\ \hat{ȷ} \hat{ı} & = \hat{z} \hat{x} \hat{x} \hat{y} = \hat{z} \hat{y} = - \hat{κ} \end{aligned}

Finally, by associativity of the geometric product, $\hat{ı} \hat{ȷ} \hat{κ} = {\hat{κ}}^{2} = - 1$ . So indeed, the bivectors behave just like the imaginary quaternions!

Now our multivector looks like a quaternion added to a vector + pseudoscalar :

\begin{aligned} M_{3} & = (a + c_{i} \hat{ı} + c_{j} \hat{ȷ} + c_{k} \hat{κ}) \\ + (b_{x} \hat{x} + b_{y} \hat{y} + b_{z} \hat{z}) \\ + d I \end{aligned}

2.2 : Representation of the Pseudoscalar

We already have a representation for $1$ , $\hat{x}$ , $\hat{y}$ , and $\hat{ı}$ , so we'll just keep them the same. How might we find representations for the other elements? To simplify this problem, we notice one key detail — we can write all missing elements as the pseudoscalar multiplied with elements we already have!

\begin{aligned} I & = 1 I \\ \hat{z} & = - I \hat{ı} \\ \hat{ȷ} & = I \hat{y} \\ \hat{κ} & = I \hat{x} \end{aligned}

(The proof of this is left as an exercise to the reader, see the end of this section.)

This helps us a lot! We only need to find one new matrix, that for $I$ , and we get all the rest for free!

What should the matrix for $I$ be? Here we run into a problem — we've already used all the possible real-valued $2 \times 2$ matrices! We've run out of degrees of freedom, so we are forced to extend our matrix space. How should we extend it, though?

We recall that $I$ behaves like the imaginary $i$ , but must be different to the matrix for $\hat{ı}$ we had previously. What other matrices can we think of that behave like $i$ ? A very simple option is just allowing complex entries in the matrix, which gives us this trivial $i$ -like matrix:

I \mapsto [\begin{matrix} i & 0 \\ 0 & i \end{matrix}] = i 1

Thinking about the degrees of freedom, we note that allowing complex entries doubles the (real) degrees of freedom from four to eight — which is precisely the number we need for 3D G·A! This makes it a very elegant "minimal" solution, in some sense, since it extends our space only as much as we need. By contrast, if we'd found $4 \times 4$ real-valued matrices instead, we'd have sixteen degrees of freedom where we need only eight, leaving us with a lot of "bloat".

We have left out one important detail here. The matrix above commutes with all other $2 \times 2$ complex matrices, which would imply that $I$ commutes with everything in 3D G·A. Luckily, this turns out to be true, the proof of which is also left as an exercise (see below).

Having found a matrix for $I$ , we use our equations from earlier to find the rest:

\begin{aligned} \hat{z} = - I \hat{ı} & \mapsto [\begin{array}{c} 0 & i \\ - i & 0 \end{array}] \\ \hat{ȷ} = I \hat{y} & \mapsto [\begin{array}{c} i & 0 \\ 0 & - i \end{array}] \\ \hat{κ} = I \hat{x} & \mapsto [\begin{array}{c} 0 & i \\ i & 0 \end{array}] \end{aligned}

This fills in all of our gaps!

Proofs left as an exercise to the reader

If you're somewhat new to geometric algebra, these are some good practice problems. Firstly, prove the following

\begin{aligned} I & = 1 I \\ \hat{z} & = - I \hat{ı} \\ \hat{ȷ} & = I \hat{y} \\ \hat{κ} & = I \hat{x} \end{aligned}

Secondly, prove that $I$ commutes with everything else.

Remember, since our basis vectors are perpendicular, swapping them is anti-commutative:

{\vec{v}}_{⊥} \vec{u} = {\vec{v}}_{⊥} \land \vec{u} = - \vec{u} \land {\vec{v}}_{⊥} = - \vec{u} {\vec{v}}_{⊥}

And since our basis vectors are unit-length, they square to $1$ , because $\vec{u} \vec{u} = | \vec{u} |^{2}$ . It is very helpful to become familiar with these two facts if you intend to work with geometric algebra, since they come up a lot. (Think of it like learning matrix multiplication for linear algebra.)

Alongside the definitions of $I$ and $\hat{ı}$ , $\hat{ȷ}$ , & $\hat{κ}$ , those two facts are all you need to know. Good luck!

2.3 : Cl(3) and Pauli Representations

Now we have a representation for 3D G·A!

Our Matrix Representation for 3D G·A

$\begin{aligned} 1 & \mapsto [\begin{array}{c} 1 & 0 \\ 0 & 1 \end{array}], & \hat{y} & \mapsto [\begin{array}{c} 1 & 0 \\ 0 & - 1 \end{array}], \\ \hat{x} & \mapsto [\begin{array}{c} 0 & 1 \\ 1 & 0 \end{array}], & \hat{ı} & \mapsto [\begin{array}{c} 0 & - 1 \\ 1 & 0 \end{array}], \\ I & \mapsto [\begin{array}{c} i & 0 \\ 0 & i \end{array}], & \hat{ȷ} & \mapsto [\begin{array}{c} i & 0 \\ 0 & - i \end{array}], \\ \hat{κ} & \mapsto [\begin{array}{c} 0 & i \\ i & 0 \end{array}], & \hat{z} & \mapsto [\begin{array}{c} 0 & i \\ - i & 0 \end{array}] . \end{aligned}$

We also see that we again have an isomorphism, now between 3D G·A and the space of $2 \times 2$ complex-valued matrices:

{Cl}_{3} ≅ C^{2 \times 2}

The rules for converting between matrices and multivectors is the same as in the 2D case, but now the coefficients are complex numbers, so the real or imaginary parts then yield the specific components we desire.

The Pauli matrices, denoted by $σ_{x}$ , $σ_{y}$ , and $σ_{z}$ , are equivalent to the matrices we've found for $\hat{x}$ , $\hat{y}$ , and $\hat{z}$ here! Slightly frustratingly, however, Pauli has $σ_{z} \mapsto - \hat{y}$ and $σ_{y} \mapsto \hat{z}$ .^[4] T

The reason I find this frustrating is that it causes the 2D G·A matrices to correspond to the $\hat{x}$ and $\hat{z}$ subspace of 3D G·A, rather than the $\hat{x}$ and $\hat{y}$ subspace you'd expect. (Presumably Pauli didn't start from the 2D perspective like we did.)

The Pauli representation then looks like so:

Pauli Matrix Representation for 3D G·A

$\begin{aligned} 1 & := [\begin{array}{c} 1 & 0 \\ 0 & 1 \end{array}], & σ_{z} & := [\begin{array}{c} 1 & 0 \\ 0 & - 1 \end{array}], \\ σ_{x} & := [\begin{array}{c} 0 & 1 \\ 1 & 0 \end{array}], & σ_{z x} & := [\begin{array}{c} 0 & 1 \\ - 1 & 0 \end{array}], \\ σ_{x y z} & := [\begin{array}{c} i & 0 \\ 0 & i \end{array}], & σ_{x y} & := [\begin{array}{c} i & 0 \\ 0 & - i \end{array}], \\ σ_{y z} & := [\begin{array}{c} 0 & i \\ i & 0 \end{array}], & σ_{y} & := [\begin{array}{c} 0 & - i \\ i & 0 \end{array}] . \end{aligned}$

Where we've denoted products of matrices by combining their indices.

This completes the derivation of the Pauli matrices!

Conclusion and Spinor Discussion

As I have hopefully shown, the Pauli matrices are not just arbitrary, and in fact come about naturally when trying to find matrix representations of geometric algebras.

This might make you wonder — what is the connection between geometric algebra and spinors? (After all, the Pauli matrices are supposed to be operators that act on spinors.^[4:1])

The answer, it turns out, is pretty much everything! The 'key idea' in geometric algebra is that vectors are not just objects, but linear transformations (via the geometric product). These linear transformations can, of course, act on other vectors, but they can also act on spinors, represented by two-components column tuples:^[5]

{\hat{x}}^{} ψ \mapsto [\begin{matrix} 0 & 1 \\ 1 & 0 \end{matrix}] [\begin{matrix} ψ^{0} \\ ψ^{1} \end{matrix}]

These spinors are called Pauli spinors since they work with Pauli vectors.^[5:1]

Matrices can be constructed from Kronecker Products of column tuples with row tuples, and we can analogously think of (multi)-vectors as being constructed from tensor products of spinors with (dual) cospinors.^[5:2] In a sense, that is what geometric algebra does. It recognises that it is not spin-1 vectors that are the "fundamental objects", but instead spin-1/2 spinors. This allows the interpretation of vectors as linear maps (both on spinors and other vectors), which gives rise to the geometric product!

No wonder geometric algebra simplifies so much of physics — we've been treating vectors as "fundamental" this whole time, when it should have been spinors! Mathematics rewards those who listen to her.

Appendix

Thanks for reading, and have a nice day!

A lonely page, it seems...

A. MacDonald, Linear and geometric algebra, Nachdr. S.l.: Createspace, 2012. ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
sudgylacmoe - A Swift Introduction to Geometric Algebra ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎ ↩︎
Wikipedia - Isomorphism ↩︎
Wikipedia - Pauli matrices ↩︎ ↩︎
eigenchris - Spinors for Beginners ↩︎ ↩︎ ↩︎