Quantum Computing

Shortcut to this page: ntrllog.netlify.app/quantum

Notes provided by Professor Negin Forouzesh (CSULA) and Quantum Computing for Computer Scientists

Apparently complex numbers are used a lot in quantum mechanics. So this will start off with some light review of complex numbers and linear algebra.

Complex Numbers (Algebraically)

A complex number `c` is an expression `c=a+bi` where `a,b in RR`. `a` is referred to as the real part of `c` and `b` is the imaginary part of `c`.

With this in mind, we can also define a complex number as an ordered pair of real numbers: `c=(a,b)`.

Addition and Subtraction

Let `c_1=(a_1,b_1)` and `c_2=(a_2,b_2)` be two complex numbers.

`c_1+c_2=(a_1,b_1)+(a_2,b_2)=(a_1+a_2,b_1+b_2)`

`c_1-c_2=(a_1,b_1)-(a_2,b_2)=(a_1-a_2,b_1-b_2)`

Multiplication

`c_1xxc_2=(a_1,b_1)xx(a_2,b_2)=(a_1a_2-b_1b_2,a_1b_2+a_2b_1)`

This comes from the same idea as FOIL.

`c_1xxc_2=(a_1+b_1i)xx(a_2+b_2i)`

`=a_1a_2+a_1b_2i+b_1ia_2+b_1b_2i^2`

`=a_1a_2+a_1b_2i+b_1ia_2-b_1b_2`

`=a_1a_2-b_1b_2+a_1b_2i+a_2b_1i`

`=(a_1a_2-b_1b_2)+(a_1b_2+a_2b_1)i`

Division

`c_1/c_2=\frac{(a_1,b_1)}{(a_2,b_2)}=(a_1a_2+b_1b_2)/(a_2^2+b_2^2)+(a_2b_1-a_1b_2)/(a_2^2+b_2^2)i`

Let `(x,y)=\frac{(a_1,b_1)}{(a_2,b_2)}` be the quotient of two complex numbers.

From this, we get

`(a_1,b_1)=(x,y)xx(a_2,b_2)=(a_2x-b_2y,a_2y+b_2x)`

So we have

`(1): a_1=a_2x-b_2y`

`(2): b_1=a_2y+b_2x`

We can solve for `x` by multiplying `(1)` by `a_2` and multiplying `(2)` by `b_2`:

`a_1a_2=a_2^2x-b_2a_2y`

`b_1b_2=a_2b_2y+b_2^2x`

`a_1a_2+b_1b_2=a_2^2x-b_2a_2y+a_2b_2y+b_2^2x`

`a_1a_2+b_1b_2=a_2^2x+b_2^2x`

`a_1a_2+b_1b_2=x(a_2^2+b_2^2)`

`x=\frac{a_1a_2+b_1b_2}{a_2^2+b_2^2}`

We can solve for `y` by multiplying `(1)` by `b_2` and multiplying `(2)` by `-a_2`:

`a_1b_2=a_2b_2x-b_2^2y`

`-b_1a_2=-a_2^2y-a_2b_2x`

`a_1b_2-b_1a_2=a_2b_2x-b_2^2y-a_2^2y-a_2b_2x`

`a_1b_2-b_1a_2=-b_2^2y-a_2^2y`

`a_1b_2-b_1a_2=-y(b_2^2y+a_2^2)`

`\frac{a_1b_2-b_1a_2}{a_2^2+b_2^2}=-y`

`y=\frac{a_2b_1-a_1b_2}{a_2^2+b_2^2}`

Conjugation

Changing the sign of only the imaginary part of a complex number is called conjugation. Formally, if `c=a+bi`, then the conjugate of `c` is

`bar c=a-bi`.

Modulus

The modulus of a complex number `c=a+bi` is

`+sqrt(a^2+b^2)`

and is denoted as `abs(c)`. It's the "length" of a complex number when thinking about it geometrically.

The modulus is a generalization of the absolute value operation for real numbers.

`abs(a)=+sqrt(a^2)`

Complex Numbers (Geometrically)

As we've seen, we can represent complex numbers as ordered pairs. This naturally lends itself to being graphable on a plane.

Cartesian Representation

For example, the complex number `2+3i` would look like

The complex plane (where the real numbers sit on the horizontal axis and the imaginary numbers sit on the vertical axis) is also called the Argand plane.

So we can think of complex numbers as vectors on the complex plane.

Notice that if we calculate the length of the vector, we get the modulus. So the "length" of a complex number `c=a+bi` is `|c|=+sqrt(a^2+b^2)`.

Polar Representation

To convert a complex number from Cartesian coordinates to polar coordinates, we need `rho` and `theta`:

`rho=sqrt(a^2+b^2)`

`theta=tan^(-1)(b/a)`

To go back from polar to Cartesian:

`a=rho cos(theta)`

`b=rho sin(theta)`

From this, we have that a complex number `c=a+bi` can be written as

`c=rhocos(theta)+rhosin(theta)i`

`=rho(cos(theta)+isin(theta))`

Euler's formula states that `e^(itheta)=cos(theta)+isin(theta)`. Using this, we can rewrite a complex number in exponential form as

`c=rhoe^(itheta)`

Multiplication

Let `c_1=(rho_1,theta_1)` and `c_2=(rho_2,theta_2)`.

`c_1xxc_2=(rho_1,theta_1)xx(rho_2,theta_2)=(rho_1rho_2,theta_1+theta_2)`

Complex Vector Spaces

A complex vector space is a set of vectors where each element of a vector is a complex number.

For example, `CC^4` is the set of `4`-element vectors and an element of `CC^4` would look like this:

`[[6-4i],[7+3i],[4.2-8.1i],[-3i]]`

Just like with regular vectors, adding complex vectors is the same.

`CC^n` with vector addition is an Abelian group, which means:

`CC^n` is closed under vector addition
vector addition is associative
`CC^n` has an identity element: the zero vector
every vector in `C^n` has an inverse
vector addition is commutative

(The first four properties are what makes a group and the last property is what makes a group Abelian.)

And so is scalar multiplication.

`CC^n` with scalar multiplication (and vector addition) is a complex vector space, which means:

`1*V=V`
`c_1*(c_2*V)=(c_1xxc_2)*V`
`c*(V+W)=c*V+c*W`
`(c_1+c_2)*V=c_1*V+c_2*V`

`CC^(mxxn)` is the set of `mxxn` matrices, and it is also a complex vector space.

Matrix addition, scalar multiplication, and the transpose is the same as regular matrices.

For a complex matrix `A`,

the conjugate of `A`, denoted `bar A`, is obtained by conjugating each element of `A`
the adjoint of `A`, denoted `A^†`, is obtained by transposing and conjugating `A`

Basis

A set of complex vectors `B` is a basis of a complex vector space if

every vector can be written as a linear combination of vectors from `B`
`B` is linearly independent

Dimension

The dimension of a complex vector space is the number of elements in a basis of the space.

Inner Products

For two vectors `V_1=[[r_0],[r_1],[vdots],[r_n]]` and `V_2=[[r_0'],[r_1'],[vdots],[r_n']]`, the inner product is defined as

`(:V_1,V_2:)=V_1^T***V_2=sum_(j=0)^nr_jr_j'`

For two matrices `A,B in RR^(nxxn)`, the inner product is defined as

`(:A,B:)=Trace(A^T***B)`

where the trace of a matrix is the sum of the diagonal elements of the matrix.

For complex vectors, the inner product is defined as

`(:V_1,V_2:)=V_1^†***V_2`

For two matrices `A,B in CC^(nxxn)`, the inner product is defined as

`(:A,B:)=Trace(A^†***B)`

Norm

The norm (length) of a vector `V` is defined as

`abs(V)=sqrt((:V,V:))`

Orthogonal/Orthonormal

Two vectors are orthogonal if their inner product is equal to `0`.

A basis is orthogonal if all the vectors in the basis are pairwise orthogonal to each other.

`(:V_j,V_k:)=0` for `j ne k`

An orthogonal basis is orthonormal if all the vectors in the basis is of norm `1`.

`(:V_j,V_k:)={(1,if j=k),(0,if j ne k):}`

Trig

In `RR^3`, the inner product `(:V,V':)` is the same as

`abs(V)abs(V')costheta`

where `theta` is the angle between `V` and `V'`.

Eigenvalues and Eigenvectors

For a matrix `A in CC^(nxxn)`, if there is a number `c in CC` and a vector `V ne 0` such that

`AV=c cdotV`

then `c` is an eigenvalue of `A` and `V` is an eigenvector of `A` associated with `c`.

Hermitian Matrices

A complex matrix `A` is hermitian if `A^†=A`.

Unitary Matrices

A matrix is unitary if the matrix is invertible, and its inverse is its adjoint.

`U***U^†=U^†***U=I`

A matrix `A` is invertible if there exists `A^(-1)` such that

`A***A^(-1)=A^(-1)***A=I`

`I` is the identity matrix, which means only `1`s along the diagonal and `0`s everywhere else.

Unitary matrices preserve inner products: `(:UV,UV':)=(:V,V':)`.

`(:UV,UV':)`

`=(UV)^†***UV'`

`=V^†U^†***UV'`

`=V^†***I***V'`

`=V^†***V'`

`=(:V,V':)`

Unitary matrices also preserve norms: `abs(UV)=abs(V)`

`abs(UV)`

`=sqrt((:UV,UV:))`

`=sqrt((:V,V:))`

`=abs(V)`

Tensor Product

For vectors,

`[[x],[y]]ox[[a],[b],[c]]=[[x cdot [[a],[b],[c]]],[y cdot [[a],[b],[c]]]]`

For matrices,

`[[a_(0,0),a_(0,1)],[a_(1,0),a_(1,1)]]ox[[b_(0,0),b_(0,1),b_(0,2)],[b_(1,0),b_(1,1),b_(1,2)],[b_(2,0),b_(2,1),b_(2,2)]]=[[a_(0,0) cdot [[b_(0,0),b_(0,1),b_(0,2)],[b_(1,0),b_(1,1),b_(1,2)],[b_(2,0),b_(2,1),b_(2,2)]], a_(0,1) cdot [[b_(0,0),b_(0,1),b_(0,2)],[b_(1,0),b_(1,1),b_(1,2)],[b_(2,0),b_(2,1),b_(2,2)]]], [a_(1,0) cdot [[b_(0,0),b_(0,1),b_(0,2)],[b_(1,0),b_(1,1),b_(1,2)],[b_(2,0),b_(2,1),b_(2,2)]], a_(1,1) cdot [[b_(0,0),b_(0,1),b_(0,2)],[b_(1,0),b_(1,1),b_(1,2)],[b_(2,0),b_(2,1),b_(2,2)]]]]`

`=[[a_(0,0)xxb_(0,0),a_(0,0)xxb_(0,1),a_(0,0)xxb_(0,2),a_(0,1)xxb_(0,0),a_(0,1)xxb_(0,1),a_(0,1)xxb_(0,2)],[a_(0,0)xxb_(1,0),a_(0,0)xxb_(1,1),a_(0,0)xxb_(1,2),a_(0,1)xxb_(1,0),a_(0,1)xxb_(1,1),a_(0,1)xxb_(1,2)],[a_(0,0)xxb_(2,0),a_(0,0)xxb_(2,1),a_(0,0)xxb_(2,2),a_(0,1)xxb_(2,0),a_(0,1)xxb_(2,1),a_(0,1)xxb_(2,2)],[a_(1,0)xxb_(0,0),a_(1,0)xxb_(0,1),a_(1,0)xxb_(0,2),a_(1,1)xxb_(0,0),a_(1,1)xxb_(0,1),a_(1,1)xxb_(0,2)],[a_(1,0)xxb_(1,0),a_(1,0)xxb_(1,1),a_(1,0)xxb_(1,2),a_(1,1)xxb_(1,0),a_(1,1)xxb_(1,1),a_(1,1)xxb_(1,2)],[a_(1,0)xxb_(2,0),a_(1,0)xxb_(2,1),a_(1,0)xxb_(2,2),a_(1,1)xxb_(2,0),a_(1,1)xxb_(2,1),a_(1,1)xxb_(2,2)]]`

Quantum Architecture

Moving away from linear algebra for a bit (heh), we'll look at the components of quantum computers.

Bits and Qubits

A bit is a unit of information describing a classical system that has only two states (a two-dimensional classical system). Typically these states can be thought of as "on" and "off" or "true" and "false" and are represented by `0` and `1`.

We denote the states as `|0:)` and `|1:)`, which can be represented as matrices `[[1],[0]]` and `[[0],[1]]` respectively. (Think of it as having a `1` in the `0^text(th)` index, so that represents the `|0:)` state and having a `1` in the `1^text(st)` index, so that represents the `|1:)` state.)

The `|` with the `:)` is called Dirac ket notation.

A qubit is a quantum bit. So it is a unit of information describing a two-dimensional quantum system. We can also represent a qubit using a matrix, but instead of just `0`s and `1`s, the entries are complex numbers. So `[[c_0],[c_1]]` where `abs(c_0)^2+abs(c_1)^2=1` represents a qubit.

`abs(c_0)^2` is the probability that the qubit is in state `|0:)` and `abs(c_1)^2` is the probability that the qubit is in state `|1:)`. In a sense, a qubit is in both states at the same time until it is measured. Once a qubit is measured, it collapses to become a bit, which at that point, means it is definitively either in state `|0:)` or `|1:)`.

Since the bits `|0:)` and `|1:)` are a basis of `CC^2`, any qubit can be written as

`[[c_0],[c_1]]=c_0 cdot [[1],[0]]+c_1 cdot [[0],[1]]=c_0``|0:)+c_1``|1:)`

In general, an arbitrary state `|psi:)` is a linear combination of states `|x_0:),``|x_1:),...,``|x_(n-1):)` by complex weights `c_0,c_1,...,c_(n-1)`. These weights are called complex amplitudes.

Any state can be represented by an element of `CC^n` as

`|psi:)``|->[[c_0],[c_1],[vdots],[c_(n-1)]]`

Let's consider the complex vector `V=[[5+3i],[6i]]`. We can turn this complex vector into a qubit by normalizing it, that is, dividing each entry by the length of the vector (recall that the sum of the square of the entries has to equal `1` for it to be a qubit).

The length of the vector is

`abs(V)=sqrt((:V,V:))=sqrt([5-3i,-6i][[5+3i],[6i]])=sqrt(34+36)=sqrt(70)`

So the qubit is

`V/sqrt(70)=[[(5+3i)/sqrt(70)],[(6i)/sqrt(70)]]=(5+3i)/sqrt(70)``|0:)``+(6i)/sqrt(70)``|1:)`

After measuring the qubit, the probability that it is found in state `|0:)` is `34/70` (`abs(c_0)^2`) and the probability that it is found in state `|1:)` is `36/70` (`abs(c_1)^2`).

`|psi:)` is a superposition of the states `|x_0:),``|x_1:),...,``|x_(n-1):)`. That is, `|psi:)` represents a qubit as being simultaneously in all `{x_0,x_1,...,x_(n-1)}` states.

The probability that a qubit is at point `x_i` is

`p(x_i)=abs(c_i)^2/abs(|psi:)^2=abs(c_i)^2/(sum_jabs(c_j)^2)`

`[[0],[1],[0],[0]]`

The tensor product is implicitly understood, so it can also be denoted as `|01:)`.

If we pair up all the possible qubits, we get

`|00:)``=[[1],[0],[0],[0]]`

`|01:)``=[[0],[1],[0],[0]]`

`|10:)``=[[0],[0],[1],[0]]`

`|11:)``=[[0],[0],[0],[1]]`

Again, the indexes idea works here:

There's a `1` in the `0^text(th)` index so that represents "`0`".
There's a `1` in the `1^text(st)` index so that represents "`1`".
There's a `1` in the `2^text(nd)` index so that represents "`2`".
There's a `1` in the `3^text(rd)` index so that represents "`3`".

Classical Gates

Gates are operators that manipulate bits. They take in bits as input and outputs one bit. The classical gates are pretty much just the boolean operators. We can represent gates with matrices.

NOT Gate

NOT of `|0:)` equals `|1:)` and NOT of `|1:)` equals `|0:)`. The matrix representation is

`[[0,1],[1,0]]`

We can confirm that the matrix above is indeed the matrix for the NOT operator.

`[[0,1],[1,0]][[1],[0]]=[[0],[1]]`

NOT of `|0:)` equals `|1:)`

`[[0,1],[1,0]][[0],[1]]=[[1],[0]]`

NOT of `|1:)` equals `|0:)`

AND Gate

The matrix representation of the AND gate is

`[[1,1,1,0],[0,0,0,1]]`

We can confirm that the matrix above is indeed the matrix for the AND operator.

`[[1,1,1,0],[0,0,0,1]][[1],[0],[0],[0]]=[[1],[0]]`

AND `|00:)` equals `|0:)`

`[[1,1,1,0],[0,0,0,1]][[0],[1],[0],[0]]=[[1],[0]]`

AND `|01:)` equals `|0:)`

`[[1,1,1,0],[0,0,0,1]][[0],[0],[1],[0]]=[[1],[0]]`

AND `|10:)` equals `|0:)`

`[[1,1,1,0],[0,0,0,1]][[0],[0],[0],[1]]=[[0],[1]]`

AND `|11:)` equals `|1:)`

OR Gate

The matrix representation of the OR gate is

`[[1,0,0,0],[0,1,1,1]]`

NAND Gate

NAND is "not and". The matrix representation of the NAND gate is (unsurprisingly)

`[[0,0,0,1],[1,1,1,0]]`

Sequential Operations

There's another way to derive the matrix for the NAND gate. NAND is the result of performing the AND operation followed by the NOT operation. We can sequentially apply the operations one by one, but we can also multiply their matrices to get the matrix that produces the same result.

`NOT *** AND = [[0,1],[1,0]] *** [[1,1,1,0],[0,0,0,1]] = [[0,0,0,1],[1,1,1,0]]`

If the input of one operation depends on the output of another operation, then those are sequential operations.

Parallel Operations

If two operations are working with different bits and their inputs and outputs don't depend on each other, then those operations are parallel. With parallel operations, we perform tensor multiplication between them.

For example, consider one of DeMorgan's laws:

`not(not P ^^ not Q) = P vv Q`

The matrix representation of this is

`NOT *** AND *** (NOT otimes NOT) = OR`

The two NOT operators in `not P ^^ not Q` are acting on different bits, so they are parallel.

The outermost NOT operator can only be performed after the AND operator, so they are sequential.

Reversible Gates

In the quantum world, only reversible operations are allowed. Reversible means that you can look at the output and know what the original input was.

For example, the NOT operation is reversible because if the output is `|0:)`, we know the input was `|1:)` (and vice versa). The AND operation is not reversible because if the output is `|0:)`, we don't know if the input was `|00:)`, `|01:)`, or `|10:)`.

A computer with reversible circuits and programs would (theoretically) not use any energy. Let's see how we get to this conclusion.

Erasing information (as opposed to writing information) causes energy loss and heat. Landauer showed this, so this idea has become known as Landauer's principle.

Writing information is a reversible process. Imagine there's a blank chalkboard. If we write something and someone else comes in and erases it (i.e. reverses the writing) then the board gets returned to its original state.

On the other hand, erasing information is an irreversible process. If there's already writing on the chalkboard and we erase it, someone else coming in would have no idea what was originally written on the chalkboard, i.e., they would not know how to reverse the erasing to bring the chalkboard back to its original state.

When we erase the chalkboard, the chalk particles get dispersed into the air, which is analogous to energy being dispersed. So erasing information causes energy loss.

So a computer with irreversible circuits uses energy whereas a computer with reversible circuits doesn't.

Controlled-NOT (CNOT) Gate

Another reversible gate is the controlled-NOT gate. It has two inputs and two outputs.

The top input is called the control bit because it controls what the output will be. The black filled circle indicates that the input to the left of it is the control bit. If the control bit is `|0:)`, then the bottom output stays the same.

The inputs are `|x:)=``|0:)`, `|y:)=``|0:)` and the outputs are `|x:)=``|0:)`, `|y:)=``|0:)`.

If the control bit is `|1:)`, then the bottom output is calculated depending on what type of operation the gate is. In the CNOT gate, the bottom output is the exclusive or of `|x:)` and `|y:)`.

The inputs are `|x:)=``|1:)`, `|y:)=``|0:)` and the outputs are `|x:)=``|1:)`, `|y:)=``|1:)`.

The CNOT gate takes `|x,y:)` to `|x,x oplus y:)`.

The matrix representation of the CNOT gate is

`[[1,0,0,0],[0,1,0,0],[0,0,0,1],[0,0,1,0]]`

There's another way to derive and interpret the matrix representation.

`{:[00,01,10,11]:}`

`{:[00],[01],[10],[11]:}[[1,0,0,0],[0,1,0,0],[0,0,0,1],[0,0,1,0]]`

(ignore the bad alignment; idk how to do this with AsciiMath)

The numbers to the left of the matrix can be seen as the inputs to the gate. So `00` is `|x:)=``|0:),``|y:)=``|0:)` and `01` is `|x:)=``|0:),``|y:)=``|1:)`. The `1` in the matrix means that whatever column the `1` is in is the output.

So for `|x:)=``|1:),``|y:)=``|1:)`, the output is `|x:)=``|1:),``|y:)=``|0:)`, which is consistent with the behavior of the CNOT gate.

The CNOT gate is reversible because it can be reversed by itself (meaning that we apply the CNOT operation twice). Here's what that looks like:

Informally, the construction of this gate can be thought of as:

The top output of the CNOT gate is just a copy of the top input. The bottom output of the CNOT gate is the top input `oplus` the bottom input.

So if we were to attach the CNOT gate again, then the top output is just a copy of the top input. And the bottom output is the top input `oplus` the bottom input (the bottom input is `|x oplus y:)`).

`x oplus x` equals `0` and `0 oplus y` equals `y`, so the final output is always `y`.

Toffoli Gate

The Toffoli gate is another reversible gate. It has three inputs and three outputs.

This operation takes `|x,y,z:)` to `|x,y,z oplus (x ^^ y):)`.

The matrix representation of the Toffoli gate is

`{:[000,001,010,011,100,101,110,111]:}`

`{:[000],[001],[010],[011],[100],[101],[110],[111]:}[[1,0,0,0,0,0,0,0],[0,1,0,0,0,0,0,0],[0,0,1,0,0,0,0,0],[0,0,0,1,0,0,0,0],[0,0,0,0,1,0,0,0],[0,0,0,0,0,1,0,0],[0,0,0,0,0,0,0,1],[0,0,0,0,0,0,1,0]]`

Fredkin Gate

The Fredkin gate is another reversible gate. It has three inputs and three outputs. For this gate, if `|x:)=``|0:)`, then all the outputs are the same as the input, i.e., `|y':)=``|y:)` and `|z':)=``|z:)`. If `|x:)=``|1:)`, then the outputs are reversed, i.e., `|y':)=``|z:)` and `|z':)=``|y:)`.

The `xx` represents a swap gate.

Mathematically, `|0,y,z:) mapsto ``|0,y,z:)` and `|1,y,z:) mapsto ``|1,z,y:)`.

The matrix representation of the Fredkin gate is

`{:[000,001,010,011,100,101,110,111]:}`

`{:[000],[001],[010],[011],[100],[101],[110],[111]:}[[1,0,0,0,0,0,0,0],[0,1,0,0,0,0,0,0],[0,0,1,0,0,0,0,0],[0,0,0,1,0,0,0,0],[0,0,0,0,1,0,0,0],[0,0,0,0,0,0,1,0],[0,0,0,0,0,1,0,0],[0,0,0,0,0,0,0,1]]`

Quantum Gates

(The reversible gates are also quantum gates, but idk, we'll have a separate quantum gates section.)

Hadamard Gate

The Hadamard gate is a gate that puts a qubit into a superposition where both states `|0:)` and `|1:)` are equally likely. The matrix representation is

`[[1/sqrt(2),1/sqrt(2)],[1/sqrt(2),-1/sqrt(2)]]`

This is what it looks like to apply the Hadamard matrix to a qubit in state `|0:)`:

`H``|0:)``=[[1/sqrt(2),1/sqrt(2)],[1/sqrt(2),-1/sqrt(2)]][[1],[0]]`

`=[[1/sqrt(2)],[1/sqrt(2)]]`

`=1/sqrt(2)[[1],[0]]+1/sqrt(2)[[0],[1]]`

`=1/sqrt(2)``|0:)``+1/sqrt(2)``|1:)`

This means that the qubit has a `(1/sqrt(2))^2=1/2` chance to be in state `|0:)` or `|1:)` when measured.

Suppose we're in a two-qubit system `|q_1q_0:)` and both qubits start in state `|0:)`. This is what it looks like to apply the Hadamard gate to `q_0`:

`|0:)otimes H``|0:)`

`=``|0:)``otimes1/sqrt(2)``|0:)``+1/sqrt(2)``|1:)`

`=1/sqrt(2)``|00:)``+1/sqrt(2)``|01:)`

Pauli Matrices

The matrices

`X=[[0,1],[1,0]]`

`Y=[[0,-i],[i,0]]`

`Z=[[1,0],[0,-1]]`

are called the Pauli matrices. Apparently, they're used everywhere in quantum computing.

Bloch Sphere

Similar to how a complex number can be visualized as an arrow on a 2D plane, a qubit can be visualized as an arrow in a sphere.

Let `|psi:)=c_0``|0:)+c_1``|1:)` where `abs(c_0)^2+abs(c_1)^2=1` be a qubit.

Rewriting it in polar exponential form (using `c_0=r_0e^(iphi_0)` and `c_1=r_1e^(iphi_1)`), we get

`|psi:)``=r_0e^(iphi_0)``|0:)``+r_1e^(iphi_1)``|1:)`

It turns out that if we multiply a quantum state by a (complex) number, the state doesn't change. So we can multiply both sides by `e^(-iphi_0)` to get

`e^(-iphi_0)``|psi:)``=e^(-iphi_0)(``r_0e^(iphi_0)``|0:)``+r_1e^(iphi_1)``|1:))`

`=r_0``|0:)``+r_1e^(iphi_1-iphi_0)``|1:)`

`=r_0``|0:)``+r_1e^(i(phi_1-phi_0))``|1:)`

`=r_0``|0:)``+r_1e^(iphi)``|1:)`

(let `phi=(phi_1-phi_0)`)

To simplify this further, we'll take a step back for a bit. Since this is a qubit, we have `abs(c_0)^2+abs(c_1)^2=1`. Which means we have

`abs(r_0e^(iphi_0))^2+abs(r_1e^(iphi_1))^2=1`

`abs(r_0)^2abs(e^(iphi_0))^2+abs(r_1)^2abs(e^(iphi_1))^2=1`

Note that `e^(-iphi)` has norm `1`.

`e^(-iphi)=cos(phi)+isin(-phi)=cos(phi)-isin(phi)`

(Recall that the norm/modulus of a complex number is `sqrt(a^2+b^2)`.)

`abs(e^(-iphi))=sqrt(cos^2(phi)+(-sin(phi))^2)=sqrt(cos^2(phi)+sin^2(phi))=sqrt(1)=1`

So we end up with

`abs(r_0)^2+abs(r_1)^2=1`

`r_0^2+r_1^2=1`

We can substitute `r_0=cos(theta)` and `r_1=sin(theta)` and go back to simplifying:

`|psi:)``=r_0``|0:)``+r_1e^(iphi)``|1:)`

`=cos(theta)``|0:)``+e^(iphi)sin(theta)``|1:)`

All of this is to show that a qubit can be written in terms of `2` parameters: `phi` and `theta`.

`0 le phi lt 2pi` and `0 le theta lt pi/2` are enough to cover all possible qubits.

The north pole of the Bloch sphere corresponds to `|0:)` and the south pole corresponds to `|1:)`. If the qubit is pointing more north, then it is more likely to collapse to `|0:)` when it is measured.

`theta` controls the chance that the qubit collapses to `|0:)` or to `|1:)`.

If we rotate the qubit around the `z` axis, we are performing a phase change.

`phi` controls the qubit's phase.

The `Y` Pauli matrix flips the Bloch sphere `180^@` about the `y` axis.

Quantum Theory

Now we can really get into the quantum stuff. You know, the stuff like Schrödinger's cat and Heisenberg's uncertainty principle. Well, not these two concepts specifically, but the general ideas behind these concepts, like things not being predictable; things being in two states at the same time; and all that.

Classical Deterministic Systems

(The book uses marbles and vertices, but it's Halloween season as I'm writing this, so mazes and people instead!) Imagine we had a maze with `6` rooms.

There are currently `6` people in room `0`, `2` people in room `1`, and so on. The arrows point to the next room that everyone will be moving to. So after following the arrows once (which we'll refer to as a "time click"), all `6` people in room `0` will be moving to room `5`, and all `10` people in room `5` will be moving to room `2`, and so on.

We can represent this transition as a matrix:

`[[0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0],[0, 1, 0, 0, 0, 1],[0, 0, 0, 1, 0, 0],[0, 0, 1, 0, 0, 0],[1, 0, 0, 0, 1, 0]]`

(This matrix can be interpreted as: the room indicated by the row index is receiving from the room indicated by the column index. So room `4` (row index `4`) is receiving from room `2` (column index `2`)).

In my head, I'm calling this a "transition matrix". It can be officially referred to as the adjacency matrix.

If we represent the current state of the maze with the vector `[[6],[2],[1],[5],[3],[10]]`, we can multiply it by the matrix to find out what the next state of the maze will be:

`[[0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0],[0, 1, 0, 0, 0, 1],[0, 0, 0, 1, 0, 0],[0, 0, 1, 0, 0, 0],[1, 0, 0, 0, 1, 0]][[6],[2],[1],[5],[3],[10]]=[[0],[0],[12],[5],[1],[9]]`

For example, this means that there will be `12` people in room `2`, which we can manually confirm is true.

If we want to get the state of the maze after two time clicks, we repeat the process but with the new current state.

`[[0, 0, 0, 0, 0, 0],[0, 0, 0, 0, 0, 0],[0, 1, 0, 0, 0, 1],[0, 0, 0, 1, 0, 0],[0, 0, 1, 0, 0, 0],[1, 0, 0, 0, 1, 0]][[0],[0],[12],[5],[1],[9]]`

Let `M` be the transition matrix. If `X` is the state of the system at time `t`, then `Y` is the state of the system at time `t+1`.

`Y[i]=(MX)[i]=sum_(k=0)^5M[i,k]X[k]`

In words, this is saying that the number of objects at vertex `i` is equal to the sum of all the objects that are on vertices with edges connecting to vertex `i`.

Probabilistic Systems

Let's consider a different type of maze with `3` rooms and only `1` person.

For example, if the person is in room `0`, they have a `1/3` chance of choosing to go to room `1` and a `2/3` chance of choosing to go to room `2`.

`[[0,1/6,5/6],[1/3,1/2,1/6],[2/3,1/3,0]]`

Notice that each row sums up to `1` and each column sums up to `1`. This is a doubly stochastic matrix.

Suppose we came late and we don't know where the person started. Let's say there's a `1/6` chance they're in room `0`, a `1/6` chance they're in room `1`, and a `2/3` chance they're in room `2`, i.e., the current state of the maze can be represented as `[[1/6],[1/6],[2/3]]`.

We can find out the probabilities for which room they might be in after one time click by multiplying the transition matrix by the current state:

`[[0,1/6,5/6],[1/3,1/2,1/6],[2/3,1/3,0]][[1/6],[1/6],[2/3]]=[[21/36],[9/36],[6/36]]`

So after following the arrows once, there's a `21/36` chance the person is in room `0`, a `9/36` chance the person is in room `1`, and a `6/36` chance the person is in room `2`.

Just for fun, let's take a look at the tranpose of the transition matrix:

`[[0,1/3,2/3],[1/6,1/2,1/3],[5/6,1/6,0]]`

Notice that this is the same system but with the arrows reversed. It's like going back in time (by one time click) or reversing the footage.

The transpose of the transition matrix takes a state from time `t` to `t-1`.

Probabilistic Double-Slit Experiment

(The book uses bullets and targets, but bullets kinda seem like they don't really make sense in this example.) Suppose we're playing a carnival game where we're throwing baseballs through one of two holes. Behind the holes is a mechanism that will then shoot the baseballs at one of three targets. (Assume we're good enough that we always successfully throw the baseballs into a hole, and that the mechanism always hits a target.)

At position `0`, the baseball has a `1/2` chance to go through hole `2`, which then has a `1/3` chance of hitting target `6`.

Let's say it takes one time click for a baseball to reach the hole and another time click for the baseball to hit a target. The transition matrix is

`[[0,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[0,1/3,0,1,0,0,0,0],[0,1/3,0,0,1,0,0,0],[0,1/3,1/3,0,0,1,0,0],[0,0,1/3,0,0,0,1,0],[0,0,1/3,0,0,0,0,1]]`

As we saw earlier in the deterministic maze, we can get the state of the system after two time clicks by calculating the next state twice. Another way to get the state of the system after two time clicks is to instead multiply the transition matrix by itself twice:

`[[0,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[0,1/3,0,1,0,0,0,0],[0,1/3,0,0,1,0,0,0],[0,1/3,1/3,0,0,1,0,0],[0,0,1/3,0,0,0,1,0],[0,0,1/3,0,0,0,0,1]][[0,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[0,1/3,0,1,0,0,0,0],[0,1/3,0,0,1,0,0,0],[0,1/3,1/3,0,0,1,0,0],[0,0,1/3,0,0,0,1,0],[0,0,1/3,0,0,0,0,1]]`

`=[[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[1/6,1/3,0,1,0,0,0,0],[1/6,1/3,0,0,1,0,0,0],[1/3,1/3,1/3,0,0,1,0,0],[1/6,0,1/3,0,0,0,1,0],[1/6,0,1/3,0,0,0,0,1]]`

If the baseball starts at position `0`, then the state of the baseball is `[[1],[0],[0],[0],[0],[0],[0],[0]]`. After two time clicks, the state of the baseball is

`[[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[1/6,1/3,0,1,0,0,0,0],[1/6,1/3,0,0,1,0,0,0],[1/3,1/3,1/3,0,0,1,0,0],[1/6,0,1/3,0,0,0,1,0],[1/6,0,1/3,0,0,0,0,1]][[1],[0],[0],[0],[0],[0],[0],[0]]=[[0],[0],[0],[1/6],[1/6],[1/3],[1/6],[1/6]]`

Notice that the probability of the baseball hitting target `5` is `1/3` because there is a `1/6+1/6=1/3` chance of hitting target `5`.

Quantum Systems

The difference between quantum systems and classical probabilistic systems is that quantum systems deal with complex numbers as probabilities while classical probabilistic systems use real numbers as probabilities.

Probabilities in quantum systems are calculated as the square of the modulus of the complex number, i.e., `0 le abs(c)^2 le 1`.

The fact that complex numbers are used as probabilities is significant because complex probabilities can decrease when added together. Real-numbered probabilities can only increase when added together.

For real numbers `0 le p_1,p_2 le 1`, we have that `p_1+p_2gep_1` and `p_1+p_2gep_2`.

For complex numbers `c_1,c_2`, it is not necessarily true that `abs(c_1+c_2)^2 ge abs(c_1)^2` and that `abs(c_1+c_2)^2 ge abs(c_2)^2`.

For example, if `c_1=5+3i` and `c_2=-3-2i`, then

`abs(c_1)^2=(sqrt(5^2+3^2))^2=34`

`abs(c_2)^2=(sqrt((-3)^2+(-2)^2))^2=13`

but

`abs(c_1+c_2)^2=abs(2-i)^2=5`

which is less than both `34` and `13`.

The fact that complex numbers can cancel each other out when added is referred to as interference.

An example of a quantum system could look like this:

for which the corresponding transition matrix is

`U=[[1/sqrt(2),1/sqrt(2),0],[(-i)/sqrt(2),i/sqrt(2),0],[0,0,i]]`

Notice that this is a unitary matrix, meaning that its inverse is its adjoint.

`U***U^†=[[1/sqrt(2),1/sqrt(2),0],[(-i)/sqrt(2),i/sqrt(2),0],[0,0,i]][[1/sqrt(2),i/sqrt(2),0],[1/sqrt(2),-i/sqrt(2),0],[0,0,-i]]=[[1,0,0],[0,1,0],[0,0,1]]=I`

Also notice that the modulus squared of all the entries results in a doubly stochastic matrix:

`abs(U[i,j])^2=[[1/2,1/2,0],[1/2,1/2,0],[0,0,1]]`

Similar to how we saw in the classical system,

the transition matrix `U` takes a state from time `t` to `t+1`
the adjoint of the transition matrix `U^†` takes a state from time `t` to `t-1`
- for a state `V`, `V mapsto UV mapsto U^†UV=IV=V`
- if we perform an operation, we can "undo" the operation to get back to the original state before the operation

Quantum Double-Slit Experiment

Instead of baseballs and targets, we will now look at photons and measuring devices. We will also deal with complex probabilities, but the setup is still the same as the classical one.

Note that `abs(1/sqrt(2))^2=1/2` and `abs((+-1+-i)/sqrt(6))^2=1/3`.

The adjacency matrix is

The modulus squared of the matrix is

`[[0,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[1/2,0,0,0,0,0,0,0],[0,1/3,0,1,0,0,0,0],[0,1/3,0,0,1,0,0,0],[0,1/3,1/3,0,0,1,0,0],[0,0,1/3,0,0,0,1,0],[0,0,1/3,0,0,0,0,1]]`

This is actually the exact same transition matrix as the transition matrix for the baseball game after one time click. (Which, if we think about it, shouldn't be too surprising since the probabilities in both systems are the same.)

But let's see what the transition matrix is after two time clicks.

`[[0,0,0,0,0,0,0,0],[1/sqrt(2),0,0,0,0,0,0,0],[1/sqrt(2),0,0,0,0,0,0,0],[0,(-1+i)/sqrt(6),0,1,0,0,0,0],[0,(-1-i)/sqrt(6),0,0,1,0,0,0],[0,(1-i)/sqrt(6),(-1+i)/sqrt(6),0,0,1,0,0],[0,0,(-1-i)/sqrt(6),0,0,0,1,0],[0,0,(1-i)/sqrt(6),0,0,0,0,1]][[0,0,0,0,0,0,0,0],[1/sqrt(2),0,0,0,0,0,0,0],[1/sqrt(2),0,0,0,0,0,0,0],[0,(-1+i)/sqrt(6),0,1,0,0,0,0],[0,(-1-i)/sqrt(6),0,0,1,0,0,0],[0,(1-i)/sqrt(6),(-1+i)/sqrt(6),0,0,1,0,0],[0,0,(-1-i)/sqrt(6),0,0,0,1,0],[0,0,(1-i)/sqrt(6),0,0,0,0,1]]`

`=[[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[(-1+i)/sqrt(12),(-1+i)/sqrt(6),0,1,0,0,0,0],[(-1-i)/sqrt(12),(-1-i)/sqrt(6),0,0,1,0,0,0],[0,(1-i)/sqrt(6),(-1+i)/sqrt(6),0,0,1,0,0],[(-1-i)/sqrt(12),0,(-1-i)/sqrt(6),0,0,0,1,0],[(1-i)/sqrt(12),0,(1-i)/sqrt(6),0,0,0,0,1]]`

In terms of the modulus squared:

`=[[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[0,0,0,0,0,0,0,0],[1/6,1/3,0,1,0,0,0,0],[1/6,1/3,0,0,1,0,0,0],[0,1/3,1/3,0,0,1,0,0],[1/6,0,1/3,0,0,0,1,0],[1/6,0,1/3,0,0,0,0,1]]`

This is actually almost the exact same transition matrix as the transition matrix for the baseball game after two time clicks. The only difference is row index `5`, column index `0`.

So what happened here?

There's a `1/sqrt(2)((1-i)/sqrt(6))+1/sqrt(2)((-1+i)/sqrt(6))=(1-i)/sqrt(12)+(-1+i)/sqrt(12)=0/sqrt(12)=0` chance of hitting measurement device `5`.

This is a seemingly strange phenomenon: even though there are two ways for a photon to reach measurement device `5`, there will be no photons measured by device `5`. Interference at its finest!

Let the state of a quantum system be given by `X=[[c_0],[c_1],[vdots],[c_(n-1)]]`.

It is wrong to say that the probability of a photon being in position `k` is `abs(c_k)^2`. It is correct to say that the photon in state `X` is in all positions `c_0,c_1,...c_(n-1)` simultaneously. (Recall superposition!)

The photon passes through both slits simultaneously and cancels itself out when it exits both slits.

The probabilities come into play when we perform a measurement, which causes the photon to collapse to position `k` with probability `abs(c_k)^2`.

Spin

Electrons "spin". I think there are infinitely many "directions" it could be spinning, but when we measure an electron, it can only be found spinning "up" or "down".

Refer to the Stern-Gerlach experiment.

The generic state for the spin is

`|psi:)=c_0``|uarr:)+c_1``|darr:)`

State Transitions

Suppose we have a start state `|psi:)``=[[c_0],[c_1],[vdots],[c_(n-1)]]` and an end state `|psi':)``=[[c'_0],[c'_1],[vdots],[c'_(n-1)]]`. How likely is it that the start state will end up in the end state? We can measure this by looking at the transition amplitude. To calculate the transition amplitude, we calculate the inner product.

We start by getting the adjoint of the end state. This state is called a bra, and it is denoted as `(:psi'|`.

`(:psi'|``=|:psi':)^†=[bar (c'_0), bar (c'_1), ..., bar (c'_(n-1))]`

The transition amplitude is then calculated as

`(:psi'|psi:)=[bar (c'_0), bar (c'_1), ..., bar (c'_(n-1))][[c_0],[c_1],[vdots],[c_(n-1)]]`

`=bar (c'_0)xxc_0+bar (c'_1)xxc_1+...+bar (c'_(n-1))xxc_(n-1)`

It is possible for a transition amplitude to be `0`. This means that it is impossible for a state to transition to another state, i.e., the states are orthogonal to each other. For example, if an electron is in spin up before being measured, it will never change to spin down after being measured.

This relates to the dot product between two orthogonal vectors being `0`.

Let `{``|b_0:)``, ``|b_1:)``, ..., ``|b_(n-1):)``}` be an orthonormal basis representing a maximal list of mutually exclusive end states. We can write a state `|psi:)` in the basis as

`|psi:)=c_0``|b_0:)+c_1``|b_1:)+...+c_(n-1)``|b_(n-1):)`

Quantum Entanglement

Consider a two-particle system where the first particle can only be either `x_0` or `x_1` and the second particle can only be either `y_0` or `y_1`.

Suppose we have a state

`|psi:)=``|x_0:)otimes``|y_1:)+``|x_1:)otimes``|y_1:)`

Let's see if we can write `|psi:)` as the tensor product of two states coming from the two subsystems. In other words, let's see if we can rewrite this state as the tensor product of the first particle with the second particle. In math, let's see if we can find `c_0,c_1,c'_0,c'_1` such that `|psi:)=(c_0``|x_0:)+c_1``|x_1:))otimes(c'_0``|y_0:)+c'_1``|y_1:))`.

Well, we can rewrite it as

`|psi:)=(``|x_0:)+``|x_1:))otimes``|y_1:)`

Suppose `x_0=0` (so `x_1` must be `1`) and `y_1=1`. This gives us the state

`|psi:)=``|x_0:)otimes``|y_1:)+``|x_1:)otimes``|y_1:)`

`=``|01:)+``|11:)`

We can see that if the rightmost particle has value `1`, then the other particle can either be `0` or `1`; we don't know for sure. This is what it means to for a state to be separable: knowing the value of one of the particles does not give us information about the value of the other particle.

Let's now consider the state

`|psi:)=``|x_0:)otimes``|y_0:)+``|x_1:)otimes``|y_1:)`

We can see that it's not possible to rewrite this as a tensor product of `x`s and `y`s. So this state is entangled.

Suppose `x_0=0` (so `x_1` must be `1`) and `y_0=0` (so `y_1` must be `1`). This gives us the state

`|psi:)=``|x_0:)otimes``|y_0:)+``|x_1:)otimes``|y_1:)`

`=``|00:)+``|11:)`

We can see that if any of the particles is `0`, then the other particle must be `0`. Likewise, if one of the particles is `1`, then the other particle has to be `1`. This is what it means to for a state to be entangled: knowing the value of one of the particles gives us full information about the value of the other particle.

`=1``|x_0:)otimes``|y_0:)+0``|x_0:)otimes``|y_1:)+0``|x_1:)otimes``|y_0:)+1``|x_1:)otimes``|y_1:)`

Then we would have

`|psi:)=(c_0``|x_0:)+c_1``|x_1:))otimes(c'_0``|y_0:)+c'_1``|y_1:))`

`=c_0c'_0``|x_0:)otimes``|y_0:)+c_0c'_1``|x_0:)otimes``|y_1:)+c_1c'_0``|x_1:)otimes``|y_0:)+c_1c'_1``|x_1:)otimes``|y_1:)`

Comparing coefficients, we have that

`c_0c'_0=1`

`c_1c'_1=1`

`c_1c'_0=0`

`c_0c'_1=0`

But there are no values for `c_0,c_1,c'_0,c'_1` that can satisfy all four equations.

So `|psi:)` cannot be rewritten as the tensor product of the two subsystems.