- A vector space $\set{V}$ is a set of objects (vectors, denoted $\ket a$, $\ket b$, etc.) that is closed under two operations:
- vector addition: $$\ket a + \ket b = \ket b+\ket a\in\set{V}\qquad\text{(commutativity)}$$ $$\ket a + (\ket b + \ket c) = (\ket a+\ket b)+\ket c\qquad\text{(associativity)}$$
- multiplication by scalars (usually real or complex numbers) $\lambda$, $\mu$, etc., which satisfies: $$\lambda(\ket a + \ket b) = \lambda\ket a+\lambda\ket b$$ $$\lambda(\mu\ket a) = (\lambda\mu)\ket a$$ $$(\lambda+\mu)\ket a = \lambda\ket a + \mu\ket a$$ The set of scalars is assumed to contain a unit scalar, denoted by $1$, with the property $1\ket a=\ket a$.
- Every vector space contains a (unique) null or zero vector $\ket0$ such that for all $\ket a$ in $\set{V}$, $$\ket a + \ket0=\ket a.$$
- For every vector $\ket a$ in $\set{V}$ there is an additive inverse, temporarily denoted $\ket{{-}a}$, such that $$\ket a+\ket{{-}a} = \ket0.$$ When the scalars are real or complex numbers, the axioms above imply that the additive inverse can be constructed as $\ket{{-}a}=(-1)\ket a$. Thus, we can drop the pedantic notation $\ket{{-}a}$ in favour of $-\ket a$.

Shankar 1.1, Riley 8.1

- A set of vectors $\{\ket{a_i},\; i=1, 2, \dots, N\}$ is said to be
*linearly independent*if the equation $$\sum_{i=1}^N\lambda_i\ket{a_i}=\ket0$$ is satisfied only if every $\lambda_i=0$. Otherwise the vectors are*linearly dependent*. - If in a vector space there is a set of $N$ linearly independent vectors but no set of $N+1$ linearly independent vectors, the space is said to be $N$-dimensional. In this course we denote the dimensionality by a superscript: $\set{V}^N$.
- For an $N$-dimensional vector space, a set of $N$ linearly
independent vectors is called a
*basis*; the base vectors $\ket{a_i}$ are said to*span*$\set{V}^N$. Given a basis $\basis{a_i}$, any vector $\ket b$ can be expressed as a linear combination of the $\ket{a_i}$: $$ \ket b = \sum_{i=1}^Nb_i\ket{a_i},$$ where the expansion coefficients (or*components*) $b_i$ are unique for a given $\ket b$ and choice of basis.

Shankar 1.1, Riley 8.1.1

- Given a basis $\basis{a_i}$, the list of components $b_i$
specifies $\ket b$ completely, and is said to be
a
*representation*of the abstract vector $\ket b$: $$\ket b\longrightarrow\colvec bn.$$ The arrow "$\longrightarrow$" can be pronounced "is represented by". This is just like representing a Cartesian 3-vector $\vec b$ by a list of its Cartesian components, $\displaystyle\begin{pmatrix}b_x\\b_y\\b_z\end{pmatrix}$; here the notation $\vec b$ (like $\ket b$) denotes the vector without making reference to the coordinate system, whereas the components $b_x,b_y,b_z$ depend on the choice of axes, i.e., on the choice of basis vectors. - Note that the basis vectors $\ket{a_i}$ have a particularly simple representation: $$\ket{a_1}\longrightarrow\begin{pmatrix}1\\0\\\vdots\\0\end{pmatrix},\quad \ket{a_2}\longrightarrow\begin{pmatrix}0\\1\\\vdots\\0\end{pmatrix},\quad\dots,\quad \ket{a_N}\longrightarrow\begin{pmatrix}0\\0\\\vdots\\1\end{pmatrix}.$$
- In a given representation, if $\ket w=\lambda\ket b + \mu\ket c$ then $w_i = \lambda b_i+\mu c_i$; i.e., $$\ket w \longrightarrow \begin{pmatrix} \lambda b_1+\mu c_1\\\lambda b_2+\mu c_2\\\vdots\\\lambda b_N+\mu c_N \end{pmatrix}.$$

Shankar 1.1, Riley 8.3

- The scalar product (if defined for a particular vector space
$\set{V}$) is a generalization of the dot product $\vec
a\cdot\vec b$ defined for Cartesian vectors. It is a scalar
function of two vectors $\ket a,\ket b\in\set{V}$, written $\braket
ab$, with the following properties:
- If $\ket w=\lambda\ket b+\mu\ket c$, then $$\braket aw =\lambda\braket ab+\mu\braket ac;$$ i.e., the function $\braket aw$ is linear in the second argument.
- $\braket ab=\conj{\braket ba}$. Here we are assuming
a
*complex*vector space; for a real vector space, the condition would be simply $\braket ab=\braket ba$. - In this course we also require $$\braket aa\ge 0,$$ where $\braket aa=0$ if and only if $\ket a=\ket0$.

*conjugate linear*" (or "*antilinear*") in its first argument. - A vector space with a scalar product is called an
*inner product space*; "inner product" is just another name for the scalar product. - Once an inner product is defined, the
*norm*(or "length") of a vector $\ket b$ can be defined as $\norm b\equiv\sqrt{\!\braket bb}\ge0$. - If $\norm b=1$, we say that $\ket b$ is
*normalized to unity*; we could also say that $\ket b$ is a*unit vector*. - Two important inequalities that follow from the
properties of the scalar product are:
**Schwarz's inequality:**$$\abs{\braket ab} \le \norm a\norm b,$$ which is the generalization of $\abs{\vec a\cdot\vec b}=ab\abs{\cos\theta}\le ab$ for Cartesian vectors.**The triangle inequality:**if $\ket c=\ket a+\ket b$, then $$\norm c\le\norm a+\norm b.$$

Shankar 1.2, Riley 8.1.2, 8.1.3

- Given the existence of a scalar product, the orthogonality of
two vectors $\ket a,\ket b$ can be defined by the condition
$\braket ab=0$. A set of vectors $\kete i \in \set{V}^N$ that
have unit norm and are mutually orthogonal is called
an
**orthonormal set**. The vectors of an orthonormal set satisfy $$\braket{e_i}{e_j} = \delta_{ij} = \begin{cases} 1\quad\text{if $i=j$}\\0\quad\text{if $i\ne j$.} \end{cases}$$ If the orthonormal set contains $N$ unit vectors $\kete i$, they form an*orthonormal basis*for $\set{V}^N$. - The components $b_i$ of a vector $\ket b$ with respect to an
orthonormal basis $\basis{e_i}$ are given by the simple
expressions $b_i=\braket{e_i}b$, analogous to $b_x = \vec
i\cdot\vec b$, etc., for the components of a Cartesian vector.
The scalar product $\braket bc$ also takes a simple form,
$$\braket bc=\sum_i\conj{b_i}c_i=\rowvec bn\colvec cn,$$
where the product of row and column vectors follows the usual rule
for matrix multiplication. (Note that these simple formulas for
$b_i$ and $\braket bc$ do
**not**hold for a general, non-orthonormal basis.) **Bras and kets.**The symbols $\ket a,\ket b,\dots$ we have used for abstract vectors have been called**kets**by Dirac. Their representations (see above) are column vectors. For each ket $\ket b$ there is a corresponding**bra**vector, $\bra b$, which is represented by the row vector $(\conj{b_1}, \conj{b_2}, \dots,\conj{b_N})$ in the preceding formula for the scalar product. The set of bra vectors is also a vector space, which is distinct from the space $\set{V}^N$ that the kets $\ket b$ belong to; the vector space of bras is said to be**dual**to $\mathbb V^N$. The existence of a distinction between these two vector spaces should be clear from the representation of kets by column vectors and bras by row vectors. Similarly, the one-to-one correspondence between kets $\ket b$ and bras $\bra b$ should also be apparent from this representation: $$\ket b\longrightarrow\colvec bn\equiv\vec b,\qquad \bra b\longrightarrow\rowvec bn\equiv\vec b^\dagger;$$ i.e., the row vector representing a bra is the Hermitian conjugate (conjugate transpose) of the column vector representing the ket, and vice versa. The scalar product $\braket bc$ can be regarded as a combination of the bra $\bra b$ with the ket $\ket c$ to form the "bra[c]ket" $\braket bc$; this explains the silly (but useful) names.**Gram-Schmidt orthogonalization**is a general method for constructing an orthonormal basis $\basis{e_i}$ from a given non-orthonormal basis $\basis{a_i}$ for $\set{V}^N$. We...- Let our first unit vector be $\kete 1=\ket{a_1}/\norm{a_1}$.
- Let $\kete 2 = C_2(\ket{a_2}-\kete 1\braket{e_1}{a_2})$, so that $\braket{e_1}{e_2}=0$. (You should check this!) Choose $C_2$ so that $\kete 2$ is normalized to unity: $$\braket{e_2}{e_2} = \abs{C_2}^2\left(\braket{a_2}{a_2}-\abs{\braket{a_2}{e_1}}^2\right)=1,$$ which fixes the value of $\abs{C_2}$; the phase can be chosen arbitrarily.
- On the $j$th step, let $$\kete j=C_j\left(\ket{a_j}-\sum_{i=1}^{j-1}\kete i\braket{e_i}{a_j}\right),$$ so that $\braket{e_i}{e_j}=0$ for $i\lt j$. Again, we can find the value of $\abs{C_j}$ that makes $\braket{e_j}{e_j}=1$.

Shankar 1.2, 1.3.1, Riley 8.1.2

- A linear operator $\op A$ is a vector-valued function of a
vector, written $\op A\ket v$, which has the property of
**linearity**: $$\op A(\lambda\ket a+\mu\ket b) = \lambda(\op A\ket a)+\mu(\op A\ket b)$$ where $\lambda,\mu$ are scalars and $\ket a,\ket b$ are vectors in $\set{V}$. In this course we suppose that $\op A\ket a\in\set{V}$ if $\ket a\in\set{V}$. - Simple examples of operators are
- the
*identity operator*(or*unit operator*) $\op 1$ which has no effect when applied to a vector: $\op1\ket b=\ket b$ for all $\ket b\in\set{V}$. - the
*zero operator*$\op0$ which gives the null vector $\ket0$ when applied to any vector: $\op0\ket b=\ket0$ for all $\ket b\in\set{V}$.

- the
- The
**sum**of two operators, $\op A+\op B$, is defined by its action $$(\op A + \op B)\ket b = \op A\ket b+\op B\ket b$$ for all $\ket b\in\set{V}$. Multiplication of an operator $\op A$ by a scalar $\lambda$ is defined by $$(\lambda\op A)\ket b = \lambda(\op A\ket b).$$ - The
**product**of two operators, $\op A\op B$ is defined by $$(\op A\op B)\ket b = \op A(\op B\ket b);$$ i.e., $\op B$ is applied first, then $\op A$. From this definition you should be able to show that the product is*associative*: $(\op A\op B)\op C = \op A(\op B\op C)$. Note, however, that the operator product need not be*commutative*: $\op A\op B\ne\op B\op A$, in general. - If there exists an operator $\op B$ such that $\op B\op
A=\op1$, then $\op B\equiv\op A^{-1}$ is called
the
*inverse*of $\op A$. Not all operators have an inverse! - More examples of operators:
- The
**outer product**of two vectors $\ket a$ and $\ket c$, written $\ketbra ca$, is defined to be an*operator*whose action on vector $\ket b$ is given by $$(\ketbra ca)\ket b = \ket c\braket ab = (\braket ab)\ket c;$$ the factor $\braket ab$ is a scalar, of course. - Given an orthonormal basis $\basis{e_i}$ for a vector
space, one can define
**projection operators**$\op P_i=\ketbra{e_i}{e_i}$ that extract from a vector $\ket b=\sum_jb_j\kete j$ the part that is "parallel" to $\kete i$: $$\op P_i\ket b = \ketbra{e_i}{e_i}\left({\textstyle\sum_jb_j\kete j}\right) = b_i\kete i.$$ Note that all information about $b_j$ for $j\ne i$ is lost in the process of projection, so $\op P_i$ is an example of an operator that has no inverse. By summing both sides of the last equation with respect to $i$ one obtains the result $$\left({\textstyle\sum_i\op P_i}\right)\ket b = \sum_ib_i\kete i = \ket b;$$ that is, we have $\sum_i\op P_i =\op1$, or $$\sum_i\outer{e_i}{e_i}=\op1.$$ This simple (but extremely useful) formula expresses the fact that the orthonormal vectors $\basis{e_i}$ span $\set{V}$. It is called the**completeness relation**or a**resolution of unity**. Some further properties of projection operators are $$\op P_i\op P_j = \begin{cases} \op P_i\quad&\text{if $j=i$}\\ \op0\quad&\text{if $j\ne i$.} \end{cases}$$

- The

- If $\ket c=\op A\ket b$, projection on to vector $\kete i$ of an orthonormal basis extracts the component $c_i=\braket{e_i}{c}=\brae i(\op A\ket b)$. If we expand $\ket b$ in terms of the basis vectors, $\ket b=\sum_jb_j\kete j$, this gives $$c_i = \sum_j \brae i\op A\kete jb_j = \sum_jA_{ij}b_j\,, $$ where $A_{ij}=\brae i\op A\kete j$. We recognize this as the rule for multiplying a column vector by a matrix $\mat A$ with elements $A_{ij}$. Thus, the equation $$\ket c = \op A\ket b$$ has the representation $$\colvec cn = \matrix An\colvec bn\quad\text{or, more briefly,}\quad\vec c=\mat A\vec b.$$
**Aside:**- Note that column $j$ of matrix $\mat A$ gives the components of $\op A\kete j$. If the columns of $\mat A$ are not linearly independent, there will be a combination of the vectors $\op A\kete j$ that satifies $\sum_j u_j\op A\kete j=\ket0$, where not all of the $u_j$ are zero. This could also be written $\op A\ket u = \ket0$, where $\ket u=\sum_j u_j\kete j$.
- An important consequence of this is that if the equation
$\op A\ket b=\ket c$ has a solution, the solution $\ket b$
will not be
*unique*: $\ket b+\alpha\ket u$ with $\alpha\ne0$ will be a different but equally acceptable solution. Accordingly, the operator $\op A$ has no inverse. - Alternatively, note that if the columns of $\mat A$ are
linearly dependent, then $\det\mat A=0$, so that the inverse of
the
*matrix*doesn't exist.

- Examples of matrix representations of operators:
- The
**identity**: $\brae i\op1\kete j = \delta_{ij}$. Thus, $$\op1\longrightarrow\diagmat 111\equiv\mat I$$ is the unit matrix. - The
**outer product**$\ketbra ca$ has matrix elements $\braket{e_i}c\braket a{e_j}=c_i\conj{a_j}$: $$\ketbra ca\longrightarrow \begin{pmatrix} c_1\conj{a_1}&c_1\conj{a_2}&\dots&c_1\conj{a_N}\\ c_2\conj{a_1}&c_2\conj{a_2}&\dots&c_2\conj{a_N}\\ \vdots&\vdots&\ddots&\vdots\\ c_N\conj{a_1}&c_N\conj{a_2}&\dots&c_N\conj{a_N} \end{pmatrix}\equiv\vec c\,\vec a^\dagger.$$ - The matrix representation of the
**projection operator**$\op P_1=\kete 1\brae 1$ is therefore $$\op P_1\longrightarrow\diagmat 100\equiv\mat P_1\,.$$ Similarly, the matrix $\mat P_i$ will be zero everywhere except for a single $1$ at position $(i,i)$ on the diagonal.

- The

Shankar 1.2, 1.3.1, Riley 8.1.2

- For a linear operator $\op A$, we define its adjoint (or Hermitian conjugate) to be the operator $\op A^\dagger$ for which $$\bra u\op A^\dagger\ket v = \conj{\bra v\op A\ket u}$$ for all $\ket u,\ket v\in\set{V}$. Taking $\ket u$ and $\ket v$ to be orthonormal basis vectors $\kete i$ and $\kete j$ shows at once that the matrix elements of $\op A^\dagger$ are $$A^\dagger{}_{ij} = \brae i\op A^\dagger\kete j =\conj{\brae j\op A\kete i} = \conj{A_{ji}}\,,$$ i.e., the matrix $\mat A^\dagger$ is the conjugate transpose (Hermitian conjugate) of the matrix $\mat A$.
- You should be able to show that:
- $(\op A\op B){}^\dagger=\op B^\dagger\op A^\dagger$;
- $(\lambda\op A)^\dagger=\conj\lambda\op A^\dagger$ (where $\lambda$ is a scalar);
- if $\op Q=\ketbra ca$ then $\op Q^\dagger=\ketbra ac$.

- An operator $\op A$ is said to be
**self-adjoint**(or**Hermitian**) if $\op A=\op A^\dagger$, so that $$\bra u\op A\ket v=\conj{\bra v\op A\ket u}$$ for all $\ket u,\ket v\in\set{V}$. From this, the matrix elements of $\op A$ satisfy $A_{ij}=\conj{A_{ji}}$: the matrix $\mat A=\mat A^\dagger$ is Hermitian. - Note that in a
*real*vector space (i.e., one in which the scalars are real numbers), the matrix of a self-adjoint operator will be real/symmetric, $A_{ij}=A_{ji}\in\set{R}$. **Exercise:**- Show that if $\op A=\op A^\dagger$, then $\bra u\op A^2\ket u\ge0$ for all $\ket u\in\set{V}$.
**Hint:**Write $\op A^2=\op A\op1\op A$ and make use of the completeness relation $\op1=\sum_i\ketbra{e_i}{e_i}$.

- An operator $\op U$ is said to be
**unitary**if $\op U\op U^\dagger=\op U^\dagger\op U=\op1$, i.e. if $\op U^{-1}=\op U^\dagger$. The matrix representing a unitary operator therefore satisfies $\mat U\mat U^\dagger=\mat U^\dagger\mat U=\mat I$. **Exercise:**- Prove that unitary operators preserve the scalar product of two vectors; that is, if $\ket{a'}=\op U\ket a$ and $\ket{b'}=\op U\ket b$, then $\braket{a'}{b'}=\braket ab$.
- Hence show that the norm of a vector is preserved by $\op U$; i.e., $\norm{a'}=\norm a$.

- For a
*real*vector space, a unitary operator can be represented by a real, orthogonal matrix, $\mat O$, so that $\mat O\tp{\mat O}=\mat I$, where $\tp{\mat O}$ denotes the transpose of $\mat O$. - The columns of a unitary matrix can be regarded as an
orthonormal set of vectors. The same is true of the rows.
**Proof (for columns):**From each of the columns $j$ of the matrix $\mat U$, construct a vector $\ket{u_j}\equiv\sum_iU_{ij}\kete i$. Then $$\braket{u_k}{u_j} = \sum_i \conj{U_{ik}}U_{ij} =\sum_iU^\dagger{}_{ki}U_{ij} = \delta_{kj}\,,$$ which proves that the vectors $\basis{u_j}$ form an orthonormal set.**Exercise:**Prove the orthogonality of the rows.

- We can show that a
**change of basis**can be effected by means of a**unitary matrix**:- If we have
**two**orthonormal bases $\basis{e_i}$ and $\basis{e'_i}$, we can use either to expand an arbitrary vector: $$\ket b=\sum_jb'_j\ket{e'_j}=\sum_kb_k\kete k.$$ Projecting both sides onto $\ket{e'_i}$ gives $$b'_i=\sum_k\braket{e'_i}{e_k}b_k\equiv \sum_kU_{ik}b_k;$$ the matrix of the transformation is $\mat U$ with elements $U_{ik}=\braket{e'_i}{e_k}$. - We now show that $\mat U$ is
**unitary**. First note that $U^\dagger{}_{kj} = \conj{U_{jk}} = \conj{\braket{e'_j}{e_k}}=\braket{e_k}{e'_j}$, so that $$ (\mat U\mat U^\dagger)_{ij} =\sum_kU_{ik}U^\dagger{}_{kj}=\sum_k\braket{e'_i}{e_k}\braket{e_k}{e'_j} =\braket{e'_i}{e'_j}=\delta_{ij}.$$ Here we have used the completeness relation $\sum_k\ketbra{e_k}{e_k}=\op1$ to do the sum on $k$.

- If we have
**Aside on active and passive transformations.**Just as in the case of a rotation, a given unitary transformation with matrix elements $U_{ik}$ can be interpreted in two ways. As we have just seen, the equation $b'_i=\sum_kU_{ik}b_k$ expresses the components of $\ket b$ with respect to one basis, $\basis{e'_i}$, in terms of the components of $\ket b$ with respect to another basis, $\basis{e_i}$. Here the vector $\ket b$ is the same in each case, but the "axes" have been changed; this corresponds to the*passive view*of rotations, met before in PHYS10672. But the same*equation*, $b'_i=\sum_kU_{ik}b_k$, has a different interpretation when the quantities $b'_i$ are instead the components of a*different*vector $\ket{b'}$ with respect to the original basis, $\basis{e_i}$. In this case $$\ket{b'}=\op U\ket b,\quad\text{i.e.,}\quad\ket{b'}\ne\ket b,$$ and the coefficients $U_{ik}$ are the matrix elements of the unitary operator $\op U$ relating the two vectors, $U_{ik}=\brae i\op U\kete k$. This corresponds to the*active view*of rotations, also met in PHYS10672. Both interpretations are regularly needed in physics, but it should always be clear which one is being used.- The
**transformation of the matrix elements of operators**to a new representation (basis) is almost as straightforward as the transformation of the components of a vector. In the new representation with basis vectors $\basis{e'_i}$ we have $$A'_{ij} = \bra{e'_i}\op A\ket{e'_j} =\bra{e'_i}\op1\op A\op1\ket{e'_j} =\sum_{k,l}\braket{e'_i}{e_k}\brae k\op A\kete l\braket{e_l}{e'_j},$$ where we have used the completeness relation twice. Noting that $\braket{e'_i}{e_k}=U_{ik}$ and $\braket{e_l}{e'_j}=U^\dagger{}_{lj}$ we find $$\mat A' = \mat U\mat A\mat U^\dagger.$$

**Revision of your existing knowledge of eigenvalue problems.**Note that you are expected to be able to*apply*this knowledge, as you have met many examples already in PHYS10071, PHYS10672 (principal axes of symmetric tensors) and PHYS20401 (small oscillations problems).- The
**eigenvalue equation**for an operator $\op A$ is the equation $$\op A\ket u=a\ket u,$$ where $a$ is a scalar, the**eigenvalue**, and $\ket u\ne\ket 0$ is the**eigenvector**. Solving an eigenvalue problem means finding $\{a_i\}$, the set of possible values for $a$, and also the corresponding eigenvectors $\basis{u_i}$. - Using the matrix representation for an operator on $\set{V}^N$, the eigenvalue equation becomes $\mat A\vec u=a\vec u$, or $$(\mat A - a\mat I)\vec u = \vec0\,.$$ Since $\vec u\ne\vec0$, this implies that the columns of $\mat A-a\mat I$ are linearly dependent. [If you want to check this last point, it's helpful to note that the last equation can be written $\sum_ju_j\left[\sum_i\vec e_i(A_{ij}-a\delta_{ij})\right]=\vec0$; the quantities in square brackets are the columns of $\mat A-a\mat I$, regarded as column vectors.]
- Thus, for there to be a nontrivial (i.e., non
*zero*) solution of this system of $N$ linear equations, the determinant of the matrix of coefficients must vanish: $$\det(\mat A-a\mat I) = 0.$$ This is a polynomial equation of degree $N$ in $a$, and is often called the**characteristic equation**or (less often, these days) the**secular equation**. By the Fundamental Theorem of Algebra, a polynomial equation of degree $N$ has exactly $N$ roots, $a_i\in\set{C}$. If a root is repeated $m>1$ times, we say that the eigenvalue has**degeneracy**$m$, or that its**multiplicity**is $m$. - Once the eigenvalues are known, the eigenvectors can be
found by solving the equations $(\mat A-a_i\mat I)\vec
u=\vec0$ for $\vec u$. Eigenvectors can be determined only
up to a multiplicative constant: i.e., if $\vec u$ is an
eigenvector, then $C\vec u$ is an equally good solution of
the eigenvalue equation, provided $C\ne0$, but $C\vec u$ is
*not*regarded as being an eigenvector that is distinct from $\vec u$. - Distinct eigenvalues ($a_i\ne a_j$) always correspond to distinct eigenvectors. But if a particular eigenvalue is degenerate, with degeneracy $m>1$, we can say only that there may be up to $m$ linearly independent eigenvectors belonging to this eigenvalue. Thus, the total number of linearly-independent eigenvectors of an $N\times N$ matrix is always less than or equal to $N$.
- The Fundamental Theorem of Algebra, applied to the
polynomial $P(a)=\det(\mat A-a\mat I)$, also tells us that
$P(a)$ can be
*factorized*in the form $$\det(\mat A-a\mat I)=\prod_{i=1}^N(a_i-a).$$ By setting $a=0$ on each side, we obtain $$\det\mat A=\prod_{i=1}^Na_i\,{;}$$ the*determinant*of a matrix is equal to the product of its eigenvalues. By comparing coefficients of $a^{N-1}$ on each side, one can also show (with more effort) that $$\Tr\mat A=\sum_{i=1}^Na_i\,{;}$$ the*trace*of a matrix (i.e., the sum of its diagonal elements) equals the sum of its eigenvalues. Note that these last two, important results are always valid, regardless of whether there are exactly $N$ or fewer than $N$ linearly-independent eigenvectors. They can be a useful check on your calculation of the eigenvalues of a matrix.

- The
**Eigenvalues and eigenvectors of Hermitian operators,**$\op A=\op A^\dagger$- Suppose that $\ket{u_i}$ and $\ket{u_k}$ are eigenvectors
belonging to eigenvalues $a_i$ and $a_k$. Then, starting
from $$\op A\ket{u_i}=a_i\ket{u_i},$$ we can take the
scalar product with $\ket{u_k}$ to obtain
$$\bra{u_k}a_i\ket{u_i} =\bra{u_k}\op A\ket{u_i}
=\conj{\bra{u_i}\op A\ket{u_k}}
=\conj{\bra{u_i}a_k\ket{u_k}}
=\conj{a_k}\braket{u_k}{u_i},$$
which gives
$$(a_i-\conj{a_k})\braket{u_k}{u_i}=0.$$
We can draw two conclusions from this:
- By setting $k=i$ and noting that
$\braket{u_i}{u_i}>0$, we find $a_i-\conj{a_i}=0$; the
eigenvalues of an Hermitian operator are
**real numbers**. - If $a_i\ne a_k$, the factor $(a_i-\conj{a_k})$ is
nonzero. [Remember, all of the eigenvalues are
*real*, so $\conj{a_k}=a_k$.] We deduce that the other factor is zero: $\braket{u_k}{u_i}=0$. Thus, the eigenvectors belonging to distinct eigenvalues are**orthogonal**.

- By setting $k=i$ and noting that
$\braket{u_i}{u_i}>0$, we find $a_i-\conj{a_i}=0$; the
eigenvalues of an Hermitian operator are
- Less obviously (we won't prove it), in the case of an
eigenvalue with degeneracy $m$, there will be
**exactly**$m$ linearly independent eigenvectors. (Compare this with the general case, $\op A\ne\op A^\dagger$, where the number of linearly-independent eigenvectors may be*less*than $m$.) The Gram-Schmidt process can be applied to these $m$ eigenvectors to obtain an orthonormal set of $m$ eigenvectors belonging to the same eigenvalue. - Thus, the $N$ normalized eigenvectors of an Hermitian
operator $\op A$ on $\set{V}^N$ form an orthonormal basis
for $\set{V}^N$, or, in the case of degeneracy, they can
be
*chosen*so that they do.

- Suppose that $\ket{u_i}$ and $\ket{u_k}$ are eigenvectors
belonging to eigenvalues $a_i$ and $a_k$. Then, starting
from $$\op A\ket{u_i}=a_i\ket{u_i},$$ we can take the
scalar product with $\ket{u_k}$ to obtain
$$\bra{u_k}a_i\ket{u_i} =\bra{u_k}\op A\ket{u_i}
=\conj{\bra{u_i}\op A\ket{u_k}}
=\conj{\bra{u_i}a_k\ket{u_k}}
=\conj{a_k}\braket{u_k}{u_i},$$
which gives
$$(a_i-\conj{a_k})\braket{u_k}{u_i}=0.$$
We can draw two conclusions from this:
**Eigenvalues and eigenvectors of unitary operators,**$\op U^{-1}=\op U^\dagger$. If $\op U$ is unitary and the eigenvalue equation is $$\op U\ket{v_j}=\omega_j\ket{v_j},$$ then:- The eigenvalues may be real or complex, but all have unit modulus, $\mod{\omega_j}=1$.
- If $\omega_j\ne\omega_k$, then $\braket{v_j}{v_k}=0$; the
eigenvectors belonging to distinct eigenvalues
are
**orthogonal**. - A unitary operator on $\set V^N$ has exactly $N$ linearly-independent eigenvectors $\basis{v_j}$ and they can be chosen so that they form an orthonormal basis for $\set V^N$. (As in the case of an Hermitian operator, "choice" arises only in the case of degeneracy.)

**"Diagonalization" of Hermitian and unitary operators**- Solving the eigenvalue problem for an operator $\op A$
that is either Hermitian or unitary is often
called
**diagonalization**. The reason for the name is that the matrix of $\op A$ is diagonal in the representation given by the orthonormal set of eigenvectors, $\ket{u_i}$: $$\bra{u_i}\op A\ket{u_j} = \bra{u_i}a_j\ket{u_j} = a_j\delta_{ij}\,;$$ i.e., in this representation, $$\op A\longrightarrow\diagmat{a_1}{a_2}{a_N}\equiv\mat A^\text{diag}.$$ - The unitary transformation from basis $\basis{e_i}$ to the basis of eigenvectors, $\basis{u_j}$, is given by $$\mat A^\text{diag}=\mat T^\dagger\mat A\mat T,$$ where $T_{ij}=\braket{e_i}{u_j}$; that is, the $j$th column of $\mat T$ is the $j$th eigenvector of $\mat A$.
- By using the eigenfunctions $\basis{u_i}$ as the basis, we
can also write any Hermitian or unitary operator in a
particularly simple, general form. Starting from the
eigenvalue equation $$\op A\ket{u_i}=a_i\ket{u_i},$$ we take
the outer product of each side with $\ket{u_i}$ (i.e., we
multiply both sides on the right by $\bra{u_i}$) and then
sum over $i$: $$\sum_i\op A\ketbra{u_i}{u_i} =
\sum_ia_i\ketbra{u_i}{u_i}.$$ But
$\sum_i\ketbra{u_i}{u_i}=\op1$ (the completeness relation
for $\basis{u_i}$), so $$\op A =
\sum_ia_i\ketbra{u_i}{u_i}.$$ This is called the
**spectral representation**of the operator $\op A$.

- Solving the eigenvalue problem for an operator $\op A$
that is either Hermitian or unitary is often
called
**Aside**[not needed for the examples]:**Diagonalization of commuting operators**-- Operators $\op A$ and $\op B$ are said to**commute**if $\op A\op B=\op B\op A$.- Let $\ket u$ be an eigenvector of $\op B$, so that $\op
B\ket u=b\ket u$. Then if $\op A$ commutes with $\op B$, we
can show that $\op A\ket u$ is
*also*an eigenvector of $\op B$, corresponding to the same eigenvalue $b$: $$\op B\left(\op A\ket u\right) = \op A\left(\op B\ket u\right) = b\op A\ket u\,,$$ where we have used $\op A\op B=\op B\op A$ in the first step and the eigenvalue equation $\op B\ket u=b\ket u$ in the second step. Two cases arise:- If the eigenvalue $b$ is
**nondegenerate**, $\op A\ket u$ must be simply proportional to $\ket u$: $$\op A\ket u=a\ket u\,;$$ this shows that $\ket u$ is also an eigenvector of $\op A$; the coefficient of proportionality, $a$, is the eigenvalue. - Suppose instead that $b$ has degeneracy $m>1$ and that the corresponding eigenvectors are $\{\ket{u_i},i=1,\,\dots,m\}$. The vector $\op A\ket{u_i}$ is an eigenvector of $\op B$ with eigenvalue $b$, so it must be possible to express it as a linear combination of the vectors $\{\ket{u_i},i=1,\,\dots,m\}$: $$\op A\ket{u_i}=\sum_{k=1}^ma_{ki}\ket{u_k}\,;$$ the notation for the coefficients has been chosen so that $a_{ki}=\bra{u_k}\op A\ket{u_i}$. Now consider the eigenvalue problem $\op A\ket v=a\ket v$, where $\ket v$ is a vector in the $m$-dimensional subspace spanned by the vectors $\{\ket{u_i},i=1,\,\dots,m\}$. By making the expansion $\ket v=\sum_{i=1}^mv_i\ket{u_i}$, the eigenvalue problem can be expressed in matrix form: $$\sum_{j=1}^ma_{ij}\,v_j=a\,v_i\,;$$ if you are in any doubt about this, it would be a good exercise to check it. But the last equation is simply a standard $m\times m$ matrix eigenvalue problem, which can be solved to obtain the eigenvalues and eigenvectors of $\op A$ that are -- at the same time -- eigenvectors of $\op B$ with eigenvalue $b$. Thus, if they commute, the operators $\op A$ and $\op B$ can be diagonalized simultaneously.

- If the eigenvalue $b$ is

- Let $\ket u$ be an eigenvector of $\op B$, so that $\op
B\ket u=b\ket u$. Then if $\op A$ commutes with $\op B$, we
can show that $\op A\ket u$ is

- Because we know how to add and multiply operators, we can build
up simple polynomial expressions such as $\op A^2$, $2\op
A^3+\op1$, etc., which are
*functions*of the operator $\op A$. Perhaps it is possible to extend this idea to functions defined via a power series, $$f(\op A)=\sum_{r=0}^\infty f_r\op A^r\,,\quad\text{where}\quad\op A^0\equiv\op1.$$ To be more specific, suppose that $\op A$ is an operator on $\set V$ with eigenvalues $\{a_i\}$ and eigenvectors $\basis{u_i}$ that span $\set V$; this will certainly be the case if $\op A$ is Hermitian or unitary. To find out how $f(\op A)$ acts on a general vector $\ket b=\sum_ib_i\ket{u_i}$, it would be enough to know its action on $\ket{u_i}$: $$f(\op A)\ket{u_i} =\sum_{r=0}^\infty f_r\op A^r\ket{u_i} =\sum_{r=0}^\infty f_r\,a_i^r\ket{u_i} =f(a_i)\ket{u_i}.$$ For this to give a definite result for every $\ket{u_i}$, the series for $f(a_i)$ should converge for each $a_i$. Clearly, then, we can't say whether $f(\op A)$ is well defined without knowing the spectrum of $\op A$ (its eigenvalues) and the radius of convergence of the power series. - One example of a function that can be very usefully extended to
operators is the exponential function,
$$f(z)=\exp(z)=\sum_{r=0}^\infty\frac{z^r}{r!}\,,$$ which has no
singularities in the finite complex plane (it is an
**entire function**), so that its Taylor series about $z=0$ converges for all finite $z$. Now, if $\op A$ is an operator on $\set V^N$, its eigenvalues will be finite and $$f(\op A)=\exp(\op A)=\sum_{r=0}^\infty\frac{\op A^r}{r!}$$ will have a well-defined action on every $\ket{u_i}$: $$\exp(\op A)\ket{u_i}=\exp(a_i)\ket{u_i}\,;$$ we see that $\ket{u_i}$ is an eigenfunction of $\exp(\op A)$ with eigenvalues $e^{a_i}$. Therefore, in the representation in which $\op A$ is diagonal, $\exp(\op A)$ is also diagonal: $$\exp(\op A)\longrightarrow\diagmat{e^{a_1}}{e^{a_2}}{e^{a_N}}.$$ **Examples (not examinable):**The exponential function of an operator is often met in advanced quantum mechanics. In quantum mechanics, the state of a system is represented by a ket $\ket{\psi(t)}$, which obeys the time-dependent Schroedinger equation $$i\hbar\pdby{}t\ket{\psi(t)}=\op H\ket{\psi(t)}.$$ This is a first-order differential equation whose solution is $$\ket{\psi(t)}=\exp[-i\op Ht/\hbar]\ket{\psi(0)}\equiv\op U_t\ket{\psi(t)},\quad\text{where $\op U_t\equiv\exp[-i\op Ht/\hbar]$}$$ and $\ket{\psi(0)}$ is the state of the system at some initial time $t=0$. Note that if $\ket{\psi(0)}$ happens to be an energy eigenstate with energy $E$, the time dependence of $\ket{\psi(t)}$ reduces to a scalar factor $\exp[-iEt/\hbar]$; this is a result that you have already have met for the time dependence of an energy eigenfunction.

We can call operator $\op U_t\equiv\exp[-i\op Ht/\hbar]$ the

*time-development operator*. It is easy to verify that the time-development operator is*unitary*: $$\op U_t{}^\dagger\op U_t=e^{i\op Ht/\hbar}\,e^{-i\op Ht/\hbar}=\op1.$$ Of course, it must satisfy an equation of motion that is consistent with the time-dependent Schroedinger equation for $\ket{\psi(t)}$, namely $$i\hbar\pdby{\op U_t}t=\op H\op U_t\,;$$ you can check that this holds by directly differentiating the series for $\exp[-i\op Ht/\hbar]$.- The exponential function also appears in the formula for the
partition function of a quantum system: $$Z=\sum_ie^{-\beta
E_i},$$ where $E_i$ are the energy levels (eigenvalues of the
Hamiltonian, $\hat H$) and $\beta=1/(\kT)$. Thus, $$e^{-\beta
E_i} = \bra{u_i}\exp(-\beta\op H)\ket{u_i},$$ where
$\ket{u_i}$ is an energy eigenfunction. So the partition
function is the sum of the diagonal elements (i.e.,
the
*trace*) of the matrix of $\exp(-\beta\op H)$: $$Z = \sum_i\bra{u_i}\exp(-\beta\op H)\ket{u_i}=\Tr[\exp(-\beta\op H)].$$ This is quite a useful formula, because the trace is independent of the choice of basis. Even if we can't calculate the eigenvectors and eigenvalues of $\op H$ exactly, we may still be able to approximate the sum $$Z = \Tr[\exp(-\beta\op H)]=\sum_j\brae j\exp(-\beta\op H)\kete j$$ by making an appropriate choice of basis, $\basis{e_i}$.[Too vague. Example might be where $\op H=\op H_0+\lambda\op H_1$ and $\basis{e_i}$ are eigenvectors of $\op H_0$. Possible then to expand $Z(\lambda)$ (or, more usefully, $F(\lambda)-F(0)=-\kT\log[Z(\lambda)/Z(0)]$) in powers of $\lambda$. But a specific example would be more suitable for year 4.]