PHYS20672 Summary 5

Definition of a vector space

  1. A vector space $\set{V}$ is a set of objects (vectors, denoted $\ket a$, $\ket b$, etc.) that is closed under two operations:
  2. Every vector space contains a (unique) null or zero vector $\ket0$ such that for all $\ket a$ in $\set{V}$, $$\ket a + \ket0=\ket a.$$
  3. For every vector $\ket a$ in $\set{V}$ there is an additive inverse, temporarily denoted $\ket{{-}a}$, such that $$\ket a+\ket{{-}a} = \ket0.$$ When the scalars are real or complex numbers, the axioms above imply that the additive inverse can be constructed as $\ket{{-}a}=(-1)\ket a$. Thus, we can drop the pedantic notation $\ket{{-}a}$ in favour of $-\ket a$.

Shankar 1.1, Riley 8.1

Linear independence

  1. A set of vectors $\{\ket{a_i},\; i=1, 2, \dots, N\}$ is said to be linearly independent if the equation $$\sum_{i=1}^N\lambda_i\ket{a_i}=\ket0$$ is satisfied only if every $\lambda_i=0$. Otherwise the vectors are linearly dependent.
  2. If in a vector space there is a set of $N$ linearly independent vectors but no set of $N+1$ linearly independent vectors, the space is said to be $N$-dimensional. In this course we denote the dimensionality by a superscript: $\set{V}^N$.
  3. For an $N$-dimensional vector space, a set of $N$ linearly independent vectors is called a basis; the base vectors $\ket{a_i}$ are said to span $\set{V}^N$. Given a basis $\basis{a_i}$, any vector $\ket b$ can be expressed as a linear combination of the $\ket{a_i}$: $$ \ket b = \sum_{i=1}^Nb_i\ket{a_i},$$ where the expansion coefficients (or components) $b_i$ are unique for a given $\ket b$ and choice of basis.

Shankar 1.1, Riley 8.1.1


  1. Given a basis $\basis{a_i}$, the list of components $b_i$ specifies $\ket b$ completely, and is said to be a representation of the abstract vector $\ket b$: $$\ket b\longrightarrow\colvec bn.$$ The arrow "$\longrightarrow$" can be pronounced "is represented by". This is just like representing a Cartesian 3-vector $\vec b$ by a list of its Cartesian components, $\displaystyle\begin{pmatrix}b_x\\b_y\\b_z\end{pmatrix}$; here the notation $\vec b$ (like $\ket b$) denotes the vector without making reference to the coordinate system, whereas the components $b_x,b_y,b_z$ depend on the choice of axes, i.e., on the choice of basis vectors.
  2. Note that the basis vectors $\ket{a_i}$ have a particularly simple representation: $$\ket{a_1}\longrightarrow\begin{pmatrix}1\\0\\\vdots\\0\end{pmatrix},\quad \ket{a_2}\longrightarrow\begin{pmatrix}0\\1\\\vdots\\0\end{pmatrix},\quad\dots,\quad \ket{a_N}\longrightarrow\begin{pmatrix}0\\0\\\vdots\\1\end{pmatrix}.$$
  3. In a given representation, if $\ket w=\lambda\ket b + \mu\ket c$ then $w_i = \lambda b_i+\mu c_i$; i.e., $$\ket w \longrightarrow \begin{pmatrix} \lambda b_1+\mu c_1\\\lambda b_2+\mu c_2\\\vdots\\\lambda b_N+\mu c_N \end{pmatrix}.$$

Shankar 1.1, Riley 8.3

Scalar product

  1. The scalar product (if defined for a particular vector space $\set{V}$) is a generalization of the dot product $\vec a\cdot\vec b$ defined for Cartesian vectors. It is a scalar function of two vectors $\ket a,\ket b\in\set{V}$, written $\braket ab$, with the following properties: The first two properties imply that if $\ket w=\lambda\ket b+\mu\ket c$, then $$\braket wa=\conj\lambda\braket ba+\conj\mu\braket ca,$$ i.e., the scalar product is "conjugate linear" (or "antilinear") in its first argument.
  2. A vector space with a scalar product is called an inner product space; "inner product" is just another name for the scalar product.
  3. Once an inner product is defined, the norm (or "length") of a vector $\ket b$ can be defined as $\norm b\equiv\sqrt{\!\braket bb}\ge0$.
  4. If $\norm b=1$, we say that $\ket b$ is normalized to unity; we could also say that $\ket b$ is a unit vector.
  5. Two important inequalities that follow from the properties of the scalar product are: For the proofs, see Examples 6.51.

Shankar 1.2, Riley 8.1.2, 8.1.3

Orthonormal bases

  1. Given the existence of a scalar product, the orthogonality of two vectors $\ket a,\ket b$ can be defined by the condition $\braket ab=0$. A set of vectors $\kete i \in \set{V}^N$ that have unit norm and are mutually orthogonal is called an orthonormal set. The vectors of an orthonormal set satisfy $$\braket{e_i}{e_j} = \delta_{ij} = \begin{cases} 1\quad\text{if $i=j$}\\0\quad\text{if $i\ne j$.} \end{cases}$$ If the orthonormal set contains $N$ unit vectors $\kete i$, they form an orthonormal basis for $\set{V}^N$.
  2. The components $b_i$ of a vector $\ket b$ with respect to an orthonormal basis $\basis{e_i}$ are given by the simple expressions $b_i=\braket{e_i}b$, analogous to $b_x = \vec i\cdot\vec b$, etc., for the components of a Cartesian vector. The scalar product $\braket bc$ also takes a simple form, $$\braket bc=\sum_i\conj{b_i}c_i=\rowvec bn\colvec cn,$$ where the product of row and column vectors follows the usual rule for matrix multiplication. (Note that these simple formulas for $b_i$ and $\braket bc$ do not hold for a general, non-orthonormal basis.)
  3. Bras and kets. The symbols $\ket a,\ket b,\dots$ we have used for abstract vectors have been called kets by Dirac. Their representations (see above) are column vectors. For each ket $\ket b$ there is a corresponding bra vector, $\bra b$, which is represented by the row vector $(\conj{b_1}, \conj{b_2}, \dots,\conj{b_N})$ in the preceding formula for the scalar product. The set of bra vectors is also a vector space, which is distinct from the space $\set{V}^N$ that the kets $\ket b$ belong to; the vector space of bras is said to be dual to $\mathbb V^N$. The existence of a distinction between these two vector spaces should be clear from the representation of kets by column vectors and bras by row vectors. Similarly, the one-to-one correspondence between kets $\ket b$ and bras $\bra b$ should also be apparent from this representation: $$\ket b\longrightarrow\colvec bn\equiv\vec b,\qquad \bra b\longrightarrow\rowvec bn\equiv\vec b^\dagger;$$ i.e., the row vector representing a bra is the Hermitian conjugate (conjugate transpose) of the column vector representing the ket, and vice versa. The scalar product $\braket bc$ can be regarded as a combination of the bra $\bra b$ with the ket $\ket c$ to form the "bra[c]ket" $\braket bc$; this explains the silly (but useful) names.
  4. Gram-Schmidt orthogonalization is a general method for constructing an orthonormal basis $\basis{e_i}$ from a given non-orthonormal basis $\basis{a_i}$ for $\set{V}^N$. We... At each step, $\ket{a_j}$ is linearly independent of the unit vectors $\kete i$ with $i\lt j$, as the latter were constructed using only the vectors $\ket{a_i}$ with $i\lt j$. The process therefore terminates on step $j=N$, after the construction of $\kete N$. The existence of the Gram-Schmidt process proves that an orthonormal basis can be found for any finite-dimensional inner-product space.

Shankar 1.2, 1.3.1, Riley 8.1.2

Linear operators

  1. A linear operator $\op A$ is a vector-valued function of a vector, written $\op A\ket v$, which has the property of linearity: $$\op A(\lambda\ket a+\mu\ket b) = \lambda(\op A\ket a)+\mu(\op A\ket b)$$ where $\lambda,\mu$ are scalars and $\ket a,\ket b$ are vectors in $\set{V}$. In this course we suppose that $\op A\ket a\in\set{V}$ if $\ket a\in\set{V}$.
  2. Simple examples of operators are
  3. The sum of two operators, $\op A+\op B$, is defined by its action $$(\op A + \op B)\ket b = \op A\ket b+\op B\ket b$$ for all $\ket b\in\set{V}$. Multiplication of an operator $\op A$ by a scalar $\lambda$ is defined by $$(\lambda\op A)\ket b = \lambda(\op A\ket b).$$
  4. The product of two operators, $\op A\op B$ is defined by $$(\op A\op B)\ket b = \op A(\op B\ket b);$$ i.e., $\op B$ is applied first, then $\op A$. From this definition you should be able to show that the product is associative: $(\op A\op B)\op C = \op A(\op B\op C)$. Note, however, that the operator product need not be commutative: $\op A\op B\ne\op B\op A$, in general.
  5. If there exists an operator $\op B$ such that $\op B\op A=\op1$, then $\op B\equiv\op A^{-1}$ is called the inverse of $\op A$. Not all operators have an inverse!
  6. More examples of operators:

Matrix representation of operators

  1. If $\ket c=\op A\ket b$, projection on to vector $\kete i$ of an orthonormal basis extracts the component $c_i=\braket{e_i}{c}=\brae i(\op A\ket b)$. If we expand $\ket b$ in terms of the basis vectors, $\ket b=\sum_jb_j\kete j$, this gives $$c_i = \sum_j \brae i\op A\kete jb_j = \sum_jA_{ij}b_j\,, $$ where $A_{ij}=\brae i\op A\kete j$. We recognize this as the rule for multiplying a column vector by a matrix $\mat A$ with elements $A_{ij}$. Thus, the equation $$\ket c = \op A\ket b$$ has the representation $$\colvec cn = \matrix An\colvec bn\quad\text{or, more briefly,}\quad\vec c=\mat A\vec b.$$
  2. Aside:
  3. Examples of matrix representations of operators:

Shankar 1.2, 1.3.1, Riley 8.1.2

Adjoint operators (Hermitian conjugates)

  1. For a linear operator $\op A$, we define its adjoint (or Hermitian conjugate) to be the operator $\op A^\dagger$ for which $$\bra u\op A^\dagger\ket v = \conj{\bra v\op A\ket u}$$ for all $\ket u,\ket v\in\set{V}$. Taking $\ket u$ and $\ket v$ to be orthonormal basis vectors $\kete i$ and $\kete j$ shows at once that the matrix elements of $\op A^\dagger$ are $$A^\dagger{}_{ij} = \brae i\op A^\dagger\kete j =\conj{\brae j\op A\kete i} = \conj{A_{ji}}\,,$$ i.e., the matrix $\mat A^\dagger$ is the conjugate transpose (Hermitian conjugate) of the matrix $\mat A$.
  2. You should be able to show that:

Hermitian operators

  1. An operator $\op A$ is said to be self-adjoint (or Hermitian) if $\op A=\op A^\dagger$, so that $$\bra u\op A\ket v=\conj{\bra v\op A\ket u}$$ for all $\ket u,\ket v\in\set{V}$. From this, the matrix elements of $\op A$ satisfy $A_{ij}=\conj{A_{ji}}$: the matrix $\mat A=\mat A^\dagger$ is Hermitian.
  2. Note that in a real vector space (i.e., one in which the scalars are real numbers), the matrix of a self-adjoint operator will be real/symmetric, $A_{ij}=A_{ji}\in\set{R}$.
  3. Exercise:

Unitary operators

  1. An operator $\op U$ is said to be unitary if $\op U\op U^\dagger=\op U^\dagger\op U=\op1$, i.e. if $\op U^{-1}=\op U^\dagger$. The matrix representing a unitary operator therefore satisfies $\mat U\mat U^\dagger=\mat U^\dagger\mat U=\mat I$.
  2. Exercise:
  3. For a real vector space, a unitary operator can be represented by a real, orthogonal matrix, $\mat O$, so that $\mat O\tp{\mat O}=\mat I$, where $\tp{\mat O}$ denotes the transpose of $\mat O$.
  4. The columns of a unitary matrix can be regarded as an orthonormal set of vectors. The same is true of the rows.
  5. We can show that a change of basis can be effected by means of a unitary matrix:
  6. Aside on active and passive transformations. Just as in the case of a rotation, a given unitary transformation with matrix elements $U_{ik}$ can be interpreted in two ways. As we have just seen, the equation $b'_i=\sum_kU_{ik}b_k$ expresses the components of $\ket b$ with respect to one basis, $\basis{e'_i}$, in terms of the components of $\ket b$ with respect to another basis, $\basis{e_i}$. Here the vector $\ket b$ is the same in each case, but the "axes" have been changed; this corresponds to the passive view of rotations, met before in PHYS10672. But the same equation, $b'_i=\sum_kU_{ik}b_k$, has a different interpretation when the quantities $b'_i$ are instead the components of a different vector $\ket{b'}$ with respect to the original basis, $\basis{e_i}$. In this case $$\ket{b'}=\op U\ket b,\quad\text{i.e.,}\quad\ket{b'}\ne\ket b,$$ and the coefficients $U_{ik}$ are the matrix elements of the unitary operator $\op U$ relating the two vectors, $U_{ik}=\brae i\op U\kete k$. This corresponds to the active view of rotations, also met in PHYS10672. Both interpretations are regularly needed in physics, but it should always be clear which one is being used.
  7. The transformation of the matrix elements of operators to a new representation (basis) is almost as straightforward as the transformation of the components of a vector. In the new representation with basis vectors $\basis{e'_i}$ we have $$A'_{ij} = \bra{e'_i}\op A\ket{e'_j} =\bra{e'_i}\op1\op A\op1\ket{e'_j} =\sum_{k,l}\braket{e'_i}{e_k}\brae k\op A\kete l\braket{e_l}{e'_j},$$ where we have used the completeness relation twice. Noting that $\braket{e'_i}{e_k}=U_{ik}$ and $\braket{e_l}{e'_j}=U^\dagger{}_{lj}$ we find $$\mat A' = \mat U\mat A\mat U^\dagger.$$

Eigenvalue problems

  1. Revision of your existing knowledge of eigenvalue problems. Note that you are expected to be able to apply this knowledge, as you have met many examples already in PHYS10071, PHYS10672 (principal axes of symmetric tensors) and PHYS20401 (small oscillations problems).
  2. Eigenvalues and eigenvectors of Hermitian operators, $\op A=\op A^\dagger$
  3. Eigenvalues and eigenvectors of unitary operators, $\op U^{-1}=\op U^\dagger$. If $\op U$ is unitary and the eigenvalue equation is $$\op U\ket{v_j}=\omega_j\ket{v_j},$$ then: For the proof of the first two statements, see Examples 6.58.
  4. "Diagonalization" of Hermitian and unitary operators
  5. Aside [not needed for the examples]: Diagonalization of commuting operators -- Operators $\op A$ and $\op B$ are said to commute if $\op A\op B=\op B\op A$.

Functions of operators

  1. Because we know how to add and multiply operators, we can build up simple polynomial expressions such as $\op A^2$, $2\op A^3+\op1$, etc., which are functions of the operator $\op A$. Perhaps it is possible to extend this idea to functions defined via a power series, $$f(\op A)=\sum_{r=0}^\infty f_r\op A^r\,,\quad\text{where}\quad\op A^0\equiv\op1.$$ To be more specific, suppose that $\op A$ is an operator on $\set V$ with eigenvalues $\{a_i\}$ and eigenvectors $\basis{u_i}$ that span $\set V$; this will certainly be the case if $\op A$ is Hermitian or unitary. To find out how $f(\op A)$ acts on a general vector $\ket b=\sum_ib_i\ket{u_i}$, it would be enough to know its action on $\ket{u_i}$: $$f(\op A)\ket{u_i} =\sum_{r=0}^\infty f_r\op A^r\ket{u_i} =\sum_{r=0}^\infty f_r\,a_i^r\ket{u_i} =f(a_i)\ket{u_i}.$$ For this to give a definite result for every $\ket{u_i}$, the series for $f(a_i)$ should converge for each $a_i$. Clearly, then, we can't say whether $f(\op A)$ is well defined without knowing the spectrum of $\op A$ (its eigenvalues) and the radius of convergence of the power series.
  2. One example of a function that can be very usefully extended to operators is the exponential function, $$f(z)=\exp(z)=\sum_{r=0}^\infty\frac{z^r}{r!}\,,$$ which has no singularities in the finite complex plane (it is an entire function), so that its Taylor series about $z=0$ converges for all finite $z$. Now, if $\op A$ is an operator on $\set V^N$, its eigenvalues will be finite and $$f(\op A)=\exp(\op A)=\sum_{r=0}^\infty\frac{\op A^r}{r!}$$ will have a well-defined action on every $\ket{u_i}$: $$\exp(\op A)\ket{u_i}=\exp(a_i)\ket{u_i}\,;$$ we see that $\ket{u_i}$ is an eigenfunction of $\exp(\op A)$ with eigenvalues $e^{a_i}$. Therefore, in the representation in which $\op A$ is diagonal, $\exp(\op A)$ is also diagonal: $$\exp(\op A)\longrightarrow\diagmat{e^{a_1}}{e^{a_2}}{e^{a_N}}.$$
  3. Examples (not examinable):