"

4 ORTHOGONALIZATION: THE GRAM-SCHMIDT PROCEDURE

A basis u_1, \ldots, u_n is said to be orthogonal if u_i^Tu_j = 0 when i \neq j. If in addition, ||u_i||_2=1, we say that the basis is orthonormal.

Example 1: An orthonormal basis in \mathbb{R}^2. The collection of vectors \{u_1, u_2\}, with

    \begin{align*} u_1 =\frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}, \qquad & u_2 = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}, \end{align*}

forms an orthonormal basis of \mathbb{R}^2.

4.1. What is orthogonalization?

Orthogonalization refers to a procedure that finds an orthonormal basis of the span of given vectors.

Given vectors a_1, \cdots, a_k \in \mathbb{R}^n, an orthogonalization procedure computes vectors q_1, \cdots, q_r \in \mathbb{R}^n such that

    \begin{align*} S:= \mathbf{span}\{a_1, \cdots, a_k\} = \mathbf{span}\{q_1, \cdots, q_r\}, \end{align*}

where r is the dimension of S, and

    \begin{align*} q_i^Tq_j = 0 \hspace{0.05in} (i \neq j), q_i^Tq_i = 1, \hspace{0.05in} 1\leq i, j \leq r. \end{align*}

That is, the vectors (q_1, \cdots, q_r) form an orthonormal basis for the span of the vectors (a_1, \cdots, a_k).

4.2. Basic step: projection on a line

A basic step in the procedure consists in projecting a vector on a line passing through zero. Consider the line

    \begin{align*} L(q) := \{tq: t\in \mathbb{R}\}, \end{align*}

where q \in \mathbb{R}^n is given, and normalized (||q||_2=1).

The projection of a given point a \in \mathbb{R}^n on the line is a vector v located on the line, that is closest to a (in Euclidean norm). This corresponds to a simple optimization problem:

    \begin{align*} \min\limits_{t} ||a-tq||_2 \end{align*}

The vector a_{proj}:= t^*q, where t^* is the optimal value, is referred to as the projection of a on the line L(q). As seen here, the solution of this simple problem has a closed-form expression:

    \begin{align*} a_{\text{proj}} = (q^Ta)q. \end{align*}

Note that the vector x can now be written as a sum of its projection and another vector that is orthogonal to the projection:

    \begin{align*} a = (a - a_{\text{proj}}) + a_{\text{proj}} = (a - (q^Ta)q) + (q^Ta)q. \end{align*}

where (a - a_{proj}) = (a - (q^Ta)q) and a_{proj} = (q^Ta)q are orthogonal. The vector a = (a - a_{proj}) can be interpreted as the result of removing the component of a along q.

4.3. Gram-Schmidt procedure

The Gram-Schmidt procedure is a particular orthogonalization algorithm. The basic idea is to first orthogonalize each vector w.r.t. previous ones; then normalize result to have norm one.

Case when the vectors are independent

Let us assume that the vectors a_1, \cdots, a_n are linearly independent. The GS algorithm is as follows.

Gram-Schmidt procedure:

    1. set \tilde{q_1} = a_1.
    2. normalize: set q_1 = \tilde{q_1}/ ||\tilde{q_1}||_2.
    3. remove component of q_1 in a_2: set \tilde{q_2} = a_2 - (a_2^Tq_1)q_1.
    4. normalize: set q_2 = \tilde{q_2}/ ||\tilde{q_2}||_2.
    5. remove components of q_1, q_2 in a_3: set  \tilde{q_3} = a_3 - (a_3^Tq_1)q_1 - (a_3^Tq_2)q_2.
    6. normalize: set q_3 = \tilde{q_3}/ ||\tilde{q_3}||_2.
    7. etc.
The image of the left shows the GS procedure applied to the case of two vectors in two dimensions. We first set the first vector to be a normalized version of the first vector a_1. Then we remove the component of a_2 along the direction q_1. The difference is the (un-normalized) direction  \tilde{q_2} , which becomes q_2 after normalization. At the end of the process, the vectors q_1, q_2 have both unit length and are orthogonal to each other.

The GS process is well-defined, since at each step \tilde{q_i} \neq 0 (otherwise this would contradict the linear independence of the a_i‘s).

General case: the vectors may be dependent

It is possible to modify the algorithm to allow it to handle the case when the a_i‘s are not linearly independent. If at step i, we find \tilde{q_i} = 0 , then we directly jump at the next step.

Modified Gram-Schmidt procedure:

  1. set r=0.
  2. for i = 1, \cdots, n:
    1. set \tilde{q} = a_i - \sum_{j=1}^{r}(q_j^Ta_i)q_j.
    2. if \tilde{q}\neq 0, r = r+1; q_r = \tilde{q}/||\tilde{q}||_2.

On exit, the integer r is the dimension of the span of the vectors a_1, \cdots, a_k.

License

Icon for the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Linear Algebra and Applications Copyright © 2023 by VinUiversity is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, except where otherwise noted.