2 SCALAR PRODUCT, NORMS AND ANGLES

Definition

The scalar product (or, inner product, or dot product) between two vectors x,y \in \mathbb{R}^n is the scalar denoted x^Ty, and defined as

    \begin{align*} x^T y = \sum\limits_{i=1}^{n} x_i y_i \end{align*}

The motivation for our notation above will come later when we define the matrix-matrix product. The scalar product is also sometimes denoted \langle x,y \rangle, a notation that originates in physics.

See also:

Orthogonality

We say that two vectors x,y \in \mathbb{R}^n are orthogonal if x^Ty=0.

Example 1: The two vectors in \mathbb{R}^3:

     \begin{align*} x = \begin{pmatrix} 1 \\ 1 \\ 3 \end{pmatrix}, \quad y = \begin{pmatrix} 4 \\ -1 \\ -1 \end{pmatrix} \end{align*}

are orthogonal, since

    \begin{align*} x^T y= \underbrace{1 \times 4}_{x_1 \times y_1} + \underbrace{1 \times(-1)}_{x_2 \times y_2} + \underbrace{(-1) \times 3}_{x_3 \times y_3} = 0 . \end{align*}

2.2. Norms

Definition

Measuring the size of a scalar value is unambiguous — we just take the magnitude (absolute value) of the number. However, when we deal with higher dimensions and try to define the notion of size, or length, of a vector, we are faced with many possible choices. These choices are encapsulated in the notion of norm.

Norms are real-valued functions that satisfy a basic set of rules that a sensible notion of size should involve. You can consult the formal definition of a norm here. The norm of a vector v is usually denoted ||v||

2.3. Three popular norms

In this course, we focus on the following three popular norms for a vector x \in \mathbb{R}^n:

The Euclidean norm:

    \begin{align*} ||x|| := \sqrt{\sum\limits_{i=1}^{n} x_i^2} = \sqrt{x^T x}, \end{align*}

corresponds to the usual notion of distance in two or three dimensions. The set of points with equal l_2-norm is a circle (in 2D), a sphere (in 3D), or a hyper-sphere in higher dimensions.
The l_1-norm:

    \begin{align*} ||x||_1 := \sum\limits_{i=1}^{n} |x_i|, \end{align*}

corresponds to the distance traveled on a rectangular grid to go from one point to another.
The l_\infty-norm:

    \begin{align*} ||x||_\infty := \max\limits_{1\leq i\leq n} |x_i|, \end{align*}

is useful in measuring peak values.

Examples:

  • A given vector will in general have different ‘‘lengths” under different norms. For example, the vector x = [1, -2, 3]^T yields ||x||_2 = 3.7147, ||x||_1 = 6, and ||x||_\infty = 3.
  • Sample standard deviation.

2.4. Cauchy-Schwarz inequality

The Cauchy-Schwarz inequality allows to bound the scalar product of two vectors in terms of their Euclidean norm.

Theorem: Cauchy-Schwarz inequality

For any two vectors x,y \in \mathbb{R}^n, we have

    \begin{align*} x^T y \leq ||x||_2 ||y||_2. \end{align*}

The above inequality is an equality if and only if x, y are collinear. In other words:

    \begin{align*} \max\limits_{x: ||x||_2\leq 1} x^T y = ||y||_2, \end{align*}

with optimal x given by x^* = y/||y||_2 if y is non-zero.

For a proof, see here. The Cauchy-Schwarz inequality can be generalized to other norms, using the concept of dual norm.

2.5. Angles between vectors

When none of the vectors x, y is zero, we can define the corresponding angle as \theta such that

    \begin{align*} \cos\theta = \frac{x^Ty}{||x||_{2}.||y||_{2}} \end{align*}

Applying the Cauchy-Schwartz inequality above to (x,y) and (x,-y) we see that indeed the number above is in [-1,1].

The notion above generalizes the usual notion of angle between two directions in two dimensions, and is useful in measuring the similarity (or, closeness) between two vectors. When the two vectors are orthogonal, that is, x^T y=0, we do obtain that their angle is \theta = 90^o.

See also: Similarity of two documents.

License

Icon for the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Linear Algebra and Applications Copyright © 2023 by VinUiversity is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, except where otherwise noted.

Share This Book