Optimal set of least-squares via SVD
Theorem: optimal set of ordinary least-squares
The optimal set of the OLS problem can be expressed as where |
Proof: The following proof relies on the SVD of , and the rotational invariance of the Euclidean norm.
Optimal value of the problem
Using the SVD we can find the optimal set to the least-squares optimization problem
Indeed, if is an SVD of
, the problem can be written
where we have exploited the fact that the Euclidean norm is invariant under the orthogonal transformation . With
, and
, and changing the variable
to
, we express the above as
Expanding the terms, and using the partitioned notations ,
, we obtain
Since is invertible, we can reduce the first term in the objective to zero with the choice
. Hence the optimal value is
We observe that the optimal value is zero if and only if , which is exactly the same as
.
Optimal set
Let us detail the optimal set for the problem. The variable is partly determined, via its first
components:
. The remaining
variables contained in
are free, as
does not appear in the objective function of the above problem.
Thus, optimal points are of the form , with
,
, and
free.
To express this in terms of the original SVD of , we observe that
means that
where is partitioned as
, with
and
. Similarly, the vector
can be expressed as
, with
formed with the first
columns of
. Thus, any element
in the optimal set is of the form
where . (We will soon explain the acronym appearing in the subscript.) The free components
correspond to the degrees of freedom allowed to by the nullspace of
.
Minimum-norm optimal point
The particular solution to the problem, , is the minimum-norm solution, in the sense that it is the element of
that has the smallest Euclidean norm. This is best understood in the space of
-variables.
Indeed, the particular choice corresponds to the element in the optimal set that has the smallest Euclidean norm. Indeed, the norm of
is the same as that of its rotated version,
. the first
elements in
are fixed, and since
, we see that the minimal norm is obtained with
.
Optimal set via the pseudo-inverse
The matrix , which appears in the expression of the particular solution
mentioned above, is nothing else than the pseudo-inverse of
, which is denoted
. Indeed, we can express the pseudo-inverse in terms of the SVD as
With this convention, the minimum-norm optimal point is . Recall that the last
columns of
form a basis for the nullspace of
. Hence the optimal set of the problem is
When is full column rank
, the optimal set reduces to a singleton (a set with only one element), as the nullspace is
. The unique optimal point expresses as