APPLICATION: DATA VISUALIZATION BY PROJECTION ON A LINE

Alicia Y. Tsai; Authors: Laurent EI Ghaoui; Contributors: Phan Hoang, Tony Tin, Ha To Tam, Nguyen Van Kep, Le Dinh Nam, Juliette Decugis, Nguyen Quoc Cuong.; Giuseppe C. Calafiore

7 APPLICATION: DATA VISUALIZATION BY PROJECTION ON A LINE

7.1. Senate voting data

In this section, we are focussing on a data set containing the votes of US Senators. This dataset can be represented as a collection of $n =100$ vectors $x_j$ , $j = 1, \cdots, n$ in $\mathbb{R}^m$ , with $m=645$ the number of bills, and $n=100$ the number of Senators. Thus, $x_j$ contains all the votes of Senator $j$ , and the $i$ -th component of $x_j$ contains the vote of that Senator on bill $i$ .

Senate voting matrix: This image shows the votes of the $n=100$ Senators in the 2004-2006 US Senate, for a total of $m=645$ bills. “Yes” votes are represented as $1$ ‘s, “No” as $-1$ ‘s, and the other votes are recorded as $0$ . Each row represents the votes of a single Senator, and each column contains the votes of all Senators for a particular bill. The vectors $x_j$ , $j=1, \cdots, m =645$ can be read as rows in the picture.

7.2. Visualization of high-dimensional data via projection

As seen in the picture above, simply plotting the raw data is often not very informative.

We can try to visualize the data set, by projecting each data point (each row or column of the matrix) on (say) a one-, two- or three-dimensional space. Each ‘‘view’’ corresponds to a particular projection, that is, a particular one-, two- or three-dimensional subspace on which we choose to project the data. Let us detail what it means to project on a one-dimensional set, that is, on a line.

Projecting on a line allows to assign a single number, or ‘‘score’’, to each data point, via a scalar product. We choose a (normalized) direction $u \in \mathbb{R}^m$ , and a scalar $v \in \mathbb{R}$ . This corresponds to the affine ‘‘scoring’’ function $f: \mathbb{R}^m \rightarrow \mathbb{R}$ , which, to a generic data point $x \in \mathbb{R}^m$ , assigns the value

$\begin{align*}f(x) = u^Tx+v\end{align*}$

We thus obtain a vector of values $f \in \mathbb{R}^n$ , with components $f_j = u^Tx_j + v$ , $j = 1, \cdots, n$ . It is often useful to center these scores around zero. This can be done by choosing v such that

$\begin{align*}0 = \sum\limits_{j=1}^{n} (u^Tx_j + v) = u^T \left(\sum\limits_{j=1}^{n} x_j \right) +n\cdot v\end{align*}$

The zero-mean condition implies $v = u^T \widehat{x}$ , where

$\begin{align*}\widehat{x}:=\frac{1}{n} \sum\limits_{j=1}^{n} x_j \in \mathbb{R}^m\end{align*}$

is the vector of sample averages of the different data points.

The vector $\widehat{x}$ can be interpreted as the ‘‘average response’’ across data points (the average vote across Senators in our running example). The values of our scoring function can now be expressed as

$\begin{align*}f(x) = u^T(x-\widehat{x})\end{align*}$

In order to be able to compare the relative merits of different directions, we can assume, without loss of generality, that the direction vector u is normalized (so that $||u||_2=1$ ).

Note that our definition of $f(x)$ above is consistent with the idea of projecting the data points $x_i - \widehat{x}$ on the line passing through the origin and with normalized direction $u$ . Indeed, the component of $x_i - \widehat{x}$ on the line is $u^T(x_i - \widehat{x})$ .

In the Senate voting example above, a particular projection (that is, a direction in $\mathbb{R}^m$ ) corresponds to assigning a ‘‘score’’ to each Senator, and thus represents all the Senators as a single value on a line. We will project the data along a vector in the ‘‘bill’’ space, which is $\mathbb{R}^m$ . That is, we are going to form linear combinations of the bills, so that the $m=642$ votes for each Senator are reduced to a single number, or ‘‘score’’. Since we centered our data, the average score (across Senators) is zero.

7.3. Examples

Projection in a random direction

Scores obtained with random direction: This image shows the values of the projections of the Senator’s votes $x_j - \widehat{x}$ (that is, with average across Senators removed) on a (normalized) ‘‘random bill’’ direction. Each score is a number given by $u_{rand}^T(x_j - \widehat{x})$ , with $u_{rand}$ a random vector (normalized to have unit Euclidean norm). We show the party affiliation, with the green color corresponding to the Independent Senator Jeffords. This projection shows no obvious structure; in particular, it does not reveal the party affiliation.

Projection on the ‘‘all-ones’’ vector

Clearly, not all directions are ‘‘good’’, in the sense of producing informative plots. Here, we discuss a general principle that allows us to choose an ‘‘informative’’ direction. But for this data set, a good guess could be to choose the direction that corresponds to the ‘‘average bill’’. That is, we choose the direction $u$ to be the parallel to the vector of ones in $\mathbb{R}^m$ , scaled appropriately so that its Euclidean norm is one.



	Scores obtained with all-ones direction: This image shows the values of the projections of the Senator’s votes $x_j - \widehat{x}$ (that is, with average across Senators removed) on a (normalized) ‘‘average bill’’ direction. Each score is a number given by $u_{avg}^T(x_j - \widehat{x})$ , with $u_{avg}$ a vector with elements all equal to $1/\sqrt{m}$ (it is normalized to unit Euclidean norm). This projection reveals clearly the party affiliation of each senator, although the information was not available to the projection algorithm.
	The interpretation is that the behavior of senators on an ‘‘average bill’’ almost fully determines her or his party affiliation. Note the range of the plot (about $[-6;6]$ , which is higher than that of the random vector (about $[-1;1]$ ), allowing a better spread of scores.