-
Notifications
You must be signed in to change notification settings - Fork 2
/
90-pca-alternating-regression.qmd
72 lines (65 loc) · 2.51 KB
/
90-pca-alternating-regression.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# PCA and alternating regression {#alt-pca}
```{r setup}
#| include: false
source("_common.R")
```
Let $X_1, \ldots, X_n \in \mathbb{R}^p$ be the observations for which we
want to compute the corresponding PCA. Without loss of generality we can
always assume that
$$
\frac{1}{n} \sum_{i=1}^n X_i \ = (0,\ldots,0)^\top \, ,
$$
so that the sample covariance matrix $S_n$ is
$$
S_n \ = \ \frac{1}{n-1} \, \sum_{i=1}^n X_i \, X_i^\top \, .
$$
We saw in class that if $B \in \mathbb{R}^{p \times k}$ has in its
columns the eigenvectors of $S_n$ associated with its $k$ largest
eigenvalues, then
$$
\frac{1}{n} \, \sum_{i=1}^n \left\| X_i - P( L_{B}, X_i
) \right\|^2 \ \le \
\frac{1}{n} \, \sum_{i=1}^n \left\| X_i - P( L, X_i
) \right\|^2 \, ,
$$
for any $k$-dimensional linear subspace $L \subset \mathbb{R}^p$
where $P( L, X)$ denotes the orthogonal projection of $X$ onto the
subspace $L$, $P( L_{B}, X) = {B} {B}^\top X$ (whenever ${B}$ is chosen
so that ${B}^\top {B} = I$) and $L_{B}$ denotes the subspace spanned by
the columns of $B$.
We will show now that, instead of finding the spectral decomposition of
$S_n$, principal components can also be computed via a sequence of
"alternating least squares" problems. To fix ideas we will consider the
case $k=1$, but the method is trivially extended to arbitrary values of
$k$.
When $k=1$ we need to solve the following problem
\begin{equation}
\min_{\left\| a \right\|=1, v \in \mathbb{R}^n} \
\sum_{i=1}^n \left\| X_i - a \, v_i \right\|^2,
(\#eq:pca1)
\end{equation}
where $v = (v_1, \ldots v_n)^\top$ (in general, for any $k$ we have
$$
\min_{ A^\top A = I, v_1, \ldots, v_n \in \mathbb{R}^k} \
\sum_{i=1}^n \left\| X_i - A \, v_i \right\|^2 \, ).
$$
The objective function in Equation \@ref(eq:pca1) can also be written
as
\begin{equation}
\sum_{i=1}^n \sum_{j=1}^p \left( X_{i,j} - a_j \, v_i \right)^2 \, , (\#eq:pca2)
\end{equation}
and hence, for a given vector $a$, the minimizing values of
$v_1, \ldots, v_n$ in Equation \@ref(eq:pca2) can be found solving $n$
separate least squares problems:
$$
v_\ell \, = \, \arg \, \min_{d \in \mathbb{R}}
\sum_{j=1}^p \left( X_{\ell,j} - a_j \, d \right)^2 \, , \qquad
\ell = 1, \ldots, n \, .
$$
Similarly, for a given set $v_1, \ldots, v_n$ the entries of $a$ can
be found solving $p$ separate least squares problems: $$
a_r \, = \, \arg \, \min_{d \in \mathbb{R}}
\sum_{i=1}^n \left( X_{i, r} - d \, v_i \right)^2 \, , \qquad
r = 1, \ldots, p \, .
$$ We can then set $a \leftarrow a / \| a \|$ and iterate to find new
$v$'s, then a new $a$, etc.