normalize.Rd
Normalize the columns of a project matrix. For each eigenvector, swap the signs of the vector elements if the first entry is negative. See "Details" for more information.
normalize(B, d)
B | A projection matrix: often the matrix of the left singular vectors given by the Singular Value Decomposition of a data matrix or Grammian. |
---|---|
d | The number of columns of |
A matrix of the eigenvectors or left singular vectors in B
transformed to be the left singular values of the original data matrix.
This function is designed to reconstruct the original first d
left singular vectors of a data matrix from the first d
eigenvectors of the Grammian of that data matrix. Basically, after the
data matrix has been centred, the left singular vectors of that data
matrix and the left singular vectors of the Grammian of that data matrix
are equal up to a sign. This function reverses that sign so that the two
sets of singular vectors are equal.
Consider the internal workings of the aespca
function. This
"sign flipping" changes the eigenvectors of xtx
into the
left singular vectors of scale(X, , center = TRUE, scale = TRUE)
.
Instead of calculating the Grammian, regularising it (by adding some small
\(\lambda\) value to the diagonal), taking the SVD of the regularized
Grammian, and extracting the first \(d\) eigenvectors, why don't we just
extract the first \(d\) singular vectors directly from the scaled data
matrix itself? The regularisation effect only inflates the singular- or
eigen-values anyway, so it has no effect on the singular vectors in any
way. Moreover, the aespca
function does not even call for
the eigen-values at all, so this whole process is supurfluous. The only
wrinkle is adapting the lars.lsa
and aespca
functions to only operate on the data matrix.
Furthermore, the lars
function can take in the
full data, instead of just a Grammian. As an enhancement, we should either
update our copy of the lars function in lars.lsa
, or make a
call to the exported lars
function. ENHANCEMENT.
# DO NOT CALL THIS FUNCTION DIRECTLY. # Use AESPCA_pVals() instead