Efficient implementation of covariance updates - maniteja123/scikit-learn GitHub Wiki
How do I implement the following efficiently?
[X \in R^{N \times p}, y \in R^N, w \in R^{p}, p \gg N ]
[f(X, y,w) = \sum_{k:|\beta_k | greater 0 }} (x_j, x_k) w_k = (\tilde{X}[:,j], \tilde{X}) {\tilde w ]
for i in n_iterations
for j in p
f(X,y,w)
end
end
background:
-
w starts out dense but gets sparser as i -> n_iterations
-
if w[j]==0 it will stay zero
-
[(\tilde{X}[:,j], \tilde{X}) ] is recalculated for each iteration of the outer loop, but it can't be cached from the beginning since p x p is to large fit in memory.
-
implementation will be in Cython
-
cblas ddot
could be used for the scalar product -
current implementation (indexing of cached values is buggy)