hyperspy_ml_algorithms.IncrementalSVD#
- class hyperspy_ml_algorithms.IncrementalSVD(n_components=None, num_chunks=None)#
Bases:
objectIncremental (streaming) SVD estimator (no centering).
Computes a plain SVD incrementally by feeding data in batches via
partial_fit. After all batches have been processed, the learned components and singular values are available as attributes.Uses the algorithm of Ross et al. (2008): each new batch is stacked with the previous top-k subspace (scaled by singular values), then a rank-k truncated SVD is computed on the stacked matrix to merge the new data into the existing subspace. No mean centering is ever applied, so the decomposition is an SVD, not PCA.
- Parameters:
- n_componentsint or None, default None
Number of singular components to compute. If None, defaults to
min(n_samples, n_features)on the first batch.- num_chunksint or None, default None
Number of chunks to split the data into when calling
fit(). If None, a heuristic is used based on the data size. Ignored when usingpartial_fitdirectly.
- Attributes:
- components_array, shape (n_components, n_features)
Right singular vectors (rows are components).
- singular_values_array, shape (n_components,)
Singular values in descending order.
- explained_variance_array, shape (n_components,)
Variance explained by each component (
S² / N).- explained_variance_ratio_array, shape (n_components,)
Fraction of top-k variance captured by each component (
S² / sum(S²)).mean_array, shape (n_features,)Mean vector (always zeros — no centering is applied).
- n_samples_seen_int
Total number of samples processed across all
partial_fitcalls.- noise_variance_float
Mean of discarded singular values squared, divided by
n_samples_seen_(if any singular values were discarded).
Examples
>>> import numpy as np >>> from hyperspy_ml_algorithms import IncrementalSVD >>> rng = np.random.default_rng(42) >>> X = rng.standard_normal((200, 50)) >>> est = IncrementalSVD(n_components=3) >>> for chunk in np.array_split(X, 4): ... est.partial_fit(chunk) >>> components = est.components_.T # shape (n_features, n_components) >>> scores = est.transform(X) # shape (n_samples, n_components)
- __init__(n_components=None, num_chunks=None)#
Methods
__init__([n_components, num_chunks])fit(X[, y])Fit the incremental SVD model to X.
fit_transform(X[, y])Fit the model and transform X.
partial_fit(X_chunk[, y])Fit one batch without centering (plain incremental SVD).
transform(X)Project X onto the learned components.
Attributes
mean_Mean vector (always zeros — no centering is applied).