hyperspy_ml_algorithms.SVDPCA#

class hyperspy_ml_algorithms.SVDPCA(n_components=None, svd_solver='auto', centre=None, auto_transpose=True, svd_flip=True, u_based_decision=True)#

Bases: object

SVD-based PCA estimator.

Performs Principal Component Analysis using Singular Value Decomposition.

Parameters:
n_componentsint or None, default None

Number of components to keep. If None, keep all components.

svd_solver{“auto”, “full”, “arpack”, “randomized”}, default “auto”

SVD solver to use. See svd_solve() for details.

centre{None, “signal”, “features”, False}, default None

Centering strategy:

  • None or False: no centering.

  • "signal": center along the signal axis (each sample’s signal values are centered around zero).

  • "features": center the features (each feature is centered to have zero mean across samples). This is the standard PCA centering (equivalent to sklearn’s PCA).

auto_transposebool, default True

If True, automatically transposes the data to boost performance.

svd_flipbool, default True

If True, adjusts the signs of the components and scores such that the components that are largest in absolute value are always positive.

u_based_decisionbool, default True

If True, use the columns of u as the basis for sign flipping. Otherwise, use the rows of v.

Attributes:
components_ndarray of shape (n_features, n_components)

Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by decreasing singular values.

singular_values_ndarray of shape (n_components,)

Singular values corresponding to each component.

explained_variance_ndarray of shape (n_components,)

The amount of variance explained by each component. When centre is not None (mean-centered data), this is the variance explained by each component (σᵢ² / N), consistent with PCA. When centre is None (no centering), this is the mean squared contribution of each component (σᵢ² / N), i.e. a measure of signal energy rather than variance.

explained_variance_ratio_ndarray of shape (n_components,)

Percentage of variance explained by each component.

mean_ndarray or None

Per-feature empirical mean, estimated from the training data. None if centre is None or False.

Examples

>>> import numpy as np
>>> from hyperspy_ml_algorithms import SVDPCA
>>> rng = np.random.RandomState(42)
>>> X = rng.random((77, 13))
>>> est = SVDPCA(n_components=5)
>>> est.fit(X)
SVDPCA(n_components=5, ...)
>>> est.components_.shape
(5, 13)
>>> scores = est.transform(X)
>>> scores.shape
(77, 5)
__init__(n_components=None, svd_solver='auto', centre=None, auto_transpose=True, svd_flip=True, u_based_decision=True)#

Methods

__init__([n_components, svd_solver, centre, ...])

fit(X[, y])

Fit the model with X.

fit_transform(X[, y])

Fit the model with X and apply dimensionality reduction.

transform(X)

Apply dimensionality reduction to X.