Linear discriminant analysis python

sklearn.discriminant_analysis .LinearDiscriminantAnalysis¶

class sklearn.discriminant_analysis. LinearDiscriminantAnalysis ( solver = ‘svd’ , shrinkage = None , priors = None , n_components = None , store_covariance = False , tol = 0.0001 , covariance_estimator = None ) [source] ¶

Linear Discriminant Analysis.

A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes’ rule.

The model fits a Gaussian density to each class, assuming that all classes share the same covariance matrix.

The fitted model can also be used to reduce the dimensionality of the input by projecting it to the most discriminative directions, using the transform method.

New in version 0.17: LinearDiscriminantAnalysis.

  • ‘svd’: Singular value decomposition (default). Does not compute the covariance matrix, therefore this solver is recommended for data with a large number of features.
  • ‘lsqr’: Least squares solution. Can be combined with shrinkage or custom covariance estimator.
  • ‘eigen’: Eigenvalue decomposition. Can be combined with shrinkage or custom covariance estimator.

Changed in version 1.2: solver=»svd» now has experimental Array API support. See the Array API User Guide for more details.

  • None: no shrinkage (default).
  • ‘auto’: automatic shrinkage using the Ledoit-Wolf lemma.
  • float between 0 and 1: fixed shrinkage parameter.

This should be left to None if covariance_estimator is used. Note that shrinkage works only with ‘lsqr’ and ‘eigen’ solvers.

priors array-like of shape (n_classes,), default=None

The class prior probabilities. By default, the class proportions are inferred from the training data.

n_components int, default=None

store_covariance bool, default=False

If True, explicitly compute the weighted within-class covariance matrix when solver is ‘svd’. The matrix is always computed and stored for the other solvers.

Absolute threshold for a singular value of X to be considered significant, used to estimate the rank of X. Dimensions whose singular values are non-significant are discarded. Only used if solver is ‘svd’.

If not None, covariance_estimator is used to estimate the covariance matrices instead of relying on the empirical covariance estimator (with potential shrinkage). The object should have a fit method and a covariance_ attribute like the estimators in sklearn.covariance . if None the shrinkage parameter drives the estimate.

This should be left to None if shrinkage is used. Note that covariance_estimator works only with ‘lsqr’ and ‘eigen’ solvers.

intercept_ ndarray of shape (n_classes,)

covariance_ array-like of shape (n_features, n_features)

Weighted within-class covariance matrix. It corresponds to sum_k prior_k * C_k where C_k is the covariance matrix of the samples in class k . The C_k are estimated using the (potentially shrunk) biased estimator of covariance. If solver is ‘svd’, only exists when store_covariance is True.

explained_variance_ratio_ ndarray of shape (n_components,)

Percentage of variance explained by each of the selected components. If n_components is not set then all components are stored and the sum of explained variances is equal to 1.0. Only available when eigen or svd solver is used.

means_ array-like of shape (n_classes, n_features)

priors_ array-like of shape (n_classes,)

scalings_ array-like of shape (rank, n_classes — 1)

Scaling of the features in the space spanned by the class centroids. Only available for ‘svd’ and ‘eigen’ solvers.

xbar_ array-like of shape (n_features,)

Overall mean. Only present if solver is ‘svd’.

classes_ array-like of shape (n_classes,)

n_features_in_ int

Number of features seen during fit .

Names of features seen during fit . Defined only when X has feature names that are all strings.

Quadratic Discriminant Analysis.

>>> import numpy as np >>> from sklearn.discriminant_analysis import LinearDiscriminantAnalysis >>> X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]]) >>> y = np.array([1, 1, 1, 2, 2, 2]) >>> clf = LinearDiscriminantAnalysis() >>> clf.fit(X, y) LinearDiscriminantAnalysis() >>> print(clf.predict([[-0.8, -1]])) [1] 

Apply decision function to an array of samples.

Fit the Linear Discriminant Analysis model.

Fit to data, then transform it.

Get output feature names for transformation.

Get metadata routing of this object.

Get parameters for this estimator.

Predict class labels for samples in X.

Return the mean accuracy on the given test data and labels.

Set the parameters of this estimator.

Request metadata passed to the score method.

Project data to maximize class separation.

Apply decision function to an array of samples.

The decision function is equal (up to a constant factor) to the log-posterior of the model, i.e. log p(y = k | x) . In a binary classification setting this instead corresponds to the difference log p(y = 1 | x) — log p(y = 0 | x) . See Mathematical formulation of the LDA and QDA classifiers .

Parameters : X array-like of shape (n_samples, n_features)

Array of samples (test vectors).

Returns : C ndarray of shape (n_samples,) or (n_samples, n_classes)

Decision function values related to each class, per sample. In the two-class case, the shape is (n_samples,), giving the log likelihood ratio of the positive class.

Fit the Linear Discriminant Analysis model.

Changed in version 0.19: store_covariance has been moved to main constructor.

Changed in version 0.19: tol has been moved to main constructor.

y array-like of shape (n_samples,)

Returns : self object

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X .

Parameters : X array-like of shape (n_samples, n_features)

y array-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_params dict

Additional fit parameters.

Returns : X_new ndarray array of shape (n_samples, n_features_new)

get_feature_names_out ( input_features = None ) [source] ¶

Get output feature names for transformation.

The feature names out will prefixed by the lowercased class name. For example, if the transformer outputs 3 features, then the feature names out are: [«class_name0», «class_name1», «class_name2»] .

Parameters : input_features array-like of str or None, default=None

Only used to validate feature names with the names seen in fit .

Returns : feature_names_out ndarray of str objects

Transformed feature names.

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns : routing MetadataRequest

A MetadataRequest encapsulating routing information.

Get parameters for this estimator.

Parameters : deep bool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns : params dict

Parameter names mapped to their values.

Predict class labels for samples in X.

Parameters : X of shape (n_samples, n_features)

The data matrix for which we want to get the predictions.

Returns : y_pred ndarray of shape (n_samples,)

Vector containing the class labels for each sample.

Parameters : X array-like of shape (n_samples, n_features)

Returns : C ndarray of shape (n_samples, n_classes)

Estimated log probabilities.

Parameters : X array-like of shape (n_samples, n_features)

Returns : C ndarray of shape (n_samples, n_classes)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters : X array-like of shape (n_samples, n_features)

y array-like of shape (n_samples,) or (n_samples, n_outputs)

sample_weight array-like of shape (n_samples,), default=None

Returns : score float

Mean accuracy of self.predict(X) w.r.t. y .

See Introducing the set_output API for an example on how to use the API.

Parameters : transform , default=None

Configure output of transform and fit_transform .

  • «default» : Default output format of a transformer
  • «pandas» : DataFrame output
  • None : Transform configuration is unchanged

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline ). The latter have parameters of the form __ so that it’s possible to update each component of a nested object.

Parameters : **params dict

Returns : self estimator instance

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config ). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True : metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
  • False : metadata is not requested and the meta-estimator will not pass it to score .
  • None : metadata is not requested, and the meta-estimator will raise an error if the user provides it.
  • str : metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default ( sklearn.utils.metadata_routing.UNCHANGED ) retains the existing request. This allows you to change the request for some parameters and not others.

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a pipeline.Pipeline . Otherwise it has no effect.

Parameters : sample_weight str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED

Metadata routing for sample_weight parameter in score .

Returns : self object

Project data to maximize class separation.

Parameters : X array-like of shape (n_samples, n_features)

Returns : X_new ndarray of shape (n_samples, n_components) or (n_samples, min(rank, n_components))

Transformed data. In the case of the ‘svd’ solver, the shape is (n_samples, min(rank, n_components)).

Источник

Читайте также:  JavaScript submit form on click
Оцените статью