sklearn.preprocessing.Imputer

Warning

DEPRECATED

class sklearn.preprocessing.Imputer(**kwargs)[source]

Imputation transformer for completing missing values.

Read more in the User Guide.

Parameters
missing_valuesinteger or “NaN”, optional (default=”NaN”)

The placeholder for the missing values. All occurrences of missing_values will be imputed. For missing values encoded as np.nan, use the string value “NaN”.

strategystring, optional (default=”mean”)

The imputation strategy.

  • If “mean”, then replace missing values using the mean along the axis.

  • If “median”, then replace missing values using the median along the axis.

  • If “most_frequent”, then replace missing using the most frequent value along the axis.

axisinteger, optional (default=0)

The axis along which to impute.

  • If axis=0, then impute along columns.

  • If axis=1, then impute along rows.

verboseinteger, optional (default=0)

Controls the verbosity of the imputer.

copyboolean, optional (default=True)

If True, a copy of X will be created. If False, imputation will be done in-place whenever possible. Note that, in the following cases, a new copy will always be made, even if copy=False:

  • If X is not an array of floating values;

  • If X is sparse and missing_values=0;

  • If axis=0 and X is encoded as a CSR matrix;

  • If axis=1 and X is encoded as a CSC matrix.

Attributes
statistics_array of shape (n_features,)

The imputation fill value for each feature if axis == 0.

Notes

  • When axis=0, columns which only contained missing values at fit are discarded upon transform.

  • When axis=1, an exception is raised if there are rows for which it is not possible to fill in the missing values (e.g., because they only contain missing values).

Methods

fit(self, X[, y])

Fit the imputer on X.

fit_transform(self, X[, y])

Fit to data, then transform it.

get_params(self[, deep])

Get parameters for this estimator.

set_params(self, \*\*params)

Set the parameters of this estimator.

transform(self, X)

Impute all missing values in X.

__init__(*args, **kwargs)[source]

DEPRECATED: Imputer was deprecated in version 0.20 and will be removed in 0.22. Import impute.SimpleImputer from sklearn instead.

fit(self, X, y=None)[source]

Fit the imputer on X.

Parameters
X{array-like, sparse matrix}, shape (n_samples, n_features)

Input data, where n_samples is the number of samples and n_features is the number of features.

Returns
selfImputer
fit_transform(self, X, y=None, **fit_params)[source]

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters
Xnumpy array of shape [n_samples, n_features]

Training set.

ynumpy array of shape [n_samples]

Target values.

Returns
X_newnumpy array of shape [n_samples, n_features_new]

Transformed array.

get_params(self, deep=True)[source]

Get parameters for this estimator.

Parameters
deepboolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsmapping of string to any

Parameter names mapped to their values.

set_params(self, **params)[source]

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns
self
transform(self, X)[source]

Impute all missing values in X.

Parameters
X{array-like, sparse matrix}, shape = [n_samples, n_features]

The input data to complete.