lux.samplers.ImportanceSampler

class lux.samplers.ImportanceSampler(classifier, predict_proba, indstance_to_explain, min_generate_samples, process_input=None, categorical=None)

A transformer class for generating synthetic data using importance sampling based on SHAP values.

Parameters:

param classifier:

object The classifier used for generating SHAP values.

type classifier:

object

param predict_proba:

callable A function returning probability estimates for samples.

type predict_proba:

callable

param indstance_to_explain:

array-like of shape (n_features,) An instance to be used for explaining the synthetic samples creation process.

type indstance_to_explain:

array-like of shape (n_features,)

param min_generate_samples:

int The minimum number of synthetic samples to generate.

type min_generate_samples:

int

param process_input:

callable, default=None Function that aims in processing the input data before generating synthetic samples.

type process_input:

callable

__init__(classifier, predict_proba, indstance_to_explain, min_generate_samples, process_input=None, categorical=None)

A transformer class for generating synthetic data using importance sampling based on SHAP values.

Parameters:

param classifier:

object The classifier used for generating SHAP values.

type classifier:

object

param predict_proba:

callable A function returning probability estimates for samples.

type predict_proba:

callable

param indstance_to_explain:

array-like of shape (n_features,) An instance to be used for explaining the synthetic samples creation process.

type indstance_to_explain:

array-like of shape (n_features,)

param min_generate_samples:

int The minimum number of synthetic samples to generate.

type min_generate_samples:

int

param process_input:

callable, default=None Function that aims in processing the input data before generating synthetic samples.

type process_input:

callable

Methods

__init__(classifier, predict_proba, ...[, ...])

A transformer class for generating synthetic data using importance sampling based on SHAP values.

fit(X[, y])

Fits the transformer by calculating SHAP values for the given dataset.

fit_transform(X[, y])

Fit to data, then transform it.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

set_output(*[, transform])

Set output container.

set_params(**params)

Set the parameters of this estimator.

transform(X[, y])

Transforms the dataset by generating synthetic samples based on SHAP values.

fit(X, y=None)

Fits the transformer by calculating SHAP values for the given dataset.

Parameters:

param X:

array-like of shape (n_samples, n_features) The input data for which SHAP values are to be calculated.

param y:

array-like of shape (n_samples,), default=None The target values. This parameter is not used and is only present to adhere to the scikit-learn transformer interface.

Returns:

self: object

The fitted transformer instance.

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns

X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns

routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters

transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns

selfestimator instance

Estimator instance.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict

Estimator parameters.

Returns

selfestimator instance

Estimator instance.

transform(X, y=None)

Transforms the dataset by generating synthetic samples based on SHAP values.

Parameters:

param X:

array-like of shape (n_samples, n_features) The input data to be transformed.

param y:

array-like of shape (n_samples,), default=None The target values. This parameter is not used and is only present to adhere to the scikit-learn transformer interface.

Returns:

transformed_data: array-like of shape (n_samples_new, n_features)

The transformed dataset containing the original samples along with the generated synthetic samples.