lux.samplers.ImportanceSampler

class lux.samplers.ImportanceSampler(classifier, predict_proba, indstance_to_explain, min_generate_samples, process_input=None, categorical=None)

A transformer class for generating synthetic data using importance sampling based on SHAP values.

Parameters:

param classifier:: object The classifier used for generating SHAP values.
type classifier:: object
param predict_proba:: callable A function returning probability estimates for samples.
type predict_proba:: callable
param indstance_to_explain:: array-like of shape (n_features,) An instance to be used for explaining the synthetic samples creation process.
type indstance_to_explain:: array-like of shape (n_features,)
param min_generate_samples:: int The minimum number of synthetic samples to generate.
type min_generate_samples:: int
param process_input:: callable, default=None Function that aims in processing the input data before generating synthetic samples.
type process_input:: callable

__init__(classifier, predict_proba, indstance_to_explain, min_generate_samples, process_input=None, categorical=None)

A transformer class for generating synthetic data using importance sampling based on SHAP values.

Parameters:

param classifier:: object The classifier used for generating SHAP values.
type classifier:: object
param predict_proba:: callable A function returning probability estimates for samples.
type predict_proba:: callable
param indstance_to_explain:: array-like of shape (n_features,) An instance to be used for explaining the synthetic samples creation process.
type indstance_to_explain:: array-like of shape (n_features,)
param min_generate_samples:: int The minimum number of synthetic samples to generate.
type min_generate_samples:: int
param process_input:: callable, default=None Function that aims in processing the input data before generating synthetic samples.
type process_input:: callable

Methods

`__init__`(classifier, predict_proba, ...[, ...])	A transformer class for generating synthetic data using importance sampling based on SHAP values.
`fit`(X[, y])	Fits the transformer by calculating SHAP values for the given dataset.
`fit_transform`(X[, y])	Fit to data, then transform it.
`get_metadata_routing`()	Get metadata routing of this object.
`get_params`([deep])	Get parameters for this estimator.
`set_output`(*[, transform])	Set output container.
`set_params`(**params)	Set the parameters of this estimator.
`transform`(X[, y])	Transforms the dataset by generating synthetic samples based on SHAP values.

fit(X, y=None)

Fits the transformer by calculating SHAP values for the given dataset.

Parameters:

param X:: array-like of shape (n_samples, n_features) The input data for which SHAP values are to be calculated.
param y:: array-like of shape (n_samples,), default=None The target values. This parameter is not used and is only present to adhere to the scikit-learn transformer interface.

Returns:

self: object: The fitted transformer instance.

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters

Xarray-like of shape (n_samples, n_features): Input samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None: Target values (None for unsupervised transformations).
**fit_paramsdict: Additional fit parameters.

Returns

X_newndarray array of shape (n_samples, n_features_new): Transformed array.

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns

routingMetadataRequest: A MetadataRequest encapsulating routing information.

get_params(deep=True)

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

set_output(*, transform=None)

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters

transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged

New in version 1.4: “polars” option was added.

Returns

selfestimator instance: Estimator instance.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**paramsdict: Estimator parameters.

Returns

selfestimator instance: Estimator instance.

transform(X, y=None)

Transforms the dataset by generating synthetic samples based on SHAP values.

Parameters:

param X:: array-like of shape (n_samples, n_features) The input data to be transformed.
param y:: array-like of shape (n_samples,), default=None The target values. This parameter is not used and is only present to adhere to the scikit-learn transformer interface.

Returns:

transformed_data: array-like of shape (n_samples_new, n_features): The transformed dataset containing the original samples along with the generated synthetic samples.