ceml.sklearn¶

ceml.sklearn.counterfactual¶

class ceml.sklearn.counterfactual.SklearnCounterfactual(model, **kwds)¶

Bases: ceml.model.counterfactual.Counterfactual, abc.ABC

Base class for computing a counterfactual of a sklearn model.

The SklearnCounterfactual class can compute counterfactuals of sklearn models.

Parameters: model (object) – The sklearn model that is used for computing the counterfactual.

model¶

An instance of a sklearn model.

Type: object

mymodel¶

Rebuild model.

Type: instance of ceml.model.ModelWithLoss

Note

The class SklearnCounterfactual can not be instantiated because it contains an abstract method.

compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

x (numpy.ndarray) – The data point x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If feature_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.DifferentiableCostFunction if the cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

If no regularization is used (regularization=None), C is ignored.

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

Some models (see paper) support the use of mathematical programs for computing counterfactuals. In this case, you can use the option “mp” - please read the documentation of the corresponding model for further information.

The default is “auto”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

In case of a regression it might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

abstract rebuild_model(model)¶

Rebuilds a sklearn model.

Converts a sklearn model into a class:ceml.model.ModelWithLoss instance so that we have a model specific cost function and can compute the derivative with respect to the input.

Parameters: model – The sklearn model that is used for computing the counterfactual.
Returns: The wrapped model
Return type: ceml.model.ModelWithLoss

ceml.sklearn.plausibility¶

ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals(X, y, gmms, projection_mean_sub=None, projection_matrix=None, density_thresholds=None)¶

Computes all steps that are independent of a concrete sample when computing a plausible counterfactual explanations. Because the computation of a plausible counterfactual requires quite an amount of computation that does not depend on the concret sample we want to explain, it make sense to pre compute as much as possible (reduce redundant computations).

Parameters

X (numpy.ndarray) – Data points.
y (numpy.ndarray) – Labels of data points X. Assumed to be [0, 1, 2, …].
gmms (list(int)) – List of class dependent Gaussian Mixture Models (GMMs).
projection_mean_sub (numpy.ndarray, optional) –
The negative bias of the affine preprocessing.

The default is None.
projection_matrix (numpy.ndarray, optional) –
The projection matrix of the affine preprocessing.

The default is None.
density_threshold (float, optional) –
Density threshold at which we consider a counterfactual to be plausible.

If no density threshold is specified (density_threshold is set to None), the median density of the samples X is chosen as a threshold.

The default is None.

Returns

All necessary (pre computable) stuff needed for the computation of plausible counterfactuals.

Return type

dict

ceml.sklearn.decisiontree¶

class ceml.sklearn.decisiontree.DecisionTreeCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual, ceml.sklearn.decisiontree.PlausibleCounterfactualOfDecisionTree

Class for computing a counterfactual of a decision tree model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

compute_all_counterfactuals(x, y_target, features_whitelist=None, regularization='l1')¶

Computes all counterfactuals of a given input x.

Parameters

model (a sklearn.tree.DecisionTreeClassifier or sklearn.tree.DecisionTreeRegressor instance.) – The decision tree model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction is supposed to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom penalty function by setting regularization to a callable that can be called on a potential counterfactual and returns a scalar.

If regularization is None, no regularization is used.

The default is “l1”.

Returns

List of all counterfactuals.

Return type

list(np.array)

Raises

TypeError – If an invalid argument is passed to the function.
ValueError – If no counterfactual exists.

compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=None, optimizer=None, return_as_dict=True)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.tree.DecisionTreeClassifier or sklearn.tree.DecisionTreeRegressor instance.) – The decision tree model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction is supposed to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom penalty function by setting regularization to a callable that can be called on a potential counterfactual and returns a scalar.

If regularization is None, no regularization is used.

The default is “l1”.
C (None) –
Not used - is always None.

The only reason for including this parameter is to match the signature of other ceml.sklearn.counterfactual.SklearnCounterfactual children.
optimizer (None) –
Not used - is always None.

The only reason for including this parameter is to match the signature of other ceml.sklearn.counterfactual.SklearnCounterfactual children.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

rebuild_model(model)¶

Rebuild a sklearn.linear_model.LogisticRegression model.

Does nothing.

Parameters: model (instance of sklearn.tree.DecisionTreeClassifier or sklearn.tree.DecisionTreeRegressor) – The sklearn decision tree model.
Returns
Return type: None

Note

In contrast to many other SklearnCounterfactual instances, we do do not rebuild the model because we do not need/can compute gradients in a decision tree. We compute the set of counterfactuals without using a “common” optimization algorithms like Nelder-Mead.

ceml.sklearn.decisiontree.decisiontree_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', return_as_dict=True, done=None, plausibility=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.tree.DecisionTreeClassifier or sklearn.tree.DecisionTreeRegressor instance.) – The decision tree model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If feature_whitelist is None, all features can be used.

The default is None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom penalty function by setting regularization to a callable that can be called on a potential counterfactual and returns a scalar.

If regularization is None, no regularization is used.

The default is “l1”.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) – Not used.
plausibility (dict, optional.) –
If set to a valid dictionary (see ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals()), a plausible counterfactual (as proposed in Artelt et al. 2020) is computed. Note that in this case, all other parameters are ignored.

If plausibility is None, the closest counterfactual is computed.

The default is None.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.knn¶

class ceml.sklearn.knn.KNN(model, dist='l2', **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.neighbors.KNeighborsClassifier and sklearn.neighbors.KNeighborsRegressor classes.

The KNN class rebuilds a sklearn knn model.

Parameters

model (instance of sklearn.neighbors.KNeighborsClassifier or sklearn.neighbors.KNeighborsRegressor) – The knn model.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.

The default is “l2”.

Note: dist must not be None.

X¶

The training data set.

Type: numpy.array

y¶

The ground truth of the training data set.

Type: numpy.array

dist¶

The distance function.

Type: callable

Raises: TypeError – If model is not an instance of sklearn.neighbors.KNeighborsClassifier or sklearn.neighbors.KNeighborsRegressor

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Builds a cost function where we penalize the minimum distance to the nearest prototype which is consistent with the target y_target.

Parameters

y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to an input. E.g. using the ceml.optim.input_wrapper.InputWrapper class.

If pred is None, no transformation is applied to the input before passing it into the loss function.

The default is None.

Returns

Initialized cost function. Target label is y_target.

Return type

ceml.backend.jax.costfunctions.TopKMinOfListDistCost

predict(x)¶: Note

This function is a placeholder only.

This function does not predict anything and just returns the given input.

class ceml.sklearn.knn.KnnCounterfactual(model, dist='l2', **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual

Class for computing a counterfactual of a knn model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

rebuild_model(model)¶

Rebuilds a sklearn.neighbors.KNeighborsClassifier or sklearn.neighbors.KNeighborsRegressor model.

Converts a sklearn.neighbors.KNeighborsClassifier or sklearn.neighbors.KNeighborsRegressor instance into a ceml.sklearn.knn.KNN instance.

Parameters: model (instace of sklearn.neighbors.KNeighborsClassifier or sklearn.neighbors.KNeighborsRegressor) – The sklearn knn model.
Returns: The wrapped knn model.
Return type: ceml.sklearn.knn.KNN

ceml.sklearn.knn.knn_generate_counterfactual(model, x, y_target, features_whitelist=None, dist='l2', regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.neighbors.KNeighborsClassifier or sklearn.neighbors.KNeighborsRegressor instance.) – The knn model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.

The default is “l1”.

Note: dist must not be None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optimizer.optimizer.desc_to_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “nelder-mead”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

In case of a regression it might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

ceml.sklearn.linearregression¶

class ceml.sklearn.linearregression.LinearRegression(model, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.linear_model.base.LinearModel class

The LinearRegression class rebuilds a softmax regression model from a given weight vector and intercept.

Parameters: model (instance of sklearn.linear_model.base.LinearModel) – The linear regression model (e.g. sklearn.linear_model.LinearRegression or sklearn.linear_model.Ridge).

w¶

The weight vector (a matrix if we have a multi-dimensional output).

Type: numpy.ndarray

b¶

The intercept/bias (a vector if we have a multi-dimensional output).

Type: numpy.ndarray

dim¶

Dimensionality of the input data.

Type: int

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Build a squared-error cost function where the target is y_target.

Parameters

y_target (float) – The target value.
pred (callable, optional) –
A callable that maps an input to the output (regression).

If pred is None, the class method predict is used for mapping the input to the output (regression)

The default is None.

Returns

Initialized squared-error cost function. Target is y_target.

Return type

ceml.backend.jax.costfunctions.SquaredError

predict(x)¶

Predict the output of a given input.

Computes the regression on a given input x.

Parameters: x (numpy.ndarray) – The input x whose output is going to be predicted.
Returns: An array containing the predicted output.
Return type: jax.numpy.array

class ceml.sklearn.linearregression.LinearRegressionCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual, ceml.optim.cvx.MathematicalProgram, ceml.optim.cvx.ConvexQuadraticProgram

Class for computing a counterfactual of a linear regression model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

rebuild_model(model)¶

Rebuild a sklearn.linear_model.base.LinearModel model.

Converts a sklearn.linear_model.base.LinearModel into a ceml.sklearn.linearregression.LinearRegression.

Parameters: model (instance of sklearn.linear_model.base.LinearModel) – The sklearn linear regression model (e.g. sklearn.linear_model.LinearRegression or sklearn.linear_model.Ridge).
Returns: The wrapped linear regression model.
Return type: ceml.sklearn.linearregression.LinearRegression

ceml.sklearn.linearregression.linearregression_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='mp', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.linear_model.base.LinearModel instance.) – The linear regression model (e.g. sklearn.linear_model.LinearRegression or sklearn.linear_model.Ridge) that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (float) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

Linear regression supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program for computing the counterfactual. Note that in this case the hyperparameter C is ignored.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “mp”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

It might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.lvq¶

class ceml.sklearn.lvq.CQPHelper(mymodel, x_orig, y_target, indices_other_prototypes, features_whitelist=None, regularization='l1', optimizer_args=None, **kwds)¶: Bases: ceml.optim.cvx.ConvexQuadraticProgram

class ceml.sklearn.lvq.LVQ(model, dist='l2', **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn_lvq.GlvqModel, sklearn_lvq.GmlvqModel, sklearn_lvq.LgmlvqModel, sklearn_lvq.RslvqModel, sklearn_lvq.MrslvqModel and sklearn_lvq.LmrslvqModel classes.

The LVQ class rebuilds a sklearn-lvq lvq model.

Parameters

model (instance of sklearn_lvq.GlvqModel, sklearn_lvq.GmlvqModel, sklearn_lvq.LgmlvqModel, sklearn_lvq.RslvqModel, sklearn_lvq.MrslvqModel or sklearn_lvq.LmrslvqModel) – The lvq model.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.

The default is “l2”.

Note: dist must not be None.

prototypes¶

The prototypes.

Type: numpy.array

labels¶

The labels of the prototypes.

Type: numpy.array

dist¶

The distance function.

Type: callable

model¶

The original sklearn-lvq model.

Type: object

model_class¶

The class of the sklearn-lvq model.

Type: class

dim¶

Dimensionality of the input data.

Type: int

Raises: TypeError – If model is not an instance of sklearn_lvq.GlvqModel, sklearn_lvq.GmlvqModel, sklearn_lvq.LgmlvqModel, sklearn_lvq.RslvqModel, sklearn_lvq.MrslvqModel or sklearn_lvq.LmrslvqModel

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Builds a cost function where we penalize the minimum distance to the nearest prototype which is consistent with the target y_target.

Parameters

y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to an input. E.g. using the ceml.optim.input_wrapper.InputWrapper class.

If pred is None, no transformation is applied to the input before putting it into the loss function.

The default is None.

Returns

Initialized cost function. Target label is y_target.

Return type

ceml.backend.jax.costfunctions.MinOfListDistCost

predict(x)¶: Note

This function is a placeholder only.

This function does not predict anything and just returns the given input.

class ceml.sklearn.lvq.LvqCounterfactual(model, dist='l2', cqphelper=<class 'ceml.sklearn.lvq.CQPHelper'>, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual, ceml.optim.cvx.MathematicalProgram, ceml.optim.cvx.DCQP

Class for computing a counterfactual of a lvq model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

rebuild_model(model)¶

Rebuilds a sklearn_lvq.GlvqModel, sklearn_lvq.GmlvqModel, sklearn_lvq.LgmlvqModel, sklearn_lvq.RslvqModel, sklearn_lvq.MrslvqModel or sklearn_lvq.LmrslvqModel model.

Converts a sklearn_lvq.GlvqModel, sklearn_lvq.GmlvqModel, sklearn_lvq.LgmlvqModel, sklearn_lvq.RslvqModel, sklearn_lvq.MrslvqModel or sklearn_lvq.LmrslvqModel instance into a ceml.sklearn.lvq.LVQ instance.

Parameters: model (instace of sklearn_lvq.GlvqModel, sklearn_lvq.GmlvqModel, sklearn_lvq.LgmlvqModel, sklearn_lvq.RslvqModel, sklearn_lvq.MrslvqModel or sklearn_lvq.LmrslvqModel) – The sklearn-lvq lvq model.
Returns: The wrapped lvq model.
Return type: ceml.sklearn.lvq.LVQ

solve(x_orig, y_target, regularization, features_whitelist, return_as_dict, optimizer_args)¶

Approximately solves the DCQP by using the penalty convex-concave procedure.

Parameters: x0 (numpy.ndarray) – The initial data point for the penalty convex-concave procedure - this could be anything, however a “good” initial solution might lead to a better result.

ceml.sklearn.lvq.lvq_generate_counterfactual(model, x, y_target, features_whitelist=None, dist='l2', regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.neighbors.sklearn_lvq.GlvqModel, sklearn_lvq.GmlvqModel, sklearn_lvq.LgmlvqModel, sklearn_lvq.RslvqModel, sklearn_lvq.MrslvqModel or sklearn_lvq.LmrslvqModel instance.) –
The lvq model that is used for computing the counterfactual.

Note: Only lvq models from sklearn-lvq are supported.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.

The default is “l1”.

Note: dist must not be None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.

The default is “auto”.

Learning vector quantization supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program (G(M)LVQ) or a DCQP (otherwise) for computing the counterfactual. Note that in this case the hyperparameter C is ignored. Because the DCQP is a non-convex problem, we are not guaranteed to find the best solution (it might even happen that we do not find a solution at all) - we use the penalty convex-concave procedure for approximately solving the DCQP.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) – Not used.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.models¶

ceml.sklearn.models.generate_counterfactual(model, x, y_target, features_whitelist=None, dist='l2', regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (object) – The sklearn model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.

The default is “l1”.

Note: dist must not be None.

Note

Only needed if model is a LVQ or KNN model!
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optimizer.optimizer.desc_to_optim() for details.

Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “auto”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

In case of a regression it might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

ValueError – If model contains an unsupported model.

ceml.sklearn.naivebayes¶

class ceml.sklearn.naivebayes.GaussianNB(model, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.naive_bayes.GaussianNB class

The GaussianNB class rebuilds a gaussian naive bayes model from a given set of parameters (priors, means and variances).

Parameters: model (instance of sklearn.naive_bayes.GaussianNB) – The gaussian naive bayes model.

class_priors¶

Class dependend priors.

Type: numpy.ndarray

means¶

Class and feature dependend means.

Type: numpy.array

variances¶

Class and feature dependend variances.

Type: numpy.ndarray

dim¶

Dimensionality of the input data.

Type: int

is_binary¶

True if model is a binary classifier, False otherwise.

Type: boolean

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Build a negative-log-likehood cost function where the target is y_target.

Parameters

y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).

If pred is None, the class method predict is used for mapping the input to the output (class probabilities)

The default is None.

Returns

Initialized negative-log-likelihood cost function. Target label is y_target.

Return type

ceml.backend.jax.costfunctions.NegLogLikelihoodCost

predict(x)¶

Predict the output of a given input.

Computes the class probabilities for a given input x.

Parameters: x (numpy.ndarray) – The input x that is going to be classified.
Returns: An array containing the class probabilities.
Return type: jax.numpy.array

class ceml.sklearn.naivebayes.GaussianNbCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual, ceml.optim.cvx.MathematicalProgram, ceml.optim.cvx.SDP, ceml.optim.cvx.DCQP

Class for computing a counterfactual of a gaussian naive bayes model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

rebuild_model(model)¶

Rebuild a sklearn.naive_bayes.GaussianNB model.

Converts a sklearn.naive_bayes.GaussianNB into a ceml.sklearn.naivebayes.GaussianNB.

Parameters: model (instance of sklearn.naive_bayes.GaussianNB) – The sklearn gaussian naive bayes model.
Returns: The wrapped gaussian naive bayes model.
Return type: ceml.sklearn.naivebayes.GaussianNB

solve(x_orig, y_target, regularization, features_whitelist, return_as_dict, optimizer_args)¶

Approximately solves the DCQP by using the penalty convex-concave procedure.

Parameters: x0 (numpy.ndarray) – The initial data point for the penalty convex-concave procedure - this could be anything, however a “good” initial solution might lead to a better result.

ceml.sklearn.naivebayes.gaussiannb_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.naive_bayes.GaussianNB instance.) – The gaussian naive bayes model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.

The default is “auto”.

Gaussian naive Bayes supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a semi-definite program (binary classifier) or a DCQP (otherwise) for computing the counterfactual. Note that in this case the hyperparameter C is ignored. Because the DCQP is a non-convex problem, we are not guaranteed to find the best solution (it might even happen that we do not find a solution at all) - we use the penalty convex-concave procedure for approximately solving the DCQP.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) – Not used.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.lda¶

class ceml.sklearn.lda.Lda(model, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.discriminant_analysis.LinearDiscriminantAnalysis class.

The Lda class rebuilds a lda model from a given parameters.

Parameters: model (instance of sklearn.discriminant_analysis.LinearDiscriminantAnalysis) – The lda model.

class_priors¶

Class dependend priors.

Type: numpy.ndarray

means¶

Class dependend means.

Type: numpy.ndarray

sigma_inv¶

Inverted covariance matrix.

Type: numpy.ndarray

dim¶

Dimensionality of the input data.

Type: int

Raises: TypeError – If model is not an instance of sklearn.discriminant_analysis.LinearDiscriminantAnalysis

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Build a negative-log-likehood cost function where the target is y_target.

Parameters

y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).

If pred is None, the class method predict is used for mapping the input to the output (class probabilities)

The default is None.

Returns

Initialized negative-log-likelihood cost function. Target label is y_target.

Return type

ceml.backend.jax.costfunctions.NegLogLikelihoodCost

predict(x)¶

Predict the output of a given input.

Computes the class probabilities for a given input x.

Parameters: x (numpy.ndarray) – The input x that is going to be classified.
Returns: An array containing the class probabilities.
Return type: jax.numpy.array

class ceml.sklearn.lda.LdaCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual, ceml.optim.cvx.MathematicalProgram, ceml.optim.cvx.ConvexQuadraticProgram, ceml.optim.cvx.PlausibleCounterfactualOfHyperplaneClassifier

Class for computing a counterfactual of a lda model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

rebuild_model(model)¶

Rebuild a sklearn.discriminant_analysis.LinearDiscriminantAnalysis model.

Converts a sklearn.discriminant_analysis.LinearDiscriminantAnalysis into a ceml.sklearn.lda.Lda.

Parameters: model (instance of sklearn.discriminant_analysis.LinearDiscriminantAnalysis) – The sklearn lda model - note that store_covariance must be set to True.
Returns: The wrapped qda model.
Return type: ceml.sklearn.lda.Lda

ceml.sklearn.lda.lda_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='mp', optimizer_args=None, return_as_dict=True, done=None, plausibility=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.discriminant_analysis.LinearDiscriminantAnalysis instance.) – The lda model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

Linear discriminant analysis supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program for computing the counterfactual. Note that in this case the hyperparameter C is ignored.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “mp”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) – Not used.
plausibility (dict, optional.) –
If set to a valid dictionary (see ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals()), a plausible counterfactual (as proposed in Artelt et al. 2020) is computed. Note that in this case, all other parameters are ignored.

If plausibility is None, the closest counterfactual is computed.

The default is None.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.qda¶

class ceml.sklearn.qda.Qda(model, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis class.

The Qda class rebuilds a lda model from a given parameters.

Parameters: model (instance of sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis) – The qda model.

class_priors¶

Class dependend priors.

Type: numpy.ndarray

means¶

Class dependend means.

Type: numpy.ndarray

sigma_inv¶

Class dependend inverted covariance matrices.

Type: numpy.ndarray

dim¶

Dimensionality of the input data.

Type: int

is_binary¶

True if model is a binary classifier, False otherwise.

Type: boolean

Raises: TypeError – If model is not an instance of sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Build a negative-log-likehood cost function where the target is y_target.

Parameters

y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).

If pred is None, the class method predict is used for mapping the input to the output (class probabilities)

The default is None.

Returns

Initialized negative-log-likelihood cost function. Target label is y_target.

Return type

ceml.backend.jax.costfunctions.NegLogLikelihoodCost

predict(x)¶

Predict the output of a given input.

Computes the class probabilities for a given input x.

Parameters: x (numpy.ndarray) – The input x that is going to be classified.
Returns: An array containing the class probabilities.
Return type: jax.numpy.array

class ceml.sklearn.qda.QdaCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual, ceml.optim.cvx.MathematicalProgram, ceml.optim.cvx.SDP, ceml.optim.cvx.DCQP

Class for computing a counterfactual of a qda model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

rebuild_model(model)¶

Rebuild a sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis model.

Converts a sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis into a ceml.sklearn.qda.Qda.

Parameters: model (instance of sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis) – The sklearn qda model - note that store_covariance must be set to True.
Returns: The wrapped qda model.
Return type: ceml.sklearn.qda.Qda

solve(x_orig, y_target, regularization, features_whitelist, return_as_dict, optimizer_args)¶

Approximately solves the DCQP by using the penalty convex-concave procedure.

Parameters: x0 (numpy.ndarray) – The initial data point for the penalty convex-concave procedure - this could be anything, however a “good” initial solution might lead to a better result.

ceml.sklearn.qda.qda_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis instance.) – The qda model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “nelder-mead”.

Quadratic discriminant analysis supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a semi-definite program (binary classifier) or a DCQP (otherwise) for computing the counterfactual. Note that in this case the hyperparameter C is ignored. Because the DCQP is a non-convex problem, we are not guaranteed to find the best solution (it might even happen that we do not find a solution at all) - we use the penalty convex-concave procedure for approximately solving the DCQP.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) – Not used.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.pipeline¶

class ceml.sklearn.pipeline.PipelineCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual

Class for computing a counterfactual of a softmax regression model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

build_loss(regularization, x_orig, y_target, pred, grad_mask, C, input_wrapper)¶

Build a loss function.

Overwrites the build_loss method from base class ceml.sklearn.counterfactual.SklearnCounterfactual.

Parameters

regularization (str or ceml.costfunctions.costfunctions.CostFunction) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.DifferentiableCostFunction if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.
x_orig (numpy.array) – The original input whose prediction has to be explained.
y_target (int or float) – The requested output.
pred (callable) –
A callable that maps an input to the output.

If pred is None, the class method predict is used for mapping the input to the output.
grad_mask (numpy.array) – Gradient mask determining which dimensions can be used.
C (float or list(float)) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).

Returns

Initialized cost function. Target is set to y_target.

Return type

ceml.costfunctions.costfunctions.CostFunction

compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

x (numpy.ndarray) – The data point x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If feature_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.DifferentiableCostFunction if the cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

If no regularization is used (regularization=None), C is ignored.

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

Some models (see paper) support the use of mathematical programs for computing counterfactuals. In this case, you can use the option “mp” - please read the documentation of the corresponding model for further information.

The default is “auto”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

In case of a regression it might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

rebuild_model(model)¶

Rebuild a sklearn.pipeline.Pipeline model.

Converts a sklearn.pipeline.Pipeline into a ceml.sklearn.pipeline.PipelineModel.

Parameters: model (instance of sklearn.pipeline.Pipeline) – The sklearn pipeline model.
Returns: The wrapped pipeline model.
Return type: ceml.sklearn.pipeline.Pipeline

class ceml.sklearn.pipeline.PipelineModel(models, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.pipeline.Pipeline class

The PipelineModel class rebuilds a pipeline model from a given list of sklearn models.

Parameters: models (list(object)) – Ordered list of all sklearn models in the pipeline.

models¶

Ordered list of all sklearn models in the pipeline.

Type: list(objects)

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Builds a cost function where the target is y_target.

Parameters

y_target (int or float) – The requested output.
pred (callable, optional) –
A callable that maps an input to the output.

If pred is None, the class method predict is used for mapping the input to the output.

The default is None.

Returns

Initialized cost function. Target is set to y_target.

Return type

ceml.costfunctions.costfunctions.CostFunction

predict(x)¶

Predicts the output of a given input.

Computes the prediction of a given input x.

Parameters: x (numpy.ndarray) – The input x.
Returns: Output of the pipeline (might be scalar or smth. higher-dimensional).
Return type: numpy.array

ceml.sklearn.pipeline.pipeline_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.pipeline.Pipeline instance.) – The modelpipeline that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “nelder-mead”.

Some models (see paper) support the use of mathematical programs for computing counterfactuals. In this case, you can use the option “mp” - please read the documentation of the corresponding model for further information.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

In case of a regression it might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.randomforest¶

class ceml.sklearn.randomforest.EnsembleVotingCost(models, y_target, input_wrapper=None, epsilon=0, **kwds)¶

Bases: ceml.costfunctions.costfunctions.CostFunction

Loss function of an ensemble of models.

The loss is the negative fraction of models that predict the correct output.

Parameters

models (list(object)) – List of models
y_target (int, float or a callable that returns True if a given prediction is accepted.) – The requested prediction.
input_wrapper (callable, optional) –
Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).

The default is None.

score_impl(x)¶: Implementation of the loss function.

class ceml.sklearn.randomforest.RandomForest(model, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.ensemble.RandomForestClassifier or sklearn.ensemble.RandomForestRegressor class.

Parameters: model (instance of sklearn.ensemble.RandomForestClassifier or sklearn.ensemble.RandomForestRegressor) – The random forest model.
Raises: TypeError – If model is not an instance of sklearn.ensemble.RandomForestClassifier or sklearn.ensemble.RandomForestRegressor

get_loss(y_target, input_wrapper=None)¶

Creates and returns a loss function.

Parameters

y_target (int, float or a callable that returns True if a given prediction is accepted.) – The requested prediction.
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).

Returns

Initialized loss function. The target output is y_target.

Return type

ceml.sklearn.randomforest.EnsembleVotingCost

predict(x)¶

Predict the output of a given input.

Computes the class label of a given input x.

Parameters: x (numpy.ndarray) – The input x that is going to be classified.
Returns: Prediction.
Return type: int or float

class ceml.sklearn.randomforest.RandomForestCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual

Class for computing a counterfactual of a random forest model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

build_loss(regularization, x_orig, y_target, pred, grad_mask, C, input_wrapper)¶: Build the (non-differentiable) cost function: Regularization + Loss

compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If feature_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.DifferentiableCostFunction if the cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

If no regularization is used (regularization=None), C is ignored.

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “nelder-mead”.

Note

The cost function of a random forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

In case of a regression it might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

rebuild_model(model)¶

Rebuilds a sklearn.ensemble.RandomForestClassifier or sklearn.ensemble.RandomForestRegressor model.

Converts a sklearn.ensemble.RandomForestClassifier or sklearn.ensemble.RandomForestRegressor instance into a ceml.sklearn.randomforest.RandomForest instance.

Parameters: model (instance of sklearn.ensemble.RandomForestClassifier or sklearn.ensemble.RandomForestRegressor) – The sklearn random forest model.
Returns: The wrapped random forest model.
Return type: ceml.sklearn.randomforest.RandomForest

ceml.sklearn.randomforest.randomforest_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.ensemble.RandomForestClassifier or sklearn.ensemble.RandomForestRegressor instance.) – The random forest model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “nelder-mead”.

Note

The cost function of a random forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) – Not used.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.isolationforest¶

class ceml.sklearn.isolationforest.IsolationForest(model, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.ensemble.IsolationForest class.

Parameters: model (instance of sklearn.ensemble.IsolationForest) – The isolation forest model.
Raises: TypeError – If model is not an instance of sklearn.ensemble.IsolationForest

get_loss(y_target, input_wrapper=None)¶

Creates and returns a loss function.

Parameters

y_target (int) – The target class - either +1 or -1
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).

Returns

Initialized loss function. Target label is y_target.

Return type

ceml.sklearn.isolationforest.IsolationForestCost

predict(x)¶

Predict the output of a given input.

Computes the class label of a given input x.

Parameters: x (numpy.ndarray) – The input x that is going to be classified.
Returns: Prediction.
Return type: int

class ceml.sklearn.isolationforest.IsolationForestCost(models, y_target, input_wrapper=None, epsilon=0, **kwds)¶

Bases: ceml.costfunctions.costfunctions.CostFunction

Loss function of an isolation forest.

The loss is the negative averaged length of the decision paths.

Parameters

models (list(object)) – List of decision trees.
y_target (int) – The requested prediction - either -1 or +1.
input_wrapper (callable, optional) –
Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).

The default is None.

score_impl(x)¶: Implementation of the loss function.

class ceml.sklearn.isolationforest.IsolationForestCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual

Class for computing a counterfactual of an isolation forest model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)¶

Computes a counterfactual of a given input x.

Parameters

x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If feature_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.DifferentiableCostFunction if the cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

If no regularization is used (regularization=None), C is ignored.

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optimizer.optimizer.desc_to_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “nelder-mead”.

Note

The cost function of an isolation forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.

If done is None, the output/prediction of the counterfactual must match y_target exactly.

The default is None.

Note

In case of a regression it might not always be possible to achieve a given output/prediction exactly.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

rebuild_model(model)¶

Rebuilds a sklearn.ensemble.IsolationForest model.

Converts a sklearn.ensemble.IsolationForest into a ceml.sklearn.isolationforest.IsolationForest.

Parameters: model (instance of sklearn.ensemble.IsolationForest) – The sklearn isolation forest model.
Returns: The wrapped isolation forest model.
Return type: ceml.sklearn.isolationforest.IsolationForest

ceml.sklearn.isolationforest.isolationforest_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.ensemble.IsolationForest instance.) – The isolation forest model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int) – The requested prediction of the counterfactual - either -1 or +1.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optimizer.optimizer.desc_to_optim() for details.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “nelder-mead”.

Note

The cost function of an isolation forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.softmaxregression¶

class ceml.sklearn.softmaxregression.SoftmaxCounterfactual(model, **kwds)¶

Bases: ceml.sklearn.counterfactual.SklearnCounterfactual, ceml.optim.cvx.MathematicalProgram, ceml.optim.cvx.ConvexQuadraticProgram, ceml.optim.cvx.PlausibleCounterfactualOfHyperplaneClassifier

Class for computing a counterfactual of a softmax regression model.

See parent class ceml.sklearn.counterfactual.SklearnCounterfactual.

rebuild_model(model)¶

Rebuilds a sklearn.linear_model.LogisticRegression model.

Converts a sklearn.linear_model.LogisticRegression into a ceml.sklearn.softmaxregression.SoftmaxRegression.

Parameters: model (instance of sklearn.linear_model.LogisticRegression) – The sklearn softmax regression model.
Returns: The wrapped softmax regression model.
Return type: ceml.sklearn.softmaxregression.SoftmaxRegression

class ceml.sklearn.softmaxregression.SoftmaxRegression(model, **kwds)¶

Bases: ceml.model.model.ModelWithLoss

Class for rebuilding/wrapping the sklearn.linear_model.LogisticRegression class.

The SoftmaxRegression class rebuilds a softmax regression model from a given weight vector and intercept.

Parameters: model (instance of sklearn.linear_model.LogisticRegression) – The softmax regression model.

w¶

The weight vector (a matrix if we have more than two classes).

Type: numpy.ndarray

b¶

The intercept/bias (a vector if we have more than two classes).

Type: numpy.ndarray

dim¶

Dimensionality of the input data.

Type: int

is_multiclass¶

True if model is a binary classifier, False otherwise.

Type: boolean

Raises: TypeError – If model is not an instance of sklearn.linear_model.LogisticRegression

get_loss(y_target, pred=None)¶

Creates and returns a loss function.

Builds a negative-log-likehood cost function where the target is y_target.

Parameters

y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).

If pred is None, the class method predict is used for mapping the input to the output (class probabilities)

The default is None.

Returns

Initialized negative-log-likelihood cost function. Target label is y_target.

Return type

ceml.backend.jax.costfunctions.NegLogLikelihoodCost

predict(x)¶

Predict the output of a given input.

Computes the class probabilities for a given input x.

Parameters: x (numpy.ndarray) – The input x that is going to be classified.
Returns: An array containing the class probabilities.
Return type: jax.numpy.array

ceml.sklearn.softmaxregression.softmaxregression_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='mp', optimizer_args=None, return_as_dict=True, done=None, plausibility=None)¶

Computes a counterfactual of a given input x.

Parameters

model (a sklearn.linear_model.LogisticRegression instance.) –
The softmax regression model that is used for computing the counterfactual.

Note: model.multi_class must be set to multinomial.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

If features_whitelist is None, all features can be used.

The default is None.
regularization (str or ceml.costfunctions.costfunctions.CostFunction, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.

Supported values:
- l1: Penalizes the absolute deviation.
- l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.CostFunctionDifferentiable if your cost function is differentiable) or None if no regularization is requested.

If regularization is None, no regularization is used.

The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.

C is ignored if no regularization is used (regularization=None).

The default is 1.0
optimizer (str or instance of ceml.optim.optimizer.Optimizer, optional) –
Name/Identifier of the optimizer that is used for computing the counterfactual. See ceml.optim.optimizer.prepare_optim() for details.

Softmax regression supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program for computing the counterfactual. Note that in this case the hyperparameter C is ignored.

As an alternative, we can use any (custom) optimizer that is derived from the ceml.optim.optimizer.Optimizer class.

The default is “mp”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.

The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.

The default is True.
done (callable, optional) – Not used.
plausibility (dict, optional.) –
If set to a valid dictionary (see ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals()), a plausible counterfactual (as proposed in Artelt et al. 2020) is computed. Note that in this case, all other parameters are ignored.

If plausibility is None, the closest counterfactual is computed.

The default is None.

Returns

A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.

(x_cf, y_cf, delta) : triple if return_as_dict is False

Return type

dict or triple

Raises

Exception – If no counterfactual was found.

ceml.sklearn.utils¶

ceml.sklearn.utils.build_regularization_loss(regularization, x, input_wrapper=None)¶

Build a regularization loss.

Parameters

regularization (str, ceml.costfunctions.costfunctions.CostFunction or None) –
Description of the regularization, instance of ceml.costfunctions.costfunctions.CostFunction (or ceml.costfunctions.costfunctions.DifferentiableCostFunction if your cost function is differentiable) or None if no regularization is requested.

See ceml.sklearn.utils.desc_to_regcost() for a list of supported descriptions.

If no regularization is requested, an instance of ceml.backend.jax.costfunctions.costfunctions.DummyCost is returned. This cost function always outputs zero, no matter what the input is.
x (numpy.array) – The original input from which we do not want to deviate much.
input_wrapper (callable, optional) –
Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).

If input_wrapper is None, the input is passed without any modifications.

The default is None.

Returns

An instance of ceml.costfunctions.costfunctions.CostFunction or the user defined, callable, regularization.

Return type

callable

Raises

TypeError – If regularization has an invalid type.

ceml.sklearn.utils.desc_to_dist(desc)¶

Converts a description of a distance metric into a jax.numpy function.

Supported descriptions:

l1: l1-norm

l2: l2-norm

Parameters: desc (str) – Description of the distance metric.
Returns: The distance function implemented as a jax.numpy function.
Return type: callable
Raises: ValueError – If desc contains an invalid description.

ceml.sklearn.utils.desc_to_regcost(desc, x, input_wrapper)¶

Converts a description of a regularization into a jax.numpy function.

Supported descriptions:

l1: l1-regularization

l2: l2-regularization

Parameters

desc (str) – Description of the distance metric.
x (numpy.array) – The original input from which we do not want to deviate much.
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).

Returns

The regularization function implemented as a jax.numpy function.

Return type

callable

Raises

ValueError – If desc contains an invalid description.