ceml.sklearn
ceml.sklearn.counterfactual
- class ceml.sklearn.counterfactual.SklearnCounterfactual(model, **kwds)
Bases:
ceml.model.counterfactual.Counterfactual
,abc.ABC
Base class for computing a counterfactual of a sklearn model.
The
SklearnCounterfactual
class can compute counterfactuals of sklearn models.- Parameters
model (object) – The sklearn model that is used for computing the counterfactual.
- model
An instance of a sklearn model.
- Type
object
- mymodel
Rebuild model.
- Type
instance of
ceml.model.ModelWithLoss
Note
The class
SklearnCounterfactual
can not be instantiated because it contains an abstract method.- compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
x (numpy.ndarray) – The data point x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If feature_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.DifferentiableCostFunction
if the cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
If no regularization is used (regularization=None), C is ignored.
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.
As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.Some models (see paper) support the use of mathematical programs for computing counterfactuals. In this case, you can use the option “mp” - please read the documentation of the corresponding model for further information.
The default is “auto”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
In case of a regression it might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
- abstract rebuild_model(model)
Rebuilds a sklearn model.
Converts a sklearn model into a class:ceml.model.ModelWithLoss instance so that we have a model specific cost function and can compute the derivative with respect to the input.
- Parameters
model – The sklearn model that is used for computing the counterfactual.
- Returns
The wrapped model
- Return type
ceml.model.ModelWithLoss
ceml.sklearn.plausibility
- ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals(X, y, gmms, projection_mean_sub=None, projection_matrix=None, density_thresholds=None)
Computes all steps that are independent of a concrete sample when computing a plausible counterfactual explanations. Because the computation of a plausible counterfactual requires quite an amount of computation that does not depend on the concret sample we want to explain, it make sense to pre compute as much as possible (reduce redundant computations).
- Parameters
X (numpy.ndarray) – Data points.
y (numpy.ndarray) – Labels of data points X. Assumed to be [0, 1, 2, …].
gmms (list(int)) – List of class dependent Gaussian Mixture Models (GMMs).
projection_mean_sub (numpy.ndarray, optional) –
The negative bias of the affine preprocessing.
The default is None.
projection_matrix (numpy.ndarray, optional) –
The projection matrix of the affine preprocessing.
The default is None.
density_threshold (float, optional) –
Density threshold at which we consider a counterfactual to be plausible.
If no density threshold is specified (density_threshold is set to None), the median density of the samples X is chosen as a threshold.
The default is None.
- Returns
All necessary (pre computable) stuff needed for the computation of plausible counterfactuals.
- Return type
dict
ceml.sklearn.decisiontree
- class ceml.sklearn.decisiontree.DecisionTreeCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
,ceml.sklearn.decisiontree.PlausibleCounterfactualOfDecisionTree
Class for computing a counterfactual of a decision tree model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- compute_all_counterfactuals(x, y_target, features_whitelist=None, regularization='l1')
Computes all counterfactuals of a given input x.
- Parameters
model (a
sklearn.tree.DecisionTreeClassifier
orsklearn.tree.DecisionTreeRegressor
instance.) – The decision tree model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction is supposed to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom penalty function by setting regularization to a callable that can be called on a potential counterfactual and returns a scalar.
If regularization is None, no regularization is used.
The default is “l1”.
- Returns
List of all counterfactuals.
- Return type
list(np.array)
- Raises
TypeError – If an invalid argument is passed to the function.
ValueError – If no counterfactual exists.
- compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=None, optimizer=None, return_as_dict=True)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.tree.DecisionTreeClassifier
orsklearn.tree.DecisionTreeRegressor
instance.) – The decision tree model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction is supposed to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom penalty function by setting regularization to a callable that can be called on a potential counterfactual and returns a scalar.
If regularization is None, no regularization is used.
The default is “l1”.
C (None) –
Not used - is always None.
The only reason for including this parameter is to match the signature of other
ceml.sklearn.counterfactual.SklearnCounterfactual
children.optimizer (None) –
Not used - is always None.
The only reason for including this parameter is to match the signature of other
ceml.sklearn.counterfactual.SklearnCounterfactual
children.return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- rebuild_model(model)
Rebuild a
sklearn.linear_model.LogisticRegression
model.Does nothing.
- Parameters
model (instance of
sklearn.tree.DecisionTreeClassifier
orsklearn.tree.DecisionTreeRegressor
) – The sklearn decision tree model.- Returns
- Return type
None
Note
In contrast to many other
SklearnCounterfactual
instances, we do do not rebuild the model because we do not need/can compute gradients in a decision tree. We compute the set of counterfactuals without using a “common” optimization algorithms like Nelder-Mead.
- ceml.sklearn.decisiontree.decisiontree_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', return_as_dict=True, done=None, plausibility=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.tree.DecisionTreeClassifier
orsklearn.tree.DecisionTreeRegressor
instance.) – The decision tree model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If feature_whitelist is None, all features can be used.
The default is None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom penalty function by setting regularization to a callable that can be called on a potential counterfactual and returns a scalar.
If regularization is None, no regularization is used.
The default is “l1”.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) – Not used.
plausibility (dict, optional.) –
If set to a valid dictionary (see
ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals()
), a plausible counterfactual (as proposed in Artelt et al. 2020) is computed. Note that in this case, all other parameters are ignored.If plausibility is None, the closest counterfactual is computed.
The default is None.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.knn
- class ceml.sklearn.knn.KNN(model, dist='l2', **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.neighbors.KNeighborsClassifier
andsklearn.neighbors.KNeighborsRegressor
classes.The
KNN
class rebuilds a sklearn knn model.- Parameters
model (instance of
sklearn.neighbors.KNeighborsClassifier
orsklearn.neighbors.KNeighborsRegressor
) – The knn model.dist (str or callable, optional) –
Computes the distance between a prototype and a data point.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.
The default is “l2”.
Note: dist must not be None.
- X
The training data set.
- Type
numpy.array
- y
The ground truth of the training data set.
- Type
numpy.array
- dist
The distance function.
- Type
callable
- Raises
TypeError – If model is not an instance of
sklearn.neighbors.KNeighborsClassifier
orsklearn.neighbors.KNeighborsRegressor
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Builds a cost function where we penalize the minimum distance to the nearest prototype which is consistent with the target y_target.
- Parameters
y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to an input. E.g. using the
ceml.optim.input_wrapper.InputWrapper
class.If pred is None, no transformation is applied to the input before passing it into the loss function.
The default is None.
- Returns
Initialized cost function. Target label is y_target.
- Return type
ceml.backend.jax.costfunctions.TopKMinOfListDistCost
- predict(x)
Note
This function is a placeholder only.
This function does not predict anything and just returns the given input.
- class ceml.sklearn.knn.KnnCounterfactual(model, dist='l2', **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
Class for computing a counterfactual of a knn model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- rebuild_model(model)
Rebuilds a
sklearn.neighbors.KNeighborsClassifier
orsklearn.neighbors.KNeighborsRegressor
model.Converts a
sklearn.neighbors.KNeighborsClassifier
orsklearn.neighbors.KNeighborsRegressor
instance into aceml.sklearn.knn.KNN
instance.- Parameters
model (instace of
sklearn.neighbors.KNeighborsClassifier
orsklearn.neighbors.KNeighborsRegressor
) – The sklearn knn model.- Returns
The wrapped knn model.
- Return type
- ceml.sklearn.knn.knn_generate_counterfactual(model, x, y_target, features_whitelist=None, dist='l2', regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.neighbors.KNeighborsClassifier
orsklearn.neighbors.KNeighborsRegressor
instance.) – The knn model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.
The default is “l1”.
Note: dist must not be None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optimizer.optimizer.desc_to_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “nelder-mead”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
In case of a regression it might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
ceml.sklearn.linearregression
- class ceml.sklearn.linearregression.LinearRegression(model, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.linear_model.base.LinearModel
classThe
LinearRegression
class rebuilds a softmax regression model from a given weight vector and intercept.- Parameters
model (instance of
sklearn.linear_model.base.LinearModel
) – The linear regression model (e.g.sklearn.linear_model.LinearRegression
orsklearn.linear_model.Ridge
).
- w
The weight vector (a matrix if we have a multi-dimensional output).
- Type
numpy.ndarray
- b
The intercept/bias (a vector if we have a multi-dimensional output).
- Type
numpy.ndarray
- dim
Dimensionality of the input data.
- Type
int
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Build a squared-error cost function where the target is y_target.
- Parameters
y_target (float) – The target value.
pred (callable, optional) –
A callable that maps an input to the output (regression).
If pred is None, the class method predict is used for mapping the input to the output (regression)
The default is None.
- Returns
Initialized squared-error cost function. Target is y_target.
- Return type
ceml.backend.jax.costfunctions.SquaredError
- predict(x)
Predict the output of a given input.
Computes the regression on a given input x.
- Parameters
x (numpy.ndarray) – The input x whose output is going to be predicted.
- Returns
An array containing the predicted output.
- Return type
jax.numpy.array
- class ceml.sklearn.linearregression.LinearRegressionCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
,ceml.optim.cvx.MathematicalProgram
,ceml.optim.cvx.ConvexQuadraticProgram
Class for computing a counterfactual of a linear regression model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- rebuild_model(model)
Rebuild a
sklearn.linear_model.base.LinearModel
model.Converts a
sklearn.linear_model.base.LinearModel
into aceml.sklearn.linearregression.LinearRegression
.- Parameters
model (instance of
sklearn.linear_model.base.LinearModel
) – The sklearn linear regression model (e.g.sklearn.linear_model.LinearRegression
orsklearn.linear_model.Ridge
).- Returns
The wrapped linear regression model.
- Return type
- ceml.sklearn.linearregression.linearregression_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='mp', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.linear_model.base.LinearModel
instance.) – The linear regression model (e.g.sklearn.linear_model.LinearRegression
orsklearn.linear_model.Ridge
) that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (float) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.Linear regression supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program for computing the counterfactual. Note that in this case the hyperparameter C is ignored.
As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “mp”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
It might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.lvq
- class ceml.sklearn.lvq.CQPHelper(mymodel, x_orig, y_target, indices_other_prototypes, features_whitelist=None, regularization='l1', optimizer_args=None, **kwds)
- class ceml.sklearn.lvq.LVQ(model, dist='l2', **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn_lvq.GlvqModel
,sklearn_lvq.GmlvqModel
,sklearn_lvq.LgmlvqModel
,sklearn_lvq.RslvqModel
,sklearn_lvq.MrslvqModel
andsklearn_lvq.LmrslvqModel
classes.The
LVQ
class rebuilds a sklearn-lvq lvq model.- Parameters
model (instance of
sklearn_lvq.GlvqModel
,sklearn_lvq.GmlvqModel
,sklearn_lvq.LgmlvqModel
,sklearn_lvq.RslvqModel
,sklearn_lvq.MrslvqModel
orsklearn_lvq.LmrslvqModel
) – The lvq model.dist (str or callable, optional) –
Computes the distance between a prototype and a data point.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.
The default is “l2”.
Note: dist must not be None.
- prototypes
The prototypes.
- Type
numpy.array
- labels
The labels of the prototypes.
- Type
numpy.array
- dist
The distance function.
- Type
callable
- model
The original sklearn-lvq model.
- Type
object
- model_class
The class of the sklearn-lvq model.
- Type
class
- dim
Dimensionality of the input data.
- Type
int
- Raises
TypeError – If model is not an instance of
sklearn_lvq.GlvqModel
,sklearn_lvq.GmlvqModel
,sklearn_lvq.LgmlvqModel
,sklearn_lvq.RslvqModel
,sklearn_lvq.MrslvqModel
orsklearn_lvq.LmrslvqModel
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Builds a cost function where we penalize the minimum distance to the nearest prototype which is consistent with the target y_target.
- Parameters
y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to an input. E.g. using the
ceml.optim.input_wrapper.InputWrapper
class.If pred is None, no transformation is applied to the input before putting it into the loss function.
The default is None.
- Returns
Initialized cost function. Target label is y_target.
- Return type
ceml.backend.jax.costfunctions.MinOfListDistCost
- predict(x)
Note
This function is a placeholder only.
This function does not predict anything and just returns the given input.
- class ceml.sklearn.lvq.LvqCounterfactual(model, dist='l2', cqphelper=<class 'ceml.sklearn.lvq.CQPHelper'>, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
,ceml.optim.cvx.MathematicalProgram
,ceml.optim.cvx.DCQP
Class for computing a counterfactual of a lvq model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- rebuild_model(model)
Rebuilds a
sklearn_lvq.GlvqModel
,sklearn_lvq.GmlvqModel
,sklearn_lvq.LgmlvqModel
,sklearn_lvq.RslvqModel
,sklearn_lvq.MrslvqModel
orsklearn_lvq.LmrslvqModel
model.Converts a
sklearn_lvq.GlvqModel
,sklearn_lvq.GmlvqModel
,sklearn_lvq.LgmlvqModel
,sklearn_lvq.RslvqModel
,sklearn_lvq.MrslvqModel
orsklearn_lvq.LmrslvqModel
instance into aceml.sklearn.lvq.LVQ
instance.- Parameters
model (instace of
sklearn_lvq.GlvqModel
,sklearn_lvq.GmlvqModel
,sklearn_lvq.LgmlvqModel
,sklearn_lvq.RslvqModel
,sklearn_lvq.MrslvqModel
orsklearn_lvq.LmrslvqModel
) – The sklearn-lvq lvq model.- Returns
The wrapped lvq model.
- Return type
- solve(x_orig, y_target, regularization, features_whitelist, return_as_dict, optimizer_args)
Approximately solves the DCQP by using the penalty convex-concave procedure.
- Parameters
x0 (numpy.ndarray) – The initial data point for the penalty convex-concave procedure - this could be anything, however a “good” initial solution might lead to a better result.
- ceml.sklearn.lvq.lvq_generate_counterfactual(model, x, y_target, features_whitelist=None, dist='l2', regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.neighbors.sklearn_lvq.GlvqModel
,sklearn_lvq.GmlvqModel
,sklearn_lvq.LgmlvqModel
,sklearn_lvq.RslvqModel
,sklearn_lvq.MrslvqModel
orsklearn_lvq.LmrslvqModel
instance.) –The lvq model that is used for computing the counterfactual.
Note: Only lvq models from sklearn-lvq are supported.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.
The default is “l1”.
Note: dist must not be None.
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.
The default is “auto”.
Learning vector quantization supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program (G(M)LVQ) or a DCQP (otherwise) for computing the counterfactual. Note that in this case the hyperparameter C is ignored. Because the DCQP is a non-convex problem, we are not guaranteed to find the best solution (it might even happen that we do not find a solution at all) - we use the penalty convex-concave procedure for approximately solving the DCQP.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) – Not used.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.models
- ceml.sklearn.models.generate_counterfactual(model, x, y_target, features_whitelist=None, dist='l2', regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (object) – The sklearn model that is used for computing the counterfactual.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
dist (str or callable, optional) –
Computes the distance between a prototype and a data point.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
You can use your own custom distance function by setting dist to a callable that can be called on a data point and returns a scalar.
The default is “l1”.
Note: dist must not be None.
Note
Only needed if model is a LVQ or KNN model!
regularization (str or callable, optional) –
Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optimizer.optimizer.desc_to_optim()
for details.Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.
As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “auto”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
In case of a regression it might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
ValueError – If model contains an unsupported model.
ceml.sklearn.naivebayes
- class ceml.sklearn.naivebayes.GaussianNB(model, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.naive_bayes.GaussianNB
classThe
GaussianNB
class rebuilds a gaussian naive bayes model from a given set of parameters (priors, means and variances).- Parameters
model (instance of
sklearn.naive_bayes.GaussianNB
) – The gaussian naive bayes model.
- class_priors
Class dependend priors.
- Type
numpy.ndarray
- means
Class and feature dependend means.
- Type
numpy.array
- variances
Class and feature dependend variances.
- Type
numpy.ndarray
- dim
Dimensionality of the input data.
- Type
int
- is_binary
True if model is a binary classifier, False otherwise.
- Type
boolean
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Build a negative-log-likehood cost function where the target is y_target.
- Parameters
y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).
If pred is None, the class method predict is used for mapping the input to the output (class probabilities)
The default is None.
- Returns
Initialized negative-log-likelihood cost function. Target label is y_target.
- Return type
ceml.backend.jax.costfunctions.NegLogLikelihoodCost
- predict(x)
Predict the output of a given input.
Computes the class probabilities for a given input x.
- Parameters
x (numpy.ndarray) – The input x that is going to be classified.
- Returns
An array containing the class probabilities.
- Return type
jax.numpy.array
- class ceml.sklearn.naivebayes.GaussianNbCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
,ceml.optim.cvx.MathematicalProgram
,ceml.optim.cvx.SDP
,ceml.optim.cvx.DCQP
Class for computing a counterfactual of a gaussian naive bayes model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- rebuild_model(model)
Rebuild a
sklearn.naive_bayes.GaussianNB
model.Converts a
sklearn.naive_bayes.GaussianNB
into aceml.sklearn.naivebayes.GaussianNB
.- Parameters
model (instance of
sklearn.naive_bayes.GaussianNB
) – The sklearn gaussian naive bayes model.- Returns
The wrapped gaussian naive bayes model.
- Return type
- solve(x_orig, y_target, regularization, features_whitelist, return_as_dict, optimizer_args)
Approximately solves the DCQP by using the penalty convex-concave procedure.
- Parameters
x0 (numpy.ndarray) – The initial data point for the penalty convex-concave procedure - this could be anything, however a “good” initial solution might lead to a better result.
- ceml.sklearn.naivebayes.gaussiannb_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.naive_bayes.GaussianNB
instance.) – The gaussian naive bayes model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.
The default is “auto”.
Gaussian naive Bayes supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a semi-definite program (binary classifier) or a DCQP (otherwise) for computing the counterfactual. Note that in this case the hyperparameter C is ignored. Because the DCQP is a non-convex problem, we are not guaranteed to find the best solution (it might even happen that we do not find a solution at all) - we use the penalty convex-concave procedure for approximately solving the DCQP.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) – Not used.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.lda
- class ceml.sklearn.lda.Lda(model, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.discriminant_analysis.LinearDiscriminantAnalysis
class.The
Lda
class rebuilds a lda model from a given parameters.- Parameters
model (instance of
sklearn.discriminant_analysis.LinearDiscriminantAnalysis
) – The lda model.
- class_priors
Class dependend priors.
- Type
numpy.ndarray
- means
Class dependend means.
- Type
numpy.ndarray
- sigma_inv
Inverted covariance matrix.
- Type
numpy.ndarray
- dim
Dimensionality of the input data.
- Type
int
- Raises
TypeError – If model is not an instance of
sklearn.discriminant_analysis.LinearDiscriminantAnalysis
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Build a negative-log-likehood cost function where the target is y_target.
- Parameters
y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).
If pred is None, the class method predict is used for mapping the input to the output (class probabilities)
The default is None.
- Returns
Initialized negative-log-likelihood cost function. Target label is y_target.
- Return type
ceml.backend.jax.costfunctions.NegLogLikelihoodCost
- predict(x)
Predict the output of a given input.
Computes the class probabilities for a given input x.
- Parameters
x (numpy.ndarray) – The input x that is going to be classified.
- Returns
An array containing the class probabilities.
- Return type
jax.numpy.array
- class ceml.sklearn.lda.LdaCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
,ceml.optim.cvx.MathematicalProgram
,ceml.optim.cvx.ConvexQuadraticProgram
,ceml.optim.cvx.PlausibleCounterfactualOfHyperplaneClassifier
Class for computing a counterfactual of a lda model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- rebuild_model(model)
Rebuild a
sklearn.discriminant_analysis.LinearDiscriminantAnalysis
model.Converts a
sklearn.discriminant_analysis.LinearDiscriminantAnalysis
into aceml.sklearn.lda.Lda
.- Parameters
model (instance of
sklearn.discriminant_analysis.LinearDiscriminantAnalysis
) – The sklearn lda model - note that store_covariance must be set to True.- Returns
The wrapped qda model.
- Return type
- ceml.sklearn.lda.lda_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='mp', optimizer_args=None, return_as_dict=True, done=None, plausibility=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.discriminant_analysis.LinearDiscriminantAnalysis
instance.) – The lda model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.Linear discriminant analysis supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program for computing the counterfactual. Note that in this case the hyperparameter C is ignored.
As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “mp”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) – Not used.
plausibility (dict, optional.) –
If set to a valid dictionary (see
ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals()
), a plausible counterfactual (as proposed in Artelt et al. 2020) is computed. Note that in this case, all other parameters are ignored.If plausibility is None, the closest counterfactual is computed.
The default is None.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.qda
- class ceml.sklearn.qda.Qda(model, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
class.The
Qda
class rebuilds a lda model from a given parameters.- Parameters
model (instance of
sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
) – The qda model.
- class_priors
Class dependend priors.
- Type
numpy.ndarray
- means
Class dependend means.
- Type
numpy.ndarray
- sigma_inv
Class dependend inverted covariance matrices.
- Type
numpy.ndarray
- dim
Dimensionality of the input data.
- Type
int
- is_binary
True if model is a binary classifier, False otherwise.
- Type
boolean
- Raises
TypeError – If model is not an instance of
sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Build a negative-log-likehood cost function where the target is y_target.
- Parameters
y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).
If pred is None, the class method predict is used for mapping the input to the output (class probabilities)
The default is None.
- Returns
Initialized negative-log-likelihood cost function. Target label is y_target.
- Return type
ceml.backend.jax.costfunctions.NegLogLikelihoodCost
- predict(x)
Predict the output of a given input.
Computes the class probabilities for a given input x.
- Parameters
x (numpy.ndarray) – The input x that is going to be classified.
- Returns
An array containing the class probabilities.
- Return type
jax.numpy.array
- class ceml.sklearn.qda.QdaCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
,ceml.optim.cvx.MathematicalProgram
,ceml.optim.cvx.SDP
,ceml.optim.cvx.DCQP
Class for computing a counterfactual of a qda model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- rebuild_model(model)
Rebuild a
sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
model.Converts a
sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
into aceml.sklearn.qda.Qda
.- Parameters
model (instance of
sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
) – The sklearn qda model - note that store_covariance must be set to True.- Returns
The wrapped qda model.
- Return type
- solve(x_orig, y_target, regularization, features_whitelist, return_as_dict, optimizer_args)
Approximately solves the DCQP by using the penalty convex-concave procedure.
- Parameters
x0 (numpy.ndarray) – The initial data point for the penalty convex-concave procedure - this could be anything, however a “good” initial solution might lead to a better result.
- ceml.sklearn.qda.qda_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis
instance.) – The qda model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “nelder-mead”.
Quadratic discriminant analysis supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a semi-definite program (binary classifier) or a DCQP (otherwise) for computing the counterfactual. Note that in this case the hyperparameter C is ignored. Because the DCQP is a non-convex problem, we are not guaranteed to find the best solution (it might even happen that we do not find a solution at all) - we use the penalty convex-concave procedure for approximately solving the DCQP.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) – Not used.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.pipeline
- class ceml.sklearn.pipeline.PipelineCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
Class for computing a counterfactual of a softmax regression model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- build_loss(regularization, x_orig, y_target, pred, grad_mask, C, input_wrapper)
Build a loss function.
Overwrites the build_loss method from base class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- Parameters
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
) –Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.DifferentiableCostFunction
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
x_orig (numpy.array) – The original input whose prediction has to be explained.
y_target (int or float) – The requested output.
pred (callable) –
A callable that maps an input to the output.
If pred is None, the class method predict is used for mapping the input to the output.
grad_mask (numpy.array) – Gradient mask determining which dimensions can be used.
C (float or list(float)) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).
- Returns
Initialized cost function. Target is set to y_target.
- Return type
- compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='auto', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
x (numpy.ndarray) – The data point x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If feature_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.DifferentiableCostFunction
if the cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
If no regularization is used (regularization=None), C is ignored.
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.
As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.Some models (see paper) support the use of mathematical programs for computing counterfactuals. In this case, you can use the option “mp” - please read the documentation of the corresponding model for further information.
The default is “auto”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
In case of a regression it might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
- rebuild_model(model)
Rebuild a
sklearn.pipeline.Pipeline
model.Converts a
sklearn.pipeline.Pipeline
into aceml.sklearn.pipeline.PipelineModel
.- Parameters
model (instance of
sklearn.pipeline.Pipeline
) – The sklearn pipeline model.- Returns
The wrapped pipeline model.
- Return type
ceml.sklearn.pipeline.Pipeline
- class ceml.sklearn.pipeline.PipelineModel(models, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.pipeline.Pipeline
classThe
PipelineModel
class rebuilds a pipeline model from a given list of sklearn models.- Parameters
models (list(object)) – Ordered list of all sklearn models in the pipeline.
- models
Ordered list of all sklearn models in the pipeline.
- Type
list(objects)
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Builds a cost function where the target is y_target.
- Parameters
y_target (int or float) – The requested output.
pred (callable, optional) –
A callable that maps an input to the output.
If pred is None, the class method predict is used for mapping the input to the output.
The default is None.
- Returns
Initialized cost function. Target is set to y_target.
- Return type
- predict(x)
Predicts the output of a given input.
Computes the prediction of a given input x.
- Parameters
x (numpy.ndarray) – The input x.
- Returns
Output of the pipeline (might be scalar or smth. higher-dimensional).
- Return type
numpy.array
- ceml.sklearn.pipeline.pipeline_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.pipeline.Pipeline
instance.) – The modelpipeline that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.Use “auto” if you do not know what optimizer to use - a suitable optimizer is chosen automatically.
As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “nelder-mead”.
Some models (see paper) support the use of mathematical programs for computing counterfactuals. In this case, you can use the option “mp” - please read the documentation of the corresponding model for further information.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
In case of a regression it might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.randomforest
- class ceml.sklearn.randomforest.EnsembleVotingCost(models, y_target, input_wrapper=None, epsilon=0, **kwds)
Bases:
ceml.costfunctions.costfunctions.CostFunction
Loss function of an ensemble of models.
The loss is the negative fraction of models that predict the correct output.
- Parameters
models (list(object)) – List of models
y_target (int, float or a callable that returns True if a given prediction is accepted.) – The requested prediction.
input_wrapper (callable, optional) –
Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).
The default is None.
- score_impl(x)
Implementation of the loss function.
- class ceml.sklearn.randomforest.RandomForest(model, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.ensemble.RandomForestClassifier
orsklearn.ensemble.RandomForestRegressor
class.- Parameters
model (instance of
sklearn.ensemble.RandomForestClassifier
orsklearn.ensemble.RandomForestRegressor
) – The random forest model.- Raises
TypeError – If model is not an instance of
sklearn.ensemble.RandomForestClassifier
orsklearn.ensemble.RandomForestRegressor
- get_loss(y_target, input_wrapper=None)
Creates and returns a loss function.
- Parameters
y_target (int, float or a callable that returns True if a given prediction is accepted.) – The requested prediction.
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).
- Returns
Initialized loss function. The target output is y_target.
- Return type
- predict(x)
Predict the output of a given input.
Computes the class label of a given input x.
- Parameters
x (numpy.ndarray) – The input x that is going to be classified.
- Returns
Prediction.
- Return type
int or float
- class ceml.sklearn.randomforest.RandomForestCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
Class for computing a counterfactual of a random forest model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- build_loss(regularization, x_orig, y_target, pred, grad_mask, C, input_wrapper)
Build the (non-differentiable) cost function: Regularization + Loss
- compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If feature_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.DifferentiableCostFunction
if the cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
If no regularization is used (regularization=None), C is ignored.
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “nelder-mead”.
Note
The cost function of a random forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
In case of a regression it might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
- rebuild_model(model)
Rebuilds a
sklearn.ensemble.RandomForestClassifier
orsklearn.ensemble.RandomForestRegressor
model.Converts a
sklearn.ensemble.RandomForestClassifier
orsklearn.ensemble.RandomForestRegressor
instance into aceml.sklearn.randomforest.RandomForest
instance.- Parameters
model (instance of
sklearn.ensemble.RandomForestClassifier
orsklearn.ensemble.RandomForestRegressor
) – The sklearn random forest model.- Returns
The wrapped random forest model.
- Return type
- ceml.sklearn.randomforest.randomforest_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.ensemble.RandomForestClassifier
orsklearn.ensemble.RandomForestRegressor
instance.) – The random forest model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “nelder-mead”.
Note
The cost function of a random forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) – Not used.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.isolationforest
- class ceml.sklearn.isolationforest.IsolationForest(model, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.ensemble.IsolationForest
class.- Parameters
model (instance of
sklearn.ensemble.IsolationForest
) – The isolation forest model.- Raises
TypeError – If model is not an instance of
sklearn.ensemble.IsolationForest
- get_loss(y_target, input_wrapper=None)
Creates and returns a loss function.
- Parameters
y_target (int) – The target class - either +1 or -1
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).
- Returns
Initialized loss function. Target label is y_target.
- Return type
- predict(x)
Predict the output of a given input.
Computes the class label of a given input x.
- Parameters
x (numpy.ndarray) – The input x that is going to be classified.
- Returns
Prediction.
- Return type
int
- class ceml.sklearn.isolationforest.IsolationForestCost(models, y_target, input_wrapper=None, epsilon=0, **kwds)
Bases:
ceml.costfunctions.costfunctions.CostFunction
Loss function of an isolation forest.
The loss is the negative averaged length of the decision paths.
- Parameters
models (list(object)) – List of decision trees.
y_target (int) – The requested prediction - either -1 or +1.
input_wrapper (callable, optional) –
Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).
The default is None.
- score_impl(x)
Implementation of the loss function.
- class ceml.sklearn.isolationforest.IsolationForestCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
Class for computing a counterfactual of an isolation forest model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- compute_counterfactual(x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True, done=None)
Computes a counterfactual of a given input x.
- Parameters
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float) – The requested prediction of the counterfactual.
feature_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If feature_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x. Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.DifferentiableCostFunction
if the cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
If no regularization is used (regularization=None), C is ignored.
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optimizer.optimizer.desc_to_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “nelder-mead”.
Note
The cost function of an isolation forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) –
A callable that returns True if a counterfactual with a given output/prediction is accepted and False otherwise.
If done is None, the output/prediction of the counterfactual must match y_target exactly.
The default is None.
Note
In case of a regression it might not always be possible to achieve a given output/prediction exactly.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
- rebuild_model(model)
Rebuilds a
sklearn.ensemble.IsolationForest
model.Converts a
sklearn.ensemble.IsolationForest
into aceml.sklearn.isolationforest.IsolationForest
.- Parameters
model (instance of
sklearn.ensemble.IsolationForest
) – The sklearn isolation forest model.- Returns
The wrapped isolation forest model.
- Return type
- ceml.sklearn.isolationforest.isolationforest_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='nelder-mead', optimizer_args=None, return_as_dict=True)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.ensemble.IsolationForest
instance.) – The isolation forest model that is used for computing the counterfactual.x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int) – The requested prediction of the counterfactual - either -1 or +1.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optimizer.optimizer.desc_to_optim()
for details.As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “nelder-mead”.
Note
The cost function of an isolation forest model is not differentiable - we can not use a gradient-based optimization algorithm.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.softmaxregression
- class ceml.sklearn.softmaxregression.SoftmaxCounterfactual(model, **kwds)
Bases:
ceml.sklearn.counterfactual.SklearnCounterfactual
,ceml.optim.cvx.MathematicalProgram
,ceml.optim.cvx.ConvexQuadraticProgram
,ceml.optim.cvx.PlausibleCounterfactualOfHyperplaneClassifier
Class for computing a counterfactual of a softmax regression model.
See parent class
ceml.sklearn.counterfactual.SklearnCounterfactual
.- rebuild_model(model)
Rebuilds a
sklearn.linear_model.LogisticRegression
model.Converts a
sklearn.linear_model.LogisticRegression
into aceml.sklearn.softmaxregression.SoftmaxRegression
.- Parameters
model (instance of
sklearn.linear_model.LogisticRegression
) – The sklearn softmax regression model.- Returns
The wrapped softmax regression model.
- Return type
- class ceml.sklearn.softmaxregression.SoftmaxRegression(model, **kwds)
Bases:
ceml.model.model.ModelWithLoss
Class for rebuilding/wrapping the
sklearn.linear_model.LogisticRegression
class.The
SoftmaxRegression
class rebuilds a softmax regression model from a given weight vector and intercept.- Parameters
model (instance of
sklearn.linear_model.LogisticRegression
) – The softmax regression model.
- w
The weight vector (a matrix if we have more than two classes).
- Type
numpy.ndarray
- b
The intercept/bias (a vector if we have more than two classes).
- Type
numpy.ndarray
- dim
Dimensionality of the input data.
- Type
int
- is_multiclass
True if model is a binary classifier, False otherwise.
- Type
boolean
- Raises
TypeError – If model is not an instance of
sklearn.linear_model.LogisticRegression
- get_loss(y_target, pred=None)
Creates and returns a loss function.
Builds a negative-log-likehood cost function where the target is y_target.
- Parameters
y_target (int) – The target class.
pred (callable, optional) –
A callable that maps an input to the output (class probabilities).
If pred is None, the class method predict is used for mapping the input to the output (class probabilities)
The default is None.
- Returns
Initialized negative-log-likelihood cost function. Target label is y_target.
- Return type
ceml.backend.jax.costfunctions.NegLogLikelihoodCost
- predict(x)
Predict the output of a given input.
Computes the class probabilities for a given input x.
- Parameters
x (numpy.ndarray) – The input x that is going to be classified.
- Returns
An array containing the class probabilities.
- Return type
jax.numpy.array
- ceml.sklearn.softmaxregression.softmaxregression_generate_counterfactual(model, x, y_target, features_whitelist=None, regularization='l1', C=1.0, optimizer='mp', optimizer_args=None, return_as_dict=True, done=None, plausibility=None)
Computes a counterfactual of a given input x.
- Parameters
model (a
sklearn.linear_model.LogisticRegression
instance.) –The softmax regression model that is used for computing the counterfactual.
Note: model.multi_class must be set to multinomial.
x (numpy.ndarray) – The input x whose prediction has to be explained.
y_target (int or float or a callable that returns True if a given prediction is accepted.) – The requested prediction of the counterfactual.
features_whitelist (list(int), optional) –
List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.
If features_whitelist is None, all features can be used.
The default is None.
regularization (str or
ceml.costfunctions.costfunctions.CostFunction
, optional) –Regularizer of the counterfactual. Penalty for deviating from the original input x.
Supported values:
l1: Penalizes the absolute deviation.
l2: Penalizes the squared deviation.
regularization can be a description of the regularization, an instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.CostFunctionDifferentiable
if your cost function is differentiable) or None if no regularization is requested.If regularization is None, no regularization is used.
The default is “l1”.
C (float or list(float), optional) –
The regularization strength. If C is a list, all values in C are tried and as soon as a counterfactual is found, this counterfactual is returned and no other values of C are tried.
C is ignored if no regularization is used (regularization=None).
The default is 1.0
optimizer (str or instance of
ceml.optim.optimizer.Optimizer
, optional) –Name/Identifier of the optimizer that is used for computing the counterfactual. See
ceml.optim.optimizer.prepare_optim()
for details.Softmax regression supports the use of mathematical programs for computing counterfactuals - set optimizer to “mp” for using a convex quadratic program for computing the counterfactual. Note that in this case the hyperparameter C is ignored.
As an alternative, we can use any (custom) optimizer that is derived from the
ceml.optim.optimizer.Optimizer
class.The default is “mp”.
optimizer_args (dict, optional) –
Dictionary for overriding the default hyperparameters of the optimization algorithm.
The default is None.
return_as_dict (boolean, optional) –
If True, returns the counterfactual, its prediction and the needed changes to the input as dictionary. If False, the results are returned as a triple.
The default is True.
done (callable, optional) – Not used.
plausibility (dict, optional.) –
If set to a valid dictionary (see
ceml.sklearn.plausibility.prepare_computation_of_plausible_counterfactuals()
), a plausible counterfactual (as proposed in Artelt et al. 2020) is computed. Note that in this case, all other parameters are ignored.If plausibility is None, the closest counterfactual is computed.
The default is None.
- Returns
A dictionary where the counterfactual is stored in ‘x_cf’, its prediction in ‘y_cf’ and the changes to the original input in ‘delta’.
(x_cf, y_cf, delta) : triple if return_as_dict is False
- Return type
dict or triple
- Raises
Exception – If no counterfactual was found.
ceml.sklearn.utils
- ceml.sklearn.utils.build_regularization_loss(regularization, x, input_wrapper=None)
Build a regularization loss.
- Parameters
regularization (str,
ceml.costfunctions.costfunctions.CostFunction
or None) –Description of the regularization, instance of
ceml.costfunctions.costfunctions.CostFunction
(orceml.costfunctions.costfunctions.DifferentiableCostFunction
if your cost function is differentiable) or None if no regularization is requested.See
ceml.sklearn.utils.desc_to_regcost()
for a list of supported descriptions.If no regularization is requested, an instance of
ceml.backend.jax.costfunctions.costfunctions.DummyCost
is returned. This cost function always outputs zero, no matter what the input is.x (numpy.array) – The original input from which we do not want to deviate much.
input_wrapper (callable, optional) –
Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).
If input_wrapper is None, the input is passed without any modifications.
The default is None.
- Returns
An instance of
ceml.costfunctions.costfunctions.CostFunction
or the user defined, callable, regularization.- Return type
callable
- Raises
TypeError – If regularization has an invalid type.
- ceml.sklearn.utils.desc_to_dist(desc)
Converts a description of a distance metric into a jax.numpy function.
Supported descriptions:
l1: l1-norm
l2: l2-norm
- Parameters
desc (str) – Description of the distance metric.
- Returns
The distance function implemented as a jax.numpy function.
- Return type
callable
- Raises
ValueError – If desc contains an invalid description.
- ceml.sklearn.utils.desc_to_regcost(desc, x, input_wrapper)
Converts a description of a regularization into a jax.numpy function.
Supported descriptions:
l1: l1-regularization
l2: l2-regularization
- Parameters
desc (str) – Description of the distance metric.
x (numpy.array) – The original input from which we do not want to deviate much.
input_wrapper (callable) – Converts the input (e.g. if we want to exclude some features/dimensions, we might have to include these missing features before applying any function to it).
- Returns
The regularization function implemented as a jax.numpy function.
- Return type
callable
- Raises
ValueError – If desc contains an invalid description.