ceml.optim

ceml.optim.input_wrapper

class ceml.optim.input_wrapper.InputWrapper(features_whitelist, x_orig, **kwds)

Bases: object

Class for wrapping an input.

The InputWrapper class wraps an inputs to hide some of its dimensions/features to subsequent methods.

Parameters
  • features_whitelist (list(int)) –

    A non-empty list of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

    If feature_whitelist is None, all features can be used.

  • x_orig (numpy.array) – The original input that is going to be wrapped - this is the input whose prediction has to be explained.

Raises

ValueError – If features_whitelist is an empty list.

complete(x)

Completing a given input.

Adds the fixed/hidden dimensions from the original input to the given input.

Parameters

x (array_like:) – The input to be completed.

Returns

The completed input.

Return type

numpy.array

extract_from(x)

Extracts the whitelisted dimensions from a given input.

Parameters

x (array_like:) – The input to be processed.

Returns

The extracted input - only whitelisted features/dimensions are kept.

Return type

numpy.array

ceml.optimizer.optimizer

class ceml.optim.optimizer.BFGS(**kwds)

Bases: ceml.optim.optimizer.Optimizer

BFGS optimization algorithm.

Note

The BFGS optimization algorithm is a Quasi-Newton method.

init(f, f_grad, x0, tol=None, max_iter=None)

Initializes all parameters.

Parameters
  • f (callable) – The objective that is minimized.

  • f_grad (callable) – The gradient of the objective.

  • x0 (numpy.array) – The initial value of the unknown variable.

  • tol (float, optional) –

    Tolerance for termination.

    tol=None is equivalent to tol=0.

    The default is None.

  • max_iter (int, optional) –

    Maximum number of iterations.

    If max_iter is None, the default value of the particular optimization algorithm is used.

    Default is None.

class ceml.optim.optimizer.ConjugateGradients(**kwds)

Bases: ceml.optim.optimizer.Optimizer

Conjugate gradients optimization algorithm.

init(f, f_grad, x0, tol=None, max_iter=None)

Initializes all parameters.

Parameters
  • f (callable) – The objective that is minimized.

  • f_grad (callable) – The gradient of the objective.

  • x0 (numpy.array) – The initial value of the unknown variable.

  • tol (float, optional) –

    Tolerance for termination.

    tol=None is equivalent to tol=0.

    The default is None.

  • max_iter (int, optional) –

    Maximum number of iterations.

    If max_iter is None, the default value of the particular optimization algorithm is used.

    Default is None.

class ceml.optim.optimizer.NelderMead(**kwds)

Bases: ceml.optim.optimizer.Optimizer

Nelder-Mead optimization algorithm.

Note

The Nelder-Mead algorithm is a gradient-free optimization algorithm.

init(f, x0, tol=None, max_iter=None)

Initializes all parameters.

Parameters
  • f (callable) – The objective that is minimized.

  • x0 (numpy.array) – The initial value of the unknown variable.

  • tol (float, optional) –

    Tolerance for termination.

    tol=None is equivalent to tol=0.

    The default is None.

  • max_iter (int, optional) –

    Maximum number of iterations.

    If max_iter is None, the default value of the particular optimization algorithm is used.

    Default is None.

class ceml.optim.optimizer.Optimizer(**kwds)

Bases: abc.ABC

Abstract base class of an optimizer.

All optimizers must be derived from the Optimizer class.

Note

Any class derived from Optimizer has to implement the abstract methods init, __call__ and is_grad_based.

class ceml.optim.optimizer.Powell(**kwds)

Bases: ceml.optim.optimizer.Optimizer

Powell optimization algorithm.

Note

The Powell algorithm is a gradient-free optimization algorithm.

init(f, x0, tol=None, max_iter=None)

Initializes all parameters.

Parameters
  • f (callable) – The objective that is minimized.

  • x0 (numpy.array) – The initial value of the unknown variable.

  • tol (float, optional) –

    Tolerance for termination.

    tol=None is equivalent to tol=0.

    The default is None.

  • max_iter (int, optional) –

    Maximum number of iterations.

    If max_iter is None, the default value of the particular optimization algorithm is used.

    Default is None.

ceml.optim.optimizer.is_optimizer_grad_based(optim)

Determines whether a specific optimization algorithm (specified by a description in desc) needs a gradient.

Supported descriptions:

  • nelder-mead: Gradient-free Nelder-Mead optimizer (also called Downhill-Simplex)

  • powell: Gradient-free Powell optimizer

  • bfgs: BFGS optimizer

  • cg: Conjugate gradients optimizer

Parameters

optim (str or instance of ceml.optim.optimizer.Optimizer) – Description of the optimization algorithm or an instance of ceml.optim.optimizer.Optimizer.

Returns

True if the optimization algorithm needs a gradient, False otherwise.

Return type

bool

Raises
  • ValueError – If optim contains an invalid description.

  • TypeError – If optim is neither a string nor an instance of ceml.optim.optimizer.Optimizer.

ceml.optim.optimizer.prepare_optim(optim, f, x0, f_grad=None, tol=None, max_iter=None)

Creates and initializes an optimization algorithm (instance of ceml.optim.optimizer.Optimizer) specified by a description of the algorithm.

Supported descriptions:

  • nelder-mead: Gradient-free Nelder-Mead optimizer (also called downhill simplex method)

  • powell: Gradient-free Powell optimizer

  • bfgs: BFGS optimizer

  • cg: Conjugate gradients optimizer

Parameters
  • optim (str or instance of ceml.optim.optimizer.Optimizer) – Description of the optimization algorithm or an instance of ceml.optim.optimizer.Optimizer.

  • f (instance of ceml.costfunctions.costfunctions.CostFunction or callable) – The objective that has to be minimized.

  • x0 (numpy.array) – The initial value of the unknown variable.

  • f_grad (callable, optional) –

    The gradient of the objective.

    If f_grad is None, no gradient is used. Note that some optimization algorithms require a gradient!

    The default is None.

  • tol (float, optional) –

    Tolerance for termination.

    tol=None is equivalent to tol=0.

    The default is None.

  • max_iter (int, optional) –

    Maximum number of iterations.

    If max_iter is None, the default value of the particular optimization algorithm is used.

    Default is None.

Returns

An instance of ceml.optim.optimizer.Optimizer

Return type

callable

Raises
  • ValueError – If optim contains an invalid description or if no gradient is specified but and optim describes a gradient based optimization algorithm.

  • TypeError – If optim is neither a string nor an instance of ceml.optim.optimizer.Optimizer.

ceml.optimizer.ga

class ceml.optim.ga.EvolutionaryOptimizer(population_size=100, select_by_fitness=0.5, mutation_prob=0.1, mutation_scaling=4.0, **kwds)

Bases: ceml.optim.optimizer.Optimizer

Evolutionary/Genetic optimization algorithm.

Note

This genetic algorithm is a gradient-free optimization algorithm.

This implementation encodes an individual as a numpy.array - if you want to use a different representation, you have to derive a new class from this class and reimplement all relevant methods.

Parameters
  • population_size (int) –

    The size of the population

    The default is 100

  • select_by_fitness (float) –

    The fraction of individuals that is selected according to their fitness.

    The default is 0.5

  • mutation_prob (float) –

    The proability that an offspring is mutated.

    The default is 0.1

  • mutation_scaling (float) –

    Standard deviation of the normal distribution for mutating features.

    The default is 4.0

compute_fitness(x)

Computes the fitness of a given individual x.

Parameters

x (numpy.array) – The representation of the individual.

crossover(x0, x1)

Produces an offspring from the individuals x0 and x1.

Note

This method implements single-point crossover. If you want to use a different crossover strategy, you have to derive a new class from this one and reimplement the method crossover

Parameters
  • x0 (numpy.array) – The representation of first individual.

  • x1 (numpy.array) – The representation of second individual.

Returns

The representation of offspring created from x0 and x1.

Return type

numpy.array

init(f, x0, tol=None, max_iter=None)

Initializes all remaining parameters.

Parameters
  • f (callable) – The objective that is minimized.

  • x0 (numpy.array) – The initial value of the unknown variable.

  • tol (float, optional) –

    Tolerance for termination.

    tol=None is equivalent to tol=0.

    The default is 0.

  • max_iter (int, optional) –

    Maximum number of iterations.

    If max_iter is None, the default value of the particular optimization algorithm is used.

    Default is None.

mutate(x)

Mutates a given individual x.

Parameters

x (numpy.array) – The representation of the individual.

Returns

The representation of the mutated individual x.

Return type

numpy.array

select_candidates(fitness)

Selects a the most fittest individuals from the current population for producing offsprings.

Parameters

fitness (list(float)) – Fitness of the individuals.

Returns

The selected individuals.

Return type

list(numpy.array)

validate(x)

Validates a given individual x.

This methods checks whether a given individual is valid (in the sense that the feature characteristics are valid) and if not it makes it valid by changing some of its features.

Note

This implementation is equivalent to the identity function. The input is returned without any changes - we do not restrict the input space! If you want to make some restrictions on the input space, you have to derive a new class from this one and reimplement the method validate.

Parameters

x (numpy.array) – The representation of the individual x.

Returns

The representation of the validated individual.

Return type

numpy.array

ceml.optimizer.cvx

class ceml.optim.cvx.ConvexQuadraticProgram(**kwds)

Bases: abc.ABC

Base class for a convex quadratic program - for computing counterfactuals.

epsilon

“Small” non-negative number for relaxing strict inequalities.

Type

float

build_solve_opt(x_orig, y, features_whitelist=None, mad=None)

Builds and solves the convex quadratic optimization problem.

Parameters
  • x_orig (numpy.ndarray) – The original data point.

  • y (int or float) – The requested prediction of the counterfactual - e.g. a class label.

  • features_whitelist (list(int), optional) –

    List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

    If features_whitelist is None, all features can be used.

    The default is None.

  • mad (numpy.ndarray, optional) –

    Weights for the weighted Manhattan distance.

    If mad is None, the Euclidean distance is used.

    The default is None.

Returns

The solution of the optimization problem.

If no solution exists, None is returned.

Return type

numpy.ndarray

class ceml.optim.cvx.DCQP(**kwds)

Bases: object

Class for a difference-of-convex-quadratic program (DCQP) - for computing counterfactuals.

\[\underset{\vec{x} \in \mathbb{R}^d}{\min} \vec{x}^\top Q_0 \vec{x} + \vec{q}^\top \vec{x} + c - \vec{x}^\top Q_1 \vec{x} \quad \text{s.t. } \vec{x}^\top A0_i \vec{x} + \vec{x}^\top \vec{b_i} + r_i - \vec{x}^\top A1_i \vec{x} \leq 0 \; \forall\,i\]
pccp

Implementation of the penalty convex-concave procedure for approximately solving the DCQP.

Type

instance of ceml.optim.cvx.PenaltyConvexConcaveProcedure

epsilon

“Small” non-negative number for relaxing strict inequalities.

Type

float

build_program(model, x_orig, y_target, Q0, Q1, q, c, A0_i, A1_i, b_i, r_i, features_whitelist=None, mad=None)

Builds the DCQP.

Parameters
  • model (object) – The model that is used for computing the counterfactual - must provide a method predict.

  • x (numpy.ndarray) – The data point x whose prediction has to be explained.

  • y_target (int or float) – The requested prediction of the counterfactual - e.g. a class label.

  • Q0 (numpy.ndarray) – The matrix Q_0 of the DCQP.

  • Q1 (numpy.ndarray) – The matrix Q_1 of the DCQP.

  • q (numpy.ndarray) – The vector q of the DCQP.

  • c (float) – The constant c of the DCQP.

  • A0_i (list(numpy.ndarray)) – List of matrices A0_i of the DCQP.

  • A1_i (list(numpy.ndarray)) – List of matrices A1_i of the DCQP.

  • b_i (list(numpy.ndarray)) – List of vectors b_i of the DCQP.

  • r_i (list(float)) – List of constants r_i of the DCQP.

  • features_whitelist (list(int), optional) –

    List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

    If features_whitelist is None, all features can be used.

    The default is None.

  • mad (numpy.ndarray, optional) –

    Weights for the weighted Manhattan distance.

    If mad is None, the Euclidean distance is used.

    The default is None.

solve(x0, tao=1.2, tao_max=100, mu=1.5)

Approximately solves the DCQP by using the penalty convex-concave procedure.

Parameters
  • x0 (numpy.ndarray) – The initial data point for the penalty convex-concave procedure - this could be anything, however a “good” initial solution might lead to a better result.

  • tao (float, optional) –

    Hyperparameter - see paper for details.

    The default is 1.2

  • tao_max (float, optional) –

    Hyperparameter - see paper for details.

    The default is 100

  • mu (float, optional) –

    Hyperparameter - see paper for details.

    The default is 1.5

class ceml.optim.cvx.MathematicalProgram(**kwds)

Bases: object

Base class for a mathematical program.

class ceml.optim.cvx.PenaltyConvexConcaveProcedure(model, Q0, Q1, q, c, A0_i, A1_i, b_i, r_i, features_whitelist=None, mad=None, **kwds)

Bases: object

Implementation of the penalty convex-concave procedure for approximately solving a DCQP.

class ceml.optim.cvx.SDP(**kwds)

Bases: abc.ABC

Base class for a semi-definite program (SDP) - for computing counterfactuals.

epsilon

“Small” non-negative number for relaxing strict inequalities.

Type

float

build_solve_opt(x_orig, y, features_whitelist=None)

Builds and solves the SDP.

Parameters
  • x_orig (numpy.ndarray) – The original data point.

  • y (int or float) – The requested prediction of the counterfactual - e.g. a class label.

  • features_whitelist (list(int), optional) –

    List of feature indices (dimensions of the input space) that can be used when computing the counterfactual.

    If features_whitelist is None, all features can be used.

    The default is None.

Returns

The solution of the optimization problem.

If no solution exists, None is returned.

Return type

numpy.ndarray