API documentation of the XLEMOO framework

class XLEMOO.LEMOO.CrossOverOP[source]

An abstract class that defines the general interface for a crossover operator.

abstract do(population: Population, mating_pop_ids: List[int]) → ndarray[source]

The crossover operator should have a ‘do’ method as specified in the abstract class.

Parameters:

population (Population) – A population of solutions.
mating_pop_ids (List[int]) – The indices selected for mating.

Returns:

The offspring individuals resulting from applying the mating operator.

Return type:

np.ndarray

class XLEMOO.LEMOO.DummyPopulation(individuals: ndarray, problem: MOProblem)[source]: Used for testing.

class XLEMOO.LEMOO.EAParams(population_size: int, cross_over_op: CrossOverOP, mutation_op: MutationOP, selection_op: SelectionOP, population_init_design: str, iterations_per_cycle: int)[source]

A data class to store and pass parameter values related to the Darwinian mode of a LEMOO method.

Parameters:

population_size (int) – The size of the population to be evolved.
cross_over_op (CrossOverOP) – The crossover operator.
mutation_op (MutationOP) – The mutation operator.
selection_op (SelectionOP) – The selection operator.
population_init_design (str) – Initialization strategy of the population. Should be ‘Random’ for random or ‘LHSDesign’ for latin hypercube sampling.
iterations_per_cycle (int) – How many times a population is evolved in a Darwinian mode before switching to a learning mode. Only relevant when a LEMOO method is run using the run_iterations method.

class XLEMOO.LEMOO.ElitismOP[source]

An abstract class that defines the general interface for an elitism operator.

abstract do(pop1: Population, pop1_fitness: ndarray, pop2: Population, pop2_fitness: ndarray) → ndarray[source]

The elitism operator should have a ‘do’ method as specified in the abstract class. Two populations are compared by the elitism operator from which the best individuals, according to their fitness values, are selected and returned.

Parameters:

pop1 (Population) – The first population.
pop1_fitness (np.ndarray) – The fitness values in the first population. It is assumed the fitnesses are related to the individuals in the population by index.
pop2 (Population) – The second population.
pop2_fitness (np.ndarray) – The fitness values in the second population. It is assumed the fitnesses are related to the individuals in the population by index.

Returns:

The elite individuals from resulting by applying the elitism operator.

Return type:

np.ndarray

class XLEMOO.LEMOO.LEMParams(use_ml: bool, use_darwin: bool, fitness_indicator: Callable[[ndarray, Optional[ndarray]], ndarray], ml_probe: int, darwin_probe: int, ml_threshold: float, darwin_threshold: float, total_iterations: int)[source]

A data class to store and pass general parameter values of a LEMOO method.

Parameters:

use_ml (bool) – Whether to engage in a learning mode or not.
use_darwin (bool) – Whether to engage in a Darwinian mode or not.
fitness_indicator (Callable[[np.ndarray, Optional[np.ndarray]], np.ndarray]) – A fitness function that accepts a 2d numpy array that represents the population individuals in the objective space of the problem. Optionally, the decision variable values may also be passed for each population individual.
ml_probe (int) – The maximum time a learning mode is executed when a threshold is not reached. Only relevant when a LEMOO method is executed using the run method.
darwin_probe (int) – like ml_probe but for a Darwinian mode.
ml_threshold (float) – The relative improvement of the best population individual’s fitness expected before switching out of a learning mode. E.g., a threshold of 1.05 means that executing a learning mode stops when the best population individual has improved by 5% when compared to the previous population’s best individual.
darwin_threshold (float) – Like ml_threshold but for a Darwinian mode.
total_iterations (int) – Overall maximum number of iterations to be run. Only relevant when the run_iterations method of a LEMOO model is used.

class XLEMOO.LEMOO.MLModel[source]

An abstract class that defines the general interface for a machine learning model used in a LEMOO method’s learning mode.

abstract fit(X: ndarray, Y: ndarray)[source]

The machine learning model should have a ‘fit’ method which allows training the model based on data.

Parameters:

X (np.ndarray) – Training samples. E.g., n-dimensional vectors of real values floats.
Y (np.ndarray) – The targets. E.g., a 1-dimensional vector of 0s and 1s for binary classification.

abstract predict(X: ndarray) → ndarray[source]

The machine learning model should have a ‘predict’ method that can be used to predict the, e.g., class in binary classification, or a sample, or samples.

Parameters:: X (np.ndarray) – Sample or sample a class should be predicted for.
Returns:: The predicted classes for the samples.
Return type:: np.ndarray

class XLEMOO.LEMOO.MLParams(H_split: float, L_split: float, ml_model: MLModel, instantiation_factor: float, generation_lookback: int, ancestral_recall: int, unique_only: bool, iterations_per_cycle: int)[source]

A data class to store and pass parameter values related to the learning mode of a LEMOO method.

Parameters:

H_split (float) – The splitting ratio of ‘high performing’ population individuals. E.g., a H_split of 0.10 means that 10% of the best performing population individuals are labeled as high performing during a learning mode.
L_split (float) – Same as H_split, but for the ‘low performing’ population individuals.
ml_model (MLModel) – The machine learning model to be used in a learning mode.
instantiation_factor (float) – A multiplier used to determine how many new population individuals are instantiated in a learning mode after hypothesis forming. E.g., a factor of 2.0 means that 2.0*N_population new population individuals are instantiated based on the learned hypothesis, where N_population is the size of the population in the LEMOO method.
generation_lookback (int) – How many older generations to consider in a learning mode. E.g., a lookback of 5 means that the 5 most recent population are considered when forming a hypothesis.
ancestral_recall (int) – This is like generation_lookback, but considers a specific number of the oldest populations. E.g., a recall of 5 will consider the five first populations.
unique_only (bool) – Whether to consider unique population individuals only when learning a hypothesis.
iterations_per_cycle (int) – How many times a population is ‘’evolved’’ in a learning mode before switching to a Darwinian mode. A good default is 1. Only relevant when a LEMOO method is run using the run_iterations method.

class XLEMOO.LEMOO.MutationOP[source]

An abstract class that defines the general interface for a mutation operator.

abstract do(offsprings: ndarray) → ndarray[source]

The mutation operator should have a ‘do’ method as specified in the abstract class.

Parameters:: offsprings (np.ndarray) – The offspring (or individuals from a population) to be mutated.
Returns:: The mutated offsprings.
Return type:: np.ndarray

class XLEMOO.LEMOO.PastGeneration(individuals: ndarray, objectives_fitnesses: ndarray, fitness_fun_values: ndarray)[source]

A helper data class representing past generation with the individuals (decision space) and their corresponding objective fitness values and fitness function values.

Parameters:

individuals (np.ndarray) – The individuals of a population in the decision variable space.
individuals – The individuals of a population in the objective function space.
fitness_fun_values (np.ndarray) – The fitness function values of each individual.

class XLEMOO.LEMOO.SelectionOP[source]

An abstract class that defines the general interface for a selection operator.

abstract do(pop: Population, fitness: ndarray) → List[int][source]

The selection operator should have a ‘do’ method as specified in the abstract class.

Parameters:

pop (Population) – The population the selection operator is applied to.
fitness (np.ndarray) – The fitness of the individuals in the population. It is assumed the fitnesses are related to the individuals in the population by index.

Returns:

A list of indices indicating which individuals in the population have been selected.

Return type:

List[int]

class XLEMOO.LEMOO.LEMOO(problem: MOProblem, lem_params: LEMParams, ea_params: EAParams, ml_params: MLParams)[source]

A class to define LEMOO models.

Parameters:

problem (MOProblem) – The multiobjective optimization problem to be solved as defined in the DESDEO framework.
lem_params (LEMParams) – A dataclass with parameters relevant to the LEM part of the LEMOO method. See the dataclass’ documentation for additional details.
ea_params (EAParams) – A dataclass with parameters relevant to the Darwin mode of the LEMOO method. See the dataclass’ documentation for additional details.
ml_params (MLParams) – A dataclass with parameters relevant to the learning mode of the LEMOO method. See the dataclass’ documentation for additional details.

current_ml_model

The current machine learning model employed in the learning mode.

Type:: MLModel

_population

The current population of solutions.

Type:: Union[None, SurrogatePopulation]

_best_fitness_fun_value

The current best fitness value found.

Type:: float

_generation_history

A list to keep track fo the population histories during the executing of the LEMOO model.

Type:: List[PastGeneration]

add_population_to_history(individuals: Optional[ndarray] = None, objectives_fitnesses: Optional[ndarray] = None) → None[source]

Add a population to the history of the LEMOO model.

Parameters:

individuals (Optional[np.ndarray], optional) – The decision variables values of the population individuals. Defaults to None.
objectives_fitnesses (Optional[np.ndarray], optional) – The corresponding objective function values of
None. (the population individuals. Defaults to) –

Note

If both arguments are None, then the current population in the LEMOO model is added to the history.

check_condition_best(n_lookback: int, threshold: float) → bool[source]

Check whether the Darwin termination criterion is met. In the past n_lookback iterations.

Return True and update current best value if the condition is met, just return False otherwise.

Parameters:

n_lookback (int) – How many generations to look back to.
threshold (float) – The relative improvement expected in regard to the best fitness value. E.g., a threshold of 1.05 means a 5% improvement is expected.

Returns:

True if the threshold is met. False otherwise.

Return type:

bool

collect_n_past_generations(n: int, ancestral_recall: int = 0, unique_only=False)[source]

Collect the n past generations into a single numpy array for easier handling. Returns the collected individuals, objective fitness values, and fitness function values.

Parameters:: n (int) – number of past generations to collect.
Returns:: A tuple with the collected individuals, objective fitness values, and fitness function values. Each array is 2-dimensional.
Return type:: Tuple[nd.array, np.ndarray, np.ndarray]

darwinian_mode() → None[source]: Execute Darwinian mode. The size of the population can vary.

initialize_population() → None[source]: Use the defined initialization design to init a new population, add the initial population to the history of populations.

learning_mode() → None[source]: Execute learning mode.

reset_generation_history() → None[source]: Reset the population history.

reset_population() → None[source]: Reset the current population. Do not add the new population to history.

run() → Dict[source]

Run the LEMOO model. Switching between the Darwinian mode and learning mode happens when the fitness of the best population individual has improved past a threshold or when a maximum number of iterations has been executed in a mode.

Returns:

A dictionary with counters indicating how many times the Darwinian and learning modes have been: executed, and total iterations.

Return type:

Dict

run_iterations() → Dict[source]

Run the LEMOO model. The Darwinian mode and learning mode are always executed for a set number of iterations. Thresholds are ignored.

Returns:

A dictionary with counters indicating how many times the Darwinian and learning modes have been: executed, and total iterations.

Return type:

Dict

update_best_fitness() → bool[source]

Find the best fitness value in the current population and update the value stored in self._best_fitness_fun_value if the found fitness is better.

Returns:: True if the best fitness was updated, otherwise False.
Return type:: bool

update_population(new_individuals: ndarray) → None[source]

Replace the current population of the LEMOO model with a new one.

Parameters:: new_individuals (np.ndarray) – The new population individuals in the decision variable space.

XLEMOO.ruleset_interpreter.extract_slipper_rules(classifier: SlipperClassifier) → Tuple[List[Dict[Tuple[str, str], str]], List[float]][source]

Given a trained SlipperClassifier, extracts the trained rules alongside the weight for each rule. The rules are returned in a list of dictionaries. Each rule is represented by one dictionary. Each dictionary is of the format:

{(“feature_name”, “comparison_op”): “value”} where comparison_op can be “<”, “<=”, “>”, or “>=”.

The weight represents the importance of each rule. The feature names are expected to be of the format “x_i” where ‘i’ is zero-indexed (first feature is ‘x_0’ etc.).

XLEMOO.ruleset_interpreter.instantiate_rules(rules: Dict[Tuple[str, str], str], n_features: int, feature_limits: List[Tuple[float, float]], n_samples: int) → ndarray[source]

Takes Rules and instantiates them producing n_samples of new decision variable vectors corresponding to the rules. If there are no rules for a variable, a random value is generated for that variable between its limits. Notice that when rules define a range for a variable, then that variable’s value will be generated between those ranges randomly.

Parameters:

rules (Rules) – Should be a dict with the following structure: {(“feature_name”, “comparison_op”): “value”} where comparison_op can be “<”, “<=”, “>”, or “>=”. The feature names are expected to be formatted as “x_i” where ‘i’ is zero indexed (i.e., x_0, x_1, x_2, etc.).
n_features (int) – Number of features to instantiate based on the rules provided.
feature_limits (List[Tuple[float, float]]) – 2D array, each row corresponds to a decision variable. The first column has the lower limits for each variable and the second the upper limit.
n_samples (int) – How many samples to generate based on the rules provided.

Returns:

The new samples generated based on the provided rules in a 2D array.

Return type:

np.ndarray

XLEMOO.ruleset_interpreter.instantiate_ruleset_rules(rules: List[Dict[Tuple[str, str], str]], weights: List[float], n_features: int, feature_limits: List[Tuple[float, float]], n_samples: int) → ndarray[source]

Instantiate samples according to a rule set. Instantiates in total approximately n_samples of new samples according to the rules and features limits. If for some feature there are no rules, then only the feature limits are used. The feature limits will override rules if there is a conflict. The given weights will dictate how large of a fraction of n_samples will be generated for each rule. It is assumed that the rules supplied (in a list) have a weight at the same index in the argument weights.

Parameters:

rules (List[Rules]) – The rules contained in the rule set. See ‘instantiate_rules’.
weights (List[float]) – The weights for each rule in the rule set. It is assumed tht the weight at index i corresponds to the weight of rules at index i in ‘rules’.
n_features (int) – How many new samples to generate according to the rules. This is approximate, but the total of new samples should be relatively close to this number.
feature_limits (List[Tuple[float, float]]) – Pairs representing the lower and upper bounds of each feature.
n_samples (int) – Approximately how many new samples to generate in total.

Returns:

A 2D array with all the new generated samples. If a list of samples per rule is desired, see ‘_instantiate_ruleset_rules’.

Return type:

np.ndarray

class XLEMOO.tree_interpreter.TreePath[source]

A custom type describing a typed dictionary to store tree paths.

Parameters:

rules (List) – A list with each entry containing three elements: 1. feature index (int); 2. comparison operator (str), either ‘gte’ (greater or equal) or ‘lt’ (less than); 3. threshold value (float).
samples (int) – The number of samples that reached the endpoint of the path.
impurity (float) – Impurity of the last node of the path.
classification (int) – The final classification predicted by the path.

XLEMOO.tree_interpreter.find_all_paths(tree: sklearn.tree) → List[TreePath][source]

Find and return all decision paths of a decision tree. Currently supports only decision trees in the style present in the sklearn package.

Parameters:: tree (sklearn.tree) – A trained decision tree.
Returns:: A list of all the paths of the decision tree.
Return type:: List[TreePath]

XLEMOO.tree_interpreter.instantiate_tree_rules(paths: List[TreePath], n_features: int, feature_limits: List[Tuple[float, float]], n_samples: int, desired_classification: int) → ndarray[source]

Given a list of TreePaths, instantiates a number of new samples that adhere to the rules and results in a specified classification if classified by a tree from which the TreePaths have been generated.

Parameters:

paths (List[TreePath]) – see description in the docstring of ‘find_all_paths’
n_features (int) – number of input features.
feature_limits (List[Tuple[float, float]]) – the lower and upper limits of each input feature.
n_samples (int) – number of new samples to be instantiated for each path.
desired_classification (int) – the classification each considered path should end up to.

Returns:

a 3D numpy array. The first dimension is the number of paths in paths that result in the desired classification. The second dimension if the number of desired samples. The third dimension are the instantiated feature values. If no rule has been specified in a path for some feature, that feature’s value is set to be a random value residing between its limits.

Return type:

np.ndarray