queens.utils package#

Utils.

Modules containing utilities used throughout QUEENS.

Submodules#

queens.utils.ascii_art module#

ASCII art module.

print_banner(message='QUEENS', output_width=81)[source]#

Print banner.

Parameters:
  • message (str) – Message in banner

  • output_width (int) – Terminal output width

print_banner_and_description(output_width=81)[source]#

Print banner and the description.

Parameters:

output_width (int) – Terminal output width

print_bmfia_acceleration(output_width=81)[source]#

Print BMFIA rocket.

Parameters:

output_width (int) – Terminal output width

print_centered_multiline(string, output_width=81)[source]#

Center every line of a multiline text.

Parameters:
  • string (str) – String to be printed

  • output_width (int) – Terminal output width

print_centered_multiline_block(string, output_width=81)[source]#

Print a multiline text in the center as a block.

Parameters:
  • string (str) – String to be printed

  • output_width (int) – Terminal output width

print_classification()[source]#

Print like a sir as the iterator is classyfication.

print_crown(output_width=81)[source]#

Print crown.

Parameters:

output_width (int) – Terminal output width

print_points_iterator(output_width=81)[source]#

Print points iterator.

Parameters:

output_width (int) – Terminal output width

queens.utils.classifier module#

Classifiers for use in convergence classification.

class ActiveLearningClassifier(n_params, classifier_obj, batch_size, active_sampler_obj=None)[source]#

Bases: Classifier

Active learning classifier wrapper.

n_params#

number of parameters of the solver

Type:

int

classifier_obj#

classifier, e.g. sklearn.svm.SVR

Type:

obj

active_sampler_obj#

query strategy from skactiveml.pool, e.g. UncertaintySampling

is_active = True#
train(x_train, y_train)[source]#

Train the underlying _clf classifier.

Parameters:
  • x_train (np.array) – array with training samples, size: (n_samples, n_params)

  • y_train (np.array) – vector with corresponding training labels, size: (n_samples)

Returns:

query_idx (np.array) – sample indices in x_train to query next

class Classifier(n_params, classifier_obj)[source]#

Bases: object

Classifier wrapper.

n_params#

number of parameters of the solver

Type:

int

classifier_obj#

classifier, e.g. sklearn.svm.SVR

Type:

obj

is_active = False#
load(path, file_name)[source]#

Load pickled classifier.

Parameters:
  • path (str) – Path to export the classifier

  • file_name (str) – File name without suffix

predict(x_test)[source]#

Perform prediction on given parameter combinations.

Parameters:

x_test (np.array) – array of parameter combinations (n_samples, n_params)

Returns:

y_test – prediction value or vector (n_samples)

train(x_train, y_train)[source]#

Train the underlying _clf classifier.

Parameters:
  • x_train – array with training samples, size: (n_samples, n_params)

  • y_train – vector with corresponding training labels, size: (n_samples)

queens.utils.cli_utils module#

Command Line Interface utils collection.

build_html_coverage_report()[source]#

Build html coverage report.

cli_logging(func)[source]#

Decorator to create logger for CLI function.

Parameters:

func (function) – Function that is to be decorated

gather_metadata_and_write_to_csv(*args, **kwargs)[source]#
get_cli_options(args)[source]#

Get input file path, output directory and debug from args.

Parameters:

args (list) – cli arguments

Returns:
  • input_file (Path) – Path object to input file

  • output_dir (Path) – Path object to the output directory

  • debug (bool) – True if debug mode is to be used

inject_template_cli(*args, **kwargs)[source]#
input_to_script_cli(*args, **kwargs)[source]#
print_greeting_message(*args, **kwargs)[source]#
print_pickle_data_cli(*args, **kwargs)[source]#
remove_html_coverage_report()[source]#

Remove html coverage report files.

str_to_bool(value)[source]#

Convert string to boolean for cli commands.

Parameters:

value (str) – String to convert to a bool

Returns:

bool – Bool of the string

queens.utils.collection_utils module#

Utils to collect data during iterative processes.

class CollectionObject(*field_names)[source]#

Bases: object

Collection object which stores data.

This object can be indexed by iteration i: collection_object[i] but also using the collected fields collection_object.field1.

add(**field_names_and_values)[source]#

Add data to the object.

This function can be called with one or multiple fields, i.e.: collection_object.add(field1=value1) or collection_object.add(field1=value1, field2=value2). An error is raised if one tries to add data to a field for a new iteration before all fields are filled for the current iteration.

classmethod create_collection_object_from_dict(data_dict)[source]#

Create collection item from dict.

Parameters:

data_dict (dict) – Dictionary with values to be stored in this object

Returns:

collection_object – Collection object created from dict

items()[source]#

Items of the current object.

This allows to use the object like a dict.

Returns:

dict_items – Items of the collection object

keys()[source]#

Keys, i.e. field names of the current object.

This allows to use the object like a dict.

Returns:

dict_keys – Keys of the collection object

to_dict()[source]#

Create a dictionary from the collection object.

Returns:

dict – Dictionary with all data

values()[source]#

Values of the current object.

This allows to use the object like a dict.

Returns:

dict_values – Values of the collection object

queens.utils.config_directories module#

Configuration of folder structure of QUEENS experiments.

base_directory()[source]#

Hold all queens related data.

create_directory(dir_path)[source]#

Create a directory either local or remote.

current_job_directory(experiment_dir, job_id)[source]#

Directory of the latest submitted job.

Parameters:
  • experiment_dir (Path) – Experiment directory

  • job_id (str) – Job ID of the current job

Returns:

job_dir (Path) – Path to the current job directory.

experiment_directory(experiment_name)[source]#

Directory for data of a specific experiment on the computing machine.

Parameters:

experiment_name (str) – Experiment name

experiments_base_directory()[source]#

Hold all experiment data on the computing machine.

job_dirs_in_experiment_dir(experiment_dir)[source]#

Get job directories in experiment_dir.

Parameters:

experiment_dir (pathlib.Path, str) – Path with the job dirs

Returns:

job_directories (list) – List with job_dir paths

queens.utils.exceptions module#

Custom exceptions.

exception CLIError[source]#

Bases: QueensException

QUEENS exception for CLI input.

exception FileTypeError[source]#

Bases: QueensException

Exception for wrong file types.

exception InvalidOptionError[source]#

Bases: QueensException

Custom error class for invalid options during QUEENS runs.

classmethod construct_error_from_options(valid_options, desired_option, additional_message='')[source]#

Construct invalid option error from the valid and desired options.

Parameters:
  • valid_options (lst) – List of valid option keys

  • desired_option (str) – Key of desired option

  • additional_message (str, optional) – Additional message to pass (default is None)

Returns:

InvalidOptionError

exception QueensException[source]#

Bases: Exception

QUEENS exception.

exception SubprocessError[source]#

Bases: QueensException

Custom error class for the QUEENS subprocess wrapper.

classmethod construct_error_from_command(command, command_output, error_message, additional_message='')[source]#

Construct a Subprocess error from a command and its outputs.

Parameters:
  • command (str) – Command used that raised the error

  • command_output (str) – Command output

  • error_message (str) – Error message of the command

  • additional_message (str, optional) – Additional message to pass

Returns:

SubprocessError

queens.utils.experimental_data_reader module#

Module to read experimental data.

class ExperimentalDataReader(data_processor=None, output_label=None, coordinate_labels=None, time_label=None, file_name_identifier=None, csv_data_base_dir=None)[source]#

Bases: object

Reader for experimental data.

output_label#

Label that marks the output quantity in the csv file

Type:

str

coordinate_labels#

List of column-wise coordinate labels in csv files

Type:

lst

time_label#

Name of the time variable in csv file

Type:

str

file_name#

File name of experimental data

Type:

str

base_dir#

Path to base directory containing experimental data

Type:

Path

data_processor#

data processor for experimental data

Type:

DataProcessor

get_experimental_data()[source]#

Load experimental data.

Returns:
  • y_obs_vec (np.array) – Column-vector of model outputs which correspond row-wise to observation coordinates

  • experimental_coordinates (np.array) – Matrix with observation coordinates. One row corresponds to one coordinate point

  • time_vec (np.array) – Unique vector of observation times

  • experimental_data_dict (dict) – Dictionary containing the experimental data

  • time_label (str) – Name of the time variable in csv file

  • coordinate_labels (lst) – List of column-wise coordinate labels in csv files

queens.utils.fcc_utils module#

From config create utils.

check_for_reference(obj_description)[source]#

Check if another uninitialized object is referenced.

Indicated by a keyword that ends with ‘_name’. Sub-dictionaries are also checked.

Parameters:

obj_description (dict) – Description of the object

Returns:

bool – True, if another uninitialized object is referenced.

from_config_create_iterator(config, global_settings)[source]#

Create main iterator for queens run from config.

A bottom up approach is used here to create all objects from the description. First, the objects that do not need any other so far uninitialized objects for initialization are initialized. These objects are put into the description of the other objects, where they are referenced. Then again the objects that do not need any other so far uninitialized objects for initialization are initialized. This process repeats until the main iterator (indicated by the ‘method’ keyword) is initialized.

Parameters:
  • config (dict) – Description of the queens run

  • global_settings (GlobalSettings) – settings of the QUEENS experiment including its name and the output directory

Returns:

new_obj (iterator) – Main queens iterator with all initialized objects.

from_config_create_object(obj_description, global_settings=None, parameters=None)[source]#

Create object from description.

Parameters:
  • obj_description (dict) – Description of the object

  • global_settings (GlobalSettings) – settings of the QUEENS experiment including its name and the output directory

  • parameters (obj, optional) – Parameters object

Returns:

obj – Initialized object

insert_new_obj(config, new_obj_key, new_obj)[source]#

Insert initialized object in other object descriptions.

Parameters:
  • config (dict) – Description of queens run, or sub dictionary

  • new_obj_key (str) – Key of initialized object

  • new_obj (obj) – Initialized object

Returns:

bool – True, if another uninitialized object is referenced.

queens.utils.fd_jacobian module#

Calculate finite difference based approximation of Jacobian.

Note

Implementation is heavily based on the scipy.optimize._numdiff module. We do NOT support complex scheme ‘cs’ and sparsity.

The motivation behind this reimplementation is to enable the parallel computation of all function values required for the finite difference scheme.

In theory, when computing the Jacobian of function at a specific positions via a specific finite difference scheme, all positions where the function needs to be evaluated (the perturbed positions) are known immediately/at once, because they do not depend on each other. The evaluation of the function at these perturbed positions may consequently be done “perfectly” (embarrassingly) parallel.

Most implementation of finite difference based approximations do not exploit this inherent potential for parallel evaluations because for cheap functions the communication overhead is too high. For expensive function the exploitation ensures significant speed up.

compute_step_with_bounds(x0, method, rel_step, bounds)[source]#

Compute step sizes of finite difference scheme adjusted to bounds.

Parameters:
  • x0 (ndarray, shape(n,)) – Point at which the derivative shall be evaluated

  • method (string) –

    {‘3-point’, ‘2-point’}, optional

    Finite difference method to use:
    • ’2-point’ - use the first order accuracy forward or backward difference

    • ’3-point’ - use central difference in interior points and the second order accuracy forward or backward difference near the boundary

    • ’cs’ - use a complex-step finite difference scheme. This is an additional option in the scipy.optimize library. But: NOT IMPLEMENTED in QUEENS

  • rel_step (None or array_like, optional) – Relative step size to use. The absolute step size is computed as h = rel_step * sign(x0) * max(1, abs(x0)), possibly adjusted to fit into the bounds. For method=’3-point’ the sign of h is ignored. If None (default) then step is selected automatically, see Notes.

  • bounds (tuple of array_like, optional) – Lower and upper bounds on independent variables. Defaults to no bounds. Each bound must match the size of x0 or be a scalar, in the latter case the bound will be the same for all variables. Use it to limit the range of function evaluation.

Returns:
  • h (numpy.ndarray) – Adjusted step sizes

  • use_one_sided (numpy.ndarray of bool) – Whether to switch to one-sided scheme due to closeness to bounds. Informative only for 3-point method

fd_jacobian(f0, f_perturbed, dx, use_one_sided, method)[source]#

Calculate finite difference approximation of Jacobian of f at x0.

The necessary function evaluation have been pre-calculated and are supplied via f0 and the f_perturbed vector. Each row in f_perturbed corresponds to a function evaluation. The shape of f_perturbed depends heavily on the chosen finite difference scheme (method) and therefore the pre-calculation of f_perturbed and dx has to be consistent with the requested method.

Supported methods:

  • ‘2-point’: a one sided scheme by definition

  • ‘3-point’: more exact but needs twice as many function evaluations

Note: The implementation is supposed to remain very closed to scipy._numdiff.approx_derivative.

Parameters:
  • f0 (ndarray) – Function value at x0, f0=f(x0)

  • f_perturbed (ndarray) – Perturbed function values

  • dx (ndarray) – Deltas of the input variables

  • use_one_sided (ndarray of bool) – Whether to switch to one-sided scheme due to closeness to bounds; informative only for 3-point method

  • method (str) – Which scheme was used to calculate the perturbed function values and deltas

Returns:

J_transposed.T (np.array) – Jacobian of the underlying model at x0.

get_positions(x0, method, rel_step, bounds)[source]#

Compute all positions needed for the finite difference approximation.

The Jacobian is defined for a vector-valued function at a given position.

Note: The implementation is supposed to remain very closed to scipy._numdiff.approx_derivative.

Parameters:
  • x0 (np.array) – Position or sample at which the Jacobian shall be computed.

  • method (str) – Finite difference method that is used to compute the Jacobian.

  • rel_step (float) – Finite difference step size.

  • bounds (tuple of array_like, optional) – Lower and upper bounds on independent variables. Defaults to no bounds. Each bound must match the size of x0 or be a scalar, in the latter case the bound will be the same for all variables. Use it to limit the range of function evaluation.

Returns:
  • additional_positions (list of numpy.ndarray) – List with additional stencil positions that are necessary to calculate the finite difference approximation to the gradient

  • delta_positions (list of numpy.ndarray) – Delta between positions used to approximate Jacobian

queens.utils.gpf_utils module#

Utilis for gpflow.

extract_block_diag(array, block_size)[source]#

Extract block diagonals of square 2D Array.

Parameters:
  • array (np.ndarray) – Square 2D array

  • block_size (int) – Block size

Returns:

3D Array containing block diagonals

init_scaler(unscaled_data)[source]#

Initialize StandardScaler and scale data.

Standardize features by removing the mean and scaling to unit variance

\(scaled\_data = \frac{unscaled\_data - mean}{std}\)

Parameters:

unscaled_data (np.ndarray) – Unscaled data

Returns:
  • scaler (StandardScaler) – Standard scaler

  • scaled_data (np.ndarray) – Scaled data

set_transform_function(data, transform)[source]#

Set transform function.

Parameters:
  • data (gpf.Parameter) – Data to be transformed

  • transform (tfp.bijectors.Bijector) – Transform function

Returns:

gpf.Parameter with transform

queens.utils.import_utils module#

Import utils.

class LazyLoader(module_name)[source]#

Bases: object

Lazy loader for modules that take long to load.

Inspired from https://stackoverflow.com/a/78312617

get_module_attribute(path_to_module, function_or_class_name)[source]#

Load function from python file by path.

Parameters:
  • path_to_module (Path | str) – “Path” to file

  • function_or_class_name (str) – Name of the function

Returns:

function or class – Function or class from the module

get_module_class(module_options, valid_types, module_type_specifier='type')[source]#

Return module class defined in config file.

Parameters:
  • module_options (dict) – Module options

  • valid_types (dict) – Dict of valid types with corresponding module paths and class names

  • module_type_specifier (str) – Specifier for the module type

Returns:

module_class (class) – Class from the module

queens.utils.injector module#

Injector module.

The module supplies functions to inject parameter values into a template text file.

inject(params, template_path, output_file, strict=True)[source]#

Function to insert parameters into file template and write to file.

Parameters:
  • params (dict) – Dict with parameters to inject

  • template_path (str, Path) – Path to template

  • output_file (str, Path) – Name of output file with injected parameters

  • strict (bool) – Raises exception if mismatch between provided and required parameters

inject_in_template(params, template, output_file, strict=True)[source]#

Function to insert parameters into file template and write to file.

Parameters:
  • params (dict) – Dict with parameters to inject

  • template (str) – Template (str)

  • output_file (str, Path) – Name of output file with injected parameters

  • strict (bool) – Raises exception if mismatch between provided and required parameters

render_template(params, template, strict=True)[source]#

Function to insert parameters into a template.

Parameters:
  • params (dict) – Dict with parameters to inject

  • template (str) – Template file as string

  • strict (bool) – Raises exception if required parameters from the template are missing

Returns:

str – injected template

queens.utils.input_to_script module#

Convert input file to python script.

class QueensPythonCode[source]#

Bases: object

Class to create python script.

imports#

list with the necessary imports

Type:

list

run_iterator#

list the run commands

Type:

list

load_results#

commands to load the results

Type:

list

global_settings_context#

commands to create the context

Type:

list

code#

list of code lines

Type:

list

parameters#

list of code lines for the parameters setup

Type:

list

global_settings#

list with all the global settings commands

Type:

list

extern_imports#

imports loaded with external python module

Type:

list

create_main#

True if the script should contain a main function

Type:

bool

static create_code_section(code_list, comment=None, indent_level=0)[source]#

Create python code section from a list of code lines.

Parameters:
  • code_list (list) – list with code lines

  • comment (str, optional) – comment for this code section

  • indent_level (int, optional) – indent of this code block

Returns:

str – code section

generate_code()[source]#

Generate the python code for the QUEENS run.

Returns:

str – python code

generate_script()[source]#

Format python code using black.

Returns:

str – formatted python code

class VariableName(name)[source]#

Bases: object

Dummy class to differentiate between variable names and strings.

assign_variable_value(variable_name, value)[source]#

Create code to assign value.

Parameters:
  • variable_name (str) – name of the variable

  • value (str) – value to assign

Returns:

str – code line

create_initialization_call(obj_description, python_code)[source]#

Create a initialization call.

Parameters:
  • obj_description (dict) – keyword arguments for the object

  • python_code (QueensPythonCode) – object to store the code in

Returns:

str – code “class_name(argument1=value1,…)”

create_initialization_call_from_class_and_arguments(class_name, arguments)[source]#

Create a initialization call.

Parameters:
  • class_name (str) – name of the class to initialize

  • arguments (dict) – keyword arguments for the object

Returns:

str – code class_name(argument1=value1,…)

create_script_from_input_file(input_file, output_dir, script_path=None)[source]#

Create script from input file.

Keep in mind that this does not work with jinja2 templates, only with

Parameters:
  • input_file (pathlib.Path) – Input file path

  • output_dir (pathlib.Path) – Path to write the QUEENS run results

  • script_path (pathlib.Path, optional) – Path for the python script

dict_replace_infs(dictionary_to_modify)[source]#

Replace infs in nested dictionaries with float(inf).

The solution originates from https://stackoverflow.com/a/60776516 :param dictionary_to_modify: dictionary to modify :type dictionary_to_modify: dict

Returns:

dictionary – modified dictionary

from_config_create_fields_code(random_field_preprocessor_options, python_code)[source]#

Create code to preprocess random fields.

Parameters:
  • random_field_preprocessor_options (dict) – random field description

  • python_code (QueensPythonCode) – object to store the code in

from_config_create_parameters(parameters_options, python_code)[source]#

Create a QUEENS parameter object from config.

Parameters:
  • parameters_options (dict) – Parameters description

  • python_code (QueensPythonCode) – object to store the code in

from_config_create_script(config, output_dir)[source]#

Create a python script from input file.

Parameters:
  • config (dict) – Description of the QUEENS run

  • output_dir (pathlib.Path) – output directory

Returns:

str – python script for QUEENS

get_module_class(module_options, valid_types, code, module_type_specifier='type')[source]#

Return module class defined in config file.

Parameters:
  • module_options (dict) – Module options

  • valid_types (dict) – Dict of valid types with corresponding module paths and class names

  • code (QueensPythonCode) – Object to store the code in

  • module_type_specifier (str) – Specifier for the module type

Returns:
  • module_class (class) – Class from the module

  • module_attribute (str) – Name of the class

insert_new_obj(config, new_obj_key, new_obj, python_code)[source]#

Insert new object to the script.

Note that this implementation deviates from the on in the fcc_utils

Parameters:
  • config (dict) – Description of queens run, or sub dictionary

  • new_obj_key (str) – Key of initialized object

  • new_obj (obj) – Initialized object

  • python_code (QueensPythonCode) – object to store the code in

Returns:

config (dict) – modified problem description

list_replace_infs(list_to_modify)[source]#

Replace infs in nested list with float(inf).

The solution originates from https://stackoverflow.com/a/60776516 :param list_to_modify: list to modify :type list_to_modify: list

Returns:

list – modified list

stringify(obj)[source]#

Wrap string in quotes for the source code.

Parameters:

obj (obj) – object for the code

Returns:

str – string version of the object

queens.utils.io_utils module#

Utils for input/output handling.

load_input_file(input_file_path)[source]#

Load inputs from file by path.

Parameters:

input_file_path (Path) – Path to the input file

Returns:

dict – Options in the input file.

load_result(path_to_result_file)[source]#

Load QUEENS results.

Parameters:

path_to_result_file (Path) – Path to results

Returns:

dict – Results

read_file(file_path)[source]#

Function to read in a file.

Parameters:

file_path (str, Path) – Path to file

Returns:

file (str) – read in file as string

to_dict_with_standard_types(obj)[source]#

Convert dictionaries to dictionaries with python standard types only.

Parameters:

obj (dict) – Dictionary to convert

Returns:

dict – Dictionary with standard types

write_to_csv(output_file_path, data, delimiter=',')[source]#

Write a simple csv file.

Write data out to a csv-file. Nothing fancy, at the moment, only now header line or index column is supported just pure data.

Parameters:
  • output_file_path (Path obj) – Path to the file the data should be written to

  • data (np.array) – Data that should be written to the csv file.

  • delimiter (optional, str) – Delimiter to separate individual data. Defaults to comma delimiter.

queens.utils.iterative_averaging_utils module#

Iterative averaging utils.

class ExponentialAveraging(coefficient)[source]#

Bases: IterativeAveraging

Exponential averaging.

\(x^{(0)}_{avg}=x^{(0)}\)

\(x^{(j)}_{avg}= \alpha x^{(j-1)}_{avg}+(1-\alpha)x^{(j)}\)

Is also sometimes referred to as exponential smoothing.

coefficient#

Coefficient in (0,1) for the average.

Type:

float

average_computation(new_value)[source]#

Compute the exponential average.

Parameters:

new_value (float or np.array) – New value to update the average.

Returns:

current_average (np.array) – Returns the current average

class IterativeAveraging[source]#

Bases: object

Base class for iterative averaging schemes.

current_average#

Current average value.

Type:

np.array

new_value#

New value for the averaging process.

Type:

np.array

rel_l1_change#

Relative change in L1 norm of the average value.

Type:

float

rel_l2_change#

Relative change in L2 norm of the average value.

Type:

float

abstract average_computation(new_value)[source]#

Here the averaging approach is implemented.

update_average(new_value)[source]#

Compute the actual average.

Parameters:

new_value (np.array) – New observation for the averaging

Returns:

Current average value

class MovingAveraging(num_iter_for_avg)[source]#

Bases: IterativeAveraging

Moving averages.

\(x^{(j)}_{avg}=\frac{1}{k}\sum_{i=0}^{k-1}x^{(j-i)}\)

where \(k-1\) is the number of values from previous iterations that are used

num_iter_for_avg#

Number of samples in the averaging window

Type:

int

data#

data used to compute the average

Type:

np.ndarray

average_computation(new_value)[source]#

Compute the moving average.

Parameters:

new_value (float or np.array) – New value to update the average

Returns:

average (np.array) – The current average

class PolyakAveraging[source]#

Bases: IterativeAveraging

Polyak averaging.

\(x^{(j)}_{avg}=\frac{1}{j}\sum_{i=0}^{j}x^{(j)}\)

iteration_counter#

Number of samples.

Type:

float

sum_over_iter#

Sum over all samples.

Type:

np.array

average_computation(new_value)[source]#

Compute the Polyak average.

Parameters:

new_value (float or np.array) – New value to update the average

Returns:

current_average (np.array) – Returns the current average

l1_norm(vector, averaged=False)[source]#

Compute the L1 norm of the vector.

Parameters:
  • vector (np.array) – Vector

  • averaged (bool) – If enabled, the norm is divided by the number of components

Returns:

norm (float) – L1 norm of the vector

l2_norm(vector, averaged=False)[source]#

Compute the L2 norm of the vector.

Parameters:
  • vector (np.array) – Vector

  • averaged (bool) – If enabled the norm is divided by the square root of the number of components

Returns:

norm (float) – L2 norm of the vector

relative_change(old_value, new_value, norm)[source]#

Compute the relative change of the old and new value for a given norm.

Parameters:
  • old_value (np.array) – Old values

  • new_value (np.array) – New values

  • norm (func) – Function to compute a norm

Returns:

Relative change

queens.utils.jax_minimize_wrapper module#

A collection of helper functions for optimization with JAX.

Taken from https://gist.github.com/slinderman/24552af1bdbb6cb033bfea9b2dc4ecfd

minimize(fun, x0, method=None, args=(), bounds=None, constraints=(), tol=None, callback=None, options=None)[source]#

A simple wrapper for scipy.optimize.minimize using JAX.

Parameters:
  • fun – The objective function to be minimized, written in JAX code

  • type (so that it is automatically differentiable. It is of) – `fun: x, *args -> float`

:param : `fun: x, *args -> float` :param where x is a PyTree and args is a tuple of the fixed parameters needed: :param to completely specify the function.: :param x0: Initial guess represented as a JAX PyTree. :param args: tuple, optional. Extra arguments passed to the objective function :param and its derivative. Must consist of valid JAX types; e.g. the leaves: :param of the PyTree must be floats.: :param _The remainder of the keyword arguments are inherited from: :param scipy.optimize.minimize: :param and their descriptions are copied here for: :param convenience._: :param method: str or callable, optional :param Type of solver. Should be one of:

  • ‘Nelder-Mead’ (see here)

  • ‘Powell’ (see here)

  • ‘CG’ (see here)

  • ‘BFGS’ (see here)

  • ‘Newton-CG’ (see here)

  • ‘L-BFGS-B’ (see here)

  • ‘TNC’ (see here)

  • ‘COBYLA’ (see here)

  • ‘SLSQP’ (see here)

  • ‘trust-constr’(see here)

  • ‘dogleg’ (see here)

  • ‘trust-ncg’ (see here)

  • ‘trust-exact’ (see here)

  • ‘trust-krylov’ (see here)

  • custom - a callable object (added in version 0.14.0), see below for description.

Parameters:
  • given (If not)

  • BFGS (chosen to be one of)

  • L-BFGS-B

  • SLSQP

:param : :param depending if the problem has constraints or bounds.: :param bounds: sequence or Bounds, optional

Bounds on variables for L-BFGS-B, TNC, SLSQP, Powell, and trust-constr methods. There are two ways to specify the bounds:

  1. Instance of Bounds class.

2. Sequence of (min, max) pairs for each element in x. None is used to specify no bound.

Note that in order to use bounds you will need to manually flatten them in the same order as your inputs x0.

Parameters:
  • constraints

    {Constraint, dict} or List of {Constraint, dict}, optional Constraints definition (only for COBYLA, SLSQP and trust-constr). Constraints for ‘trust-constr’ are defined as a single object or a list of objects specifying constraints to the optimization problem. Available constraints are:

    • LinearConstraint

    • NonlinearConstraint

    Constraints for COBYLA, SLSQP are defined as a list of dictionaries. Each dictionary with fields:

    typestr

    Constraint type: ‘eq’ for equality, ‘ineq’ for inequality.

    funcallable

    The function defining the constraint.

    jaccallable, optional

    The Jacobian of fun (only for SLSQP).

    argssequence, optional

    Extra arguments to be passed to the function and Jacobian.

    Equality constraint means that the constraint function result is to be zero whereas inequality means that it is to be non-negative. Note that COBYLA only supports inequality constraints.

    Note that in order to use constraints you will need to manually flatten them in the same order as your inputs x0.

  • tol – float, optional Tolerance for termination. For detailed control, use solver-specific options.

  • options

    dict, optional A dictionary of solver options. All methods accept the following generic options:

    maxiterint

    Maximum number of iterations to perform. Depending on the method each iteration may use several function evaluations.

    dispbool

    Set to True to print convergence messages.

    For method-specific options, see show_options().

  • callback

    callable, optional Called after each iteration. For ‘trust-constr’ it is a callable with the signature:

    callback(xk, OptimizeResult state) -> bool

    where xk is the current parameter vector represented as a PyTree,

    and state is an OptimizeResult object, with the same fields

    as the ones from the return. If callback returns True the algorithm execution is terminated.

    For all the other methods, the signature is:

    `callback(xk)`

    where xk is the current parameter vector, represented as a PyTree.

Returns:
  • res – The optimization result represented as a OptimizeResult object.

  • Important attributes arex: the solution array, represented as a JAX PyTree success: a Boolean flag indicating if the optimizer exited successfully message: describes the cause of the termination.

  • See `scipy.optimize.OptimizeResult` for a description of other attributes.

queens.utils.logger_settings module#

Logging in QUEENS.

class LogFilter(level)[source]#

Bases: Filter

Filters (lets through) all messages with level <= LEVEL.

level#

Logging level

filter(record)[source]#

Filter the logging record.

Parameters:

record (LogRecord obj) – Logging record object

Returns:

LogRecord obj – Filter logging record

class NewLineFormatter(fmt=None, datefmt=None, style='%', validate=True, *, defaults=None)[source]#

Bases: Formatter

Formatter splitting multiline messages into single line messages.

A logged message that consists of more than one line - contains a new line char - is split into multiple single line messages that all have the same format. Without this the overall

format of the logging is broken for multiline messages.

format(record)[source]#

Override format function.

Parameters:

record (LogRecord obj) – Logging record object

Returns:

formatted_message (str) – Logged message in supplied format split into single lines

log_init_args(method)[source]#

Log arguments of __init__ method.

Parameters:

method (obj) – __init__ method

Returns:

wrapper (func) – Decorated __init__ method

reset_logging()[source]#

Reset loggers.

This is only needed during testing, as otherwise the loggers are not destroyed resulting in the same output multiple time. This is taken from:

https://stackoverflow.com/a/56810619

setup_basic_logging(log_file_path, logger=None, debug=False)[source]#

Setup basic logging.

Parameters:
  • log_file_path (Path) – Path to the log-file

  • logger (logging.Logger) – Logger instance that should be set up

  • debug (bool) – Indicates debug mode and controls level of logging

setup_cli_logging(debug=False)[source]#

Set up logging for CLI utils.

Parameters:

debug (bool) – Indicates debug mode and controls level of logging

setup_cluster_logging()[source]#

Setup cluster logging.

setup_file_handler(logger, log_file_path)[source]#

Set up a file handler.

Parameters:
  • logger (logging.logger) – Logger object to add the stream handler to

  • log_file_path (pathlib.Path) – Path of the logging file

setup_logger(logger=None, debug=False)[source]#

Set up the main QUEENS logger.

Parameters:
  • logger (logging.Logger) – Logger instance that should be set up

  • debug (bool) – Indicates debug mode and controls level of logging

Returns:

logging.logger – QUEENS logger object

setup_stream_handler(logger)[source]#

Set up a stream handler.

Parameters:

logger (logging.logger) – Logger object to add the stream handler to

queens.utils.mcmc_utils module#

Collection of utils for Markov Chain Monte Carlo algorithms.

mh_select(log_acceptance_probability, current_sample, proposed_sample)[source]#

Perform Metropolis-Hastings selection.

The Metropolis-Hastings algorithm is used in Markov Chain Monte Carlo (MCMC) methods to accept or reject a proposed sample based on the log of the acceptance probability. This function compares the acceptance probability with a random number between 0 and 1 to decide if each proposed sample should replace the current sample. If the random number is smaller than the acceptance probability, the proposed sample is accepted. The function further checks whether the log_acceptance_probability is finite. If it is infinite or NaN, the function will not accept the respective proposed sample.

Parameters:
  • log_acceptance_probability (np.array) – Logarithm of the acceptance probability for each sample. This represents the log of the ratio of the probability densities of the proposed sample to the current sample.

  • current_sample (np.array) – The current sample values from the MCMC chain.

  • proposed_sample (np.array) – The proposed sample values to be considered for acceptance.

Returns:
  • selected_samples (np.array) – The sample values selected after the Metropolis-Hastings step. If the proposed sample is accepted, it will be returned; otherwise, the current sample is returned.

  • bool_idx (np.array) – A boolean array indicating whether each proposed sample was accepted (True) or rejected (False).

tune_scale_covariance(scale_covariance, accept_rate)[source]#

Adjust the covariance scaling factor based on the acceptance rate.

This function tunes the covariance scaling factor used in Metropolis-Hastings or similar MCMC algorithms based on the observed acceptance rate of proposed samples. The goal is to maintain an acceptance rate within the range of 20% to 50%, which is considered optimal for many MCMC algorithms. The covariance scaling factor is adjusted according to the following rules:

Acceptance Rate

Variance adaptation factor

<0.001

x 0.1

<0.05

x 0.5

<0.2

x 0.9

>0.5

x 1.1

>0.75

x 2

>0.95

x 10

Reference: [1]: pymc-devs/pymc

Parameters:
  • scale_covariance (float or np.array) – The current covariance scaling factor for the proposal

  • distribution.

  • accept_rate (float or np.array) – The observed acceptance rate of the proposed samples. This

  • 1. (value should be between 0 and)

Returns:

np.array – The updated covariance scaling factor adjusted according to the acceptance rate.

queens.utils.metadata module#

Metadata objects.

class SimulationMetadata(job_id, inputs, job_dir)[source]#

Bases: object

Simulation metadata object.

This objects holds metadata, times code sections and exports them to yaml.

job_id#

Id of the job

Type:

int

inputs#

Parameters for this job

Type:

dict

file_path#

Path to export the metadata

Type:

pathlib.Path

timestamp#

Timestamp of the object creation

Type:

str

outputs#

Results obtain by the simulation

Type:

tuple

times#

Wall times of code sections

Type:

dict

export()[source]#

Export the object to human readable format.

time_code(code_section_name)[source]#

Timer some code section.

This method allows us to time not only the runtime of the simulation, but also subparts.

Parameters:

code_section_name (string) – Name for this code section

to_dict()[source]#

Create dictionary from object.

Returns:

dict – Dictionary of the metadata object

get_metadata_from_experiment_dir(experiment_dir)[source]#

Get metadata from experiment_dir.

To keep memory usage limited, this is implemented as a generator.

Parameters:

experiment_dir (pathlib.Path, str) – Path with the job dirs

Yields:

metadata (dict) – metadata of a job

write_metadata_to_csv(experiment_dir, csv_path=None)[source]#

Gather and write job metadata to csv.

Parameters:
  • experiment_dir (pathlib.Path, str) – Path with the job dirs

  • csv_path (pathlib.Path, str) – Path to export the csv file

queens.utils.numpy_utils module#

Numpy array utils.

add_nugget_to_diagonal(matrix, nugget_value)[source]#

Add a small value to diagonal of matrix.

The nugget value is only added to diagonal entries that are smaller than the nugget value.

Parameters:
  • matrix (np.ndarray) – Matrix

  • nugget_value (float) – Small nugget value to be added

Returns:

matrix (np.ndarray) – Manipulated matrix

at_least_2d(arr)[source]#

View input array as array with at least two dimensions.

Parameters:

arr (np.ndarray) – Input array

Returns:

arr (np.ndarray) – View of input array with at least two dimensions

at_least_3d(arr)[source]#

View input array as array with at least three dimensions.

Parameters:

arr (np.ndarray) – Input array

Returns:

arr (np.ndarray) – View of input array with at least three dimensions

safe_cholesky(matrix, jitter_start_value=1e-10)[source]#

Numerically stable Cholesky decomposition.

Compute the Cholesky decomposition of a matrix. Numeric stability is increased by sequentially adding a small term to the diagonal of the matrix.

Parameters:
  • matrix (np.ndarray) – Matrix to be decomposed

  • jitter_start_value (float) – Starting value to be added to the diagonal

Returns:

low_cholesky (np.ndarray) – Lower-triangular Cholesky factor of matrix

queens.utils.path_utils module#

Path utilities for QUEENS.

check_if_path_exists(path, error_message='')[source]#

Check if a path exists.

Parameters:
  • path (str) – “Path” to be checked

  • error_message (str,optional) – If an additional message is desired

Returns:

boolTrue if the path exists, False otherwise.

Raises:

FileNotFoundError – If the path does not exist.

create_folder_if_not_existent(path)[source]#

Create folder if not existent.

Parameters:

path (PosixPath) – Path to be created

Returns:

path_obj (PosixPath) – Path object

is_empty(paths)[source]#

Check whether paths is empty.

Parameters:

paths (str, Path, list) – (list of) path like objects

relative_path_from_queens(relative_path)[source]#

Create relative path from queens/.

As an example to create: queens/queens/folder/file.A .

Call relative_path_from_source(“queens/folder/file.A”) .

Parameters:

relative_path (str) – “Path” starting from queens/

Returns:

PosixPath – Absolute path to the file

relative_path_from_source(relative_path)[source]#

Create relative path from queens/.

As an example to create: queens/queens/folder/file.A.

Call relative_path_from_source(“folder/file.A”) .

Parameters:

relative_path (str) – “Path” starting from queens/queens/

Returns:

PosixPath – Absolute path to the file

queens.utils.pdf_estimation module#

Kernel density estimation (KDE).

Estimation of the probability density function based on samples from the distribution.

estimate_bandwidth_for_kde(samples, min_samples, max_samples, kernel='gaussian')[source]#

Estimate optimal bandwidth for kde of pdf.

Parameters:
  • samples (np.ndarray) – Samples for which to estimate pdf

  • min_samples (float) – Smallest value

  • max_samples (float) – Largest value

  • kernel (str,optional) – Kernel type

Returns:

float – Estimate for optimal kernel_bandwidth

estimate_pdf(samples, kernel_bandwidth, support_points=None, kernel='gaussian')[source]#

Estimate pdf using kernel density estimation.

Parameters:
  • samples (np.array) – Samples for which to estimate pdf

  • kernel_bandwidth (float) – Kernel width to use in kde

  • support_points (np.array) – Points where to evaluate pdf

  • kernel (str, optional) – Kernel type

Returns:

np.ndarray, np.ndarraypdf_estimate at support points

queens.utils.pickle_utils module#

Utils to handle pickle files.

load_pickle(file_path)[source]#

Load a pickle file directly from path.

Parameters:

file_path (Path) – Path to pickle-file

Returns:

Data (dict)

print_pickled_data(file_path)[source]#

Print a table of the data within a pickle file.

Only goes one layer deep for dicts. This is similar to python -m pickle file_path but makes it a single command and fancy prints.

Parameters:

file_path (Path) – Path to pickle-file

queens.utils.plot_outputs module#

Collection of plotting capabilities for probability distributions.

plot_cdf(cdf_estimate, support_points, bayes=False)[source]#

Create cdf plot based on passed data.

Parameters:
  • cdf_estimate (dict) – Estimate of cdf at supporting points

  • support_points (np.array) – Supporting points

  • bayes (bool) – Do we want to plot confidence intervals

plot_icdf(icdf_estimate, bayes=False)[source]#

Create icdf plot based on passed data.

Parameters:
  • icdf_estimate (dict) – Estimate of icdf at supporting points

  • bayes (bool) – Do we want to plot confidence intervals

plot_pdf(pdf_estimate, support_points, bayes=False)[source]#

Create pdf plot based on passed data.

Parameters:
  • pdf_estimate (dict) – Estimate of pdf at supporting points

  • support_points (np.array) – Supporting points

  • bayes (bool) – Do we want to plot confidence intervals

queens.utils.pool_utils module#

Pool utils.

create_pool(number_of_workers)[source]#

Create pathos Pool from number of workers.

Parameters:

number_of_workers (int) – Number of parallel evaluations

Returns:

pathos multiprocessing pool

queens.utils.print_utils module#

Print utils.

get_str_table(name, print_dict, use_repr=False)[source]#

Function to get table to be used in __str__ methods.

Parameters:
  • name (str) – Object name

  • print_dict (dict) – Dict containing labels and values to print

  • use_repr (bool, opt) – If true, use repr() function to obtain string representations of objects

Returns:

str – Table to print

queens.utils.process_outputs module#

Collection of utility functions for post-processing.

do_processing(output_data, output_description)[source]#

Do actual processing of output.

Parameters:
  • output_data (dict) – Dictionary containing model output

  • output_description (dict) – Dictionary describing desired output quantities

Returns:

dict – Dictionary with processed results

estimate_bandwidth_for_kde(samples, min_samples, max_samples)[source]#

Estimate optimal bandwidth for kde of pdf.

Parameters:
  • samples (np.array) – Samples for which to estimate pdf

  • min_samples (float) – Smallest value

  • max_samples (float) – Largest value

Returns:

float – Estimate for optimal kernel_bandwidth

estimate_cdf(output_data, support_points, bayesian)[source]#

Compute estimate of CDF based on provided sampling data.

Parameters:
  • output_data (dict) – Dictionary with output data

  • support_points (np.array) – Points where to evaluate cdf

  • bayesian (bool) – Compute confidence intervals etc.

Returns:

cdf – Dictionary with cdf estimates

estimate_cov(output_data)[source]#

Estimate covariance based on standard unbiased estimator.

Parameters:

output_data (dict) – Dictionary with output data

Returns:

numpy.array – Unbiased covariance estimate

estimate_icdf(output_data, bayesian)[source]#

Compute estimate of inverse CDF based on provided sampling data.

Parameters:
  • output_data (dict) – Dictionary with output data

  • bayesian (bool) – Compute confidence intervals etc.

Returns:

icdf – Dictionary with icdf estimates

estimate_mean(output_data)[source]#

Estimate mean based on standard unbiased estimator.

Parameters:

output_data (dict) – Dictionary with output data

Returns:

float – Unbiased mean estimate

estimate_pdf(output_data, support_points, bayesian)[source]#

Compute estimate of PDF based on provided sampling data.

Parameters:
  • output_data (dict) – Dictionary with output data

  • support_points (np.array) – Points where to evaluate pdf

  • bayesian (bool) – Compute confidence intervals etc.

Returns:

pdf – Dictionary with pdf estimates

estimate_result_interval(output_data)[source]#

Estimate interval of output data.

Estimate interval of output data and add small margins.

Parameters:

output_data (dict) – Dictionary with output data

Returns:

list – Output interval

estimate_var(output_data)[source]#

Estimate variance based on standard unbiased estimator.

Parameters:

output_data (dict) – Dictionary with output data

Returns:

float – Unbiased variance estimate

perform_kde(samples, kernel_bandwidth, support_points)[source]#

Estimate pdf using kernel density estimation.

Parameters:
  • samples (np.array) – Samples for which to estimate pdf

  • kernel_bandwidth (float) – Kernel width to use in kde

  • support_points (np.array) – Points where to evaluate pdf

Returns:

np.arraypdf_estimate at support points

process_outputs(output_data, output_description, input_data=None)[source]#

Process output from QUEENS models.

Parameters:
  • output_data (dict) – Dictionary containing model output

  • output_description (dict) – Dictionary describing desired output quantities

  • input_data (np.array) – Array containing model input

Returns:

dict – Dictionary with processed results

write_results(processed_results, file_path)[source]#

Write results to pickle file.

Parameters:
  • processed_results (dict) – Dictionary with results

  • file_path (str, Path) – Path to pickle file to write results to

queens.utils.pymc module#

Collection of utility functions and classes for PyMC.

class PymcDistributionWrapper(logpdf, logpdf_gradients=None)[source]#

Bases: Op

Op class for Data conversion.

This PymcDistributionWrapper class is a wrapper for PyMC Distributions in QUEENS.

logpdf#

The log-pdf function

Type:

fun

logpdf_gradients#

The function to evaluate the gradient of the log-pdf

Type:

fun

logpdf_grad#

Wrapper for the gradient function of the log-pdf

Type:

obj

R_op(inputs: list[Variable], eval_points: Variable | list[Variable]) list[Variable][source]#

Construct a graph for the R-operator.

This method is primarily used by Rop. For more information, see pymc documentation for the method.

Parameters:
  • inputs (list[Variable]) – The input variables for the R operator.

  • eval_points (Union[Variable, list[Variable]]) – Should have the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R-operator is to be evaluated.

Returns:

list[Variable]

grad(inputs, output_grads)[source]#

Get gradient and multiply with upstream gradient.

itypes: Sequence['Type'] | None = [TensorType(float64, shape=(None, None))]#
otypes: Sequence['Type'] | None = [TensorType(float64, shape=(None,))]#
perform(_node, inputs, output_storage, params=None)[source]#

Call outside pdf function.

class PymcGradientWrapper(gradient_func)[source]#

Bases: Op

Op class for Data conversion.

This Class is a wrapper for the gradient of the distributions in QUEENS.

gradient_func#

The function to evaluate the gradient of the pdf

Type:

fun

R_op(inputs: list[Variable], eval_points: Variable | list[Variable]) list[Variable][source]#

Construct a graph for the R-operator.

This method is primarily used by Rop. For more information, see pymc documentation for the method.

Parameters:
  • inputs (list[Variable]) – The input variables for the R operator.

  • eval_points (Union[Variable, list[Variable]]) – Should have the same length as inputs. Each element of eval_points specifies the value of the corresponding input at the point where the R-operator is to be evaluated.

Returns:

list[Variable]

itypes: Sequence['Type'] | None = [TensorType(float64, shape=(None, None))]#
otypes: Sequence['Type'] | None = [TensorType(float64, shape=(None, None))]#
perform(_node, inputs, output_storage, _params=None)[source]#

Evaluate the gradient.

from_config_create_pymc_distribution(distribution, name, explicit_shape)[source]#

Create PyMC distribution object from queens distribution.

Parameters:
  • distribution (obj) – Queens distribution object

  • name (str) – name of random variable

  • explicit_shape (int) – Explicit shape parameter for distribution dimension

Returns:

random_variable – Random variable, distribution object in pymc format

from_config_create_pymc_distribution_dict(parameters, explicit_shape)[source]#

Get random variables in pymc distribution format.

Parameters:
  • parameters (obj) – Parameters object

  • explicit_shape (int) – Explicit shape parameter for distribution dimension

Returns:

pymc distribution list

queens.utils.random_process_scaler module#

Utils for data scaling.

class IdentityScaler[source]#

Bases: Scaler

The identity scaler.

fit(x_mat)[source]#

Fit/calculate the scaling based on the input samples.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

inverse_transform_grad_mean(grad_mean, *_args)[source]#

Conduct the inverse scaling of the mean gradient.

Parameters:

grad_mean (np.array) – gradient of the transformed mean function

Returns:

transformed_grad (np.array) – Inversely transformed gradient of then mean function

inverse_transform_grad_var(grad_var, *_args)[source]#

Conduct the inverse scaling of the variance gradient.

Parameters:

grad_var (np.array) – gradient of the transformed variance function

Returns:

transformed_grad (np.array) – Inversely transformed gradient of then variance function

inverse_transform_mean(x_mat)[source]#

Conduct the inverse scaling transformation on the data matrix.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

Returns:

transformed_data (np.array) – Transformed data-array

inverse_transform_std(x_mat)[source]#

Conduct the inverse scaling.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

Returns:

transformed_data (np.array) – Transformed data-array

transform(x_mat)[source]#

Conduct the scaling transformation on the data matrix.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

Returns:

transformed_data (np.array) – Transformed data-array

class Scaler[source]#

Bases: object

Base class for general scaling classes.

The purpose of these classes is the scaling of training data.

mean#

Mean-values of the data-matrix (column-wise).

Type:

np.array

standard_deviation#

Standard deviation of the data-matrix (per column).

Type:

np.array

abstract fit(x_mat)[source]#

Fit/calculate the scaling based on the input samples.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

abstract inverse_transform_mean(x_mat)[source]#

Conduct the inverse transformation for the mean.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

abstract inverse_transform_std(x_mat)[source]#

Conduct the inverse transformation.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

abstract transform(x_mat)[source]#

Conduct the scaling transformation on the input samples.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

class StandardScaler[source]#

Bases: Scaler

Scaler for standardization of data.

In case a stochastic process is trained on the scaled data, inverse rescaling is implemented to recover the correct mean and standard deviation prediction for the posterior process.

fit(x_mat)[source]#

Fit/calculate the scaling based on the input samples.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

inverse_transform_grad_mean(grad_mean, standard_deviation_input)[source]#

Conduct the inverse scaling of the mean gradient.

Parameters:
  • grad_mean (np.array) – gradient of the transformed mean function

  • standard_deviation_input (float) – standard deviation of the input data

Returns:

transformed_grad (np.array) – Inversely transformed gradient of the mean function

inverse_transform_grad_var(grad_var, var, trans_var, input_standard_deviation)[source]#

Conduct the inverse scaling of the variance gradient.

Parameters:
  • grad_var (np.array) – gradient of the transformed variance

  • var (np.array) – variance of the untransformed data

  • trans_var (np.array) – variance of the transformed data

  • input_standard_deviation (float) – standard deviation of the input data

Returns:

transformed_grad (np.array) – Inversely transformed gradient of the variance function

inverse_transform_mean(x_mat)[source]#

Conduct the inverse scaling transformation on the data matrix.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

Returns:

transformed_data (np.array) – Transformed data-array

inverse_transform_std(x_mat)[source]#

Conduct the inverse scaling transformation.

The data is transformed based on the standard deviation.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

Returns:

transformed_data (np.array) – Transformed data-array

transform(x_mat)[source]#

Conduct the scaling transformation on the data matrix.

Parameters:

x_mat (np.array) – Data matrix that should be standardized

Returns:

transformed_data (np.array) – Transformed data-array

queens.utils.remote_build module#

Utils to build queens on remote resource.

queens.utils.remote_operations module#

Module supplies functions to conduct operation on remote resource.

class RemoteConnection(host, remote_python, remote_queens_repository, user=None, gateway=None)[source]#

Bases: Connection

This is class wrapper around the Connection class of fabric.

remote_python#

Path to Python with installed (editable) QUEENS (see remote_queens_repository)

Type:

str

remote_queens_repository#

Path to the QUEENS source code on the remote host

Type:

str, Path

build_remote_environment(package_manager='mamba')[source]#

Build remote QUEENS environment.

Parameters:

package_manager (str, optional) – Package manager used for the creation of the environment (“mamba” or “conda”)

copy_to_remote(source, destination, verbose=True, exclude=None, filters=None)[source]#

Copy files or folders to remote.

Parameters:
  • source (str, Path, list) – paths to copy

  • destination (str, Path) – destination relative to host

  • verbose (bool) – true for verbose

  • exclude (str, list) – options to exclude

  • filters (str) – filters for rsync

create_remote_directory(remote_directory)[source]#

Make a directory (including parents) on the remote host.

Parameters:

remote_directory (Path, str) – path of the directory that will be created

get_free_local_port()[source]#

Get a free port on localhost.

get_free_remote_port()[source]#

Get a free port on remote host.

open()[source]#

Initiate the SSH connection.

open_port_forwarding(local_port=None, remote_port=None)[source]#

Open port forwarding.

Parameters:
  • local_port (int) – free local port

  • remote_port (int) – free remote port

Returns:
  • local_port (int) – used local port

  • remote_port (int) – used remote port

run_function(func, *func_args, wait=True, **func_kwargs)[source]#

Run a python function remotely using an ssh connection.

Parameters:
  • func (Function) – function that is executed

  • func_args – Additional arguments for the functools.partial function

  • wait (bool) – Flag to decide whether to wait for result of function

  • func_kwargs – Additional keyword arguments for the functools.partial function

Returns:

return_value (obj) – Return value of function

start_cluster(workload_manager, dask_cluster_kwargs, dask_cluster_adapt_kwargs, experiment_dir)[source]#

Start a Dask Cluster remotely using an ssh connection.

Parameters:
  • workload_manager (str) – Workload manager (“pbs” or “slurm”) on cluster

  • dask_cluster_kwargs (dict) – collection of keyword arguments to be forwarded to DASK Cluster

  • dask_cluster_adapt_kwargs (dict) – collection of keyword arguments to be forwarded to DASK Cluster adapt method

  • experiment_dir (str) – directory holding all data of QUEENS experiment on remote

Returns:

return_value (obj) – Return value of function

sync_remote_repository()[source]#

Synchronize local and remote QUEENS source files.

get_port()[source]#

Get free port.

Returns:

int – free port

queens.utils.rsync module#

Rsync utils.

assemble_rsync_command(source, destination, archive=False, exclude=None, filters=None, verbose=True, rsh=None, host=None, rsync_options=None)[source]#

Assemble rsync command.

Parameters:
  • source (str, Path, list) – paths to copy

  • destination (str, Path) – destination relative to host

  • archive (bool) – use the archive option

  • exclude (str, list) – options to exclude

  • filters (str) – filters for rsync

  • verbose (bool) – true for verbose

  • rsh (str) – remote ssh command

  • host (str) – host where to copy the files to

  • rsync_options (list) – additional rsync options

Returns:

str command to run rsync

rsync(source, destination, archive=True, exclude=None, filters=None, verbose=True, rsh=None, host=None, rsync_options=None)[source]#

Run rsync command.

Parameters:
  • source (str, Path, list) – paths to copy

  • destination (str, Path) – destination relative to host

  • archive (bool) – use the archive option

  • exclude (str, list) – options to exclude

  • filters (str) – filters for rsync

  • verbose (bool) – true for verbose

  • rsh (str) – remote ssh command

  • host (str) – host where to copy the files to

  • rsync_options (list) – additional rsync options

queens.utils.run_subprocess module#

Wrapped functions of subprocess stdlib module.

run_subprocess(command, raise_error_on_subprocess_failure=True, additional_error_message=None, allowed_errors=None)[source]#

Run a system command outside of the Python script.

return stderr and stdout :param command: command, that will be run in subprocess :type command: str :param raise_error_on_subprocess_failure: Raise or warn error defaults to True :type raise_error_on_subprocess_failure: bool, optional :param additional_error_message: Additional error message to be displayed :type additional_error_message: str, optional :param allowed_errors: List of strings to be removed from the error message :type allowed_errors: lst, optional

Returns:
  • process_returncode (int) – code for success of subprocess

  • process_id (int) – unique process id, the subprocess was assigned on computing machine

  • stdout (str) – standard output content

  • stderr (str) – standard error content

start_subprocess(command)[source]#

Start subprocess.

Parameters:

command (str) – command, that will be run in subprocess

Returns:

process (subprocess.Popen) – subprocess object

queens.utils.smc_utils module#

Collection of utility functions and classes for SMC algorithms.

class StaticStateSpaceModel(likelihood_model, data=None, prior=None)[source]#

Bases: StaticModel

Model needed for the particles library implementation of SMC.

likelihood_model#

Log-likelihood function.

Type:

object

n_sims#

Number of model calls.

Type:

int

loglik(theta, t=None)[source]#

Log. Likelihood function for particles SMC implementation.

Parameters:
  • theta (obj) – Samples at which to evaluate the likelihood

  • t (int) – time (if set to None, the full log-likelihood is returned)

Returns:

The log likelihood

logpyt(theta, t)[source]#

Log-likelihood of Y_t, given parameter and previous datapoints.

Parameters:
  • theta (dict-like) – theta[‘par’] is a ndarray containing the N values for parameter par

  • t (int) – time

numpy_to_particles_array(samples)[source]#

Convert numpy arrays to particles objects.

The particles library uses np.ndarrays with homemade variable dtypes. This method converts it back to the particles library type.

Parameters:

samples (np.ndarray) – Samples

Returns:

np.ndarray with homemade dtypeParticle variables object

particles_array_to_numpy(theta)[source]#

Convert particles objects to numpy arrays.

The particles library uses np.ndarrays with homemade variable dtypes. We need to convert this into numpy arrays to work with queens.

Parameters:

theta (np.ndarray with homemade dtype) – Particle variables object

Returns:

np.ndarray – Numpy array of the particles

calc_ess(weights)[source]#

Calculate the Effective Sample Size (ESS) from the given weights.

The Effective Sample Size (ESS) is a measure used to assess the quality of a set of weights by indicating how many independent samples would be required to achieve the same level of information as the current weighted samples. This is computed using the exp-log trick to improve numerical stability.

Parameters:

weights (np.array) – An array of weights, typically representing the importance weights of samples in a weighted sampling scheme.

Returns:

float – The Effective Sample Size (ESS) as calculated from the provided weights.

temper_factory(temper_type)[source]#

Return the appropriate tempering function based on the specified type.

The tempering function can be used for transitioning between different log-probability density functions in various probabilistic models.

Parameters:

temper_type (str) – Type of the tempering function to return. Valid options are: - “bayes”: Returns the Bayes tempering function. - “generic”: Returns the generic tempering function.

Returns:

function – The corresponding tempering function based on temper_type.

Raises:

ValueError – If temper_type is not one of the valid options (“bayes”, “generic”).

temper_logpdf_bayes(log_prior, log_like, tempering_parameter=1.0)[source]#

Bayesian tempering function.

It phases from the prior to the posterior = like * prior. Special cases are:

  • tempering parameter = 0.0:

    We interpret this as “disregard contribution of the likelihood”. Therefore, return just log_prior.

  • log_pior or log_like = +inf:

    Prohibit this case. The reasoning is that (+inf + -inf) is ambiguous. We know that -inf is likely to occur, e.g. in uniform priors. On the other hand, +inf is rather unlikely to be a reasonable value. Therefore, we chose to exclude it here.

Parameters:
  • log_prior (np.array) – Array containing the values of the log-prior distribution at sample points

  • log_like (np.array) – Array containing the values of the log-likelihood at sample points

  • tempering_parameter (float) – Tempering parameter for resampling

temper_logpdf_generic(logpdf0, logpdf1, tempering_parameter=1.0)[source]#

Perform generic tempering between two log-probability density functions.

This function performs a linear interpolation between two log-probability density functions based on a tempering parameter. The tempering parameter determines the weight given to each log-probability density function in the transition from the initial distribution (logpdf0) to the goal distribution (logpdf1).

The function handles the following scenarios:

  • tempering parameter = 0.0:

    We interpret this as “disregard contribution of the goal pdf”. Therefore, return logpdf0.

  • tempering parameter = 1.0:

    We interpret this as “we are fully transitioned.” Therefore, ignore the contribution of the initial distribution. Therefore, return logpdf1.

  • logpdf0 or logpdf1 = +inf:

    Prohibit this case. The reasoning is that (+inf + -inf) is ambiguous. We know that -inf is likely to occur, e.g., in uniform distributions. On the other hand, +inf is rather unlikely to be a reasonable value. Therefore, we chose to exclude it here.

Parameters:
  • logpdf0 (float or np.array) – Logarithm of the probability density function of the initial distribution.

  • logpdf1 (float or np.array) – Logarithm of the probability density function of the goal distribution.

  • tempering_parameter (float) – Parameter between 0 and 1 that controls the interpolation between logpdf0 and logpdf1. A value of 0.0 corresponds to logpdf0, while a value of 1.0 corresponds to logpdf1.

Returns:

float or np.array – The tempered log-probability density function based on the tempering_parameter.

Raises:

ValueError – If either logpdf0 or logpdf1 is positive infinity (+inf).

queens.utils.sobol_sequence module#

Collection of utility functions and classes for Sobol sequences.

sample_sobol_sequence(dimension, number_of_samples, parameters, randomize=False, seed=None)[source]#

Generate samples from Sobol sequence.

Parameters:
  • dimension (int) – Dimensionality of the sequence. Max dimensionality is 21201.

  • number_of_samples (int) – number of samples to generate in the parameter space

  • parameters (Parameters) – parameters object defined the true distribution of the samples

  • randomize (bool, optional) – If True, use LMS+shift scrambling, i.e. randomize the sequence. Otherwise, no scrambling is done. Default is False.

  • seed (SeedType, optional) – If seed is an int or None, a new numpy.random.Generator is created using np.random.default_rng(seed). If seed is already a Generator instance, then the provided instance is used.

Returns:

samples (np.ndarray) – Sobol sequence quasi Monte Carlo samples for the parameter distribution

queens.utils.start_dask_cluster module#

Main module to start a dask jobqueue cluster.

parse_arguments(unparsed_args)[source]#

Parse arguments passed via command line call.

queens.utils.tensorflow_utils module#

Utils related to tensorflow and friends.

configure_keras(tf_keras)[source]#

Configure tf keras.

Parameters:

tf_keras (Model) – The module configuration

configure_tensorflow(tensorflow)[source]#

Configure tensorflow.

Parameters:

tensorflow (module) – The module to configure

queens.utils.valid_options_utils module#

Helper functions for valid options and switch analogy.

check_if_valid_options(valid_options, desired_options, error_message='')[source]#

Check if the desired option(s) is/are in valid_options.

Raises InvalidOptionError if invalid options are present.

Parameters:
  • valid_options (lst,dict) – List of valid option keys or dict with valid options as keys

  • desired_options (str, lst(str), dict) – Key(s) of desired options

  • error_message (str, optional) – Error message in case the desired option can not be found

get_option(options_dict, desired_option, error_message='')[source]#

Get option desired_option from options_dict.

The options_dict consists of the keys and their values. Note that the value can also be functions. In case the option is not found an error is raised.

Parameters:
  • options_dict (dict) – Dictionary with valid options and their value

  • desired_option (str) – Desired method key

  • error_message (str, optional) – Custom error message to be used if the desired_option is not found. Defaults to an empty string.

Returns:

Value of the *desired_option*