probeye#

This package provides a transparent and easy-to-use framework for solving parameter estimation problems (i.e., inverse problems) primarily via sampling methods in a characteristic two-step approach.

  1. In the first step, the problem at hand is defined in a solver-independent fashion, i.e., without specifying which computational means are supposed to be utilized for finding a solution.

  2. In the second step, the problem definition is handed over to a user-selected solver, that finds a solution to the problem. The currently supported solvers focus on Bayesian methods for posterior sampling.

Due to the broad variety of existing inference engine packages, probeye does not contain self-written implementations of solvers but merely interfaces with existing ones. It currently provides interfaces with emcee for MCMC sampling and with dynesty for nested sampling. It also provides two point-estimate solvers for maximum likelihood as well as maximum a-posteriori estimates based on scipy.

The parameter estimation problems probeye aims at are problems that are centered around forward models that are computationally expensive (e.g., parameterized finite element models), and the corresponding observations of which are not particularly numerous (typically around tens or hundreds of experiments). Such problems are often encountered in engineering problems where simulation models are calibrated based on laboratory tests, which are - due to their relatively high costs - not available in high numbers.

The source code of probeye is jointly developed by Bundesanstalt für Materialforschung und -prüfung (BAM) and Netherlands Organisation for applied scientific research (TNO) for calibrating parameterized physics-based models and quantifying uncertainties in the obtained parameter estimates.

Installation#

The source code is entirely written in Python and can be installed either from PyPI or directly from the latest version on GitHub. Compatible Python versions (at least tested ones) are 3.6 up to 3.9.

pip (PyPI)#

Since probeye is frequently updated on the Python package index (PyPI), you can install the most recent stable version simply by running pip from the command line:

pip install probeye

Please note that there are several dependencies which might take some time to download and install, if they are not installed in your environment yet.

pip (GitHub)#

If you want to install the package directly from the latest version on GitHub (for example when this version is not published on PyPI yet), just clone the repository and install the package locally by running from the root of the repository the following command:

pip install .
pip install -e .[tests,lint_type_checks,docs]
pip install --user ".[tests,lint_type_checks,docs]"

Please note that there are several dependencies which might take some time to download and install, if they are not installed in your environment yet.

Examples#

Note

The examples are selected to illustrate how to use various probeye functionalities. First time users are advised to start with the linear regression example.

Simple linear regression example

Simple linear regression example

Linear regression with 1D-correlation

Linear regression with 1D-correlation

Simple bridge model with time-space correlation

Simple bridge model with time-space correlation

Simple linear regression example#

The model equation is y = ax + b with a, b being the model parameters, while the likelihood model is based on a normal zero-mean additive model error distribution with the standard deviation to infer. The problem is solved via maximum likelihood estimation as well as via sampling using emcee.

First, let’s import the required functions and classes for this example.

# third party imports
import numpy as np
import matplotlib.pyplot as plt

# local imports (problem definition)
from probeye.definition.inverse_problem import InverseProblem
from probeye.definition.forward_model import ForwardModelBase
from probeye.definition.distribution import Normal, Uniform
from probeye.definition.sensor import Sensor
from probeye.definition.likelihood_model import GaussianLikelihoodModel

# local imports (problem solving)
from probeye.inference.scipy.solver import MaxLikelihoodSolver
from probeye.inference.emcee.solver import EmceeSolver
from probeye.inference.dynesty.solver import DynestySolver

# local imports (inference data post-processing)
from probeye.postprocessing.sampling_plots import create_pair_plot
from probeye.postprocessing.sampling_plots import create_posterior_plot
from probeye.postprocessing.sampling_plots import create_trace_plot

We start by generating a synthetic data set from a known linear model to which we will add some noise. Afterwards, we will pretend to have forgotten the parameters of this ground-truth model and will instead try to recover them just from the data. The slope (a) and intercept (b) of the ground truth model are set to be:

# ground truth
a_true = 2.5
b_true = 1.7

Now, let’s generate a few data points that we contaminate with a Gaussian error:

# settings for data generation
n_tests = 50
seed = 1
mean_noise = 0.0
std_noise = 0.5

# generate the data
np.random.seed(seed)
x_test = np.linspace(0.0, 1.0, n_tests)
y_true = a_true * x_test + b_true
y_test = y_true + np.random.normal(loc=mean_noise, scale=std_noise, size=n_tests)

Let’s take a look at the data that we just generated:

plt.plot(x_test, y_test, "o", label="generated data points")
plt.plot(x_test, y_true, label="ground-truth model")
plt.title("Data vs. ground truth")
plt.xlabel("x")
plt.ylabel("y")
plt.legend()
plt.show()
Data vs. ground truth

Until this point, we didn’t use probeye at all, since we just generated some data. In a normal use case, we wouldn’t have to generate our data of course. Instead, it would be provided to us, for example as the result of some test series. As the first step in any calibration problem, one needs to have a parameterized model (in probeye such a model is called ‘forward model’) of which one assumes that it is able to describe the data at hand. In this case, if we took a look at the blue data points in the plot above without knowing the orange line, we might expect a simple linear model. It is now our job to describe this model within the probeye-framework. This is done by defining our own specific model class:

class LinearModel(ForwardModelBase):
    def interface(self):
        self.parameters = ["a", "b"]
        self.input_sensors = Sensor("x")
        self.output_sensors = Sensor("y", std_model="sigma")

    def response(self, inp: dict) -> dict:
        x = inp["x"]
        m = inp["a"]
        b = inp["b"]
        return {"y": m * x + b}

First, note that this model class is based on the probeye class ‘ForwardModelBase’. While this is a requirement, the name of the class can be chosen freely. As you can see, this class has an ‘interface’ and a ‘response’ method. In the ‘interface’ method we define that our model has two parameters, ‘a’ and ‘b’, next to one input and one output sensors, called ‘x’ and ‘y’ respectively. Keeping this interface in mind, let’s now take a look at the ‘response’ method. This method describes the actual forward model evaluation. The method takes one dictionary as an input and returns one dictionary as its output. The input dictionary ‘inp’ will have the keys ‘a’, ‘b’ and ‘x’ because of the definitions given in self.interface. Analogously, the returned dictionary must have the key ‘y’, because we defined an output sensor with the name ‘y’. Note that the entire interface of the ‘response’ method is described by the ‘interface’ method. Parameters and input sensors will be contained in the ‘inp’ dictionary, while the output sensors must be contained in the returned dictionary.

After we now have defined our forward model, we can set up the inverse problem itself. This always begins by initializing an object form the InverseProblem-class, and adding all of the problem’s parameters with priors that reflect our current best guesses of what the parameter’s values might look like. Please check out the ‘Components’-part of this documentation to get more information on the arguments seen below. However, most of the code should be self-explanatory.

# initialize the problem (the print_header=False is only set to avoid the printout of
# the probeye header which is not helpful here)
problem = InverseProblem("Linear regression with Gaussian noise", print_header=False)

# add the problem's parameters
problem.add_parameter(
    "a",
    tex="$a$",
    info="Slope of the graph",
    prior=Normal(mean=2.0, std=1.0),
)
problem.add_parameter(
    "b",
    info="Intersection of graph with y-axis",
    tex="$b$",
    prior=Normal(mean=1.0, std=1.0),
)
problem.add_parameter(
    "sigma",
    domain="(0, +oo)",
    tex=r"$\sigma$",
    info="Standard deviation, of zero-mean Gaussian noise model",
    prior=Uniform(low=0.0, high=0.8),
)

As the next step, we need to add our experimental data the forward model and the likelihood model. Note that the order is important and should not be changed.

# experimental data
problem.add_experiment(
    name="TestSeries_1",
    sensor_data={"x": x_test, "y": y_test},
)

# forward model
problem.add_forward_model(LinearModel("LinearModel"), experiments="TestSeries_1")

# likelihood model
problem.add_likelihood_model(
    GaussianLikelihoodModel(experiment_name="TestSeries_1", model_error="additive")
)

Now, our problem definition is complete, and we can take a look at its summary:

# print problem summary
problem.info(print_header=True)
2023-08-02 14:21:18.891 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                                            dP                                                    # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                                            88                                                    # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                  88d888b. 88d888b..d8888b. 88d888b. .d8888b. dP    dP .d8888b.                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                  88'  `88 88'     88'  `88 88'  `88 88ooood8 88    88 88ooood8                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                  88.  .88 88      88.  .88 88.  .88 88.      88.  .88 88.                        # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                  88Y888P' dP      `88888P' 88Y8888' `88888P' `8888P88 `88888P'                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                  88                                                .88                           # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.892 | INFO     | #                  dP                                            d8888P                            # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.893 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.893 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.893 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.893 | INFO     | #        Version 3.0.4 - A general framework for setting up parameter estimation problems.         # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.893 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.893 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:18.896 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.896 | INFO     | Problem summary: Linear regression with Gaussian noise                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.896 | INFO     | ======================================================                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.896 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.896 | INFO     | Forward models                                                                                       | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.896 | INFO     | ---------------------------------------------------------                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.896 | INFO     |  Model name   | Global parameters   | Local parameters                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     | --------------+---------------------+--------------------                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     |  LinearModel  | a, b                | a, b                                                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     | Priors                                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     | -----------------------------------------------------------------------------                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     |  Prior name    | Global parameters            | Local parameters                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     | ---------------+------------------------------+------------------------------                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     |  a_normal      | a, mean_a, std_a             | a, mean_a, std_a                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.897 | INFO     |  b_normal      | b, mean_b, std_b             | b, mean_b, std_b                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     |  sigma_uniform | sigma, low_sigma, high_sigma | sigma, low_sigma, high_sigma                         | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     | Parameter overview                                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     | ---------------------------------------------------------------------------------------              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     |  Parameter type/role   | Parameter names                                     |   Count               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     | -----------------------+-----------------------------------------------------+---------              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     |  Model parameters      | a, b                                                |       2               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     |  Prior parameters      | mean_a, std_a, mean_b, std_b, low_sigma, high_sigma |       6               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     |  Likelihood parameters | sigma                                               |       1               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.898 | INFO     |  Const parameters      | mean_a, std_a, mean_b, std_b, low_sigma, high_sigma |       6               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     |  Latent parameters     | a, b, sigma                                         |       3               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     | Parameter explanations                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     | ---------------------------------------------------------------------                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     |  Name       | Short explanation                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     | ------------+--------------------------------------------------------                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     |  mean_a     | Normal prior's parameter for latent parameter 'a'                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     |  std_a      | Normal prior's parameter for latent parameter 'a'                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.899 | INFO     |  a          | Slope of the graph                                                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |  mean_b     | Normal prior's parameter for latent parameter 'b'                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |  std_b      | Normal prior's parameter for latent parameter 'b'                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |  b          | Intersection of graph with y-axis                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |  low_sigma  | Uniform prior's parameter for latent parameter 'sigma'                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |  high_sigma | Uniform prior's parameter for latent parameter 'sigma'                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |  sigma      | Standard deviation, of zero-mean Gaussian noise model                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     | Constant parameters                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     | ----------------------                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.900 | INFO     |  Name       |   Value                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     | ------------+---------                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     |  mean_a     |     2                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     |  std_a      |     1                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     |  mean_b     |     1                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     |  std_b      |     1                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     |  low_sigma  |     0                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     |  high_sigma |     0.8                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.901 | INFO     | Theta interpretation                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | +---------------------------+                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | |  Theta  |    Parameter    |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | |  index  |      name       |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | |---------------------------|                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | |      0 --> a              |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | |      1 --> b              |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | |      2 --> sigma          |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | +---------------------------+                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.902 | INFO     | Added experiments                                                                                    | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     | --------------------------------------------------                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     |  Name         | Sensor values   | Forward model                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     | --------------+-----------------+-----------------                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     |  TestSeries_1 | x (50 elements) | LinearModel                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     |               | y (50 elements) |                                                                    | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     | Added likelihood models                                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     | ---------------------------------------------------------------                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     |  Name         | Parameters   | Target sensors   | Experiment                                         | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.903 | INFO     | --------------+--------------+------------------+--------------                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.904 | INFO     |  TestSeries_1 | sigma        | y                | TestSeries_1                                       | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:18.904 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978

After the problem definition comes the problem solution. There are different solver one can use, but we will just demonstrate how to use two of them: the scipy-solver, which merely provides a point estimate based on a maximum likelihood optimization, and the emcee solver, which is a MCMC-sampling solver. Let’s begin with the scipy-solver:

# this is for using the scipy-solver (maximum likelihood estimation)
scipy_solver = MaxLikelihoodSolver(problem, show_progress=False)
max_like_data = scipy_solver.run()
2023-08-02 14:21:18.906 | INFO     | Solving problem via maximum likelihood estimation                                                    | probeye.inference.scipy.solver:_run_ml_or_map:452
2023-08-02 14:21:18.906 | INFO     | Using start values:                                                                                  | probeye.inference.scipy.solver:_run_ml_or_map:460
2023-08-02 14:21:18.906 | INFO     | a      = 2.0                                                                                         | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:21:18.907 | INFO     | b      = 1.0                                                                                         | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:21:18.907 | INFO     | sigma  = 0.4                                                                                         | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:21:18.907 | INFO     | Starting optimizer (using Nelder-Mead)                                                               | probeye.inference.scipy.solver:_run_ml_or_map:464
2023-08-02 14:21:18.907 | INFO     | No solver options specified                                                                          | probeye.inference.scipy.solver:_run_ml_or_map:469
2023-08-02 14:21:18.918 | INFO     |                                                                                                      | probeye.inference.scipy.solver:summarize_point_estimate_results:356
2023-08-02 14:21:18.919 | INFO     | Results of maximum likelihood estimation                                                             | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:18.919 | INFO     | =====================================                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:18.919 | INFO     | Optimization terminated successfully.                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:18.919 | INFO     | -------------------------------------                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:18.919 | INFO     | Number of iterations:           77                                                                   | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:18.919 | INFO     | Number of function evaluations: 140                                                                  | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:18.919 | INFO     | -------------------------------------                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:18.920 | INFO     | a_opt     = [2.70335337] (start = 2.0)                                                               | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:21:18.920 | INFO     | b_opt     = [1.58558399] (start = 1.0)                                                               | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:21:18.920 | INFO     | sigma_opt = [0.48107272] (start = 0.4)                                                               | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:21:18.920 | INFO     |                                                                                                      | probeye.inference.scipy.solver:summarize_point_estimate_results:383

All solver have in common that they are first initialized, and then execute a run-method, which returns its result data in the format of an arviz inference-data object (except for the scipy-solver). Let’s now take a look at the emcee-solver.

# this is for using the emcee-solver (MCMC sampling)
emcee_solver = EmceeSolver(problem, show_progress=False)
inference_data = emcee_solver.run(n_steps=2000, n_initial_steps=200)
2023-08-02 14:21:18.923 | INFO     | Solving problem using emcee sampler with 200 + 2000 samples and 20 walkers                           | probeye.inference.emcee.solver:run:178
2023-08-02 14:21:18.923 | INFO     | No additional options specified                                                                      | probeye.inference.emcee.solver:run:186
2023-08-02 14:21:40.518 | INFO     | Sampling of the posterior distribution completed: 2000 steps and 20 walkers.                         | probeye.inference.emcee.solver:run:255
2023-08-02 14:21:40.519 | INFO     | Total run-time (including initial sampling): 21s.                                                    | probeye.inference.emcee.solver:run:259
2023-08-02 14:21:40.519 | INFO     |                                                                                                      | probeye.inference.emcee.solver:run:260
2023-08-02 14:21:40.519 | INFO     | Summary of sampling results (emcee)                                                                  | probeye.inference.emcee.solver:run:261
2023-08-02 14:21:40.526 | INFO     |          mean    median    sd    5%    95%                                                           | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:21:40.527 | INFO     | -----  ------  --------  ----  ----  -----                                                           | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:21:40.527 | INFO     | a        2.69      2.69  0.23  2.31   3.06                                                           | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:21:40.527 | INFO     | b        1.59      1.59  0.14  1.37   1.82                                                           | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:21:40.527 | INFO     | sigma    0.50      0.50  0.05  0.43   0.60                                                           | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:21:40.527 | INFO     |                                                                                                      | probeye.inference.emcee.solver:run:267

Finally, we want to plot the results we obtained. To that end, probeye provides some post-processing routines, which are mostly based on the arviz-plotting routines.

# this is optional, since in most cases we don't know the ground truth
true_values = {"a": a_true, "b": b_true, "sigma": std_noise}

# this is an overview plot that allows to visualize correlations
pair_plot_array = create_pair_plot(
    inference_data,
    emcee_solver.problem,
    true_values=true_values,
    focus_on_posterior=True,
    show_legends=True,
    title="Sampling results from emcee-Solver (pair plot)",
)
Sampling results from emcee-Solver (pair plot)
/home/docs/checkouts/readthedocs.org/user_builds/probeye/envs/latest/lib/python3.8/site-packages/arviz/utils.py:184: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  numba_fn = numba.jit(**self.kwargs)(self.function)
# this is a posterior plot, without including priors
post_plot_array = create_posterior_plot(
    inference_data,
    emcee_solver.problem,
    true_values=true_values,
    title="Sampling results from emcee-Solver (posterior plot)",
)
Sampling results from emcee-Solver (posterior plot), $a$, $b$, $\sigma$
# trace plots are used to check for "healthy" sampling
trace_plot_array = create_trace_plot(
    inference_data,
    emcee_solver.problem,
    title="Sampling results from emcee-Solver (trace plot)",
)
Sampling results from emcee-Solver (trace plot), $a$, $a$, $b$, $b$, $\sigma$, $\sigma$

Total running time of the script: ( 0 minutes 29.354 seconds)

Linear regression with 1D-correlation#

The n data points (y1, y2, …, yn) generated for this example are sampled from an n-variate normal distribution with mean values given by yi = a * xi + b with a, b being the model parameters and x1, x2, …, xi, …, xn being predefined spatial x-coordinates ranging from 0 to 1. The data points (y1, y2, …, yn) are not independent but correlated in x. The corresponding covariance matrix is defined based on an exponential correlation function parameterized by the constant standard deviation sigma of the n-variate normal distribution and a correlation length l_corr. Hence, the full model has four parameters a, b, sigma, l_corr, all of which are inferred in this example using maximum likelihood estimation as well as sampling via emcee.

First, let’s import the required functions and classes for this example.

# third party imports
import numpy as np
import matplotlib.pyplot as plt
from tripy.utils import correlation_function
from tripy.utils import correlation_matrix

# local imports (problem definition)
from probeye.definition.inverse_problem import InverseProblem
from probeye.definition.forward_model import ForwardModelBase
from probeye.definition.distribution import Normal, Uniform
from probeye.definition.sensor import Sensor
from probeye.definition.likelihood_model import GaussianLikelihoodModel
from probeye.definition.correlation_model import ExpModel

# local imports (problem solving)
from probeye.inference.scipy.solver import MaxLikelihoodSolver
from probeye.inference.emcee.solver import EmceeSolver

# local imports (inference data post-processing)
from probeye.postprocessing.sampling_plots import create_pair_plot
from probeye.postprocessing.sampling_plots import create_posterior_plot
from probeye.postprocessing.sampling_plots import create_trace_plot

We start by generating a synthetic data set from a known linear model to which we will add correlated noise. Afterwards, we will pretend to have forgotten the parameters of this ground-truth model and will instead try to recover them just from the data. The slope (a) and intercept (b) of the ground truth model are set to be:

# ground truth
a_true = 2.5
b_true = 1.7

Now, let’s generate a few data points that we contaminate with a Gaussian error:

# settings for the data generation
n_experiments = 3
n_points = 50
seed = 1
std_noise = 0.5
l_corr = 0.05

# first create the true values without an error model; these 'true' values will be the
# mean values for sampling from a multivariate normal distribution that accounts for the
# intended correlation
np.random.seed(seed)
x_test = np.linspace(0.0, 1.0, n_points)
y_true = a_true * x_test + b_true

# assemble the spatial covariance matrix
x_test_as_column_matrix = x_test.reshape((n_points, -1))
f_corr = lambda a: correlation_function(d=a, correlation_length=l_corr)
cov = std_noise**2 * correlation_matrix(x_test_as_column_matrix, f_corr)

# now generate the noisy test data including correlations; we assume here that
# there are n_experiments test series
data_dict = {}
for i in range(n_experiments):
    y_test_i = np.random.multivariate_normal(mean=y_true, cov=cov)
    data_dict[f"Test_{i}"] = y_test_i
    plt.scatter(x_test, y_test_i, label=f"measured data (test {i+1})", s=10, zorder=10)
# finish the plot
plt.plot(x_test, y_true, label="true model", c="black")
plt.xlabel("x")
plt.ylabel("y")
plt.legend()
plt.show()
plot correlation 1D

Until this point, we didn’t use probeye at all, since we just generate some data. In a normal use case, we wouldn’t have to generate our data of course. Instead, it would be provided to us, for example as the result of some test series. As the first step in any calibration problem, one needs to have a parameterized model (in probeye such a model is called ‘forward model’) of which one assumes that it is able to describe the data at hand. In this case, if we took a look at the blue data points in the plot above without knowing the orange line, we might expect a simple linear model. It is now our job to describe this model within the probeye-framework. This is done by defining our own specific model class:

class LinearModel(ForwardModelBase):
    def interface(self):
        self.parameters = ["a", "b"]
        self.input_sensors = Sensor("x")
        self.output_sensors = Sensor("y", std_model="sigma")

    def response(self, inp: dict) -> dict:
        x = inp["x"]
        m = inp["a"]
        b = inp["b"]
        return {"y": m * x + b}

First, note that this model class is based on the probeye class ‘ForwardModelBase’. While this is a requirement, the name of the class can be chosen freely. As you can see, this class has a ‘interface’ and a ‘response’ method. In the ‘interface’ method we define that our model has two parameters, ‘a’ and ‘b’, next to one input and one output sensors, called ‘x’ and ‘y’ respectively. Keeping this interface in mind, let’s now take a look at the ‘response’ method. This method describes the actual forward model evaluation. The method takes one dictionary as an input and returns one dictionary as its output. The input dictionary ‘inp’ will have the keys ‘a’, ‘b’ and ‘x’ because of the definitions given in self.interface. Analogously, the returned dictionary must have the key ‘y’, because we defined an output sensor with the name ‘y’. Note that the entire interface of the ‘response’ method is described by the ‘interface’ method. Parameters and input sensors will be contained in the ‘inp’ dictionary, while the output sensors must be contained in the returned dictionary.

After we now have defined our forward model, we can set up the inverse problem itself. This always begins by initializing an object form the InverseProblem-class, and adding all of the problem’s parameters with priors that reflect our current best guesses of what the parameter’s values might look like. Please check out the ‘Components’-part of this documentation to get more information on the arguments seen below. However, most of the code should be self-explanatory.

# initialize the inverse problem with a useful name
problem = InverseProblem("Linear regression with 1D correlation", print_header=False)

# add the problem's parameters
problem.add_parameter(
    "a",
    tex="$a$",
    info="Slope of the graph",
    prior=Normal(mean=2.0, std=1.0),
)
problem.add_parameter(
    "b",
    info="Intersection of graph with y-axis",
    tex="$b$",
    prior=Normal(mean=1.0, std=1.0),
)
problem.add_parameter(
    "sigma",
    domain="(0, +oo)",
    tex=r"$\sigma$",
    info="Standard deviation, of zero-mean Gaussian noise model",
    prior=Uniform(low=0.0, high=0.8),
)
problem.add_parameter(
    "l_corr",
    domain="(0, +oo)",
    tex=r"$l_\mathrm{corr}$",
    info="Correlation length of correlation model",
    prior=Uniform(low=0.0, high=0.2),
)

As the next step, we need to add our experimental data the forward model and the likelihood model. Note that the order is important and should not be changed.

# experimental data
for exp_name, y_test_i in data_dict.items():
    problem.add_experiment(
        name=exp_name,
        sensor_data={"x": x_test, "y": y_test_i},
    )

# add the forward model to the problem
problem.add_forward_model(LinearModel("LinearModel"), experiments=[*data_dict.keys()])

# likelihood model
for exp_name in data_dict:
    likelihood_model = GaussianLikelihoodModel(
        experiment_name=exp_name,
        model_error="additive",
        correlation=ExpModel(x="l_corr"),
    )
    problem.add_likelihood_model(likelihood_model)

Now, our problem definition is complete, and we can take a look at its summary:

# print problem summary
problem.info(print_header=True)
2023-08-02 14:21:47.461 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.461 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                                            dP                                                    # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                                            88                                                    # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                  88d888b. 88d888b..d8888b. 88d888b. .d8888b. dP    dP .d8888b.                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                  88'  `88 88'     88'  `88 88'  `88 88ooood8 88    88 88ooood8                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                  88.  .88 88      88.  .88 88.  .88 88.      88.  .88 88.                        # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                  88Y888P' dP      `88888P' 88Y8888' `88888P' `8888P88 `88888P'                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                  88                                                .88                           # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                  dP                                            d8888P                            # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.462 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.463 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.463 | INFO     | #        Version 3.0.4 - A general framework for setting up parameter estimation problems.         # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.463 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.463 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:21:47.465 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.465 | INFO     | Problem summary: Linear regression with 1D correlation                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.465 | INFO     | ======================================================                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     | Forward models                                                                                       | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     | ---------------------------------------------------------                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     |  Model name   | Global parameters   | Local parameters                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     | --------------+---------------------+--------------------                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     |  LinearModel  | a, b                | a, b                                                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     | Priors                                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.466 | INFO     | ------------------------------------------------------------------------------------                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     |  Prior name     | Global parameters               | Local parameters                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     | ----------------+---------------------------------+---------------------------------                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     |  a_normal       | a, mean_a, std_a                | a, mean_a, std_a                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     |  b_normal       | b, mean_b, std_b                | b, mean_b, std_b                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     |  sigma_uniform  | sigma, low_sigma, high_sigma    | sigma, low_sigma, high_sigma                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     |  l_corr_uniform | l_corr, low_l_corr, high_l_corr | l_corr, low_l_corr, high_l_corr                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     | Parameter overview                                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     | ---------------------------------------------------------------------------------------------------------------- | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.467 | INFO     |  Parameter type/role   | Parameter names                                                              |   Count | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     | -----------------------+------------------------------------------------------------------------------+--------- | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     |  Model parameters      | a, b                                                                         |       2 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     |  Prior parameters      | mean_a, std_a, mean_b, std_b, low_sigma, high_sigma, low_l_corr, high_l_corr |       8 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     |  Likelihood parameters | sigma, l_corr                                                                |       2 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     |  Const parameters      | mean_a, std_a, mean_b, std_b, low_sigma, high_sigma, low_l_corr, high_l_corr |       8 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     |  Latent parameters     | a, b, sigma, l_corr                                                          |       4 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     | Parameter explanations                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     | -----------------------------------------------------------------------                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.468 | INFO     |  Name        | Short explanation                                                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     | -------------+---------------------------------------------------------                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  mean_a      | Normal prior's parameter for latent parameter 'a'                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  std_a       | Normal prior's parameter for latent parameter 'a'                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  a           | Slope of the graph                                                                    | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  mean_b      | Normal prior's parameter for latent parameter 'b'                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  std_b       | Normal prior's parameter for latent parameter 'b'                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  b           | Intersection of graph with y-axis                                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  low_sigma   | Uniform prior's parameter for latent parameter 'sigma'                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  high_sigma  | Uniform prior's parameter for latent parameter 'sigma'                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.469 | INFO     |  sigma       | Standard deviation, of zero-mean Gaussian noise model                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     |  low_l_corr  | Uniform prior's parameter for latent parameter 'l_corr'                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     |  high_l_corr | Uniform prior's parameter for latent parameter 'l_corr'                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     |  l_corr      | Correlation length of correlation model                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     | Constant parameters                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     | -----------------------                                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     |  Name        |   Value                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     | -------------+---------                                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     |  mean_a      |     2                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.470 | INFO     |  std_a       |     1                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.471 | INFO     |  mean_b      |     1                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.471 | INFO     |  std_b       |     1                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.471 | INFO     |  low_sigma   |     0                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.471 | INFO     |  high_sigma  |     0.8                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.471 | INFO     |  low_l_corr  |     0                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.471 | INFO     |  high_l_corr |     0.2                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.471 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | Theta interpretation                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | +---------------------------+                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | |  Theta  |    Parameter    |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | |  index  |      name       |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | |---------------------------|                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | |      0 --> a              |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | |      1 --> b              |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | |      2 --> sigma          |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | |      3 --> l_corr         |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.472 | INFO     | +---------------------------+                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     | Added experiments                                                                                    | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     | --------------------------------------------                                                         | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     |  Name   | Sensor values   | Forward model                                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     | --------+-----------------+-----------------                                                         | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     |  Test_0 | x (50 elements) | LinearModel                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     |         | y (50 elements) |                                                                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     |  Test_1 | x (50 elements) | LinearModel                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     |         | y (50 elements) |                                                                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.473 | INFO     |  Test_2 | x (50 elements) | LinearModel                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     |         | y (50 elements) |                                                                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     | Added likelihood models                                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     | ----------------------------------------------------------                                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     |  Name   | Parameters    | Target sensors   | Experiment                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     | --------+---------------+------------------+--------------                                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     |  Test_0 | l_corr, sigma | y                | Test_0                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     |  Test_1 | l_corr, sigma | y                | Test_1                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     |  Test_2 | l_corr, sigma | y                | Test_2                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:21:47.474 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978

After the problem definition comes the problem solution. There are different solver one can use, but we will just demonstrate how to use two of them: the scipy-solver, which merely provides a point estimate based on a maximum likelihood optimization, and the emcee solver, which is a MCMC-sampling solver. Let’s begin with the scipy-solver:

# this is for using the scipy-solver (maximum likelihood estimation)
scipy_solver = MaxLikelihoodSolver(problem, show_progress=False)
max_like_data = scipy_solver.run()
2023-08-02 14:21:47.478 | INFO     | Solving problem via maximum likelihood estimation                                                    | probeye.inference.scipy.solver:_run_ml_or_map:452
2023-08-02 14:21:47.479 | INFO     | Using start values:                                                                                  | probeye.inference.scipy.solver:_run_ml_or_map:460
2023-08-02 14:21:47.479 | INFO     | a       = 2.0                                                                                        | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:21:47.479 | INFO     | b       = 1.0                                                                                        | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:21:47.480 | INFO     | sigma   = 0.4                                                                                        | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:21:47.480 | INFO     | l_corr  = 0.1                                                                                        | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:21:47.480 | INFO     | Starting optimizer (using Nelder-Mead)                                                               | probeye.inference.scipy.solver:_run_ml_or_map:464
2023-08-02 14:21:47.480 | INFO     | No solver options specified                                                                          | probeye.inference.scipy.solver:_run_ml_or_map:469
2023-08-02 14:21:47.568 | INFO     |                                                                                                      | probeye.inference.scipy.solver:summarize_point_estimate_results:356
2023-08-02 14:21:47.568 | INFO     | Results of maximum likelihood estimation                                                             | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:47.568 | INFO     | =====================================                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:47.568 | INFO     | Optimization terminated successfully.                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:47.568 | INFO     | -------------------------------------                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:47.568 | INFO     | Number of iterations:           131                                                                  | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:47.568 | INFO     | Number of function evaluations: 225                                                                  | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:47.569 | INFO     | -------------------------------------                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:21:47.569 | INFO     | a_opt      = [2.50521064] (start = 2.0)                                                              | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:21:47.569 | INFO     | b_opt      = [1.57413752] (start = 1.0)                                                              | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:21:47.569 | INFO     | sigma_opt  = [0.45072398] (start = 0.4)                                                              | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:21:47.570 | INFO     | l_corr_opt = [0.05469275] (start = 0.1)                                                              | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:21:47.570 | INFO     |                                                                                                      | probeye.inference.scipy.solver:summarize_point_estimate_results:383

All solver have in common that they are first initialized, and then execute a run-method, which returns its result data in the format of an arviz inference-data object (except for the scipy-solver). Let’s now take a look at the emcee-solver.

emcee_solver = EmceeSolver(problem, show_progress=False)
inference_data = emcee_solver.run(n_steps=2000, n_initial_steps=200)
2023-08-02 14:21:47.574 | INFO     | Solving problem using emcee sampler with 200 + 2000 samples and 20 walkers                           | probeye.inference.emcee.solver:run:178
2023-08-02 14:21:47.574 | INFO     | No additional options specified                                                                      | probeye.inference.emcee.solver:run:186
2023-08-02 14:22:31.892 | INFO     | Sampling of the posterior distribution completed: 2000 steps and 20 walkers.                         | probeye.inference.emcee.solver:run:255
2023-08-02 14:22:31.892 | INFO     | Total run-time (including initial sampling): 44s.                                                    | probeye.inference.emcee.solver:run:259
2023-08-02 14:22:31.892 | INFO     |                                                                                                      | probeye.inference.emcee.solver:run:260
2023-08-02 14:22:31.893 | INFO     | Summary of sampling results (emcee)                                                                  | probeye.inference.emcee.solver:run:261
2023-08-02 14:22:31.900 | INFO     |           mean    median    sd    5%    95%                                                          | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:22:31.901 | INFO     | ------  ------  --------  ----  ----  -----                                                          | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:22:31.901 | INFO     | a         2.48      2.48  0.30  2.01   2.99                                                          | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:22:31.901 | INFO     | b         1.57      1.58  0.18  1.26   1.86                                                          | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:22:31.901 | INFO     | sigma     0.51      0.50  0.07  0.42   0.65                                                          | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:22:31.901 | INFO     | l_corr    0.07      0.07  0.03  0.04   0.12                                                          | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:22:31.901 | INFO     |                                                                                                      | probeye.inference.emcee.solver:run:267

Finally, we want to plot the results we obtained. To that end, probeye provides some post-processing routines, which are mostly based on the arviz-plotting routines.

# this is optional, since in most cases we don't know the ground truth
true_values = {"a": a_true, "b": b_true, "sigma": std_noise, "l_corr": l_corr}

# this is an overview plot that allows to visualize correlations
pair_plot_array = create_pair_plot(
    inference_data,
    emcee_solver.problem,
    true_values=true_values,
    focus_on_posterior=True,
    show_legends=False,
    title="Sampling results from emcee-Solver (pair plot)",
)
Sampling results from emcee-Solver (pair plot)
# this is a posterior plot, without including priors
post_plot_array = create_posterior_plot(
    inference_data,
    emcee_solver.problem,
    true_values=true_values,
    title="Sampling results from emcee-Solver (posterior plot)",
)
Sampling results from emcee-Solver (posterior plot), $a$, $b$, $\sigma$, $l_\mathrm{corr}$
# trace plots are used to check for "healthy" sampling
trace_plot_array = create_trace_plot(
    inference_data,
    emcee_solver.problem,
    title="Sampling results from emcee-Solver (trace plot)",
)
Sampling results from emcee-Solver (trace plot), $a$, $a$, $b$, $b$, $\sigma$, $\sigma$, $l_\mathrm{corr}$, $l_\mathrm{corr}$

Total running time of the script: ( 0 minutes 48.704 seconds)

Simple bridge model with time-space correlation#

A bridge (modeled as a simply supported beam) is equipped at multiple positions with a deflection sensor. All sensors record a time series of deflection while cars with different weights and velocities cross the bridge. Correlation is assumed in both space and time. The goal of the inference is to estimate the bridge’s bending stiffness ‘EI’. Next to ‘EI’ there are three other parameters to infer: the additive model error std. deviation ‘sigma’, the temporal correlation length ‘l_corr_t’ and the spatial corr. length l_corr_x. Hence, four parameters in total, all of which are inferred in this example using maximum likelihood estimation as well as sampling via emcee.

First, let’s import the required functions and classes for this example.

# third party imports
import numpy as np
import matplotlib.pyplot as plt
from tripy.base import MeasurementSpaceTimePoints
from tripy.utils import correlation_function

# local imports (problem definition)
from probeye.definition.inverse_problem import InverseProblem
from probeye.definition.forward_model import ForwardModelBase
from probeye.definition.distribution import Normal, Uniform
from probeye.definition.likelihood_model import GaussianLikelihoodModel
from probeye.definition.correlation_model import ExpModel
from probeye.definition.sensor import Sensor
from probeye.subroutines import len_or_one
from probeye.subroutines import HiddenPrints

# local imports (problem solving)
from probeye.inference.scipy.solver import MaxLikelihoodSolver
from probeye.inference.emcee.solver import EmceeSolver

# local imports (inference data post-processing)
from probeye.postprocessing.sampling_plots import create_pair_plot
from probeye.postprocessing.sampling_plots import create_posterior_plot
from probeye.postprocessing.sampling_plots import create_trace_plot

The bridge that is considered in this example should have a length of ‘L_bridge’, while the positions of the ‘ns’ deflection sensors it is equipped with are given by the list ‘x_sensors’. Note that the bridge is modeled as a 1D bridge, where the coordinates of the ‘ns’ sensors are measured from the side of the bridge where the cars enter the bridge. All length measurements are stated in meters. For the following computations we also need the gravitational constant, which is also given below. Finally, we need a true value for the bridge’s bending stiffness ‘EI_true’.

# relevant length measurements; feel free to add more sensors by adding the respective
# positions in the x_sensors-list
L_bridge = 100.0  # [m]
x_sensors = [49, 51, 55]

# other relevant constants
g = 9.81  # [m/s**2]

# 'true' value of EI
EI_true = 1.0  # [10^10 Nm^2]

We will now begin with artificially generating our data. To that end, let’s define three experiments, that is three scenarios where a car with a certain mass is crossing the bridge at a certain velocity. Feel free to add more experiments here.

# definition of the experiments
experiments_def = {
    "Experiment_1": {
        "car_mass_kg": 3000.0,
        "car_speed_m/s": 2.5,
        "plot_color": "black",
    },
    "Experiment_2": {
        "car_mass_kg": 5000.0,
        "car_speed_m/s": 10,
        "plot_color": "blue",
    },
    "Experiment_3": {
        "car_mass_kg": 10000.0,
        "car_speed_m/s": 5.0,
        "plot_color": "red",
    },
}

Let’s now define the parameters required to define the time-space correlated noise that we need to generate our synthetic test data.

# 'true' value of noise sd, and its uniform prior parameters
sigma = 0.01
low_sigma = 0.0
high_sigma = 0.2

# 'true' value of spatial correlation length, and its uniform prior parameters
l_corr_x = 10.0  # [m]
low_l_corr_x = 1.0  # [m]
high_l_corr_x = 25.0  # [m]

# 'true' value of temporal correlation length, and its uniform prior parameters
l_corr_t = 1.0  # [s]
low_l_corr_t = 0.0  # [s]
high_l_corr_t = 5.0  # [s]

# settings for the data generation
ns = len(x_sensors)
dt = 0.5  # [s]
seed = 1

We’ll continue with defining the forward model. Note however, that we are just defining it as a class, but we do not add it to the problem yet. The reason for defining it now is that we will use it to artificially generate our test data.

class BridgeModel(ForwardModelBase):
    def interface(self):
        self.parameters = ["L", "EI"]
        self.input_sensors = [Sensor("v"), Sensor("t"), Sensor("F")]
        self.output_sensors = []
        for i, x_i in enumerate(x_sensors):
            self.output_sensors.append(
                Sensor(name=f"y{i + 1}", x=x_i, std_model="sigma")
            )

    @staticmethod
    def beam_deflect(x_sensor, x_load, L_in, F_in, EI_in):
        """Convenience method used by self.response during a for-loop."""
        y = np.zeros(len_or_one(x_load))
        for i, x_load_i in enumerate(x_load):
            if x_sensor <= x_load_i:
                b = L_in - x_load_i
                x = x_sensor
            else:
                b = x_load_i
                x = L_in - x_sensor
            y[i] = -(F_in * b * x) / (6 * L_in * EI_in) * (L_in**2 - b**2 - x**2)
        return y

    def response(self, inp: dict) -> dict:
        v_in = inp["v"]
        t_in = inp["t"]
        L_in = inp["L"]
        F_in = inp["F"]
        EI_in = inp["EI"] * 1e10
        response = {}
        x_load = v_in * t_in
        for os in self.output_sensors:
            response[os.name] = self.beam_deflect(os.x, x_load, L_in, F_in, EI_in)
        return response

Next to the forward model, we need to define two correlation functions in order to be able to generate the data. Please note that this is just required for the data generation. It is not required for setting up the probeye framework. If we had real data, we would not have to do this.

def correlation_func_space(d):
    return correlation_function(d, correlation_length=l_corr_x)


def correlation_func_time(d):
    return correlation_function(d, correlation_length=l_corr_t)

At this point, we have prepared all ingredients for generating our synthetic data. This code block is a bit longer, since we have to account for the non-trivial time-space correlation structure (we use tripy for this).

# initialize the bridge model
bridge_model = BridgeModel("BridgeModel")

# for reproducible results
np.random.seed(seed)

# create and plot data for each experiment
data_dict = {}  # type: dict
for j, (exp_name, exp_dict) in enumerate(experiments_def.items()):

    # load experimental setting
    v = exp_dict["car_speed_m/s"]
    F = exp_dict["car_mass_kg"] * g  # type: ignore

    # compute the 'true' deflections for each sensor which will serve as mean
    # values; note that the values are concatenated to a long vector
    t_end = L_bridge / v  # type: ignore
    t = np.arange(0, t_end, dt)
    if t[-1] != t_end:
        t = np.append(t, t[-1] + dt)
    nt = len(t)
    inp_1 = {"v": v, "t": t, "L": L_bridge, "F": F, "EI": EI_true}
    mean_dict = bridge_model.response(inp_1)
    mean = np.zeros(ns * nt)
    for ii, mean_vector in enumerate([*mean_dict.values()]):
        mean[ii::ns] = mean_vector

    # compute the covariance matrix using tripy
    cov_compiler = MeasurementSpaceTimePoints()
    cov_compiler.add_measurement_space_points(
        coord_mx=[[x] for x in x_sensors],
        standard_deviation=sigma,
        group="space",
    )
    cov_compiler.add_measurement_time_points(coord_vec=t, group="time")
    with HiddenPrints():  # this prevents printing of info messages from tripy
        cov_compiler.add_measurement_space_within_group_correlation(
            group="space", correlation_func=correlation_func_space
        )
        cov_compiler.add_measurement_time_within_group_correlation(
            group="time", correlation_func=correlation_func_time
        )
    # note here that the rows/columns have the reference order:
    # y1(t1), y2(t1), y3(t1), ..., y1(t2), y2(t2), y3(t2), ....
    cov = cov_compiler.compile_covariance_matrix()

    # generate the experimental data and add it to the problem
    y_test = np.random.multivariate_normal(mean=mean, cov=cov)
    y1 = y_test[0::ns]
    y2 = y_test[1::ns]

    # save the data for later
    data_dict[exp_name] = {"t": t, "v": v, "F": F}
    for ii in range(ns):
        data_dict[exp_name][f"y{ii + 1}"] = y_test[ii::ns]

    # first sensor
    c = exp_dict["plot_color"]
    plt.plot(t, mean[0::ns], "-", label=f"y1 (true, {exp_name})", color=c)
    plt.scatter(t, y1, marker="o", label=f"y1 (sampled, {exp_name})", c=c)

    # second sensor
    plt.plot(t, mean[1::ns], "--", label=f"y2 (true, {exp_name})", color=c)
    plt.scatter(t, y2, marker="x", label=f"y2 (sampled, {exp_name})", c=c)

# finish and show the plot
plt.title("Data of first two sensors for all experiments")
plt.xlabel("t [s]")
plt.ylabel("deflection [m]")
plt.legend(fontsize=8)
plt.tight_layout()
plt.show()
Data of first two sensors for all experiments

At this point we have some data to calibrate our model against. Hence, we can set up the inverse problem itself. This always begins by initializing an object form the InverseProblem-class and adding all of the problem’s parameters with priors that reflect our current best guesses of what the parameter’s values might look like. Please check out the ‘Components’-part of this documentation to get more information on the arguments seen below. However, most of the code should be self-explanatory.

# initialize the inverse problem with a useful name
problem = InverseProblem(
    "Simple bridge model with time-space correlation", print_header=False
)

# add all parameters to the problem
problem.add_parameter(
    "EI",
    domain="(0, +oo)",
    tex="$EI$",
    info="Bending stiffness of the beam [Nm^2]",
    prior=Normal(mean=0.9 * EI_true, std=0.25 * EI_true),
)
problem.add_parameter(
    "L", "model", tex="$L$", info="Length of the beam [m]", value=L_bridge
)
problem.add_parameter(
    "sigma",
    domain="(0, +oo)",
    tex=r"$\sigma$",
    info="Std. dev, of 0-mean noise model",
    prior=Uniform(low=low_sigma, high=high_sigma),
)
problem.add_parameter(
    "l_corr_x",
    domain="(0, +oo)",
    tex=r"$l_\mathrm{corr,x}$",
    info="Spatial correlation length of correlation model",
    prior=Uniform(low=low_l_corr_x, high=high_l_corr_x),
)
problem.add_parameter(
    "l_corr_t",
    domain="(0, +oo)",
    tex=r"$l_\mathrm{corr,t}$",
    info="Temporal correlation length of correlation model",
    prior=Uniform(low=low_l_corr_t, high=high_l_corr_t),
)

As the next step, we need to add our experimental data the forward model and the likelihood model. Note that the order is important and should not be changed.

# experimental data
for exp_name, data in data_dict.items():

    sensor_values_vtF = {"v": data["v"], "t": data["t"], "F": data["F"]}
    sensor_values_x = {f"x{k + 1}": x_sensors[k] for k in range(ns)}
    sensor_values_y = {f"y{k + 1}": data[f"y{k + 1}"] for k in range(ns)}
    sensor_values = {**sensor_values_vtF, **sensor_values_x, **sensor_values_y}

    problem.add_experiment(
        name=exp_name,
        sensor_data=sensor_values,
    )

# add the forward model to the problem
problem.add_forward_model(bridge_model, experiments=[*data_dict.keys()])

# likelihood models
for exp_name in problem.experiments.keys():
    loglike = GaussianLikelihoodModel(
        experiment_name=exp_name,
        model_error="additive",
        correlation=ExpModel(x="l_corr_x", t="l_corr_t"),
    )
    problem.add_likelihood_model(loglike)

Now, our problem definition is complete, and we can take a look at its summary:

# print problem summary
problem.info(print_header=True)
2023-08-02 14:22:36.734 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.734 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.734 | INFO     | #                                            dP                                                    # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.734 | INFO     | #                                            88                                                    # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.734 | INFO     | #                  88d888b. 88d888b..d8888b. 88d888b. .d8888b. dP    dP .d8888b.                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.734 | INFO     | #                  88'  `88 88'     88'  `88 88'  `88 88ooood8 88    88 88ooood8                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.734 | INFO     | #                  88.  .88 88      88.  .88 88.  .88 88.      88.  .88 88.                        # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | #                  88Y888P' dP      `88888P' 88Y8888' `88888P' `8888P88 `88888P'                   # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | #                  88                                                .88                           # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | #                  dP                                            d8888P                            # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | #        Version 3.0.4 - A general framework for setting up parameter estimation problems.         # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | #                                                                                                  # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.735 | INFO     | # ================================================================================================ # | probeye.subroutines:print_probeye_header:620
2023-08-02 14:22:36.738 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     | Problem summary: Simple bridge model with time-space correlation                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     | ================================================================                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     | Forward models                                                                                       | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     | ---------------------------------------------------------                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     |  Model name   | Global parameters   | Local parameters                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     | --------------+---------------------+--------------------                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.739 | INFO     |  BridgeModel  | L, EI               | L, EI                                                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     | Priors                                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     | --------------------------------------------------------------------------------------------------   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     |  Prior name       | Global parameters                     | Local parameters                         | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     | ------------------+---------------------------------------+---------------------------------------   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     |  EI_normal        | EI, mean_EI, std_EI                   | EI, mean_EI, std_EI                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     |  sigma_uniform    | sigma, low_sigma, high_sigma          | sigma, low_sigma, high_sigma             | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     |  l_corr_x_uniform | l_corr_x, low_l_corr_x, high_l_corr_x | l_corr_x, low_l_corr_x, high_l_corr_x    | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     |  l_corr_t_uniform | l_corr_t, low_l_corr_t, high_l_corr_t | l_corr_t, low_l_corr_t, high_l_corr_t    | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.740 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     | Parameter overview                                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     | --------------------------------------------------------------------------------------------------------------------------------------- | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     |  Parameter type/role   | Parameter names                                                                                     |   Count | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     | -----------------------+-----------------------------------------------------------------------------------------------------+--------- | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     |  Model parameters      | EI, L                                                                                               |       2 | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     |  Prior parameters      | mean_EI, std_EI, low_sigma, high_sigma, low_l_corr_x, high_l_corr_x, low_l_corr_t, high_l_corr_t    |       8 | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     |  Likelihood parameters | sigma, l_corr_x, l_corr_t                                                                           |       3 | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     |  Const parameters      | mean_EI, std_EI, L, low_sigma, high_sigma, low_l_corr_x, high_l_corr_x, low_l_corr_t, high_l_corr_t |       9 | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     |  Latent parameters     | EI, sigma, l_corr_x, l_corr_t                                                                       |       4 | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.741 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     | Parameter explanations                                                                               | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     | ---------------------------------------------------------------------------                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     |  Name          | Short explanation                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     | ---------------+-----------------------------------------------------------                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     |  mean_EI       | Normal prior's parameter for latent parameter 'EI'                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     |  std_EI        | Normal prior's parameter for latent parameter 'EI'                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     |  EI            | Bending stiffness of the beam [Nm^2]                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     |  L             | Length of the beam [m]                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.742 | INFO     |  low_sigma     | Uniform prior's parameter for latent parameter 'sigma'                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  high_sigma    | Uniform prior's parameter for latent parameter 'sigma'                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  sigma         | Std. dev, of 0-mean noise model                                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  low_l_corr_x  | Uniform prior's parameter for latent parameter 'l_corr_x'                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  high_l_corr_x | Uniform prior's parameter for latent parameter 'l_corr_x'                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  l_corr_x      | Spatial correlation length of correlation model                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  low_l_corr_t  | Uniform prior's parameter for latent parameter 'l_corr_t'                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  high_l_corr_t | Uniform prior's parameter for latent parameter 'l_corr_t'                           | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.743 | INFO     |  l_corr_t      | Temporal correlation length of correlation model                                    | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     | Constant parameters                                                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     | -------------------------                                                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     |  Name          |   Value                                                                             | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     | ---------------+---------                                                                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     |  mean_EI       |    0.9                                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     |  std_EI        |    0.25                                                                             | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     |  L             |  100                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     |  low_sigma     |    0                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.744 | INFO     |  high_sigma    |    0.2                                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     |  low_l_corr_x  |    1                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     |  high_l_corr_x |   25                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     |  low_l_corr_t  |    0                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     |  high_l_corr_t |    5                                                                                | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     | Theta interpretation                                                                                 | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     | +---------------------------+                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     | |  Theta  |    Parameter    |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     | |  index  |      name       |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.745 | INFO     | |---------------------------|                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | |      0 --> EI             |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | |      1 --> sigma          |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | |      2 --> l_corr_x       |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | |      3 --> l_corr_t       |                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | +---------------------------+                                                                        | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | Added experiments                                                                                    | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | ---------------------------------------------------                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     |  Name         | Sensor values    | Forward model                                                     | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.746 | INFO     | --------------+------------------+-----------------                                                  | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |  Experiment_1 | v  ( 1 element)  | BridgeModel                                                       | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | t  (81 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | F  ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | x1 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | x2 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | x3 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | y1 (81 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | y2 (81 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |               | y3 (81 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.747 | INFO     |  Experiment_2 | v  ( 1 element)  | BridgeModel                                                       | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | t  (21 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | F  ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | x1 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | x2 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | x3 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | y1 (21 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | y2 (21 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | y3 (21 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |  Experiment_3 | v  ( 1 element)  | BridgeModel                                                       | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.748 | INFO     |               | t  (41 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |               | F  ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |               | x1 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |               | x2 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |               | x3 ( 1 element)  |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |               | y1 (41 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |               | y2 (41 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |               | y3 (41 elements) |                                                                   | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     | Added likelihood models                                                                              | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.749 | INFO     | ----------------------------------------------------------------------------                         | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.750 | INFO     |  Name         | Parameters                | Target sensors   | Experiment                            | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.750 | INFO     | --------------+---------------------------+------------------+--------------                         | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.750 | INFO     |  Experiment_1 | l_corr_x, l_corr_t, sigma | y1, y2, y3       | Experiment_1                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.750 | INFO     |  Experiment_2 | l_corr_x, l_corr_t, sigma | y1, y2, y3       | Experiment_2                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.750 | INFO     |  Experiment_3 | l_corr_x, l_corr_t, sigma | y1, y2, y3       | Experiment_3                          | probeye.definition.inverse_problem:info:978
2023-08-02 14:22:36.750 | INFO     |                                                                                                      | probeye.definition.inverse_problem:info:978

After the problem definition comes the problem solution. There are different solver one can use, but we will just demonstrate how to use two of them: the scipy-solver, which merely provides a point estimate based on a maximum likelihood optimization, and the emcee solver, which is a MCMC-sampling solver. Let’s begin with the scipy-solver:

# this is for using the scipy-solver (maximum likelihood estimation)
scipy_solver = MaxLikelihoodSolver(problem, show_progress=False)
max_like_data = scipy_solver.run()
2023-08-02 14:22:36.755 | INFO     | Solving problem via maximum likelihood estimation                                                    | probeye.inference.scipy.solver:_run_ml_or_map:452
2023-08-02 14:22:36.756 | INFO     | Using start values:                                                                                  | probeye.inference.scipy.solver:_run_ml_or_map:460
2023-08-02 14:22:36.756 | INFO     | EI        = 0.9                                                                                      | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:22:36.756 | INFO     | sigma     = 0.1                                                                                      | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:22:36.756 | INFO     | l_corr_x  = 13.0                                                                                     | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:22:36.756 | INFO     | l_corr_t  = 2.5                                                                                      | probeye.subroutines:print_dict_in_rows:728
2023-08-02 14:22:36.756 | INFO     | Starting optimizer (using Nelder-Mead)                                                               | probeye.inference.scipy.solver:_run_ml_or_map:464
2023-08-02 14:22:36.756 | INFO     | No solver options specified                                                                          | probeye.inference.scipy.solver:_run_ml_or_map:469
2023-08-02 14:22:38.956 | INFO     |                                                                                                      | probeye.inference.scipy.solver:summarize_point_estimate_results:356
2023-08-02 14:22:38.956 | INFO     | Results of maximum likelihood estimation                                                             | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:22:38.956 | INFO     | =====================================                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:22:38.957 | INFO     | Optimization terminated successfully.                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:22:38.957 | INFO     | -------------------------------------                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:22:38.957 | INFO     | Number of iterations:           227                                                                  | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:22:38.957 | INFO     | Number of function evaluations: 384                                                                  | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:22:38.957 | INFO     | -------------------------------------                                                                | probeye.inference.scipy.solver:summarize_point_estimate_results:359
2023-08-02 14:22:38.957 | INFO     | EI_opt       = [0.99503857] (start = 0.9)                                                            | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:22:38.958 | INFO     | sigma_opt    = [0.00956326] (start = 0.1)                                                            | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:22:38.958 | INFO     | l_corr_x_opt = [8.69766034] (start = 13.0)                                                           | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:22:38.958 | INFO     | l_corr_t_opt = [0.98924203] (start = 2.5)                                                            | probeye.inference.scipy.solver:summarize_point_estimate_results:382
2023-08-02 14:22:38.958 | INFO     |                                                                                                      | probeye.inference.scipy.solver:summarize_point_estimate_results:383

All solver have in common that they are first initialized, and then execute a run-method, which returns its result data in the format of an arviz inference-data object (except for the scipy-solver). Let’s now take a look at the emcee-solver.

emcee_solver = EmceeSolver(problem, show_progress=False)
inference_data = emcee_solver.run(n_steps=2000, n_initial_steps=200)
2023-08-02 14:22:38.963 | INFO     | Solving problem using emcee sampler with 200 + 2000 samples and 20 walkers                           | probeye.inference.emcee.solver:run:178
2023-08-02 14:22:38.963 | INFO     | No additional options specified                                                                      | probeye.inference.emcee.solver:run:186
2023-08-02 14:27:20.652 | INFO     | Sampling of the posterior distribution completed: 2000 steps and 20 walkers.                         | probeye.inference.emcee.solver:run:255
2023-08-02 14:27:20.652 | INFO     | Total run-time (including initial sampling): 4m41s.                                                  | probeye.inference.emcee.solver:run:259
2023-08-02 14:27:20.652 | INFO     |                                                                                                      | probeye.inference.emcee.solver:run:260
2023-08-02 14:27:20.653 | INFO     | Summary of sampling results (emcee)                                                                  | probeye.inference.emcee.solver:run:261
2023-08-02 14:27:20.660 | INFO     |             mean    median    sd    5%    95%                                                        | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:27:20.660 | INFO     | --------  ------  --------  ----  ----  -----                                                        | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:27:20.660 | INFO     | EI          1.00      1.00  0.02  0.97   1.02                                                        | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:27:20.661 | INFO     | sigma       0.01      0.01  0.00  0.01   0.01                                                        | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:27:20.661 | INFO     | l_corr_x    8.97      8.88  1.19  7.18  11.06                                                        | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:27:20.661 | INFO     | l_corr_t    1.04      1.03  0.14  0.84   1.30                                                        | probeye.inference.emcee.solver:emcee_summary:137
2023-08-02 14:27:20.661 | INFO     |                                                                                                      | probeye.inference.emcee.solver:run:267

Finally, we want to plot the results we obtained. To that end, probeye provides some post-processing routines, which are mostly based on the arviz-plotting routines.

# this is optional, since in most cases we don't know the ground truth
true_values = {
    "EI": EI_true,
    "sigma": sigma,
    "l_corr_x": l_corr_x,
    "l_corr_t": l_corr_t,
}

# this is an overview plot that allows to visualize correlations
pair_plot_array = create_pair_plot(
    inference_data,
    emcee_solver.problem,
    true_values=true_values,
    focus_on_posterior=True,
    show_legends=False,
    title="Sampling results from emcee-Solver (pair plot)",
)
Sampling results from emcee-Solver (pair plot)
# this is a posterior-focused plot, without including priors
post_plot_array = create_posterior_plot(
    inference_data,
    emcee_solver.problem,
    true_values=true_values,
    title="Sampling results from emcee-Solver (posterior plot)",
)
Sampling results from emcee-Solver (posterior plot), $EI$, $\sigma$, $l_\mathrm{corr,x}$, $l_\mathrm{corr,t}$
# trace plots are used to check for "healthy" sampling
trace_plot_array = create_trace_plot(
    inference_data,
    emcee_solver.problem,
    title="Sampling results from emcee-Solver (trace plot)",
)
Sampling results from emcee-Solver (trace plot), $EI$, $EI$, $\sigma$, $\sigma$, $l_\mathrm{corr,x}$, $l_\mathrm{corr,x}$, $l_\mathrm{corr,t}$, $l_\mathrm{corr,t}$

Total running time of the script: ( 4 minutes 48.620 seconds)

Motivation#

Parameter estimation is a common task in many engineering and physics-focused disciplines. Typically, a parameterized model is proposed to explain a certain phenomenon, and its parameters are calibrated using some experimental data sets. Oftentimes, these models are then used in a design-context, where they predict the phenomenon’s behavior for a situation that is not covered by experiments. However, the designer might not have been involved in obtaining the parameter estimates she is using to evaluate her design case. Maybe these estimates lead to peculiar results in the considered design case, and should therefore be checked on their validity. Maybe there has been made a mistake. This is typically the point where it gets tricky. How can the parameter estimates in question be reproduced? What was the underlying model, which assumptions have been made, was there data that was excluded from the fit? What about the uncertainty of the estimates? Questions like these are often difficult to answer, even if the person or the people who conducted the calibration is/are still available.

Probeye was intended to address these reproducibility problems by providing an interface for defining parameter estimation problems in a way that can be translated into an ontology-based problem definition. This kind of tool-independent definition, that includes the model structure (forward models, parameters, priors, etc.) as well as the used experimental data is based on rdf-triples and can be saved as the fingerprint of the problem.

Next to the problem definition, also the inference data, i.e., the data generated by a solver while solving the problem should be saved in an ontology-based manner. Think for example about the parameter samples generate during an MCMC run. In order to do so, probeye comes with an integrated interface to different inference engines such as emcee or dynesty.

Mathematics#

In the following, a brief overview with respect to the inverse problems that can be described when using probeye should be outlined. This requires to define the statistical models (the data generating processes) that can be build from a given deterministic simulation model, as it is assumed to be the case when using this package.

Let us assume that \mathbf{y}_e\in\mathbb{R}^d, d\in\mathbb{N} denotes the observed data (observations) of some process that depends on a set of controllable input data which we summarize as \mathbf{x}_e\in\mathbb{R}^{d_x}, d_x\in\mathbb{N}. In probeye it is assumed, that the observations \mathbf{y}_e are realizations of a data generating process that can be described by a multivariate normal distribution that depends on the input data \mathbf{x}_e. If \mathbf{Y}_e denotes the corresponding random variable, this can be expressed as

(1)#\mathbf{Y}_e \sim \mathcal{N}(\bm{\mu}(\mathbf{x}_e), \bm{\Sigma}(\mathbf{x}_e)).

Here, \bm\mu(\mathbf{x}_e)\in\mathbb{R}^d and \bm\Sigma(\mathbf{x}_e)\in\mathbb{R}^{d \times d} refer to the mean vector and the covariance matrix respectively. Next to the previously described observations, we assume that a deterministic forward model \mathbf{y} is available that describes the mean vector \bm{\mu} of the data generating process, that is

\bm{\mu}(\mathbf{x}_e) = \mathbf{y}(\mathbf{x}_e,\bm\theta_y)

where \bm\theta_y\in\mathbb{R}^n, n\in\mathbb{N} is the model parameter vector. Since \bm\mu is given by the forward model response, the different data generating processes that can be described in probeye only differ in the definition of the covariance matrix \bm{\Sigma}. The latter depends on the assumed sub-structure of the data generation process. In this context, two general options are supported. A data generation process with an additive or a multiplicative model prediction error. In the first case, Expression (1) is specified to

(2)#\mathbf{Y}_e = \mathbf{y}(\mathbf{x}_e,\bm\theta_y) + \mathbf{E}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell) + \mathbf{E}_\mathrm{meas}

Here, \mathbf{E}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell) \sim \mathcal{N}(\mathbf{0}, \bm{\Sigma}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell)) denotes a Gaussian model for the prediction error, introducing additional latent parameters \bm\theta_\ell\in\mathbb{R}^m, m\in\mathbb{N}, while \mathbf{E}_\mathrm{meas} \sim \mathcal{N}(\mathbf{0}, \mathrm{diag}(\sigma_m^2)), \sigma_m\in\mathbb{R}_+ describes an independent and identical distributed (i.i.d.) measurement error. The alternative to the additive model prediction error is a multiplicative one. In this case, Expression (1) is specified to

(3)#\mathbf{Y}_e =  \mathbf{K}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell)\mathbf{y}(\mathbf{x}_e,\bm\theta_y) + \mathbf{E}_\mathrm{meas}

where \mathbf{K}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell) \sim \mathcal{N}(\mathbf{1}, \bm{\Sigma}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell)) denotes the unit-mean prediction error while \mathbf{E}_\mathrm{meas} \sim \mathcal{N}(\mathbf{0}, \mathrm{diag}(\sigma_m^2)) describes the measurement error as defined before. Both data generation processes, Equations (2) and (3), describe a random variable following a multivariate normal distribution as given by (1) where the covariance matrix is given by

\bm{\Sigma}(\mathbf{x}_e) = \begin{cases}
    \bm{\Sigma}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell) + \mathrm{diag}(\sigma_m^2)  & \text{(additive)} \\
    \mathrm{diag}(\mathbf{y}(\mathbf{x}_e,\bm\theta_y))\bm{\Sigma}_\mathrm{model}(\mathbf{x}_e,\bm{\theta}_\ell)\mathrm{diag}(\mathbf{y}(\mathbf{x}_e,\bm\theta_y)) + \mathrm{diag}(\sigma_m^2) & \text{(multiplicative).}
    \end{cases}

Once the mean vector \bm\mu(\mathbf{x}_e) and the covariance matrix \bm{\Sigma}(\mathbf{x}_e) are determined, the likelihood of the statistical model can be evaluated via

\ell(\bm{x}_e,\bm\theta_y,\bm\theta_\ell) = \frac{\exp\left(-\frac{1}{2}(\bm{y}_e - \bm\mu(\bm{x}_e))^T\bm\Sigma(\bm{x}_e)^{-1}(\bm{y}_e - \bm\mu(\bm{x}_e))\right)}{\sqrt{(2\pi)^d\det\bm\Sigma(\bm{x}_e)}}.

Components#

In order to provide a valid definition of an inverse problem (i.e., a parameter estimation problem) using probeye, four main ingredients (or components, as they are called here) are required.

  1. Parameters

  2. Experiments

  3. Forward models

  4. Likelihood models

These four components have to be defined by the user in a way of adding them to a problem instance. Consequently, the base structure of the corresponding probeye-code looks like this:

from probeye.definition import InverseProblem

# initialize a problem instance
problem = InverseProblem("MyProblem")

# add the four components
problem.add_parameter(...)
problem.add_experiment(...)
problem.add_forward_model(...)
problem.add_likelihood_model(...)

Of course the dots in parenthesis (...) still need to be further specified (according to the problem at hand), but the fundamental structure of how to define an inverse problem is given by the code block above. It should be pointed out that of each component multiple instances can be added - for example five parameters, two forward models, one hundred experiments and ten likelihood models - but at least one instance of each is required to obtain a valid problem definition. Also the order of adding those components should look like above. So, at first the parameters are added, then the experiments followed by the forward models and the likelihood models are added at last. Each of these components is explained in more detail in the following sections.

Parameters#

In probeye, an inverse problem is understood as a parameter estimation problem. Hence, it comes at no surprise that one needs to define at least one parameter that should be inferred. After initializing an inverse problem, adding the parameters to the problem is the natural next step. In principle, you could also add the experiments first, but it is recommended to begin with the parameters, because problem definitions are more readable like that.

Latent and constant parameters#

Generally, two kinds of parameters are distinguished in probeye: latent and constant parameters. This property is also referred to as the parameter’s role. Latent parameters are parameters that should be inferred, while constant parameters have a pre-defined value, and will hence not be inferred in the inference step. Earlier, it was pointed out that the definition of at least one parameter is required for a valid problem definition. Now we should state more precisely: the definition of at least one latent parameter is required for a valid problem definition.

A typical definition of a latent parameter looks like this:

problem.add_parameter(
    "a",
    prior=Normal("mean"=1.0, "std"=2.0),
    tex="$a$",
    info="Slope of the fitted function",
)

And a typical definition of a constant parameter looks like this:

problem.add_parameter(
    "sigma_meas",
    value=0.1,
    info="Standard deviation of measurement error",
)

As one can see, the definition of either a latent or a constant parameter is triggered by using the prior or the value keyword argument in the add_parameter-method. The value keyword argument can be a scalar like in the example above or a vector, for example value=np.array([0.9, -0.3]). The prior keyword argument on the other hand has to be given as a prior object that is initialized with its specific parameter values. Possible prior options are currently Normal, MultivariateNormal, LogNormal, TruncNormal, Uniform, Weibull and SampleBased. More information on the priors and their parameters is given in this section below.

Finally, it should be pointed out that it is possible to give a very short definition of a latent parameter by neither specifying the prior nor the const keyword argument. Examples could look like this:

problem.add_parameter("a")
problem.add_parameter("b", domain="(0, 1]")

In both of these cases an uninformative prior is assumed, meaning a prior that is constant over its domain. Note however, that internally, the uninformative prior is not a proper prior like the conventional prior classes, but just a flag stating that the corresponding parameter is a latent parameter without a prior. These types of latent parameters can only be used for maximum likelihood estimations. When using a sampling-based solver, it is required to specify a proper prior.

A parameter’s name and type#

Each parameter (latent and constant) must have a name and a type. The parameter’s name, which is given by the first argument in the add_parameter-method, must be unique in the scope of the problem, i.e., no other parameter can have the same name. This name is also referred to as the parameter’s global name.

The parameter’s type on the other hand, states where the parameter appears in the problem definition. There are three possible types model, prior and likelihood. A parameter of type model appears in one the problem’s forward models, while a parameter of type prior will be used in the definition of some latent parameter’s prior. Finally, a parameter of type likelihood will appear in one of the problem’s likelihood models. The specification of the prior type is optional. At the beginning of probeye’s development they used to be stated explicitly in the add_parameter-method as the second positional argument. But today this is not necessary anymore because: if the type is not given, it will be determined automatically.

Prior definition of latent parameters#

As described above, when defining a latent parameter, one has to provide a prior object that is initialized with the prior’s parameters and their values. The following table provides the currently implemented options.

Prior type

Prior parameters

Comments

Normal

mean, std

Gaussian or normal distribution where mean refers to the mean and std to the standard deviation.

MultivariateNormal

mean, cov

Multivariate normal distribution where mean refers to the mean and cov to the covariance matrix.

LogNormal

mean, std

Log-normal distribution where mean refers to the mean and std is the standard deviation on the log-scale.

TruncNormal

mean, std, low, high

Truncated normal distribution. Same as for “normal”, while low and high refer to the lower and upper bound respectively.

Uniform

low, high

Uniform distribution where low is the lower and high is the upper bound. Note that these bounds are inclusive.

Weibull

scale, shape

Weibull distribution. Check out the scipy-documentation for more information on the parameters.

SampleBased

samples

Gaussian kernel density estimate based on a given vector of samples.

It should be pointed out that it is also possible to use a latent parameter as a prior parameter. The following example may illustrate that.

problem.add_parameter(
    "mean_a",
    prior=Uniform("low"=-1.0, "high"=1.0),
    tex="r$\mu_a$",
    info="Mean parameter of a's prior",
)
problem.add_parameter(
    "a",
    prior=Normal("mean"="mean_a", "std"=2.0),
    tex="$a$",
    info="Slope of the fitted function",
)

Note that instead of providing a numeric value for a’s mean parameter, the name (hence a string) of the previously defined latent parameter mean_a is provided. It is important in this example that mean_a is defined before a.

A latent parameter’s domain#

Sometimes, the value of a latent parameter should stay in certain bounds. For example, if a parameter appears in the denominator of a fraction, it cannot assume the value zero. One measure to address such situations is to define the parameter’s prior in a way that its domain does not contain problematic values. However, during sampling-procedures it is still possible that values outside of a prior’s domain are proposed, and hence evaluated. To prevent that, one can define a latent parameter’s domain via the domain argument when adding it to the problem. This would look like this:

problem.add_parameter(
    "gamma",
    domain="(0, 1)",
    prior=("uniform", {"low": 0.0, "high": 1.0}),
)

Here, the domain of gamma is specified to an open interval from zero to one. Other valid strings for the domain argument are for example "[0, 1]" for a closed interval, "(0, 1]" or "[0, 1)" for half-closed intervals, or "(-oo, oo)" for a domain from minus to plus infinity. Other variations are of course possible. For a multivariate parameter, the definition looks very similar as shown by the following example.

problem.add_parameter(
    "mb",
    dim=2,
    domain="(-oo, +oo) (-oo, +oo)",
    prior=MultivariateNormal(
        "mean"=np.array([0.0, 0.0]),
        "cov"=np.array([[1.0, 0.0], [0.0, 1.0]]),
    ),
)

So in this case, the domain-string is simply a concatenation of domain-strings for a 1D-interval. Note that for multidimensional parameter, also a dim-argument is required, that specifies the parameter’s dimensionality. If a latent parameter is added to a problem without specifying its domain, it is assumed that there are no restrictions. So, in the code block above, the domain-specification would actually be unnecessary since this domain would also have been assumed if no domain was specified.

The tex and info arguments#

Each parameter can (but does not have to) have a tex and an info attribute. While the tex attribute is used for plotting, the info string is used when calling a problems info-method problem.info() printing some information on the defined problem. Even if not required, it is recommended to define both of these attributes for each parameter added to the problem.

Experiments#

The experiments that are added to an inverse problem are the containers of the experimentally recorded data that is used to calibrate the problem’s latent parameters with. But they also will contain input data for the forward model like initial or boundary conditions that are needed to simulate a specific experiment. To add an experiment in probeye, the code looks like this:

problem.add_experiment(
    name="TestSeries_Aug12_2018",
    sensor_data={
        'y1': np.array([1.12321, 0.37320, 0.14189, -0.22992, -0.04648]),
        'y2': np.array([0.20105, 1.61940, 0.33614,  0.53154,  0.04718]),
        'y3': np.array([2.68936, 0.29683, 1.10388,  0.81638,  1.48964]),
        'time': np.array([0.0, 0.25, 0.5 , 0.75, 1.0])
        'offset': 1.29,
    },
)

The first argument (here: "TestSeries_Aug12_2018") is a unique name of the experiment. The second argument states the actual measurement data, i.e., the values that have been recorded by the experiment’s sensors. Those values can be given as scalars (float, int) or as vectors in form of tuples or numpy arrays. It is not required that all entries have the same length. Note however, that these arrays have to be one-dimensional and cannot be of higher dimension. The keys of the sensor_data-dictionary will be referenced later in the forward model’s definition, as it will be discussed in the next section.

Forward models#

The forward model is a parameterized simulation model (for example a finite element model) the predictions of which should be compared against some experimental data. The forward model’s parameters are typically the parameters which are of primary interest within the stated problem. It should be pointed out that many inverse problems might contain only one forward model, but it is also possible to set up a problem that contains multiple forward models.

In probeye, a forward model is a function that has two kinds of arguments: input sensors and parameters, see also the figure below. While input sensors refer to specific experimental data that is required to simulate it (for example certain initial or boundary conditions, load settings, etc.), parameters refer to the forward model’s parameters. Once all input sensors and parameters are provided, the forward model computes a result that it returns via its output sensors.

_images/forward_model.png

In order to add a forward model to an inverse problem, two steps are required. At first, the forward model has to be defined as a Python class. This definition is done by setting up a new model class (that can have an arbitrary name) which is based on the probeye-class ForwardModelBase. This class must have both an interface-method, which defines the forward model’s parameters, input sensors and output sensors, and it must have a response-method, which describes a forward model evaluation. The response-method has only one input, which is a dictionary that contains both the input sensors and the parameters. The method will then perform some computations and returns its results in form of a dictionary of the forward model’s output sensors. For a simple linear model, such a definition could look like this:

from probeye.definition.forward_model import ForwardModelBase
from probeye.definition.sensor import Sensor

class LinearModel(ForwardModelBase):
    def interface(self):
        self.parameters = ["a", "b", "offset"]
        self.input_sensors = Sensor("time")
        self.output_sensors = [
            Sensor("y1", x=0.0, std_model="sigma_1"),
            Sensor("y2", x=0.5, std_model="sigma_2"),
            Sensor("y3", x=1.0, std_model="sigma_3"),
        ]

    def response(self, inp: dict) -> dict:
        t = inp["time"]
        a = inp["a"]
        b = inp["b"]
        offset = inp["offset"]
        response = dict()
        for os in self.output_sensors:
            response[os.name] = a * os.x + b * t + offset
        return response

After the forward model has been defined, it must be added to the problem. For the example shown above, this would look like this:

# add the forward model to the problem
problem.add_forward_model(
    LinearModel("LinearModel"), experiments=["TestSeries_Aug12_2018"]
)

Next to the forward model instance (initialized with an arbitrary name), which is provided as the first argument of the add_forward_model-method, it is necessary to provide a list of experiments (which have been added previously) that are described by the forward model. Note that the stated experiments must have all of the forward model’s input and output sensor names as keys in the sensor_data dictionary. This means, the experiments need to contain all the data required or simulated by the forward model.

The interface method#

The interface-method defines three attributes of the user defined forward model. The forward model’s parameters, its input sensors and output sensors. The parameters (self.parameters) define which parameters defined within the problem scope are used by the forward model. These parameters can be latent or constant ones. Parameters are given as a list of strings or as a single string, if the model only uses a single parameter. It is also possible to use another name for a globally defined parameter within the forward model, a local parameter name. This can be achieved by providing a one-element dictionary containing the global and local name instead of a single string of the global name. The following code lines show examples of how the (self.parameters) attribute can be set.

self.parameters = "m"
self.parameters = ["m"]
self.parameters = ["m", "b"]
self.parameters = [{"m": "a"}, "b"]

In the last example, the globally defined parameter m will be known as a within the scope of the forward model. Even though this option exists, it is recommended to not use local names if not necessary, since it might be more confusing than helpful.

The definition of the input and output sensors (self.input_sensors, self.output_sensors) is done by providing a list of Sensor-objects (or a single Sensor-object if only one sensor is assigned). In their most basic definition, sensors are just objects with a name-attribute. For example Sensor("x") creates a Sensor-object with the name attribute "x". These sensor-names will refer to specific data stored in the experiments added to the problem in the next step. For the forward model’s input sensors, this very basic sensor type is already sufficient. By providing the input sensors (with their name-attribute) one is essentially just naming all of the forward model’s input channels that are not parameters.

When defining the forward model’s output sensors, more information must be provided. Each output sensor still requires a name attribute, which refers to specific experimental data the forward model’s output will be compared against. But it must also contain the definition of the global parameter, that describes the model error scatter in the considered output sensor. In the definition of the forward model above, the first output sensor of the forward model is defined as

Sensor("y1", x=0.0, std_model="sigma_1")

Here, the standard deviation of the model prediction error (std_model) is described by the global parameter sigma_1. Additionally, the output sensors are assigned a positional-attribute, here x, which is referred to in the response-method. Note that the only required argument for an output sensor is the name (here, y1) and the model error standard deviation parameter std_model.

The response method#

The forward model’s response method is its computing method. It describes how given parameters and inputs are processed to provide the forward model’s prediction in terms of the output sensors. The only input of this method is a dictionary inp which contains as keys all (local) parameter names, as well as all input sensor names as defined in the self.parameters- self.input_sensors-attribute in the interface-method. In the example given above, those keys are be "time", "a", "b", "offset". The values to those keys are either parameter constants, experimental data or - in the case of latent parameters - values chosen by the used solver during the inference step.

The computed result of the forward model must be put into a dictionary when returned. The keys of this dictionary must be the names of the forward model’s output sensors as defined in the interface-method.

Likelihood models#

The last component to be added to an inverse problem is the likelihood model (or several likelihood models). The likelihood model’s purpose is to compute the likelihood (more precisely the log-likelihood) of a given choice of parameter values by comparing the forward model’s predictions (using the given parameter values) with the experimental data. In this section, only likelihood models are considered that do not account for correlations. In such a framework, the addition of a likelihood model to the inverse problem (referring to the examples shown above) could look like this:

problem.add_likelihood_model(
    GaussianLikelihoodModel(
        experiment_name="TestSeries_Aug12_2018",
        model_error="additive",
    )
)

The interface is fairly simple. The experiment_name argument states the name of the experiment this likelihood model refers to. Note that each likelihood model refers to exactly one experiment. Hence, when several experiments are defined, the same number of likelihood models is required. The model_error can be defined as "additive" or "multiplicative", depending on the requested error model.

Correlation#

_images/correlation_definition.png

Ontology#

_images/ontology_visual.png

API reference#

This page documents the modules, classes, and functions provided by the probeye package.

probeye.definition#

inverse_problem#

class InverseProblem(name, use_default_logger=True, log_level='INFO', log_file=None, print_header=True)[source]#

Bases: object

This class provides a general framework for defining an inverse problem (more specifically, a parameter estimation problem) without specifying or providing any computational means for solving the problem.

Parameters
  • name (str) – This is the name of the problem and has only descriptive value, for example when working with several inverse problems.

  • use_default_logger (bool) – When True, the logger will be set up with some useful default values. Otherwise, no logger configurations are applied and a logger can be defined outside of the problem definition.

  • log_level (str) – The log-level used by the default logger for printing to std out. This argument is intended for quickly controlling the amount of logging output the user sees on the screen when running probeye.

  • log_file (Optional[str]) – Path to the log-file, if the logging-stream should be printed to file. If None is given, no logging-file will be created.

  • print_header (bool) – If True, a probeye header is logged when an instance of this class is created. Otherwise, the header will not be logged.

add_experiment(name, sensor_data)[source]#

Adds a single experiment to the inverse problem. An experiment is simply a collection of measured data which was produced by one event. The measured data is given in form of a dictionary (sensor_data).

Parameters
  • name (str) – The name of the experiment, e.g. “Exp_20May.12”. If an experiment with a similar name has already been added, it will be overwritten and a warning will be thrown.

  • sensor_data (dict) – The keys are the sensor’s names (for example ‘strain_gauge_1’) and the values are the measured values.

add_forward_model(forward_model, experiments)[source]#

Adds a forward model to the inverse problem. Note that multiple forward models can be added to one problem.

Parameters
  • forward_model (probeye.definition.forward_model.ForwardModelBase) – Defines the forward model. Check out forward_model.py to see a template for the forward model definition. The user will then have to derive his own forward model from that base class. Examples can be found in the package directory tests/integration_tests.

  • experiments (Union[str, list]) – A single string or a list of strings that represents the names of the problem’s experiments that are described by forward_model.

add_likelihood_model(likelihood_model)[source]#

Adds a likelihood model to the inverse problem. Note that a single problem can have multiple likelihood models. It is assumed that all likelihood models of a problem are mutually independent. Note that before adding a likelihood model, all the experiments the likelihood model should refer to must have already been added to the InverseProblem.

Parameters

likelihood_model (probeye.definition.likelihood_model.GaussianLikelihoodModel) – The general likelihood model object, see likelihood.py. This likelihood model is general in that sense, that it is merely descriptive without providing any computational means. The general likelihood model will be translated into a likelihood model with computational means in the solver that will later be used to ‘solve’ the inverse problem.

add_parameter(name, prm_type='not defined', dim=1, domain='(-oo, +oo)', value=None, prior=None, info='No explanation provided', tex=None)[source]#

Adds a parameter (‘const’ or ‘latent’) to the inverse problem. For more information, check out the Parameters.add_parameter method.

Parameters
change_constant(prm_name, new_value)[source]#

Changes the value of a ‘const’-parameter, i.e. a constant parameter of the inverse problem. Note that constants cannot be modified in any other way.

Parameters
  • prm_name (str) – The name of the ‘const’-parameter whose value should be changed.

  • new_value (Union[int, float]) – The new value that prm_name should assume.

change_parameter_info(prm_name, new_info=None, new_tex=None)[source]#

Changes the info-string and/or the tex-string of a given parameter. This task can also be done directly since the ‘info’ and ‘tex’ attributes of the parameters are not protected. However, to do so, one needs to know some of the internal structure. With this function, one does not need to know anything about the internals. So, it is more a convenience function (which however might not be needed very often).

Parameters
  • prm_name (str) – The name of the parameter whose info/tex-string should be changed.

  • new_info (Optional[str]) – The new string for the explanation of parameter prm_name.

  • new_tex (Optional[str]) – The new string for the parameter’s tex-representation.

change_parameter_role(prm_name, value=None, prior=None, domain='(-oo, +oo)')[source]#

Performs the necessary tasks to change a parameter’s role in the problem definition. A parameter’s role can either be changed from ‘const’ to ‘latent’ or from ‘latent’ to ‘const’. Note that parameter roles cannot be modified in any other way.

Parameters
  • prm_name (str) – The name of the parameter whose role should be changed.

  • value (Optional[Union[int, float]]) – If the new role is ‘const’, the corresponding value has to be specified by this argument.

  • prior (Optional[probeye.definition.distribution.ProbabilityDistribution]) – If the new role is ‘latent’, this argument has to be given as a 2-tuple. Check out the explanations in self.add_parameter for more information on this argument.

  • domain (str) – The parameter’s domain (i.e., values it may assume). Note that this argument is only considered for latent parameter, but not for a constant.

change_parameter_type(prm_name, new_type)[source]#

Changes the type of a parameter, i.e., ‘model’, ‘prior’ or ‘likelihood’. Note that parameter types cannot be modified in any other way.

Parameters
  • prm_name (str) – The name of the parameter whose type should be changed.

  • new_type (str) – The new type, either ‘model’, ‘prior’ or ‘likelihood’.

check_parameter_domains(theta)[source]#

Checks whether the given values of the latent parameters are within their specified domains.

Parameters

theta (numpy.ndarray) – A numeric vector or tensor, which contains the current values of all latent parameters.

Returns

  • True if all values given by theta are within their specified domains.

  • Otherwise, False is returned.

Return type

bool

check_problem_consistency()[source]#

Conducts various checks to make sure the problem definition does not contain any inconsistencies.

property constant_prms: List[str]#

Provides constant_prms attribute.

property constant_prms_dict: dict#

Provides constant_prms_dict attribute.

get_experiment_names(forward_model_names=None, sensor_names=None, experiment_names=None)[source]#

Extracts the names of all experiments which refer to a forward model from a given list and/or which contain a set of given sensors. The experiments are extracted from a given set of experiments.

Parameters
  • forward_model_names (Optional[Union[str, List[str]]]) – The names of the forward model the experiments should refer to. This means, to be selected, an experiment must refer to one of those fwd model names.

  • sensor_names (Optional[Union[str, List[str]]]) – The names of the sensors the experiments should should contain. To be selected, an experiment must contain all of the sensors stated in this list.

  • experiment_names (Optional[Union[str, List[str]]]) – The names of the experiments to sub-select from. If None is given, then all experiments of the problem will be used.

Returns

The names of the sub-selected experiments.

Return type

relevant_experiment_names

get_parameters(theta, prm_def)[source]#

Extracts the numeric values for given parameters that have been defined within the inverse problem. The numeric values are extracted either from the latent parameter vector theta or from the constant parameters of the problem.

Parameters
  • theta (numpy.ndarray) – A numeric parameter vector passed to the loglike and logprior method. Which parameters these numbers refer to can be checked by calling the method self.theta_explanation() once the problem is set up.

  • prm_def (dict) – Defines which parameters to extract. The keys of this dictionary are the global parameter names, while the values are the local parameter names. In most cases global and local names will be identical, but sometimes it is convenient to define a local parameter name, e.g. in the forward model.

Returns

Contains <local parameter name> : <(global) parameter value> pairs. If a parameter is scalar, its value will be returned as a float. In case of a vector-valued parameter, its value will be returned as a np.ndarray.

Return type

prms

get_theta_names(tex=False, components=False)[source]#

Returns the parameter names of the latent parameter vector theta in the corresponding order (either in tex- or conventional format).

Parameters
  • tex (bool) – If True, the TeX-names of the parameters will be returned, otherwise the global names as they are used in the code will be returned.

  • components (bool) – If True, parameters with dimension > 1 are returned component-wise; e.g., if ‘alpha’ is a 2D parameter, it will be returned as ‘alpha_1’ and ‘alpha_2’. If False, only ‘alpha’ would be returned.

Returns

List of strings with the parameter names appearing in theta.

Return type

theta_names

info(tablefmt='presto', check_consistency=True, print_header=False, return_string=False)[source]#

Logs an overview of the problem definition and returns the generated string.

Parameters
  • tablefmt (str) – An argument for the tabulate function defining the style of the generated table. Check out tabulate’s documentation for more info.

  • check_consistency (bool) – When True, a consistency check is performed before printing the explanations on theta. When False, this check is skipped.

  • print_header (bool) – When True, the probeye header is printed before printing the problem information. Otherwise, the header is not printed.

  • return_string (bool) – When True, the constructed string is returned. Otherwise it is just logged without it being returned.

Returns

  • The constructed string providing the problem’s definition if ‘return_string’

  • was set to True. Otherwise, None is returned.

Return type

Optional[str]

property latent_prms: List[str]#

Provides latent_prms attribute.

property latent_prms_dims: List[int]#

Provides latent_prms_dims attribute.

property likelihood_prms: List[str]#

Provides likelihood_prms attribute.

property model_prms: List[str]#

Provides model_prms attribute.

property n_constant_prms: int#

Provides n_constant_prms attribute.

property n_latent_prms: int#

Provides n_latent_prms attribute.

property n_latent_prms_dim: int#

Provides n_latent_prms_dim attribute.

property n_likelihood_prms: int#

Provides n_likelihood_prms attribute.

property n_model_prms: int#

Provides n_model_prms attribute.

property n_prior_prms: int#

Provides n_prior_prms attribute.

property n_prms: int#

Provides n_prms attribute.

property prior_prms: List[str]#

Provides prior_prms attribute.

property priors: dict#

Provides the problem’s prior-dictionary which is derived dynamically from the latent parameters in the self.parameters dictionary. The keys are the priors names, while the values are the prior-objects.

property prms: List[str]#

Provides prms attribute.

remove_parameter(prm_name)[source]#

Removes a parameter (‘const’ or ‘latent’) from inverse problem.

Parameters

prm_name (str) – The name of the parameter to be removed.

theta_explanation(check_consistency=True)[source]#

Returns a string describing how the theta-vector, which is the numeric latent parameter vector that is given to the likelihood and prior methods, is interpreted with respect to the problem’s parameters. The printout will tell you which parameter is connected to which index of theta.

Parameters

check_consistency (bool) – When True, a consistency check is performed before printing the explanations on theta. When False, this check is skipped.

Returns

The constructed string.

Return type

s

forward_model#

class ForwardModelBase(name, *args, **kwargs)[source]#

Bases: object

This class serves as a base class for any forward model. When you want to define a specific forward model, you need to derive your own class from this one, and then define the ‘__call__’ method. The latter essentially describes the model function mapping the model input to the output.

Parameters
  • name (str) – The name of the forward model. Must be unique among all forward model’s names within a considered InverseProblem.

  • args – Additional positional arguments that might be passed to the forward model when it is initialized.

  • kwargs – Additional keyword arguments that might be passed to the forward model when it is initialized.

_check_std_definitions()[source]#

Checks if the forward model’s output sensors share a common model error standard deviation parameter. The result is written to self.sensors_share_std_model.

_evaluate_interface()[source]#

Sets the attributes prms_def, prms_dim, input_sensors and output_sensors. This method is called during initialization.

connect_experimental_data_to_sensors(exp_name, sensor_data)[source]#

Connects the experimental data from an experiments to the corresponding sensors of the forward model. Note that sensor-objects are essentially dictionaries, so the connection is established by adding the ‘exp_name’ as key to the respective sensor-(dict)-object with the measurements as the dict-values. This method is called in the solvers before starting an inference routine.

Parameters
  • exp_name (str) – The name of the experiment the ‘sensor_values’ are coming from.

  • sensor_data (dict) – Keys are the sensor names (like “x” or “y”) and values are either floats, integers or numpy-ndarrays representing the measured values.

property input_channel_names: List[str]#

Provides input_channel_names attribute.

property input_sensor: probeye.definition.sensor.Sensor#

Returns the 1st input sensor. Intended for models with only one onf them.

property input_sensor_dict: dict#

Returns dict with input sensor names as keys and sensor objects as values.

property input_sensor_names: List[str]#

Provides input_sensor_names attribute.

interface()[source]#

This method must be overwritten by the user. It is used to explicitly define the forward model’s parameters, input and output sensors. Check out the integration tests to see examples.

property n_output_sensors: int#

Provides number of output_sensors as an attribute.

property output_sensor: probeye.definition.sensor.Sensor#

Returns the 1st output sensor. Intended for models with only one onf them.

property output_sensor_names: List[str]#

Provides input_sensor_names attribute.

prepare_experimental_inputs_and_outputs()[source]#

This method prepares the experimental-data-collection over the forward model’s input and output sensors. This is done in an own method here for efficiency reasons. Without this method, the loops over the input and output sensors would be repeated in each evaluation of the forward model. This method is called in the solvers before starting an inference routine. It sets the two general attributes ‘self.input_from_experiments’ and ‘self.output_from_experiment’.

response(inp)[source]#

Evaluates the model response and provides computed results for all of the model’s output sensors. This method must be overwritten by the user.

Parameters

inp (dict) – Contains both the exp. input data and the model’s parameters. The keys are the names, and the values are their numeric values.

Returns

Contains the model response (value) for each output sensor, referenced by the output sensor’s name (key).

Return type

dict

property sensor_names: List[str]#

Provides a list of all sensor names as an attribute.

likelihood_model#

class GaussianLikelihoodModel(experiment_name, model_error, measurement_error=None, correlation=None)[source]#

Bases: object

This class describes a Gaussian (i.e., normal) likelihood model in general terms. It contains information such as the likelihood model’s latent parameters, its scope with respect to a given experiment, the sensors it considers, error model specifics as well as information on possible correlations to be considered.

Parameters
  • experiment_name (str) – The name of the experiment the likelihood model refers to. Note that each likelihood model refers to exactly one experiment.

  • model_error (str) – Either ‘additive’ or ‘multiplicative’. This argument defines whether an additive or a multiplicative model prediction error should be assumed.

  • measurement_error (Optional[str]) – If True, next to the model error, a normal, zero-mean i.i.d. measurement error is assumed to be present.

  • correlation (Optional[probeye.definition.correlation_model.CorrelationModel]) –

property correlation_variables: list#

Shortens the access of the correlation model’s correlation variables.

determine_output_lengths()[source]#

Sets the self.output_lengths dictionary. This dict contains information on the length of the returned values of the likelihood model’s forward model in the likelihood model’s experiment. A simple example for an uncorrelated case could look like this (note that the ‘:’-character is the key for the full response): {‘:’: {‘total’: 202, ‘increments’: [101, 101], ‘names’: [‘y1’, ‘y2’]}} This is interpreted as follows: for the likelihood’s experiment, the forward model’s output dictionary will eventually be translated into a vector holding 202 values, where the first 101 belong to output sensor ‘y1’ and the following 101 values belong to output sensor. In a correlated case, the created dict will additionally contain the lengths of the correlation variables, e.g.: {‘:’: {‘total’: 12, ‘increments’: [6, 6], ‘names’: [‘y1’, ‘y2’]},

‘t’: {‘total’: 2, ‘increments’: [1, 1], ‘names’: [‘y1’, ‘y2’]}, ‘x’: {‘total’: 12, ‘increments’: [6, 6], ‘names’: [‘y1’, ‘y2’]}}

The ‘t’ and ‘x’ entries are interpreted as the ‘t’-correlation vector having length 2 and the ‘x’-correlation vector having length 12, while the remaining information is interpreted analogously as described before.

parameter#

class ParameterProperties(prm_dict)[source]#

Bases: object

Describes relevant properties of a (‘latent’ or ‘const’) parameter. Objects from this class are associated with the parameter’s name in the dictionary class ‘Parameters’, see above. The use of this class as opposed to a standard dictionary allows convenient auto-completion while coding.

Parameters

prm_dict (dict) – The keys are ‘index’, ‘dim’, ‘type’, ‘role’, ‘prior’, ‘value’, ‘info’ and ‘tex’, while the values are the corresponding values of these properties. See also the explanations in InverseProblem.__init__() for more detailed information.

changed_copy(index=None, dim=None, domain=None, type=None, prior=None, value=None, info=None, tex=None)[source]#

Convenience method that simplifies changing the attributes of a ParameterProperties object based on creating a new instance. The reason for this approach is that some of the attributes are private, and cannot (or at least should not) be changed directly from outside.

See the explanations in InverseProblem.__init__() for more detailed information on the arguments.

Parameters
  • index (Optional[int]) –

  • dim (Optional[int]) –

  • domain (Optional[Union[tuple, List[tuple]]]) –

  • type (Optional[str]) –

  • prior (Optional[Union[list, tuple]]) –

  • value (Optional[Union[int, float, numpy.ndarray]]) –

  • info (Optional[str]) –

  • tex (Optional[str]) –

Return type

probeye.definition.parameter.ParameterProperties

check_consistency()[source]#

Checks the defined attributes in both isolated checks (each attribute is checked without considering others) and cross-checks, where the combination of attributes is checked on consistency.

property dim: int#

Access self._dim from outside via self.dim.

property domain: Union[tuple, list]#

Access self._domain from outside via self.domain.

property index: int#

Access self._index from outside via self.index.

property index_end: int#

Adds a pseudo-attribute self.index_end, which allows a convenient access to the (not-inclusive) end index in the parameter vector.

property is_const: bool#

Adds a pseudo-attribute self.is_const, which allows a convenient check on whether a parameter is constant or not.

property is_latent: bool#

Adds a pseudo-attribute self.is_latent, which allows a convenient check on whether a parameter is latent or not.

property prior: Optional[Union[tuple, list]]#

Access self._prior from outside via self.prior.

property role: str#

Adds a pseudo-attribute self.role, which allows a convenient check on whether a parameter is latent or not.

static translate_domain_string(domain_string)[source]#

Translate a domain string like “(0, 1]” into a list of ScalarInterval objects.

Parameters

domain_string (str) – A string like “(0, 1]” or “[0, 1] [0, 1]” defining the domain of a (possibly vector-valued) parameter.

Returns

List of ScalarInterval objects derived form ‘domain_string’.

Return type

intervals

property type: str#

Access self._type from outside via self.type.

property value: Union[int, float]#

Access self._value from outside via self.value.

class Parameters[source]#

Bases: dict

The main parameter ‘library’. In this dictionary, all of the problem’s parameters are stored. The parameter’s names are the keys, and the associated values are ParameterProperties-objects, see below.

add_parameter(prm_name, prm_type='not defined', dim=1, domain='(-oo, +oo)', value=None, prior=None, info='No explanation provided', tex=None)[source]#

Adds a parameter (‘const’ or ‘latent’) to the Parameters-object. The main functionality of this method is to distinguish between the two types (‘const’ and ‘latent’) and in creating the prior-object and adding the prior-parameters when adding a latent param. to the problem.

Parameters
  • prm_name (str) – The name of the parameter which should be added to the problem.

  • prm_type (str) – Either ‘model’ (for a model parameter), ‘prior’ (for a prior parameter) or ‘likelihood’ (for a likelihood parameter).

  • dim (Optional[int]) – The parameter’s dimension.

  • domain (str) – The parameter’s domain (i.e., values it may assume). Note that this argument is only considered for latent parameter, but not for a constant.

  • value (Optional[Union[int, float, tuple, numpy.ndarray]]) – If the added parameter is a ‘const’-parameter, the corresponding value has to be specified by this argument.

  • prior (Optional[probeye.definition.distribution.ProbabilityDistribution]) – If the added parameter is a ‘latent’-parameter, this argument has to be given as a 2-tuple. The first element (a string) defines the prior-type (will be referenced in inference routines). The 2nd element must be a dictionary stating the prior’s parameters as keys and their numeric values as values or the name of a pre-defined parameter within the problem scope. An example for a normal prior: (‘normal’, {‘loc’: 0.0, ‘scale’: 1.0}). In order to define the prior’s parameters, check out the prior definitions in priors.py.

  • info (str) – Short explanation on the added parameter.

  • tex (Optional[str]) – The TeX version of the parameter’s name, for example r’$eta$’ for a parameter named ‘beta’.

confirm_that_parameter_does_not_exists(prm_name)[source]#

Checks if a parameter, given by its name, exists among the currently defined parameters. An error is raised when the given parameter does already exist.

Parameters

prm_name (str) – A global parameter name.

confirm_that_parameter_exists(prm_name)[source]#

Checks if a parameter, given by its name, exists among the currently defined parameters. An error is raised when the given parameter does not exist yet.

Parameters

prm_name (str) – A global parameter name.

const_parameter_values(tablefmt='presto')[source]#

Returns a string providing the values of the defined ‘const’-parameters.

Parameters

tablefmt (str) – An argument for the tabulate function defining the style of the generated table. Check out tabulate’s documentation for more info.

Returns

This string describes a nice table with the names and values of the constant parameters of the problem.

Return type

prm_string

property constant_prms: List[str]#

Access the names of all ‘const’-parameters as an attribute.

property constant_prms_dict: dict#

Access the names and values of all ‘const’-param. as an attribute.

property latent_prms: List[str]#

Access the names of all ‘latent’-parameters as an attribute.

property latent_prms_dims: List[int]#

Access the individual dimensions of the latent parameters.

property likelihood_prms: List[str]#

Access the names of all ‘likelihood’-parameters as an attribute.

property model_prms: List[str]#

Access the names of all ‘model’-parameters as an attribute.

property n_constant_prms: int#

Access the number of all ‘const’-parameters as an attribute.

property n_latent_prms: int#

Access the number of all ‘latent’-parameters as an attribute.

property n_latent_prms_dim: int#

Access the combined dimension of all latent parameters. This number is the number of elements in the theta vector.

property n_likelihood_prms: int#

Access the number of all ‘likelihood’-parameters as an attribute.

property n_model_prms: int#

Access the number of all ‘model’-parameters as an attribute.

property n_prior_prms: int#

Access the number of all ‘prior’-parameters as an attribute.

property n_prms: int#

Access the number of all parameters as an attribute.

overview(tablefmt='presto')[source]#

Returns a string providing an overview of the defined parameters.

Parameters

tablefmt (str) – An argument for the tabulate function defining the style of the generated table. Check out tabulate’s documentation for more info.

Returns

This string describes a nice table with some essential information on the parameters of the problem.

Return type

prm_string

parameter_explanations(tablefmt='presto')[source]#

Returns a string providing short explanations on the defined parameters.

Parameters

tablefmt (str) – An argument for the tabulate function defining the style of the generated table. Check out tabulate’s documentation for more info.

Returns

This string describes a nice table with short explanations on the parameters of the problem.

Return type

prm_string

property prior_prms: List[str]#

Access the names of all ‘prior’-parameters as an attribute.

property prms: List[str]#

Access the names of all parameters as an attribute.

property value_dict: dict#

Returns a dict with the parameter names as keys and their numeric values as values. A parameter will only have a value if it is a constant. For latent parameters the dictionary-value will be None

class ScalarInterval(lower_bound, upper_bound, lower_bound_included, upper_bound_included)[source]#

Bases: object

Describes a one-dimensional interval. Used for the domain-definition of parameters.

Parameters
  • lower_bound (float) – The lower bound of the interval (if the interval is [a, b], this here is a).

  • upper_bound (float) – The upper bound of the interval (if the interval is [a, b], this here is b).

  • lower_bound_included (bool) – Defines if the lower bound is included in the interval.

  • upper_bound_included (bool) – Defines if the upper bound is included in the interval.

check_bounds_inc_inc(value)[source]#

Checks if a given value is within the specified bounds (where both bounds are included).

Parameters

value (Union[int, float]) – The given scalar value.

Return type

True, if the value is within its bounds, otherwise False is returned.

check_bounds_inc_ninc(value)[source]#

Checks if a given value is within the specified bounds (where only the lower bound is included).

Parameters

value (Union[int, float]) – The given scalar value.

Return type

True, if the value is within its bounds, otherwise False is returned.

check_bounds_ninc_inc(value)[source]#

Checks if a given value is within the specified bounds (where only the upper bound is included).

Parameters

value (Union[int, float]) – The given scalar value.

Return type

True, if the value is within its bounds, otherwise False is returned.

check_bounds_ninc_ninc(value)[source]#

Checks if a given value is within the specified bounds (where only the upper bound is included).

Parameters

value (Union[int, float]) – The given scalar value.

Return type

True, if the value is within its bounds, otherwise False is returned.

prior#

class PriorBase(ref_prm, prms_def, name, dist)[source]#

Bases: object

Template class for prior definitions. Note that the main motivation of how this class is implemented was to avoid storing any numeric values for any of the priors parameters within the prior object.

Parameters
  • ref_prm (str) – The name of the latent parameter the prior refers to.

  • prms_def (Union[str, dict, List[Union[str, dict]]]) – A list of strings, or list of one-element-dicts defining the prior’s parameter names. For example [‘mean_a’, ‘std_a’] or {‘mean_a’: ‘std_a’, ‘std_a’: ‘std_a’}. The latter example is the notation for the use of global and local names, which should not be necessary for the definition of prior-parameters. A special case is the uninformative prior (see below) which hasn’t got an parameters. So, in this case prms_def might also be None.

  • name (str) – Defining the priors name.

  • dist (probeye.definition.distribution.ProbabilityDistribution) – The probability distribution that describes the prior.

property prior_type: str#

Dynamically accesses the prior’s distribution type from its dist-object.

distribution#

class LogNormal(mean, std)[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Log-normal probability distribution (univariate). For more information check out https://en.wikipedia.org/wiki/Log-normal_distribution.

Parameters
  • mean (Union[int, float, str]) – The mean value of the distribution on the log-scale. Either a number, or a string that describes the name of the parameter that defines the value.

  • std (Union[int, float, str]) – The standard deviation of the distribution on the log-scale. Either a number, or a string that describes the name of the parameter that defines the value.

class MultivariateNormal(mean, cov)[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Normal or Gaussian probability distribution (multivariate). For more information check out https://en.wikipedia.org/wiki/Multivariate_normal_distribution.

Parameters
  • mean (Union[numpy.ndarray, str]) – The mean value of the distribution. Either a vector, or a string that describes the name of the parameter that defines the value.

  • cov (Union[numpy.ndarray, str]) – The covariance matrix of the distribution. Either an array, or a string that describes the name of the parameter that defines the value.

class Normal(mean, std)[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Normal or Gaussian probability distribution (univariate). For more information check out https://en.wikipedia.org/wiki/Normal_distribution.

Parameters
  • mean (Union[int, float, str]) – The mean value of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

  • std (Union[int, float, str]) – The standard deviation of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

class ProbabilityDistribution(dist_type)[source]#

Bases: object

Base class for the different (specific) probability distributions defined below. All of these classes have in common that they merely describe the respective distribution without providing any computing routines. The latter are overloaded when the problem is handed over to a solver.

Parameters

dist_type (str) – The type of the distribution. For example ‘normal’ or ‘uniform’.

static _plot(x, y, ax, color, rotate, adjusted_left, adjusted_right, label)[source]#

Basic plotting function for plotting the distributions probability density function (pdf).

Parameters
  • x (Union[numpy.ndarray, list, tuple, float, int]) – The values on the (un-rotated) x-axis of the pdf-plot.

  • y (Union[numpy.ndarray, list, tuple, float, int]) – The values on the (un-rotated) y-axis of the pdf-plot.

  • ax (plt.Axes) – The axis object to plot the prior-pdf on.

  • color (str) – The line-color of the prior-pdf’s graph.

  • rotate (bool) – If True, the x- and y-axis are switched. This is required, for example, in a pair-plot with two parameters. Here, the histogram to the right is rotated by 90 degrees.

  • adjusted_left (Union[int, float]) – Left limit of (un-rotated) x-axis. Typically adjusted with a margin.

  • adjusted_right (Union[int, float]) – Right limit of (un-rotated) x-axis. Typically adjusted with a margin.

  • label (str) – The label used in the legend of the plot for the plotted pdf.

plot(primary_var, ax, prms, x=None, n_points=200, color='darkorange', rotate=False, label='pdf')[source]#

Template for plotting method which plots the prior-pdf to a given axis object.

Parameters
  • primary_var (str) – The reference parameter of the distribution. For example, if one considers a normal distribution with a density called f, which one would evaluate via f(x, mean=0, std=1), then x would be the reference parameter.

  • ax (plt.Axes) – The axis object to plot the prior-pdf on.

  • prms (Parameters) – The parameters of the problem at hand. Essentially a dictionary. But the values are ParameterProperties-objects.

  • x (Optional[numpy.ndarray]) – The points where the prior-pdf should be evaluated at. If None is given, x will be derived from the x-limits of the given ax-object.

  • n_points (int) – The number of points of the prior-pdf graph. Only effective when x is None.

  • color (str) – The line-color of the prior-pdf’s graph.

  • rotate (bool) – If True, the x- and y-axis are switched. This is required, for example, in a pair-plot with two parameters. Here, the histogram to the right is rotated by 90 degrees.

  • label – The label used in the legend of the plot for the plotted pdf.

class SampleBased(samples)[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Probability distribution defined via a number of samples (univariate). For more information check out https://en.wikipedia.org/wiki/Sampling_distribution.

Parameters

samples (Union[numpy.ndarray, str]) – The samples, the distribution is based on. Either a vector, or a string that describes the name of the parameter that defines the value.

class TruncNormal(mean, std, low, high)[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Truncated normal or Gaussian probability distribution (univariate). For more information check out https://en.wikipedia.org/wiki/Truncated_normal_distribution.

Parameters
  • mean (Union[int, float, str]) – The mean value of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

  • std (Union[int, float, str]) – The standard deviation of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

  • low (Union[int, float, str]) – The lower bound of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

  • high (Union[int, float, str]) – The upper bound of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

class Uniform(low, high)[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Uniform probability distribution (univariate) with bounds included. For more information check out https://en.wikipedia.org/wiki/Continuous_uniform_distribution.

Parameters
  • low (Union[int, float, str]) – The lower bound of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

  • high (Union[int, float, str]) – The upper bound of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

class Uninformative[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Represents a univariate uniform distribution with bounds “at” +/- infinity. This distribution (whose internal purpose is that of a tag) is automatically assigned to a parameter that is added to an inverse problem that is neither assigned a prior nor a value. For example, problem.add_parameter(“a”).

class Weibull(scale, shape)[source]#

Bases: probeye.definition.distribution.ProbabilityDistribution

Two-parameter Weibull distribution (univariate). For more information check out https://en.wikipedia.org/wiki/Weibull_distribution.

Parameters
  • scale (Union[int, float, str]) – The scale-parameter of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

  • shape (Union[int, float, str]) – The shape-parameter of the distribution. Either a number, or a string that describes the name of the parameter that defines the value.

sensor#

class Sensor(name, measurand='not defined', unit='not defined', x=None, y=None, z=None, coords=None, order=('x', 'y', 'z'), std_model='not defined')[source]#

Bases: dict

Base class for an input or output sensor of the forward model. In its simplest form an instance of this class is just a dictionary with a ‘name’ attribute. Additional attributes for the measured quality (measurand) and the corresponding unit can be defined as well. If the sensors position(s) are important, they can be defined as attributes. Further attributes can be defined by the user by creating new classes derived from this one. If the sensor is used as an output sensor, the parameters that describe the statistics of the error in this sensor must be given. Moreover, a sensor objects points to the experimental data it refers to. For that purpose, the sensor class is derived from the dictionary class (so, essentially, a sensor is a dictionary with additional attributes). The keys of an output sensor are the experiment’s names in which some data for this sensor was collected. Consequently, the values are the measured values of the sensor in the respective experiment.

Parameters
  • name (str) – The name of the sensor, for example ‘T_in’, ‘v_x’ or just ‘y’. Note that the sensor’s name must be unique among all the sensors of the forward model the sensor was defined on. The name of the sensor does not have to be unique across multiple forward models though. So, for example, if there are two forward models defined in an inverse problem, they can both have a sensor named ‘y’.

  • measurand (str) – Defines what the sensor measures, for example ‘temperature’ or ‘deflection’. Note that this is optional information, which is currently not used elsewhere in the code.

  • unit (str) – Defines what unit is associated with the sensor’s measurements, for example ‘mm’ for a deflection or ‘K’ for a temperature.

  • std_model (str) – The name of the globally defined parameter that describes the standard deviation of the model prediction error in this sensor. If multiple output sensors are defined on a forward model, these sensors can have different parameters that describe their model prediction error..

  • x (Optional[Union[float, int, numpy.ndarray]]) – x-coordinate(s) of the sensor. When given, the coords- argument must be None. Usually, this value will be a scalar (for example if the sensor represents a point-like sensor like a strain gauge), but the value can also be a vector. This can make sense, for example, for an optical measurement system, which tracks the deflections of multiple points on a loaded structure. If next to ‘x’ also ‘y’ and/or ‘z’ are given, they must all have the same length.

  • y (Optional[Union[float, int, numpy.ndarray]]) – y-coordinate(s) of the sensor. When given, the coords- argument must be None. Check out the explanations of ‘x’ for more information.

  • z (Optional[Union[float, int, numpy.ndarray]]) – z-coordinate(s) of the sensor. When given, the coords-argument must be None. Check out the explanations of ‘x’ for more information.

  • coords (Optional[numpy.ndarray]) – Some or all of the coordinates x, y, z concatenated as an array. Each row corresponds to a constant coordinate, for example the first row might contain all values for the x-coordinate of all points. Which row corresponds to which coordinate is defined via the order-argument. When ‘coords’ given, the arguments ‘x’, ‘y’ and ‘z’ must be None.

  • order (Tuple[str, ...]) – Only relevant when ‘coords’ is given. Defines which row in ‘coords’ corresponds to which coordinate. For example, order=(‘x’, ‘y’, ‘z’) means that the 1st row of ‘coords’ contains x-coordinates while the 2nd row contains y-coords and the 3rd row contains the z-coordinates.

property order: List[str]#

Provides read-access to privat attribute self._order.

property x: Optional[numpy.ndarray]#

Provides x-coords as attribute without copying them from coords.

property y: Optional[numpy.ndarray]#

Provides y-coords as attribute without copying them from coords.

property z: Optional[numpy.ndarray]#

Provides z-coords as attribute without copying them from coords.

probeye.inference#

solver#

class Solver(problem, seed=None, show_progress=True)[source]#

Bases: object

Base class for the different solvers (inference engines).

Parameters
  • problem (InverseProblem) – Describes the inverse problem including e.g. parameters and data.

  • seed (Optional[int]) – Random state used for random number generation.

  • show_progress (bool) – When True, the progress of a solver routine will be shown (for example as a progress-bar) if such a feature is available. Otherwise, the progress will not shown.

_translate_experiments()[source]#

Translate the inverse problem’s experiments as needed for this solver.

_translate_forward_models()[source]#

Translate the inverse problem’s forward models as needed for this solver.

_translate_likelihood_models()[source]#

Translate the inverse problem’s likelihood models as needed for this solver.

_translate_parameters()[source]#

Translate the inverse problem’s parameters as needed for this solver.

scipy-solver#

class MaxLikelihoodSolver(problem, seed=None, show_progress=True)[source]#

Bases: probeye.inference.scipy.solver.ScipySolver

Solver for a maximum likelihood estimation. This class is separate from ScipySolver so that its main function can be triggered by a ‘run’-method. For information on the arguments see Solver.

Parameters
  • problem (InverseProblem) –

  • seed (Optional[int]) –

  • show_progress (bool) –

run(x0_dict=None, x0_prior='mean', x0_default=1.0, true_values=None, method='Nelder-Mead', solver_options=None)[source]#

Triggers a maximum likelihood estimation. For more information on the arguments check out probeye.inference.scipy.solver._run_ml_or_map().

Parameters
  • x0_dict (Optional[dict]) –

  • x0_prior (str) –

  • x0_default (float) –

  • true_values (Optional[dict]) –

  • method (str) –

  • solver_options (Optional[dict]) –

Return type

scipy.optimize.OptimizeResult

class MaxPosteriorSolver(problem, seed=None, show_progress=True)[source]#

Bases: probeye.inference.scipy.solver.ScipySolver

Solver for maximum a-posteriori estimation. This class is separate from ScipySolver so that its main function can be triggered by a ‘run’-method. For information on the arguments see Solver.

Parameters
  • problem (InverseProblem) –

  • seed (Optional[int]) –

  • show_progress (bool) –

run(x0_dict=None, x0_prior='mean', x0_default=1.0, true_values=None, method='Nelder-Mead', solver_options=None)[source]#

Triggers a maximum a-posteriori estimation. For more information on the args check out probeye.inference.scipy.solver._run_ml_or_map().

Parameters
  • x0_dict (Optional[dict]) –

  • x0_prior (str) –

  • x0_default (float) –

  • true_values (Optional[dict]) –

  • method (str) –

  • solver_options (Optional[dict]) –

Return type

scipy.optimize.OptimizeResult

class ScipySolver(problem, seed=None, show_progress=True)[source]#

Bases: probeye.inference.solver.Solver

Solver based on scipy and numpy for an InverseProblem. The ScipySolver contains the methods for log-prior and log-likelihood evaluation. For information on the arguments see Solver.

Parameters
  • problem (InverseProblem) –

  • seed (Optional[int]) –

  • show_progress (bool) –

_run_ml_or_map(x0_dict=None, x0_prior='mean', x0_default=1.0, true_values=None, method='Nelder-Mead', solver_options=None, use_priors=True)[source]#

Finds values for an InverseProblem’s latent parameters that maximize the problem’s likelihood or likelihood * prior function. The used method is scipy’s minimize function from the optimize submodule.

Parameters
  • x0_dict (Optional[dict]) – Contains the start values for each latent variable. Via this arg the user can explicitly specify a start value for the optimization.

  • x0_prior (str) – If x0_dict is not given, the start values will be derived from the priors, either using the ‘mean’ or ‘median’ value. If x0_dict is given, this argument has no effect. Valid values are ‘mean’ and ‘median’.

  • x0_default (float) – For uninformative priors, no mean or median value is defined. In those cases, the default_x0 value will be used as start value. If x0_dict is given, this argument has no effect.

  • true_values (Optional[dict]) – Defines ‘true’ parameter values. Keys are the parameter names and values are the ‘true’ values. They are only used to print them next to the inferred parameter values from the optimization run.

  • method (str) – Defines the algorithm used by scipy.optimize.minimize. See the documentation of this scipy method to see all the options.

  • solver_options (Optional[dict]) – Options passed to scipy.optimize.minimize under the ‘options’ keyword arg. See the documentation of this scipy method to see available options.

  • use_priors (bool) – When True, the priors are included in the objective function (MAP). Otherwise, the priors are not included (ML).

Returns

An object returns by scipy’s minimize function containing the optimization results. The parameter vector that optimizes the likelihood function can be requested via ‘minimize_results.x’.

Return type

minimize_results

_translate_experiments()[source]#

Translate the inverse problem’s experiments as needed for this solver.

_translate_forward_models()[source]#

Translate the inverse problem’s forward models as needed for this solver.

_translate_likelihood_models()[source]#

Translate the inverse problem’s likelihood models as needed for this solver.

_translate_parameters()[source]#

Translate the inverse problem’s parameters as needed for this solver.

evaluate_model_response(theta, forward_model, experiment_name)[source]#

Evaluates the model response for each forward model for the given parameter vector theta and the given experiments.

Parameters
  • theta (numpy.ndarray) – A numeric vector for which the model responses should be evaluated. Which parameters these numbers refer to can be checked by calling self.theta_explanation() once the problem is set up.

  • forward_model (ForwardModelBase) – The forward model that should be evaluated.

  • experiment_name (str) – The experiment, the forward model should be evaluated for.

Returns

  • model_response_vector – Vector of the model responses (concatenated over output sensors).

  • residuals_vector – Vector of the model residuals (concatenated over output sensors).

Return type

Tuple[numpy.ndarray, numpy.ndarray]

get_start_values(x0_dict=None, x0_prior='mean', x0_default=1.0)[source]#

Derives the start values for the maximum likelihood optimization run. For an explanation of the arguments, see self.run_max_likelihood.

Returns

  • x0 – A numeric vector with the derived start values in the order of InverseProblem.get_theta_names().

  • x0_dict – Keys are the latent parameters, while the keys are their start values.

Parameters
  • x0_dict (Optional[dict]) –

  • x0_prior (str) –

  • x0_default (float) –

Return type

Tuple[numpy.ndarray, dict]

loglike(theta)[source]#

Evaluates the log-likelihood function of the problem at theta.

Parameters

theta (numpy.ndarray) – A numeric vector for which the log-likelihood function should be evaluated. Which parameters these numbers refer to can be checked by calling self. theta_explanation() once the problem is set up.

Returns

The evaluated log-likelihood function for the given theta-vector.

Return type

ll

logprior(theta)[source]#

Evaluates the log-prior function of the problem at theta.

Parameters

theta (numpy.ndarray) – A numeric vector for which the log-likelihood function should be evaluated. Which parameters these numbers refer to can be checked by calling self. theta_explanation() once the problem is set up.

Returns

The evaluated log-prior function for the given theta-vector.

Return type

lp

sample_from_prior(prm_name, size)[source]#

Generates random samples from a parameter’s prior distribution and returns the generated samples.

Parameters
  • prm_name (str) – The name of the parameter the prior is associated with.

  • size (int) – The number of random samples to be drawn.

Return type

The generated samples.

summarize_point_estimate_results(minimize_results, true_values, x0_dict, estimate_type='maximum likelihood estimation')[source]#

Prints a summary of the results of the maximum likelihood estimation. For an explanation of the arguments, check out the docstring of the self.run_max_likelihood-method.

Parameters
  • minimize_results (scipy.optimize.OptimizeResult) –

  • true_values (Optional[dict]) –

  • x0_dict (dict) –

  • estimate_type (str) –

emcee-solver#

class EmceeSolver(problem, seed=None, show_progress=True)[source]#

Bases: probeye.inference.scipy.solver.ScipySolver

Provides emcee-sampler which is a pure-Python implementation of Goodman & Weare’s Affine Invariant Markov chain Monte Carlo (MCMC) Ensemble sampler. For more information, check out https://emcee.readthedocs.io/en/stable/.

Parameters
  • problem (InverseProblem) – Describes the inverse problem including e.g. parameters and data.

  • seed (Optional[int]) – Random state used for random number generation.

  • show_progress (bool) – When True, the progress of a solver routine will be shown (for example as a progress-bar) if such a feature is available. Otherwise, the progress will not shown.

emcee_summary(posterior_samples, true_values=None)[source]#

Computes and prints a summary of the posterior samples containing mean, median, standard deviation, 5th percentile and 95th percentile. Note, that this method was based on code from the taralli package: https://gitlab.com/tno-bim/taralli.

Parameters
  • posterior_samples (numpy.ndarray) – The generated samples in an array with as many columns as there are latent parameters, and n rows, where n = n_chains * n_steps.

  • true_values (Optional[dict]) – True parameter values, if known.

Returns

  • Keys are the different statistics ‘mean’, ‘median’, ‘sd’ (standard

  • deviation), ‘q05’ and ‘q95’ (0.05- and 0.95-quantile). The values are

  • dictionaries with the parameter names as keys and the respective statistics

  • as values.

Return type

dict

run(n_walkers=20, n_steps=1000, n_initial_steps=100, true_values=None, **kwargs)[source]#

Runs the emcee-sampler for the InverseProblem the EmceeSolver was initialized with and returns the results as an arviz InferenceData obj.

Parameters
  • n_walkers (int) – Number of walkers used by the estimator.

  • n_steps (int) – Number of steps to run.

  • n_initial_steps (int) – Number of steps for initial (burn-in) sampling.

  • true_values (Optional[dict]) – True parameter values, if known.

  • kwargs – Additional key-word arguments channeled to emcee.EnsembleSampler.

Returns

Contains the results of the sampling procedure.

Return type

inference_data

dynesty-solver#

class DynestySolver(problem, seed=None, show_progress=True)[source]#

Bases: probeye.inference.scipy.solver.ScipySolver

A static and dynamic nested parameter estimator. It facilitates the use of the python package dynesty. The default is set to a static parameter estimator.

_Note:_ For full details on the dynesty library see:

https://dynesty.readthedocs.io/en/latest/index.html.

Parameters
  • problem (InverseProblem) – Describes the inverse problem including e.g. parameters and data.

  • seed (Optional[int]) – Random state used for random number generation.

  • show_progress (bool) – When True, the progress of a solver routine will be shown (for example as a progress-bar) if such a feature is available. Otherwise, the progress will not shown.

get_summary(posterior_samples, true_values=None)[source]#

Computes and prints a summary of the posterior samples containing mean, median, standard deviation, 5th percentile and 95th percentile. Note, that this method was based on code from the taralli package: https://gitlab.com/tno-bim/taralli.

Parameters
  • posterior_samples (numpy.ndarray) – The generated samples in an array with as many columns as there are latent parameters, and n rows, where n = n_chains * n_steps.

  • true_values (Optional[dict]) – True parameter values, if known.

Returns

  • Keys are the different statistics ‘mean’, ‘median’, ‘sd’ (standard

  • deviation), ‘q05’ and ‘q95’ (0.05- and 0.95-quantile). The values are

  • dictionaries with the parameter names as keys and the respective statistics

  • as values.

Return type

dict

prior_transform(theta)[source]#

Evaluates the ppf of the prior distributions at theta.

Parameters

theta (numpy.ndarray) – A numeric vector for which the ppf should be evaluated. Which parameters these numbers refer to can be checked by calling self. theta_explanation() once the problem is set up.

Returns

The vector of quantiles for each prior distribution at theta.

Return type

qs

run(estimation_method='dynamic', nlive=250, true_values=None, **kwargs)[source]#

Runs the dynesty-sampler for the InverseProblem the DynestySolver was initialized with and returns the results as an arviz InferenceData obj.

Parameters
  • estimation_method (str) – “dynamic” or “static”

  • nlive (int) – number of live points

  • true_values (Optional[dict]) – True parameter values, if known.

  • kwargs – Additional key-word arguments channeled to emcee.EnsembleSampler.

Returns

Contains the results of the sampling procedure.

Return type

inference_data or dynesty sampler

probeye.postprocessing#

sampling_plots#

create_pair_plot(inference_data, problem, plot_with='arviz', plot_priors=True, focus_on_posterior=True, kind='kde', figsize=None, inches_per_row=2.0, inches_per_col=2.0, textsize=10, title_size=14, title=None, true_values=None, show_legends=True, show=True, **kwargs)[source]#

Creates a pair-plot for the given inference data.

Parameters
  • inference_data (arviz.data.inference_data.InferenceData) – Contains the results of the sampling procedure.

  • problem (InverseProblem) – The inverse problem the inference data refers to.

  • plot_with (str) – Defines the python package the plot will be generated with. Options are: {‘arviz’, ‘seaborn’, ‘matplotlib’}.

  • plot_priors (bool) – If True, the prior-distributions are included in the marginal subplots. Otherwise the priors are not shown.

  • focus_on_posterior (bool) – If True, the marginal plots will focus on the posteriors, i.e., the range of the horizontal axis will adapt to the posterior. This might result in just seeing a fraction of the prior distribution (if they are included). If False, the marginal plots will focus on the priors, which will have a broader x-range. If plot_priors=False, this argument has no effect on the generated plot.

  • kind (str) – Type of plot to display (‘scatter’, ‘kde’ and/or ‘hexbin’).

  • figsize (Optional[tuple]) – Defines the size of the generated plot in inches. If None is chosen, the figsize will be derived automatically by using inches_per_row and inches_per_col.

  • inches_per_row (Union[int, float]) – If figsize is None, this will specify the inches per row in the subplot-grid. This argument has no effect if figsize is specified.

  • inches_per_col (Union[int, float]) – If figsize is None, this will specify the inches per column in the subplot-grid. This argument has no effect if figsize is specified.

  • textsize (Union[int, float]) – Defines the font size in the default unit.

  • title_size (Union[int, float]) – Defines the font size of the figures title if ‘title’ is given.

  • title (Optional[str]) – The title of the figure.

  • true_values (Optional[dict]) – Used for plotting ‘true’ parameter values. Keys are the parameter names and values are the values that are supposed to be shown in the marginal plots.

  • show_legends (bool) – If True, legends are shown in the marginal plots. Otherwise no legends are included in the plot.

  • show (bool) – When True, the show-method is called after creating the plot. Otherwise, the show-method is not called. The latter is useful, when the plot should be further processed.

  • kwargs – Additional keyword arguments passed to arviz’ pairplot function.

Returns

The array of subplots of the created plot.

Return type

axs

create_posterior_plot(inference_data, problem, plot_with='arviz', kind='hist', figsize=None, inches_per_row=3.0, inches_per_col=2.5, textsize=10, title_size=14, title=None, hdi_prob=0.95, true_values=None, show=True, **kwargs)[source]#

Creates a posterior-plot for the given inference data.

Parameters
  • inference_data (arviz.data.inference_data.InferenceData) – Contains the results of the sampling procedure.

  • problem (InverseProblem) – The inverse problem the inference data refers to.

  • plot_with (str) – Defines the python package the plot will be generated with. Options are: {‘arviz’, ‘seaborn’, ‘matplotlib’}.

  • kind (str) – Type of plot to display (‘kde’ or ‘hist’).

  • figsize (Optional[tuple]) – Defines the size of the generated plot in inches. If None is chosen, the figsize will be derived automatically by using inches_per_row and inches_per_col.

  • inches_per_row (Union[int, float]) – If figsize is None, this will specify the inches per row in the subplot-grid. This argument has no effect if figsize is specified.

  • inches_per_col (Union[int, float]) – If figsize is None, this will specify the inches per column in the subplot-grid. This argument has no effect if figsize is specified.

  • textsize (Union[int, float]) – Defines the font size in the default unit.

  • title_size (Union[int, float]) – Defines the font size of the figures title if ‘title’ is given.

  • title (Optional[str]) – The title of the figure.

  • hdi_prob (float) – Defines the highest density interval. Must be a number between 0 and 1.

  • true_values (Optional[dict]) – Used for plotting ‘true’ parameter values. Keys are the parameter names and values are the values that are supposed to be shown in the marginal plots.

  • show (bool) – When True, the show-method is called after creating the plot. Otherwise, the show-method is not called. The latter is useful, when the plot should be further processed.

  • kwargs – Additional keyword arguments passed to arviz’ plot_posterior function.

Returns

The array of subplots of the created plot.

Return type

axs

create_trace_plot(inference_data, problem, plot_with='arviz', kind='trace', figsize=None, inches_per_row=2.0, inches_per_col=3.0, textsize=10, title_size=14, title=None, show=True, **kwargs)[source]#

Creates a trace-plot for the given inference data.

Parameters
  • inference_data (arviz.data.inference_data.InferenceData) – Contains the results of the sampling procedure.

  • problem (InverseProblem) – The inverse problem the inference data refers to.

  • plot_with (str) – Defines the python package the plot will be generated with. Options are: {‘arviz’, ‘seaborn’, ‘matplotlib’}.

  • kind (str) – Allows to choose between plotting sampled values per iteration (“trace”) and rank plots (“rank_bar”, “rank_vlines”).

  • figsize (Optional[tuple]) – Defines the size of the generated plot in inches. If None is chosen, the figsize will be derived automatically by using inches_per_row and inches_per_col.

  • inches_per_row (Union[int, float]) – If figsize is None, this will specify the inches per row in the subplot-grid. This argument has no effect if figsize is specified.

  • inches_per_col (Union[int, float]) – If figsize is None, this will specify the inches per column in the subplot-grid. This argument has no effect if figsize is specified.

  • textsize (Union[int, float]) – Defines the font size in the default unit.

  • title_size (Union[int, float]) – Defines the font size of the figures title if ‘title’ is given.

  • title (Optional[str]) – The title of the figure.

  • show (bool) – When True, the show-method is called after creating the plot. Otherwise, the show-method is not called. The latter is useful, when the plot should be further processed.

  • kwargs – Additional keyword arguments passed to arviz’ plot_trace function.

Returns

The array of subplots of the created plot.

Return type

axs

For contributors#

All contributions are welcome, feel free to create a pull request on GitHub. Please note that we are using black as a format-checker and mypy as a static type-checker to pre-check any pushes to a pull-request. If your code is not in line black/mypy, no unit tests will be run.

Documentation#

We use sphinx & ReadTheDocs for documenting.

Test your changes locally:

  • Install the documentation dependencies (setup.cfg: docs).

  • Local build: from the docs directory execute make html (on Windows: .\make.bat html).

Testing#

We use pytest for testing.