{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Simple linear regression\n\nA simple linear regression example with two model parameters and one noise parameter.\n\nThe model equation is y = a * x + b with a, b being the model parameters and the\nnoise model is a normal zero-mean distribution with the std. deviation to infer.\nThe problem is solved via sampling using emcee and pyro.\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Import what we will need for this example.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\nimport numpy as np\n\nfrom probeye.definition.forward_model import ForwardModelBase\nfrom probeye.definition.inference_problem import InferenceProblem\nfrom probeye.definition.noise_model import NormalNoiseModel\nfrom probeye.definition.sensor import Sensor\nfrom probeye.inference.emcee_.solver import EmceeSolver"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "We start by generating a synthetic data set from a known linear model. Later we\nwill pretend to forgot about the parameters of this ground truth model and will try\nto recover them from the data. The slope and intercept of the ground truth model:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "a_true = 2.5\nb_true = 1.7"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Generate a few data points that we contaminate with a Gaussian error:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "n_tests = 50\nseed = 1\n\nnp.random.seed(seed)\nx_test = np.linspace(0.0, 1.0, n_tests)\ny_true = a_true * x_test + b_true\nsigma_noise = 0.5\ny_test = y_true + np.random.normal(loc=0, scale=sigma_noise, size=n_tests)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Visualize our data points\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "plt.plot(x_test, y_test, \"o\")\nplt.xlabel(\"x\")\nplt.ylabel(\"y\")\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Define our parametrized linear model:\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "class LinearModel(ForwardModelBase):\n    def response(self, inp):\n        # this method *must* be provided by the user\n        x = inp[\"x\"]\n        m = inp[\"m\"]\n        b = inp[\"b\"]\n        response = {}\n        for os in self.output_sensors:\n            response[os.name] = m * x + b\n        return response\n\n    def jacobian(self, inp):\n        # this method *can* be provided by the user; if not provided\n        # the jacobian will be approximated by finite differences\n        x = inp[\"x\"]  # vector\n        one = np.ones((len(x), 1))\n        jacobian = {}\n        for os in self.output_sensors:\n            # partial derivatives must only be stated for the model\n            # parameters; all other input must be flagged by None;\n            # note: partial derivatives must be given as column vectors\n            jacobian[os.name] = {\"x\": None, \"m\": x.reshape(-1, 1), \"b\": one}\n        return jacobian"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Define the inference problem.\nInitialize the inference problem with a useful name; note that the\nname will only be stored as an attribute of the InferenceProblem and\nis not important for the problem itself; can be useful when dealing\nwith multiple problems\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "problem = InferenceProblem(\"Linear regression with normal noise\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Add all parameters to the problem; the first argument states the\nparameter's global name (here: 'a', 'b' and 'sigma'); the second\nargument defines the parameter type (three options: 'model' for\nparameter's of the forward model, 'prior' for prior parameters and\n'noise' for parameters of the noise model); the 'info'-argument is a\nshort description string used for logging, and the tex-argument gives\na tex-string of the parameter used for plotting; finally, the prior-\nargument specifies the parameter's prior; note that this definition\nof a prior will result in the initialization of constant parameters of\ntype 'prior' in the background\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "problem.add_parameter(\n    \"a\",\n    \"model\",\n    tex=\"$a$\",\n    info=\"Slope of the graph\",\n    prior=(\"normal\", {\"loc\": 2.0, \"scale\": 1.0}),\n)\nproblem.add_parameter(\n    \"b\",\n    \"model\",\n    info=\"Intersection of graph with y-axis\",\n    tex=\"$b$\",\n    prior=(\"normal\", {\"loc\": 1.0, \"scale\": 1.0}),\n)\nproblem.add_parameter(\n    \"sigma\",\n    \"noise\",\n    tex=r\"$\\sigma$\",\n    info=\"Std. dev, of 0-mean noise model\",\n    prior=(\"uniform\", {\"low\": 0.1, \"high\": 0.8}),\n)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Add the forward model to the problem; note that the first positional\nargument [{'a': 'm'}, 'b'] passed to LinearModel defines the forward\nmodel's parameters by name via a list with elements structured like\n{<global parameter name>: <local parameter name>}; a global name is a\nname introduced by problem.add_parameter, while a local name is a name\nused in the response-method of the forward model class (see the class\nLinearModel above); note that the use of the local parameter name 'm'\nfor the global parameter 'a' is added here only to highlight the\npossibility of this feature; it is not necessary at all here; whenever\nforward model's parameter has a similar local and global name (which\nshould be the case most of the times), one doesn't have to use the\nverbose notation  {<global parameter name>: <local parameter name>}\nbut can instead just write the parameter's (global=local) name, like\nit is done with the forward model's parameter 'b' below\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "isensor = Sensor(\"x\")\nosensor = Sensor(\"y\")\nlinear_model = LinearModel([{\"a\": \"m\"}, \"b\"], [isensor], [osensor])\nproblem.add_forward_model(\"LinearModel\", linear_model)\n\n# add the noise model to the problem\nproblem.add_noise_model(NormalNoiseModel(prms_def={\"sigma\": \"std\"}, sensors=osensor))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Add test data to the Inference Problem\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "problem.add_experiment(\n    \"TestSeries_1\",\n    fwd_model_name=\"LinearModel\",\n    sensor_values={isensor.name: x_test, osensor.name: y_test},\n)\n\n# give problem overview\nproblem.info()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Estimate the parameters using `emcee`\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "emcee_solver = EmceeSolver(problem, show_progress=True)\ninference_data = emcee_solver.run_mcmc(n_walkers=20, n_steps=2_000, n_initial_steps=200)"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.8.6"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}