---
jupytext:
  text_representation:
    extension: .md
    format_name: myst
kernelspec:
  display_name: Python 3
  language: python
  name: python3
---

# Simulation based calibration

This example demonstrates how to use the `SBC` class for simulation-based calibration, supporting both PyMC and Bambi models.

```{jupyter-execute}

from arviz_plots import plot_ecdf_pit, style
import numpy as np
import simuk
style.use("arviz-variat")
```

::::::{tab-set}
:class: full-width

:::::{tab-item} PyMC
:sync: pymc

First, define a PyMC model. In this example, we will use the centered eight schools model.

```{jupyter-execute}

import pymc as pm

data = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])

with pm.Model() as centered_eight:
    mu = pm.Normal('mu', mu=0, sigma=5)
    tau = pm.HalfCauchy('tau', beta=5)
    theta = pm.Normal('theta', mu=mu, sigma=tau, shape=8)
    y_obs = pm.Normal('y', mu=theta, sigma=sigma, observed=data)
```

Pass the model to the SBC class, set the number of simulations to 100, and run the simulations. This process may take
some time since the model runs multiple times (100 in this example).

```{jupyter-execute}

sbc = simuk.SBC(centered_eight,
    num_simulations=100,
    sample_kwargs={'draws': 25, 'tune': 50})

sbc.run_simulations();
```

To compare the prior and posterior distributions, we will plot the results from the simulations,
using the ArviZ function `plot_ecdf_pit`.
We expect a uniform distribution, the gray envelope corresponds to the 94% credible interval.

```{jupyter-execute}

plot_ecdf_pit(sbc.simulations,
              pc_kwargs={'col_wrap':4},
              plot_kwargs={"xlabel":False},
)
```

:::::

:::::{tab-item} Bambi
:sync: bambi

Now, we define a Bambi Model.

```{jupyter-execute}

import bambi as bmb
import pandas as pd

x = np.random.normal(0, 1, 200)
y = 2 + np.random.normal(x, 1)
df = pd.DataFrame({"x": x, "y": y})
bmb_model = bmb.Model("y ~ x", df)
```

Pass the model to the `SBC` class, set the number of simulations to 100, and run the simulations.
This process may take some time, as the model runs multiple times

```{jupyter-execute}

sbc = simuk.SBC(bmb_model,
    num_simulations=100,
    sample_kwargs={'draws': 25, 'tune': 50})

sbc.run_simulations();
```

To compare the prior and posterior distributions, we will plot the results from the simulations.
We expect a uniform distribution, the gray envelope corresponds to the 94% credible interval.

```{jupyter-execute}
plot_ecdf_pit(sbc.simulations)
```

:::::

:::::{tab-item} Numpyro
:sync: numpyro

We define a Numpyro Model, we use the centered eight schools model.

```{jupyter-execute}
import numpyro
import numpyro.distributions as dist
from jax import random
from numpyro.infer import NUTS
y = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])
def eight_schools_cauchy_prior(J, sigma, y=None):
    mu = numpyro.sample("mu", dist.Normal(0, 5))
    tau = numpyro.sample("tau", dist.HalfCauchy(5))
    with numpyro.plate("J", J):
        theta = numpyro.sample("theta", dist.Normal(mu, tau))
    numpyro.sample("y", dist.Normal(theta, sigma), obs=y)
# We use the NUTS sampler
nuts_kernel = NUTS(eight_schools_cauchy_prior)
```

Pass the model to the `SBC` class, set the number of simulations to 100, and run the simulations. For numpyro model, 
we pass in the ``data_dir`` parameter.

```{jupyter-execute}
sbc = simuk.SBC(nuts_kernel,
    sample_kwargs={"num_warmup": 50, "num_samples": 75},
    num_simulations=100,
    data_dir={"J": 8, "sigma": sigma, "y": y},
)
sbc.run_simulations()
```

To compare the prior and posterior distributions, we will plot the results.
We expect a uniform distribution, the gray envelope corresponds to the 94% credible interval.

```{jupyter-execute}
plot_ecdf_pit(sbc.simulations, 
              pc_kwargs={'col_wrap':4},
              plot_kwargs={"xlabel":False}
)
```