Adventures with PreliZ

October 2, 2024 · 2 min read

Software engineer

Adventures with PreliZ.

import preliz as pz

warning

WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

I'll start looking at continuous variables, specifically the beta. This one is great to start with because when the parameters for the distribution $\alpha$ and $\beta$ both equal 1, then we get a uniform distribution over the support. The beta distribution has the mathematical form

f(x\mid\alpha,\beta)=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha-1}\left(1-x\right)^{\beta-1}

where $\Gamma(\cdot)$ is the Gamma function, and $\alpha$ and $\beta$ are parameters we can adjust. Below I have set $\alpha=\beta=1$ , which gives us a uniform distribution over the support (x-axis) of the distribution.

pz.Beta(alpha=1, beta=1).plot_pdf(pointinterval=True)

<Axes: >

Another choice of the beta distribution is shown below. The "box plot" shows the 50% highest density interval (thicker line) and the 95% highest density interval (thinner line). Another cool thing about the beta distribution is that as $\alpha\rightarrow 0$ and $\beta\rightarrow 0$ , we get the Bernoulli distribution.

pz.Beta(alpha=2, beta=5).plot_pdf(pointinterval=True)

<Axes: >

One of the coolest things about PreliZ is that you can determine parameters for a distribution, by defining the maximum entropy interval.

pz.maxent(distribution=pz.Beta(), lower=0.3, upper=0.8, mass=0.8)

([1mBeta[0m(alpha=3.3, beta=2.8), <Axes: >)

So what does this mean exactly? Well, if I had data between 0.3 and 0.8, I could choose a well informed prior that encompassed the data, but still had probabilities beyond those values. To compare, look at what happens when the mass is set to 0.6.

pz.maxent(distribution=pz.Beta(), lower=0.3, upper=0.8, mass=0.8)
pz.maxent(distribution=pz.Beta(), lower=0.3, upper=0.8, mass=0.6)

([1mBeta[0m(alpha=1.49, beta=1.4), <Axes: >)

Comparing the two distributions shows that the one with the mass set to 0.6 between the interval $[0.3, 0.8]$ has higher probabilities for sampling values between $[0, 0.3]$ than the distribution with mass 0.8.

pz.maxent(distribution=pz.Normal(), lower=0.3, upper=0.8, mass=0.6)

([1mNormal[0m(mu=0.55, sigma=0.297), <Axes: >)

pz.BetaScaled(
    alpha=2,
    beta=5,
    lower=300,
    upper=500,
).plot_pdf(pointinterval=True)

<Axes: >