Skip to main content

Adventures with PreliZ

· 2 min read
Andy Maloney
Software engineer

Adventures with PreliZ.

import preliz as pz
warning

WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

I'll start looking at continuous variables, specifically the beta. This one is great to start with because when the parameters for the distribution α\alpha and β\beta both equal 1, then we get a uniform distribution over the support. The beta distribution has the mathematical form

f(xα,β)=Γ(α+β)Γ(α)Γ(β)xα1(1x)β1f(x\mid\alpha,\beta)=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\Gamma(\beta)}x^{\alpha-1}\left(1-x\right)^{\beta-1}

where Γ()\Gamma(\cdot) is the Gamma function, and α\alpha and β\beta are parameters we can adjust. Below I have set α=β=1\alpha=\beta=1, which gives us a uniform distribution over the support (x-axis) of the distribution.

pz.Beta(alpha=1, beta=1).plot_pdf(pointinterval=True)

<Axes: >

<Figure size 640x480 with 1 Axes>

Another choice of the beta distribution is shown below. The "box plot" shows the 50% highest density interval (thicker line) and the 95% highest density interval (thinner line). Another cool thing about the beta distribution is that as α0\alpha\rightarrow 0 and β0\beta\rightarrow 0, we get the Bernoulli distribution.

pz.Beta(alpha=2, beta=5).plot_pdf(pointinterval=True)

<Axes: >

<Figure size 640x480 with 1 Axes>

One of the coolest things about PreliZ is that you can determine parameters for a distribution, by defining the maximum entropy interval.

pz.maxent(distribution=pz.Beta(), lower=0.3, upper=0.8, mass=0.8)

(Beta(alpha=3.3, beta=2.8), <Axes: >)

<Figure size 640x480 with 1 Axes>

So what does this mean exactly? Well, if I had data between 0.3 and 0.8, I could choose a well informed prior that encompassed the data, but still had probabilities beyond those values. To compare, look at what happens when the mass is set to 0.6.

pz.maxent(distribution=pz.Beta(), lower=0.3, upper=0.8, mass=0.8)
pz.maxent(distribution=pz.Beta(), lower=0.3, upper=0.8, mass=0.6)

(Beta(alpha=1.49, beta=1.4), <Axes: >)

<Figure size 640x480 with 1 Axes>

Comparing the two distributions shows that the one with the mass set to 0.6 between the interval [0.3,0.8][0.3, 0.8] has higher probabilities for sampling values between [0,0.3][0, 0.3] than the distribution with mass 0.8.

pz.maxent(distribution=pz.Normal(), lower=0.3, upper=0.8, mass=0.6)

(Normal(mu=0.55, sigma=0.297), <Axes: >)

<Figure size 640x480 with 1 Axes>

pz.BetaScaled(
alpha=2,
beta=5,
lower=300,
upper=500,
).plot_pdf(pointinterval=True)

<Axes: >

<Figure size 640x480 with 1 Axes>