For a given pair of values, μ and Σ, in the sample, we can generate a simulated dataset.
The size of the simulated dataset is arbitrary, but should be large enough to generate a smooth distribution of P(A|B) and P(B|A).
The shape parameters α and β can be thought of as prior observations that I’ve made (or imagined).
This got me thinking, just how good is Cassandra Brown?
A beta prior has two shape parameters that determine what it looks like, and is denoted Beta(α, β).
I like to think of priors in terms of what kind of information they represent.
A prior and likelihood are said to be conjugate when the resulting posterior distribution is the same type of distribution as the prior.
This means that if you have binomial data you can use a beta prior to obtain a beta posterior.