# Marginal distribution

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

In probability theory, given a joint probability density function of two parameters x and y, the marginal distribution of x is the probability distribution of x after information about y has been averaged over. From a Bayesian probability perspective, we can consider the joint probability density as a joint inference about the true values of the two parameters and the marginal distribution of (say) x, as our inference about x after the uncertainty about y had been averaged over. We can say that in this case, we are considering y as a nuisance parameter.

For continuous probability densities, this marginal probability density function can be written as my(x). Such that



where p(x,y) gives the joint distribution of x and y, and c(x|y) gives the conditional distribution for x given y. Note that the marginal distribution has the form of an expectation.

For a discrete probability mass function, the marginal probability for the kth ordinate can be written as pk Such that



where the j index spans all values of the discrete y. With k fixed here and pk,j considered as a matrix, then this can be thought of as summing over all columns in the kth row. Similarly, the marginal mass function for y can be computed by summing over all rows in a particular column. When all of the pk are determined this way for all k, this set of pk constitute the discrete probability mass function for the relevant discrete values of x, in this particular case calculated as a marginal mass function from an original joint probability mass function.