# Difference between revisions of "Marginal distribution"

m |
DavidB4-bot (Talk | contribs) (→top: Spelling/Grammar Check, typos fixed: For example → For example,) |
||

(2 intermediate revisions by 2 users not shown) | |||

Line 1: | Line 1: | ||

− | In [[probability theory]], given a [[joint probability density function]] of two parameters or variables ''x'' and ''y'', the '''marginal distribution''' of ''x'' is the [[probability density function]] of ''x'' after information about ''y'' has been averaged out. For example from a [[Bayesian probability]] perspective, when doing [[parameter estimation]] we can consider the [[joint probability density function]] as a joint inference which characterizes our uncertainty about the true values of the two parameters, and the marginal distribution of (say) ''x'' as our inference about ''x'' after the uncertainty about ''y'' had been averaged out. We can say that, in this case, we are considering ''y'' as a [[nuisance parameter]]. | + | In [[probability theory]], given a [[joint probability density function]] of two parameters or variables ''x'' and ''y'', the '''marginal distribution''' of ''x'' is the [[probability density function]] of ''x'' after information about ''y'' has been averaged out. For example, from a [[Bayesian probability]] perspective, when doing [[parameter estimation]] we can consider the [[joint probability density function]] as a joint inference which characterizes our uncertainty about the true values of the two parameters, and the marginal distribution of (say) ''x'' as our inference about ''x'' after the uncertainty about ''y'' had been averaged out. We can say that, in this case, we are considering ''y'' as a [[nuisance parameter]]. |

− | For a continuous [[probability density function]] (pdf), an associated marginal pdf can be written as ''m''<sub>''y''</sub>(''x''). Such that | + | For a continuous [[probability density function]] (pdf), an associated marginal pdf can be written as ''m''<sub>''y''</sub>(''x''|''I''). Such that |

− | ::<math>m_{y}(x) = \int_y p(x,y) \, dy = \int_y | + | ::<math>m_{y}(x|I) = \int_y p(x,y|I) \, dy = \int_y p(x|yI) \, p(y|I) \, dy </math> |

− | where ''p''(''x'',''y'') gives the [[joint probability density function]] of ''x'' and ''y'', and '' | + | where ''p''(''x'',''y''|I) gives the [[joint probability density function]] of ''x'' and ''y'', and ''p''(''x''|''yI'') gives the [[conditional probability density function]] for ''x'' given ''y''. The second integral was formulated by use of the [[Bayesian product rule]]. Note that the marginal distribution has the form of an [[expectation value]] with respect to the [[prior distribution]] of y. This marginalization procedure readily generalizes for an arbitrary number of variables or parameters. Also note that all densities are considered conditioned on common background or prior information ''I''. |

For a discrete [[probability mass function]] (pmf), the marginal probability for x<sub>k</sub> can be written as ''p''<sub>''k''</sub> Such that | For a discrete [[probability mass function]] (pmf), the marginal probability for x<sub>k</sub> can be written as ''p''<sub>''k''</sub> Such that | ||

Line 13: | Line 13: | ||

where the ''j'' index spans all indices of the discrete ''y''. The notation ''p''<sub>''kj''</sub> here means the joint probability value when ''x'' has the value ''x''<sub>k</sub> and ''y'' has the value ''y''<sub>j</sub> while ''p''<sub>''k|j''</sub> here references the conditional probability value for ''x''<sub>k</sub> for y fixed at the value ''y''<sub>j</sub>. With ''k'' fixed in the above summation and ''p''<sub>''k'',''j''</sub> considered as a matrix, this can be thought of as summing over all columns in the k<sup>th</sup> row. Similarly, the marginal mass function for ''y''<sub>j</sub> (say ''q''<sub>''j''</sub>) can be computed by summing over all rows in column ''j''. When all of the ''p''<sub>''k''</sub> are determined this way for all k, this set of ''p''<sub>''k''</sub> constitute the pmf for the all relevant discrete values of ''x'', in this particular case calculated as a marginal mass function from an original joint probability mass function. | where the ''j'' index spans all indices of the discrete ''y''. The notation ''p''<sub>''kj''</sub> here means the joint probability value when ''x'' has the value ''x''<sub>k</sub> and ''y'' has the value ''y''<sub>j</sub> while ''p''<sub>''k|j''</sub> here references the conditional probability value for ''x''<sub>k</sub> for y fixed at the value ''y''<sub>j</sub>. With ''k'' fixed in the above summation and ''p''<sub>''k'',''j''</sub> considered as a matrix, this can be thought of as summing over all columns in the k<sup>th</sup> row. Similarly, the marginal mass function for ''y''<sub>j</sub> (say ''q''<sub>''j''</sub>) can be computed by summing over all rows in column ''j''. When all of the ''p''<sub>''k''</sub> are determined this way for all k, this set of ''p''<sub>''k''</sub> constitute the pmf for the all relevant discrete values of ''x'', in this particular case calculated as a marginal mass function from an original joint probability mass function. | ||

− | [[Category: | + | [[Category:Mathematics]] |

## Latest revision as of 09:05, 27 July 2016

In probability theory, given a joint probability density function of two parameters or variables *x* and *y*, the **marginal distribution** of *x* is the probability density function of *x* after information about *y* has been averaged out. For example, from a Bayesian probability perspective, when doing parameter estimation we can consider the joint probability density function as a joint inference which characterizes our uncertainty about the true values of the two parameters, and the marginal distribution of (say) *x* as our inference about *x* after the uncertainty about *y* had been averaged out. We can say that, in this case, we are considering *y* as a nuisance parameter.

For a continuous probability density function (pdf), an associated marginal pdf can be written as *m*_{y}(*x*|*I*). Such that

where *p*(*x*,*y*|I) gives the joint probability density function of *x* and *y*, and *p*(*x*|*yI*) gives the conditional probability density function for *x* given *y*. The second integral was formulated by use of the Bayesian product rule. Note that the marginal distribution has the form of an expectation value with respect to the prior distribution of y. This marginalization procedure readily generalizes for an arbitrary number of variables or parameters. Also note that all densities are considered conditioned on common background or prior information *I*.

For a discrete probability mass function (pmf), the marginal probability for x_{k} can be written as *p*_{k} Such that

where the *j* index spans all indices of the discrete *y*. The notation *p*_{kj} here means the joint probability value when *x* has the value *x*_{k} and *y* has the value *y*_{j} while *p*_{k|j} here references the conditional probability value for *x*_{k} for y fixed at the value *y*_{j}. With *k* fixed in the above summation and *p*_{k,j} considered as a matrix, this can be thought of as summing over all columns in the k^{th} row. Similarly, the marginal mass function for *y*_{j} (say *q*_{j}) can be computed by summing over all rows in column *j*. When all of the *p*_{k} are determined this way for all k, this set of *p*_{k} constitute the pmf for the all relevant discrete values of *x*, in this particular case calculated as a marginal mass function from an original joint probability mass function.