To actualise the ambition of integrating possibilistic and p-box arithmetic described in this website, there are two reciprocal questions to resolve:
What do specified moments, i.e., the mean and variance, imply about an uncertain number?
How can we compute the mean and variance, or range of these moments, for a specified uncertain number?
An uncertain number is an interval, a possibility distribution, a probability distribution, a p-box, a data set consisting of (zero-variance) intervals, a Dempster-Shafer structure composed of intervals, or a data set having uncertain numbers as elements. The differences among the various kinds of uncertain numbers, especially with respect to the calculation of the upper bound on variance, are perhaps surprisingly subtle.
What do mean & variance imply about an interval range?
Nothing. Specifying mean and variance does not constrain the range of an uncertain number.
What do mean & variance imply about a possibility distribution?
<<>>
What do mean & variance imply about a p-box?
Risk Calc's meanvariance function based on the classical Chebyshev inequality (Allen 1990, page 79) and the associated mmmv and unimmmv functions developed from it using additional information about the range and mode answer this question. The algorithm is available in downs.cpp. The pictures below illustrate the implications. Click a graph to embiggen.
What does an interval range imply about the mean & variance of a random variable?
The bounds on the mean are the same as the interval range. The bounds on the variance are zero and the interval's width squared divided by four, that is, for an interval range [a,b], the variance is surely in the interval [0, (b−a)2/4].
What does a possibility distribution imply about the mean & variance?
<<>>
What does a p-box imply about the mean & variance?
By construction, a p-box may contain particular information about the moments. The question we're asking here is what the left and right edges of a p-box, the bounding distribution functions, imply about mean and variance. The bounds on the mean can be found by computing the mean of the left distribution and the right distributions separately. They form an interval which is always the best possible bounds on the mean given only the p-box edges. The bounds on the variance can be found as the variances of distributions that follow one edge of the p-box until a jump point and follow the other edge. The jump point can be identified by optimisation.
What does a collection of zero-variance intervals imply about the mean & variance?
The bounds on the mean are given by the means of the left bounds and the right bounds respectively. Elementary interval statistics gives algorithms for computing the variance for data sets containing intervals. See <<Marco's Github site>>.
What does a Dempster-Shafer structure imply about the mean & variance?
Although the mean for a Dempster-Shafer structure consisting of interval focal elements is the same as the mean of a collection of zero-variance intervals, the variance is decidedly different for these two kinds of uncertain number. The reason is that the intervals composing the focal elements are not necessarily zero-variance, that is to say, they may represent distributions rather than imprecisely known point values. A Dempster-Shafer structure is a mixture of intervals. The variance of a mixture is ∑ wi σi2 + ∑ wi μi2 − (∑ wi μi)2 where the μi and σi2 are the means and variances for the respective focal elements, and wi are the associated basic probability masses. The variance for each interval [a,b] is [0, (b−a)2/4]. This formulation has many repeated uncertain quantities so its computation is a bit messy, but it will generally be wider than the analogous variance for the intervals interpreted to have zero variance.
It might be helpful to read Bill Huber's discussion about variance of mixtures on StackExchange, and the Applied Probability and Statistics blog on variance of a mixture.