Research

We welcome collaborations and discussions with independent researchers. Below is a list of the research and development issues surrounding confidence boxes that we feel need addressing. The list includes questions about theory, but also about computational algorithms and their use in practice. This list will evolve as the subject matures.

Can prior information, perhaps encoded as p-boxes computed using robust Bayes methods, be incorporated into calculations involving c-boxes in a way that preserves the confidence interpretation?
How can constraint information about distributions, other than merely their family shape, be incorporated into the inference?
How can we handle computations involving c-boxes when we cannot assume independence among the arguments? Do the Fréchet and other dependency convolution formulas of probability bounds analysis work in the same convenient way that the independence formula works?
1. There are, in fact, two issues here: dependence among the theta values (which is analogous to stochastic dependence among random variables), but also dependence in the data on which the inferences are based. Surely, the first arises all the time, as in estimating both mu and sigma from iid normal deviates.
What is the effect of repetitions of c-boxes in calculations? Is there any consequence on the confidence interpretation? Presumably, this is just an example of dependence among theta values (see question 3.1 above).
Are the calculations overly conservative (too wide) in practice? Are there situations in which the conservativism is stronger?

1. Michael says yes. In particular, if you use a fuzzy (consonant) structure conservative with respect to a (precise) confidence distribution instead of that CD, you'll get puffiness. And if you use a p-box conservative with respect to the consonant structure rather than that consonant structure, you'll create puffiness. A is conservative with respect to B, iff confidence in any set under A is lower than the confidence under B; and the plausibility under A is higher than the plausibility under B. (See the pictures in "12 fuzzyinfogap.ppt".)

We need expanded theory or perhaps just computational work-arounds to escape the limitation that multiplication and division of c-boxes only preserve the confidence interpretation when the factors don't straddle zero. The similar limitation constrains the use of the Fréchet (ρ,τ-)convolution under the theorem of Frank, Nelsen and Schweizer.

1. Minge-Xie's counterexample says we can't simply multiply or divide c-boxes when both straddle zero.
2. Michael attributes this outcome to "seed space scrambling". There is a legitimate confidence structure in the numerical result, but we when try to plot it, we end up scrambling it. Michael promises (May 2015) a paper on this soon. What we're seeing is that the answer is not a CD, nor even a p-box, but rather a DSS (or actually a random set) that cannot be plotted nicely. However, it can still be represented (approximately or conservative) in a computer, and more to the point, you can still harvest confidence intervals from it because it has the confidence interpretation.
  1. You may be able to transform the answer to a (necessarily puffy) p-box to make it pretty, but this would surely lose information. You can also keep it as a DSS/RS which will allow you to continue using the result in calculations without loss of information, but it is really ugly and hard to display.
3. This seed space scrambling happens whenever the transformation is non-monotone, so Ming-ge's multiplication or division with structures that both straddle zero, or square of a structure that straddles zero.
4. In fact, the use of c-boxes to propagate confidence about being within some interval is not affected by this. Only the use of c-boxes to derive confidence intervals is affected.
5. For instance, if we are interested in the probability of being in some 'danger' region of parameter space (where the missile blows up prematurely or the species goes extinct), then you can compute the confidence associated with that event, and the seed scrambling doesn't matter to us. On the other hand, if we want to estimate the value of some parameter with a confidence interval (or a c-box), then we have to back go to the original seed space, select a confidence level and a range on the seed-space axis, and project it through the monotone function.

How can c-boxes be used in deconvolutions and backcalculations?
We need some good example applications that illustrate c-boxes in important problems to show their advantages or disadvantages compared to traditional approaches that ignore inferential uncertainty as well as other frequentist and Bayesian strategies that characterize inferential uncertainty in different ways. We think applications to reliability calculations in engineering and to the "number needed to treat (NNT)" quotients in medicine would be interesting.
C-boxes should be implemented in Risk Calc so as to make the methods conveniently accessible to analysts in a high-level programming environment. This implementation effort should include our robust Bayes library which includes complementary approaches.

Page updated

Google Sites

Report abuse