Where do the consonant structures / possibility distributions come from? What empirical information would support different possibility distributions for the same quantity, and how should they be aggregated? What empirical data, assumptions, or evidence justifies a consonant structure as opposed to a non-consonant one?
There are multiple avenues to construct these performance-based structures for statistical inference:
direct schemes for converting information and uncertainty encoded in other structures as suggested by Dominik Hose (2022),
strategies for parameters of distrete distributions suggested by Alexander Wimbush (2023; cf. Balch 2020),
generic approaches building on random sets suggested by Ryan Martin (<<>>),
approaches using likelihood ratios developed by Michael Balch (2020), and
a brute-force strategy (also suggested by Balch) based on comparing the observed distance of data from a hypothetical distribution to the typical distance of random samples from that distribution (https://sites.google.com/site/cboxbinomial).
These approaches are introduced and illustrated below.
Hose (2022) argues that several consonant structures can be constructed immediately from basic knowledge or other uncertainty characterizations (see the examples in the illustration below). An interval, for instance, immediately implies the consonant structure which is zero for all values below the left endpoint of the interval and above the right endpoint, and one for all values inside the intervals. A nonnegative quantity about which we know the mean m has a consonant structure which is zero for negative values and min(1,m/v) for positive values of v (ibid, page 76f). More generally, for any quantity for which both the mean m and standard deviation s are known, the expression min(1, s²/(v−m)²) defines a consonant structure. Finally, the dunno logical value is the consonant structure for the interval range [0,1], and the anynumber (completely unknown real) value is the the structure for the interval (−∞, ∞), or perhaps its closure in the extended reals [−∞, ∞]. Hose also describes ways to obtain consonant structures directly from Dempster−Shafer structures on the real line, belief functions, p-boxes and simple knowledge about any stochastically dominating function using his unfortunately named Imprecise-Probability-to-Possibility Transform (ibid., 71ff).
Do we have some examples where we can demonstrate a computational advantage to getting the consonant structure from a p-box if one already has the p-box? I believe Ander said we have some. If so, let's add some graphical illustrations below Dominik's slide and maybe append the following text (or an abbreviation of it) to the paragraph above. If not, is it clear whether or not it is ever advantageous to compute the consonant structure from a p-box?
We can always make consonant structures from p-boxes, and p-boxes can come from many different empirical situations:
Distribution-free p-boxes (specified by statistics like mean, mode, variance, range, etc., and properties like symmetry, unimodality, etc., from literature reports or opinions),
P-boxes from imprecise measurements (under maximum likelihood, method of matching moments, or Bayesian analyses),
Composition of c-boxes through sampling models (assumed shape plus iid sample data),
Distribution-free c-box (iid sample data),
Distributional p-boxes (assumed distribution shapes with assumed interval parameters),
Envelopes of alternative precise distributions,
P-boxes from previous model or calculation results or simplifying Dempster-Shafer structures or credal sets,
Confidence bands (e.g., KS bands at some confidence level),
Bounds on probability densities,
Coverage intervals.
Some of these situations simultaneously support consonant structures as well. Do we always get the same structures if we first go to p-boxes and then to consonant structures? Or is there ratcheting?
An illustration of the various ways to obtain consonant structures from Dominik Hose's defense of his dissertation in May 2022 <<we need permission from Dominik to use this figure; also we'd like a better image>>
Wimbush (20xx) shows that structures can be generated for inferring the parameters of discrete distributions based on a generalisation of Balch's approach proposed for the binomial distribution.
including generalized binomials for different stopping rules, Poisson, negative binomial, hypergeometric, <<geometric>> distributions.
<<Ryan's approach>>
<<both Balch strategies>>
References
Balch, Michael Scott Balch (2012). Mathematical foundations for a theory of confidence structures. International Journal of Approximate Reasoning 53 (7)1003-1019. https://www.sciencedirect.com/science/article/pii/S0888613X12000746
Balch, Michael Scott Balch (2020). New two-sided confidence intervals for binomial inference derived using Walley's imprecise posterior likelihood as a test statistic. International Journal of Approximate Reasoning 123: 77-98. https://doi.org/10.1016/j.ijar.2020.05.005
Hose, Dominik (2022). Possibilistic Reasoning with Imprecise Probabilities: Statistical Inference and Dynamic Filtering. D 93 Diss. Universität Stuttgart, Shaker Verlag, Duren. https://www.shaker.de/de/content/catalogue/index.asp?lang=de&ID=8&ISBN=978-3-8440-8721-5
Wimbush, Alexander. (2023). Propagation of Epistemic Uncertainty Through Medical Diagnostic Algorithms. Dissertation, University of Liverpool. <<>>
WOODPILE
The text below should be integrated in the text above, or omitted if it cannot be used somewhere.
Stochastic dominance
Numerical hedges
See Hedges.
Coverage intervals
A consonant structures / possibility distribution can be specified in terms of one or more interval ranges each associated with a probability that the uncertain number will be within the interval.
<<Dominik?>>
Wimbush's sliced-normal estimators
<<Alex?>>
P-box coverage intervals
[Text and figures from RAMAS Constructor Synthesizing Information about Uncertain Variables]
You can also specify some interval ranges within which the uncertain number is known to lie with a certain probability.
The page initially shows three rows for inputs. In each row you can enter the left and right bounds of some interval where the uncertain value is known to be sometimes. For each such interval you enter, you should also indicate in the fourth field of the row the probability that the uncertain number will have a value inside that interval. By clicking on the third entry field of each row, you can select “exactly”, “no more than” or “no less than” to describe how the probability relates to the interval. Failing to select one of the three choices is equivalent to having selected “exactly”. As you fill in all the rows with probabilistic statements, a new row for inputs becomes visible. Up to ten rows can be used to characterize coverages for the uncertain number.
The probabilities need not add up to one and, typically, they won’t because they represent coverages that constrain the same probability mass in different ways. However, because they are probabilities, no value can be smaller than zero or larger than one. You can specify the probabilities as precise scalar values or as intervals. Use an interval when you are unsure about the probability to use.
You can specify the potential range of the uncertain number by giving an interval and indicating that it covers with probability no less than or exactly equal to 1. If you have such a statement, it will be convenient to enter it first in the top row on the Coverage pages. There should be at most only one such statement.
Each right bound must be no smaller than the corresponding left bound. If they are the same value, you are saying that some portion of the probability mass is known to lie at that value. It is possible to enter logically inconsistent statements. For instance, if you say that the interval [1,2] has 60% of the mass and that [3,4] also has 60%, you have created a logical impossibility. (Note, on the other hand, that such probabilities would be entirely possible if the intervals were [1,3] and [2,4].) Because it is easy to inadvertently make mutually inconsistent specifications, you should be careful about the inputs you make. Such inconsistencies may result in the upper and lower bounds crossing each other. If this happens, the software will give you a warning. However, it is also possible that an inconsistency does not result in the bounds crossing.
There are three kinds of coverage statements that can be made. They are constraints on a random variable X of the following forms.
P(X ∈ [x1, x2]) ≥ [p1, p2] (“no less than”),
P(X ∈ [x1, x2]) ≤ [p1, p2] (“no greater than”), or
P(X ∈ [x1, x2]) = [p1, p2] (“exactly”).
where P denotes the probability of an event and the subscripted symbols denote the numerical values entered on the Coverages page. Note that, for a “no less than” statement, the lower bound on the probability is not controlling, so only p2 is used. Likewise, for a “no more than”, only the lower bound probability p1 is important.
Given a suite of such statements, the calculation must find upper and lower bounds on the cumulative probability distribution function for X. This is a linear programming problem. For each bound on probability at every possible value of X, there is a set of equations and inequalities that must be satisfied. The objective function is to maximize or minimize the probability at that x-value. Following Kozine and Utkin (2004), the set of all x-values is the range from the smallest to the largest x-values mentioned in any of the intervals of the constraints. The procedure used here extends their work by allowing for all three types of (in)equalities, provides error checking for impossible sets of constraints, and is also valid for cases where the coverage intervals are not nested in both x and p.
This figure depicts an example of the output produced by three coverage statements.
As you replicate the entries on this input page on your computer, the fields will turn yellow to remind you that inputs need to be justified. You may notice that the p-box at the top of the display changes with individual keystrokes, rather than only when you press the Enter key or tab out of an input field. When you fill up the third row of inputs, a fourth row will appear. You can have up to ten rows.