Defining Aleatory and Epistemic Uncertainty
Joshua Kaizer
US Nuclear Regulatory Commission
7:00 PM BST, Friday, 1 May 2020
On Fri, May 1, 2020 at 3:14 PM Edoardo Patelli wrote:
HI Joshua,
I really enjoyed your talk.
To me it makes sense your definition of epistemic uncertainty (also rethinking about your final slide showing the reduction of epistemic uncertainty to zero).
This because we will have a simple and intuitive definition of aleatory and epistemic uncertainty. What you can reduce vs irreducible uncertainty. In this respect, when we collect data, there is also an uncertainty associated with it (including precision of the sensor, discretization etc) but this part since it can not be reduced/avoid it should be consider and treat as aleatory uncertainty.
Now I am not saying how to model those uncertainties. I just want to point out that if a distinction between aleatory/epistemic uncertainty is needed your interpretation is really good.
Now how do we select the best approach to model those uncertainties is a different problem. I am not convinced about aleatory -> probabilistic method vs epistemic -> imprecise probability.
For instance, numerical precision seems to be irreducible and therefore aleatory uncertainty but interval is the most appropriate approach for this type of uncertainty.
Just to add a bit more confusion.
Scott, I know you disagree.
Regards,
Edoardo
On Fri, May 1, 2020 Joshua Kaizer replied:
Edoardo,
Thanks for the comments.
I agree that the next step is determining how to model those uncertainties. As a matter of fact, one thing that started me down this path was trying to determine a “complete set of errors”. In other words, what are tall the uncertainties we should consider in modeling and simulation?
I wrote a paper on this (in the final stage of review now) and what I came up with was 13 errors, that I argue are a “complete set”. In other words, any error in the simulation must be contained in those 13 errors. I feel like what I did was more a cheap mathematical trick than amazing insight… but I think it works.
Based on your comments, where my mind is going next, is that we may have multiple epistemic and aleatory uncertainties in those 13 errors. I am not sure if each error has an epistemic and aleatory form… or if some of the errors are epistemic and others are aleatory, but at least now I know some of the questions to ask.
Thanks for the comment and if you think of anything else, please let me know… josh
On Fri, May 1, 2020 at 12:37 PM Scott Ferson interjected:
Dear Edo and Josh, and Marco and Ullrika (added to the conversation as I know they care about this):
I am thrilled that Edo and Josh connect and agree, but, as Edo notes in his email below, I am a little horrified by what y'all are saying. To fill in Marco and Ullrika, Josh gave a Zoom presentation the other evening that suggested definitions for epistemic and aleatory uncertainty. I should let Josh say what he suggested, but I think he was basically saying that e.u. is measured by a norm of the error between an estimate and the true value, and the a.u. by a norm of something very similar to entropy.
To argue a bit with Edo, I think measurement precision surely can be reduced. Just buy or build a better sensor that has more precision. Likewise, if numerical precision refers to the precision that you achieve in numerical calculations which is determined by mesh or step size, number of replications, etc., then that precision is also reducible by tightening the resolution or increasing reps. Am I misunderstanding you, Edo?
I sympathize with Edo's not being convinced about
aleatory uncertainty -> probability
epistemic uncertainty -> imprecise probability
which I understand to be the claim in this blue paragraph:
Uncertainty analysis is an active area of research, and there are raging controversies within it representing different philosophies about the nature of uncertainty. The debate between Bayesians and frequentists in probability theory is almost as old as the debate which underlies it between subjectivism and objectivism. Not everyone agrees about what the best method is in any given circumstance. At the same time, however, there seems to be a consensus emerging within engineering which is mostly disinterested in the controversies and is more concerned about the practical calculations that are possible and what their results mean in terms of objective reality that can be measured and assessed (Oberkampf and Roy 2010; Bernardini and Tonon 2010; Helton et al. 2010; Utkin 2004; Oberkampf et al. 2004; 2000; Möller et al. 2003; Nikolaidis and Haftka 2001; Kozine and Filimonov 2000; Cipra 2000; Ferson 1996; Ferson and Ginzburg 1996; inter alia; cf. Der Kiureghian and Ditlevsen 2009; contra Vick 2002; Hazelrigg 1996). This view holds that aleatory uncertainty should be propagated by the traditional methods of probability theory but that epistemic uncertainty may require another method that does not confuse incertitude with variability by requiring that every possibility be associated with a probability that it occurs. Many popular technologies for uncertainty propagation are essentially Monte Carlo shells that re-run some calculation many times while randomly varying input values according to specified distributions.If the calculation is deterministic, this is called a Monte Carlo simulation. If the calculation being re-run is itself stochastic, then the simulation is called a second-order or two-dimensional Monte Carlo simulation in which the parameters of distributions used in the calculation can be randomly varied. While everyone agrees that such a shell is useful when all the uncertainties are aleatory, the emerging consensus holds that it is decidedly insufficient to account for epistemic uncertainty, and it dangerously confounds the two kinds of uncertainty if it tries to account for both epistemic and aleatory uncertainty simultaneously.
But it's nevertheless true isn't it? I mean, do you think it is okay to model the sum A+B of two ranges A and B about which we only know bounds the way Laplace and about half of risk analysts today do it? As the triangular distribution that arises from the convolution of two uniform distributions over the A and B ranges? Is it reasonable to assume independence just because you don't know what the dependence is? It is acceptable to use maximum entropy distributions to represent a class of distributions that are specified by a few constraints? If you don't deny these are not okay, then what methods should we use for these cases? Or do you maybe deny these situations actually exist, like Tony O'Hagan sometimes does.
We are currently challenged to fill in the green table below:
In practical terms, we need to identify methods that alone, or in combination, tackle the basic challenges of uncertainty quantification to specify a working calculus that can deal with aleatory and epistemic uncertainty in a comprehensive and integrated manner. For each of several mathematical problems outlined below, the report will recommend UQ methods to fill in this table:
Intrusive method (new implementations) Black-box method (legacy codes)
There will likely be multiple recommendations per cell. For example, there might be a method that can be used if a guaranteed, bounding result is needed, and another (cheaper) method that might be better if an approximative result will suffice. The table distinguishes Simple and Messy problems. Messy problems may be large in size or they may have repeated variables, or they may involve dependencies other than independence. What makes a problem messy rather than simple may differ for different methods and for different mathematical problems. Maybe there will be different answers if we are considering different problems (arithmetic, logic and fault trees, differential equations or finite-element simulations, magnitude comparisons, backcalculations, empirical planning, engineering control, decision making, validation, estimation, updating).
Do we know the answers or not?
Cheers,
Scott
On Fri, May 1, 2020 at 5:43 PM Ullrika Sahlin offered:
Here is a simple answer from me.
Aleatory – relative frequencies
Epistemic – subjective probability
Any caveats about the assessment – turn the subjective probability into an imprecise one.
When we gather lots of knowledge and feel very confident in our assessment, the imprecise goes towards a precise subjective probability.
This is in line with well-established models for statistical inference which allows us to integrate data and expert judgement.
Cheers
Ullrika
On Fri, May 1, 2020 at 6:44 PM Joshua Kaizer replied:
I think Scott gave a fair summary of the proposed definitions for epistemic uncertainty and aleatory uncertainty.
I also think that you can build a better sensor and “reduce” the aleatory uncertainties, but I think you need to be very clear about what is being reduced where.
If we are given a specific sensor, that sensor has some “true” aleatory uncertainty. That uncertainty cannot be reduced for that sensor. We can create another sensor that has a smaller aleatory uncertainty, so we can reduce the aleatory uncertainty in our measurement, but we can only do so by changing something in the system.
Thus, in a broad sense aleatory uncertainty is “reducible”, but only if I can swap out one thing that has a higher aleatory uncertainty with another that has a lower aleatory uncertainty. I don’t think you can change the aleatory uncertainty of those “things” themselves. Note, here I am talking about the “true” aleatory uncertainty of a sensor, and not what we think that aleatory uncertainty is. We can get a better estimate of the true aleatory uncertainty of a sensor. However, this can only be done by (1) getting more information about the sensor (i.e., reducing the epistemic uncertainty) and also “updating” what we believe the aleatory uncertainty of the sensor to be. We are not changing the true aleatory uncertainty of the sensor, we are just changing our estimate of that uncertainty to get closer and closer to that truth.
Mostly, I agree with the sentiments expressed in the blue paragraph, at least in as much as I can understand them. However, I feel like I can only agree with those statements because of what I understand epistemic and aleatory uncertainty to be. But, from my reading, its hard to tell how others define epistemic and aleatory… and so I wonder just what we are arguing about. Is the argument really over a difference of what people believe is true (e.g., I think “A” is true, you think “not A” is true) or is the argument due to the fact that none of us have really expressed ourselves fully because we are using concepts which are not well defined?
Finally, I would argue that the definitions proposed were proposed so that statements such as the one in the blue paragraph could be said more clearly. To me, its one thing to say “don’t confound epistemic and aleatory” and its another to actually define them so you can clearly see that one uncertainty is independent from the other.
… josh
On Fri, May 1, 2020 at 7:14 PM Ullrika rejoindered:
Aleatory uncertainty is what environmental risk assessment refer to as “variability” while epistemic uncertainty is referred to as “uncertainty” due to lack of knowledge. There is in my understanding no debate around the meaning of aleatory and epistemic.
The question is how to model and express it.
For example, we can change (reduce or increase) our uncertainty about variability, but variability is dependent on how we choose to model it. In statistics, we can introduce covariates to explain some variation, this is the same as adding more components expressing variability into our model, and thereby transfer what we see as epistemic uncertainty into aleatory uncertainty. Thus, aleatory uncertainty is not going to change given more data, only our uncertainty about the aleatory uncertainty.
Leave the choice on how to model and express aleatory and epistemic uncertainty to the assessors, as long as they can motivate their choices and do what they are supposed to do. There are several ways to do this, and what is the best way will vary from case to case and from field to field. At the end of the day, the big battle is about getting the assessors to distinguish between aleatory and epistemic uncertainty. When we are there (which we are not yet in most cases), we can refine the discussion.
cherio
Ullrika (I will try to be silent)
On Sat, May 2, 2020 at 4:11 PM Scott Ferson wrote:
Scott makes interstitial replies to Josh and Ullrika in blue below. He hopes the discussion continues.
On Fri, May 1, 2020 at 6:44 PM Kaizer, Joshua wrote:
I think Scott gave a fair summary of the proposed definitions for epistemic uncertainty and aleatory uncertainty.
I also think that you can build a better sensor and “reduce” the aleatory uncertainties,
What? No. How? No.
but I think you need to be very clear about what is being reduced where.
If we are given a specific sensor, that sensor has some “true” aleatory uncertainty.
No. You seem to be switching the words around. The sensor itself doesn't have an aleatory uncertainty, true or not. You might say that a sensor has an epistemic uncertainty, which would relate to the plus-or-minus range, or precision, or sensitivity it has when measuring something, but the sensor itself does not have an aleatory uncertainty. It is the population it is sensing that has aleatory uncertainty. Unless you are talking about some arcane situation in which the sensor has a stochastic element for some reason that adds noise to the values that it reports. But in that case, the wiggliness of the reported values that arise from this source would be epistemic in nature since it blurs our ability to know the true value of any measurand. In such a case, you could reduce it by multiplying the number of sensors and averaging their reported values. If the wiggliness is intrinsic to each sensor, then you can even reduce the uncertainty by repeating the measurement with the same sensor and averaging the results. In either case, it is a kind of epistemic uncertainty. Reducible and relating to what can be known...epistemic.
Note that this particular flavour of epistemic uncertainty can be modeled by a distribution. (Am I contradicting myself somewhere?) It is not the only example. The inferential uncertainty that arises from small sample sizes can also be modeled by distributions that are fashioned into c-boxes. Just because this particular flavour of epistemic uncertainty can be modeled by a distribution doesn't mean that all epistemic uncertainties can be. We of the "emerging consensus" believe that a lot of epistemic uncertainty has the form of bias, or rather, bias of imperfectly known size and direction.
That uncertainty cannot be reduced for that sensor. We can create another sensor that has a smaller aleatory uncertainty, so we can reduce the aleatory uncertainty in our measurement, but we can only do so by changing something in the system.
Okay, I can no longer tell whether I agree with you or not. If you can "create another sensor that has a smaller aleatory uncertainty", then I think you must be using the word 'aleatory' to refer to what the rest of us are calling 'epistemic'. You can only reduce the aleatory uncertainty by changing something in the population.
But, in any case, going back to Edo's original statement, I don't think you would use the word precision or imprecision to refer to the component that is irreducible. It is the aleatory uncertainty, the variability, that comes, not from the sensor design, but from the real world that the sensor is, uh, sensing.
Geesh, maybe I am coming to the point of dividing ice from snow. When you look really closely, it is hard to keep track of the two kinds of uncertainty, and the designations become relative rather than absolute. For instance, suppose Josh's sensor has both flavours of epistemic uncertainty, that is, both the stochastic and the bias flavours. The, irrespective of the population of measurands the sensor might be called on to assess, there is a sense in which one might call the first flavour 'aleatory' and the second 'epistemic', even though they are both epistemic relative to the overall performance of the sensor. This sense comes from consider the statistical ensemble that we are talking about. If we are talking about the ensemble of possible measurements of a single measurand that has a fixed value, then the first flavour describes aleatory uncertainty over that ensemble. But unless you record all those repeated measurements, and usually, even if you do, you are usually only really interested in the overall measurements of that particular measurand and the population from which it came. That is a different ensemble.
Thus, in a broad sense aleatory uncertainty is “reducible”, but only if I can swap out one thing that has a higher aleatory uncertainty with another that has a lower aleatory uncertainty.
Still not buying that at all.
I don’t think you can change the aleatory uncertainty of those “things” themselves. Note, here I am talking about the “true” aleatory uncertainty of a sensor, and not what we think that aleatory uncertainty is.
What? Stop saying that. Huh?
We can get a better estimate of the true aleatory uncertainty of a sensor.
I am getting dizzy.
I know I have to room to point, but the abstractness of our statement is where engineers tend to give up on the philosophy. As Ullrika says, what matters is how you model it.
However, this can only be done by (1) getting more information about the sensor (i.e., reducing the epistemic uncertainty) and also “updating” what we believe the aleatory uncertainty of the sensor to be. We are not changing the true aleatory uncertainty of the sensor, we are just changing our estimate of that uncertainty to get closer and closer to that truth.
Mostly, I agree with the sentiments expressed in the blue paragraph, at least in as much as I can understand them. However, I feel like I can only agree with those statements because of what I understand epistemic and aleatory uncertainty to be. But, from my reading, its hard to tell how others define epistemic and aleatory… and so I wonder just what we are arguing about. Is the argument really over a difference of what people believe is true (e.g., I think “A” is true, you think “not A” is true) or is the argument due to the fact that none of us have really expressed ourselves fully because we are using concepts which are not well defined?
Maybe we need some concrete examples or thought experiments that simplify or anchor the discussion.
Finally, I would argue that the definitions proposed were proposed so that statements such as the one in the blue paragraph could be said more clearly. To me, its one thing to say “don’t confound epistemic and aleatory” and its another to actually define them so you can clearly see that one uncertainty is independent from the other.
… josh
The definitions proposed by you? I don't think we have them in this email thread yet. Could you spell them out for us a bit?
Maybe don't say independent here. Technical meaning. Would 'distinct' suffice?
But I take your point, and it is fair one with which I agree.
I am not sure that we require mathematical definitions, but it wouldn't hurt to have consistent definitions if they are rich enough to capture the linguistic import of the words or most of it. But it would definitely help to have some simple examples that make the distinctions and the complexities clear.
Ullrika wrote:
Aleatory uncertainty is what environmental risk assessment refer to as “variability” while epistemic uncertainty is referred to as “uncertainty” due to lack of knowledge. There is in my understanding no debate around the meaning of aleatory and epistemic.
Yes, I agree these definitions are perfectly clear. But that doesn't mean that it is always easy to distinguish the two in practice. See dividing ice and snow above. And the old timers will be able to come up with examples in which one becomes the other. Like when you want to estimate the mean of a population, then the aleatory uncertainty of the values across the population blends into inferential uncertainty about the mean parameter, which is epistemic. Think of it this way: if you sample every member of the population, you reduce the epistemic uncertainty you have about the population mean to zero.
The question is how to model and express it.
Exactly.
For example, we can change (reduce or increase) our uncertainty about variability, but variability is dependent on how we choose to model it. In statistics, we can introduce covariates to explain some variation, this is the same as adding more components expressing variability into our model, and thereby transfer what we see as epistemic uncertainty into aleatory uncertainty. Thus, aleatory uncertainty is not going to change given more data, only our uncertainty about the aleatory uncertainty.
Oh, that was pretty confusing, but I guess so. Certainly you can also re-define what variability you're talking about too, such as by conditioning.
Leave the choice on how to model and express aleatory and epistemic uncertainty to the assessors, as long as they can motivate their choices and do what they are supposed to do. There are several ways to do this, and what is the best way will vary from case to case and from field to field.
Hmm. I don't know about this. This sounds a bit anarchic to me. There are ways to do it wrong. And we should help them to recognise these failings when they occur. I'm not saying that it is time to write down procedures in stone, but I do think it is high time to start reeling in some of the diversity and, well, shenanigans. A thousand flowers have been blooming, and maybe it is time to identify best practices, before diversity hardens into disparate schools of thought that can no longer talk to each other.
At the end of the day, the big battle is about getting the assessors to distinguish between aleatory and epistemic uncertainty. When we are there (which we are not yet in most cases), we can refine the discussion.
Really? Even die-hard Bayesians recognise and distinguish aleatory and epistemic. Distinguishing a. and e. is definitely old hat at SRA, and (I think) it is also generally accepted at ESREL. Nozer for a long time now, and Michael Levine, and even Tony O'Hagan say that a. and e. are different. They just think they should be modeled with the same objects (probability distributions) under the same theory (probability theory).
cherio
Ullrika (I will try to be silent)
No need. Pontification and argumentation is fun for the whole family.
They say that education is the telling of smaller and smaller lies. Research, similarly, is a sequence of realisations of tinier and tinier exceptions to broad theses. To rescue my argument from my own confusion, what I want to say is this:
The words a. and e. are mostly useful in simple situations where the distinction is relative. Moreover, distinguishing them in the details can be like dividing ice from snow. For broad statements, they are fine, but they can get muddled together if you look closely. And sometimes one turns into the other!
The more operationally significant considerations are three-fold:
Can you reduce the uncertainty and, if so, how can you reduce it?
How should you model the uncertainty (interval, distribution, dependent distribution, p-box, fuzzy number) and why is that reasonable?
Is it legitimate to mix one kind of uncertainty with another, and how should you do that?
Although I have just said that there is a particular flavour of epistemic uncertainty that represents aleatory uncertainty in different ensembles, and for that reason may be modeled with distributions, I still think it is fair to say--as we did in the blue paragraph below--that, generally speaking, the unqualified phrase 'epistemic uncertainty' is referring to interval uncertainty. Maybe I'm wrong or this is just too simplistic or facile. Interval uncertainty is that case where there may be a distribution in there, you just don't know whether there is or not, or even if you think there is a distribution in there, you know so little about it that you still end up with a vacuous p-box, that is, an interval.
Maybe specifying the model you intend to use for the uncertainty is the most straightforward and least ambiguous way to say what kind of uncertainty it is. If you use an interval, it is purely epistemic. If you use a distribution, you are treating it as variation (whether that turns into aleatory or epistemic). If you use a p-box, you're saying it has both.
Cheers,
Scott
Joshua Kaiser wrote at 1:09 PM on 3 May 2020:
I really like the quote about “smaller and smaller lies” because I am starting to feel that way. Even in my own thinking, I am seeing the simplifications that aren’t quite true. So I have focused on constructing a set of examples. My goal here is to reach some level of consistency on definitions.
(I thought of extending the sensor example… but I feel like that was a bit too complex for starting out. However, I feel like we should come back to it after we have some set of definitions that make sense).
I hand you a fair six sided die, and ask if you want to gamble. If you roll some subset of the numbers, you win, if you roll the other subset of the numbers, I win (I am not concerned about what those things are now). What I am focused on is what we know about the die.
Even though we know the true probability measure of the die, we don’t know the outcome of each roll since there is some inherent variability. I am calling this true inherent variability aleatory uncertainty. Aleatory uncertainty – is a measure of the true inherent variability of a given system. (I like thinking of it as a norm on the true probability measure). When we determine if we should gamble under the given rules of who wins what, we need to model the die. Based on that model, we can determine if this is a game we should play or not. If we know the aleatory uncertainty of a die, we can model the die to make that decision.
Now, let’s go back and say instead of handing you a fair six sided die, I just hand you a six sided die where you don’t know its probability measure. We can still use the concept of aleatory uncertainty, as that die still has a true probability measure, however, we don’t know what that probability measure is. We can guess at a probability measure, but we recognize that there is a difference between our guess at a probability measure and the true probability measure. I am calling the difference between our guess at the probability measure of the die and the true probability measure of the die epistemic uncertainty. Epistemic uncertainty – is a measure of the difference between our guess at a probability measure of a system and that system’s true probability measure.
This example will become much more complex, but I wanted to keep this simple for now and ask “do people agree with the way we are using epistemic and aleatory”? Or is there disagreement or did I make any mistakes?... josh
Ullrika Sahlin wrote at 1:35 PM on 3 May 2020:
This example is what I also use when teaching. Note that the probability model for the die can be interpreted as a model with relative frequencies. I can be uncertain about these. I can choose to express my uncertainty with subjective probabilities. If so, I have the subjective probability on relative frequency model which is put forward by Apostolakis in Science from 1990. A lot of time has passed since then, and we should unite on the concepts but not be stubborn about approaches taken in different applications. For example, I could choose to express my uncertainty about the relative frequency model for the die, using plain intervals (note: it is plausible I would do this depending on how uncertain I feel I am) or a fuzzy number (note: I wouldn’t do this, since I don’t understand the fuzzy thing).
For me, the problem is that we still teach Monte Carlo simulation by telling the students to fit a distribution to data. Having that said, they don’t have any quantified epistemic uncertainty to propagate, and they will keep on working with aleatory uncertainty only. This is then taken as a reason that probability as a measure for epistemic uncertainty “doesn’t work”. Instead, we have to teach them Bayesian hierarchical modelling, which are probabilistic models of both variables and parameters and take it from there. So books written by Aven and Zio saying that the Bayesian approach is useful when you have little data are wrong, one should always use Bayesian models if one want to quantify epistemic uncertainty by probability.
If one is using another measure for epistemic uncertainty, then there needs to be a coherent principle to come up with the expressions for epistemic uncertainty (i.e. from experts or inference from data). Measures for epistemic uncertainty that pass that stress test on how to actually quantify given whatever information that is available (as opposed to just discussing propagation which many do) are worth considering. There are several out there.
Ullrika (sorry if I sound upset, it is not my intention)