Public presentations by Scott Ferson

curriculum vitae websites biosketches pictures what topic

False confidence: when satellites go bump in the sky

December 2022, Tampa, Florida

Society for Risk Analysis annual meeting

Background: In 1978, when there were fewer than 500 satellites in orbit, Kessler and Cour-Palais warned of a tipping point, now dubbed the Kessler Syndrome, which occurs when satellites and space debris are numerous enough that a collision could cause a cascade in which each collision generates more space debris that increases the likelihood of further collisions. There are currently over 5000 active satellites in orbit around the Earth, and many thousands more planned for the coming decade. There are also innumerable objects in fields of space debris including old satellites, used rocket stages and other hardware, explosion and collision fragments such as paint chips, traveling at hypervelocities, each of which is a space bullet capable of destroying a satellite or vehicle that might cross its path. Each such body larger than about 5 cm in diameter is tracked by aerospace engineers, but the probabilistic analyses that are used to compute the risk of collisions exhibit what satellite navigators call "probability dilution" which causes gross misestimation of risks that can make satellites seem to be safe even though they are on a direct collision course.

Approach: Probability dilution turns out to be a special case of "false confidence" in which there are false hypotheses that get assigned high (posterior) probability. All traditional probabilistic analyses, including both frequentist and Bayesian approaches, are susceptible to false confidence. Indeed, the False Confidence Theorem shows that every additive belief measure is susceptible to the false confidence problem, but, unfortunately, it does not say which hypotheses might be problematic. Thus, even though many or most analyses will be correct and reliable, it can be difficult to foretell which analyses are sound and which judgments based on probabilities are reliable. In light of this, we can employ non-probabilistic approaches to calculating risks that use inference procedures that cannot have false confidence because they are valid (sensu Martin and Liu) because their violation probability is tied to the confidence level.

Results: Inferences based on nonadditive plausibility functions with Martin-Liu validity support an approach to satellite conjunction analysis that is reliable in a way that aerospace engineers require. (Interestingly, confidence intervals have this validity but confidence distributions do not.)

Management/Policy Implications: Satellites serve numerous critical functions in communication, meteorology, navigation, geopositioning, intelligence gathering, remote sensing of Earth resources, and space exploration. Even a few collisions could cause service disruptions and significant financial losses. A cascade or chain reaction of collisions could spell catastrophic impacts on the half-trillion-dollar space economy and major disruptions in the human societies on Earth that rely on it. The numbers of satellites and space debris bodies in orbit are exponentially growing, dramatically increasing the risk of a cascade of hypervelocity collisions that could generate thousands or millions of space bullets, potentially rendering near-Earth space unnavigable. More generally, the phenomenon of false confidence that can accompany seemingly straightforward analyses using either frequentist or Bayesian methodologies, means that strategies must be developed to avoid dangerous conclusions from false confidence that mislabel perilous situations as safe.

Monte Carlo simulation and probability bounds analysis in R with hardly any data

December 2022, Tampa, Florida

Workshop at the Society for Risk Analysis annual meeting

This full-day workshop features hands-on examples worked in R on your own laptap, from raw data to final decision. The workshop introduces and compares Monte Carlo simulation and probability bounds analysis for developing probabilistic risk analyses when little or no empirical data are available. You can use your laptop to work the examples, or just follow along if you prefer. The examples illustrate the basic problems risk analysts face: not having much data to estimate inputs, not knowing the distribution shapes, not knowing their correlations, and not even being sure about the model form. Monte Carlo models will be parameterized using the method of matching moments and other common strategies. Probability bounds will be developed from both large and small data sets, from data with non-negligible measurement uncertainty, and from published summaries that lack data altogether. The workshop explains how to avoid common pitfalls in risk analyses, including the multiple instantiation problem, unjustified independence assumptions, repeated variable problem, and what to do when there’s little or no data. The numerical examples will be developed into fully probabilistic estimates useful for quantitative decisions and other risk-informed planning. Emphasis will be placed on the interpretation of results and on how defensible decisions can be made even when little information is available. The presentation style will be casual and interactive. Participants will receive handouts of the slides and electronic files with software for the examples.

How to handle extremely rare events in a meaningful way

25 July 2022, Liverpool

Invited workshop for the Centre for Doctoral Training in GREEN (Growing skills for Reliable Economic Energy from Nuclear)

(slides and video)

This very brief introduction to the history and science of risk analysis, as it emerged in the nuclear energy industry, recounts its successes as well as its dramatic failures, highlights its fundamental advances as well as its missteps as it has grown to be widely used across science and engineering, and considers both its necessity and fundamental limitations as an essential social tool. It explains how one can compute the risks of failure of complex systems made of components that hardly ever fail and for which failure data is virtually nonexistent.

Combating indubiety and the crisis of trust: engineering needs a new approach to uncertainty

14 April 2021, Stony Brook, New York

Department of Technology and Society, Stony Brook University

(Zoom recording, password: Hz=wrp0z)

The modern world brings connectedness unparalleled in the history of humanity, but we are not prepared for this world. Research in risk communication reveals that conventional approaches to science education are insufficient today. We are susceptible to fake news, dubious science, and clever advertisers. Our votes can be swayed by duplicitous politicians and terrorist shocks. People who believe the Earth is flat have increased exponentially via the web. This gullibility becomes dangerous in our hyperconnected world, and it is a problem for our society as serious as illiteracy was in previous centuries.

At the same time, we are at a crossroads in our scientific appreciation of uncertainty. The traditional view that there is only one kind of uncertainty and that probability theory is its calculus leads in practice to quantitative results that are misleading and often misconstrued. An emerging alternative view, however, entails a richer mathematical concept of uncertainty and a broader framework for uncertainty analysis but admits a kind of uncertainty that is not handled by traditional Laplacian probability measures.

The way to be trusted more is to stop lying—and to stop fooling ourselves—that we know more than we do. For this honesty, a restructured approach to uncertainty is essential for many modern challenges in engineering, especially those in situations where relevant data are sparse, imprecise, or unreliable. Likewise, making uncertainty analysis pervasive is one of several strategies that can be useful in cultivating and teaching healthy skepticism without falling into the trap of losing trust in everything.

Say what?

12 March 2021, Liverpool, United Kingdom

ViCE: Virtual Conference on Epistemic Uncertainty in Engineering

(video)

We are at a crossroads in our scientific appreciation of uncertainty. The traditional view is that there is only one kind of uncertainty and that probability theory with Bayes' rule is its calculus. But some engineers hold that, in practice, the quantitative results of traditional probabilistic models are often misconstrued and sometimes demonstrably misleading.

By relaxing a single axiom of traditional (von Neumann–Morgenstern) utility theory that assumes the decision maker can always decide which of any two nonidentical decision choices would be preferred, traditional decision theory devolves to a version entailing a richer concept of uncertainty and a broader framework for uncertainty analysis. The resulting theory admits a kind of uncertainty that is not handled by traditional Laplacian probability measures and might therefore be called non-Laplacian uncertainty. This non-Laplacian view argues that different kinds of uncertainty must be propagated differently through simulations, reliability and risk analyses, calculations for robust design, and other computations.

We suggest that these two views can be unified into a modern pragmatic approach to uncertainty quantification. Many, and perhaps most, practical calculations involving uncertainty may be well handled with traditional probability theory, implemented as standard applications of Bayes' rule and Monte Carlo simulations. But there are special cases involving epistemic uncertainty where it is difficult or impossible to fully specify probabilities or other measured quantities precisely, where a non-Laplacian approach can be useful. Such a unified approach would make practical solutions easier for engineering and physics-based models, and the inferences drawn from such models under this view would be more defensible.

Remarkably, this emerging consensus parallels the historical resolution of the initially extreme controversy in mathematics that produced the present balance between Euclidean and non-Euclidean approaches that forms modern geometry.

Customizing diagnostic medical testing

Alexander Wimbush*, Nicholas Gray, Marco De Angelis, Louis Clearkin, Scott Ferson

December 2020, Society for Risk Analysis annual meeting

Rationale. Diagnostic testing is used for several different purposes in medicine, including epidemiology, counselling patients, public heath surveys, blood screening, pharmacology, etc. Diagnostic algorithms (programmed test sequences) are usually developed by experts without formal quantitative analysis. This process is slow and often yields suboptimal algorithms that do not account for reliabilities of the underlying tests and are poorly suited for different diagnostic purposes. As misdiagnosis increases morbidity and mortality by up to 30%, improvement is needed.

Methods. Customizing diagnostic algorithms requires searching over testing topologies to find algorithms that come as close as possible to meeting constraints (e.g., prediction reliability) and that optimize some criterion (e.g., minimizing average cost per patient). We developed software to find optimal testing topologies using genetic algorithms that assess the performances of randomly assembled or evolving testing topologies. Medical surety can be measured by sensitivity, specificity, PPV, NPV, AUC, Youden's index, or diagnostic odds ratio.

Results & discussion. An optimal test topology can maximize a selected criterion of surety, cost, time, or medical invasiveness, while meeting constraints on the other criteria, with generally different answers for different medical purposes. In order to guarantee the constraints have been met, the inherent uncertainties in diagnostic testing parameters, i.e., sensitivity and specificity based on limited empirical sampling, are propagated through calculations using robust Bayes methods.

Conclusions & implications. Using this approach to design medical test algorithms should improve public health by allowing more efficient allocation of public health resources from quantitatively optimized testing strategies and by allowing doctors and patients to make better health decisions because of full knowledge of the concomitant uncertainties in the diagnostic process. It also provides a way to use damaged or poor tests yet maintain medical standards by redesigning the algorithm to account for their lower reliability.

Error propagation in shape analysis

31 January 2020, Liverpool, United Kingdom

Risk Institute Winter School on Machine Learning in Image Processing

Shape analysis is a special problem in feature extraction from images often used to recognise outlines, tracks, silhouettes, signatures, symbols, ordered points, etc. Quantitative methods that can characterise shapes do not comprehensively address measurement uncertainties arising from blurred or missing imagery, registration errors, pixel resolution, ambiguity about landmarks, and deformation of non-rigid objects by stretching, twisting or drooping during image capture. Such imprecision can be characterised using interval methods and projected through intervalized elliptic Fourier analysis and statistical methods such as t-tests and discriminations. The approach also allows us to assess quantitatively how close one shape is to another and to display an array of shapes that are as close to a given shape as a test shape is.

Risk and uncertainty about biological populations

18 December 2019, Liverpool

Ocean Clean-up Symposium, Institute for Risk and Uncertainty

(video)

Risk assessors are beginning to appreciate the need to include ecological processes in their assessment models. The need arises because ecological systems have an inherent complexity that can completely erase the effects of an impact or greatly magnify it, depending on the life histories of the biological species involved. This complexity can also delay the consequence of an impact or alter its expression in other ways. Three central themes have emerged in ecological risk assessment:

1) Variability versus incertitude. Natural biological systems fluctuate in time and space, partially due to interactions we understand, but substantially due to various factors that we cannot foresee. The variability of ecological patterns and processes, and our incertitude about them, prevent us from making precise, deterministic estimates of the effects of environmental impacts. Because of this, comprehensive impact assessment requires a probabilistic language of risk that recognizes variability and incertitude, yet permits quantitative statements of what can be predicted. The emergence of this risk language has been an important development in applied ecology over the last decade. A risk-analytic endpoint is a natural summary that can integrate disparate impacts on a biological system.

2) Population-level assessment. In the past, assessments were conducted at the level of the individual organism, or, in the case of toxicity impacts, even at the level of tissues or enzyme function. To justify costly decisions about remediation and mitigation, biologists are often asked “So what?” questions that demand predictions about the consequences of impacts on higher levels of biological organization. Management plans require predictions of the consequent effects on biological populations and ecological communities. Our scientific understanding of community and ecosystem ecology is very limited, however, and quantitative predictions, even in terms of risks, for complex systems would require vastly more data and mechanistic knowledge than are usually available. Extrapolating the results of individual-level impacts to potential effects on the ecosystem may simply be beyond the current scientific capacity of ecology, which still lacks wide agreement about even fundamental equations governing predator-prey interactions. How can we satisfy the desire for ecological relevance when we are limited by our understanding of how ecosystems actually work? As a practical matter, focusing on populations, meta-populations (assemblages of distinct local populations), and short food chains may be a workable compromise between the organism and ecosystem levels. Risk assessment at the population level requires the combination of several technical tools including demographic models, potentially with explicit age, stage or geographic structure, and methods for probabilistic uncertainty propagation, which are usually implemented with Monte Carlo simulation. Meta-populations and short food chains are likely to be at the frontier of what we can address with scientifically credible models over the next decade.

3) Cumulative attributable risk. Assessments should focus on the change in risk due to a particular impact. The risk that a population declines to, say, 50% of its current abundance in the next 50 years is sometimes substantial whether it is impacted by anthropogenic activity or not. Only the potential change in risk, not the risk itself, should be attributed to impact. On the other hand, for environmental protection to be effective, remediation and mitigation must be designed with reference to the cumulative risks suffered by an ecological system from impacts and from all the various stresses present cumulated through time.

Why people are so bad with probabilities

3 December 2019, Rio de Janeiro

Invited talk for Congresso ABRISCO (Associação Brasileira de Análise de Risco, Segurança de Processos e Confiabilidade)

Analysts commonly believe the public has trouble understanding risk analyses. The failures of risk communication efforts are usually ascribed to the public’s ignorance, irrationality, lack of quantitative skills, or its mistrust of industry or government. But the problem may not be the public’s fault. Humans have been wired by evolution to reckon with uncertainty and make decisions in ways that are fundamentally different from probability theory. Several of the most important biases of probability perception and decision making recognized by psychologists can be interpreted as evolutionarily adaptive strategies for reacting to risks and lack of information about the world. There are good reasons why people (i) routinely underestimate some risks and overestimate others, (ii) are insensitive to prior probabilities, (iii) always ask how bad it could be and ignore how unlikely that outcome is. Understanding these issues will allow risk communicators to formulate more effective strategies.

Por que as pessoas são tão ruins com probabilidades

3 December 2019, Rio de Janeiro

Invited talk for Congresso ABRISCO (Associação Brasileira de Análise de Risco, Segurança de Processos e Confiabilidade)

Os analistas geralmente acreditam que o público tem dificuldade em entender as análises de risco. Os fracassos dos esforços de comunicação de risco são geralmente atribuídos à ignorância do público, à irracionalidade, à falta de habilidades quantitativas ou à desconfiança da indústria ou do governo. Mas o problema pode não ser culpa do público. Os seres humanos têm sido conectados pela evolução para lidar com a incerteza e tomar decisões de maneiras que são fundamentalmente diferentes da teoria da probabilidade. Vários dos mais importantes vieses de percepção de probabilidade e tomada de decisão reconhecidos pelos psicólogos podem ser interpretados como estratégias evolutivamente adaptativas para reagir aos riscos e à falta de informação sobre o mundo. Existem boas razões pelas quais as pessoas (i) rotineiramente subestimam alguns riscos e superestimam os outros, (ii) são insensíveis a probabilidades anteriores, (iii) sempre perguntam quão ruim poderia ser e ignoram quão improvável é esse resultado. A compreensão dessas questões permitirá aos comunicadores de risco formular estratégias mais eficazes.

Answering the ‘so what?’ questions about environmental pollution

2 December 2019, Rio de Janeiro

Congresso ABRISCO (Associação Brasileira de Análise de Risco, Segurança de Processos e Confiabilidade)

Chemical contaminants in aquatic and terrestrial ecosystems such as leachates from mine tailings, agricultural fertilizers and pesticides, manufacturing by-products, and combustion residues can deliver anthropogenic toxicants that adversely affect plants and animals, including humans. The consequences of these effects are determined by natural ecological processes which are inherently complex. They can completely erase the effects of an impact or greatly magnify it, depending on the life histories of the biological species involved. This complexity can also delay the consequence of an impact or alter its expression in other ways. Moreover, natural biological systems fluctuate in time and space, often due to factors such as weather that we cannot predict. Our scientific understanding of ecosystem ecology is itself very limited, and quantitative predictions for such systems would require vastly more data and mechanistic knowledge than are usually available. So the complexity and variability of these natural systems and our lack of knowledge about them prevent us from making precise estimates. In particular, extrapolating the results of individual-level impacts observed in toxicology laboratories to effects at the ecosystem level may simply be beyond the current scientific capacity of ecology. Thus, toxicity assessments at the level of the individual organism or below often cannot answer basic “so what?” questions. What does it mean if some fish die because of a contaminant that otherwise wouldn’t, or their reproduction is reduced? Cannot the natural resilience of a population allow it to rebound? Can we be sure there will be any noticeable impact at all on the population as a whole? As a practical matter, ecological risk analysis must focus on populations and short food chains as a useful compromise between relevance and tractability. Such assessments can form the basis of a regulatory framework designed to protect environmental resources, and likewise inform polluting industries about how they should provision for uncertainties in their consideration of remediation strategies.

What about the other humans?

26 November 2019, Glasgow, Scotland

Human Reliability & Intelligent Autonomous Systems

Human factors research is usually pointed at the humans who made mistakes in operations such as workers, line operators, yeomen, technicians and the people who keep the system running, but what about the people who designed the systems and the analyses such as planner, designers, programmers, and engineers who set up the system? Their mistakes can be even more consequential, yet there are few checks to prevent them other than peer review, and there are surprisingly few ergonomic studies that consider their error rates and contributory factors. We argue for three schemes, whose wide adoption is long-overdue, by which design-time errors can be avoided or mitigated: [1] abandonment of calculation environments such as spreadsheets and tools that interact with designers in limited ways (such as visualisation only) or are otherwise hard to review, [2] adoption of programming languages that natively recognise units and dimensions and automatically check expressions for soundness and consistency, and [3] deployment of automatic uncertainty and sensitivity analyses performed in the background concomitant with all base calculations that can inform quantitative results.

Prediction and decision making from bad data

26 September 2019, Hannover, Germany

Keynote address at 29th International European Safety and Reliability Conference (ESREL), Leibniz Universität Hannover

Engineering has entered a new phase in which ad hoc data collection plays an ever more important role in planning, development/construction, operation, and decommissioning of structures and processes. Intellectual attention has largely focused on exciting new sensing technologies, and on the prospects and challenges of 'big data'. A critical issue that has received less attention is the need for new data analysis techniques that can handle what we might call bad data that does not obey assumptions required for a planned analysis. Most widely used statistical methods, and essentially all machine learning techniques , are limited in application to situation in which their input data is (i) precise, (ii) abundant, and (iii) characterised by specific properties such as linearity, independence, completeness, balance, or being distributed according to a named or particular distribution.

Although statistical techniques have been developed for situations in which some of these requirements can be relaxed, the techniques often still make assumptions about the data that may be untenable in practice. For instance, methods to handle missing data may assume the data are missing at random, which is rarely true when sensors fail under stress. Of course, even in the age of big data, we may have small data sets for rare events such as those associated with tiny failure rates, unusual natural events, crime/terror incidents, uncommon diseases, etc. Although many statistical methods allow for small sample sizes, they generally require data to be representative of the underlying population, which can be hard to guarantee. Moreover, not all uncertainty has to do with small sample sizes. Poor or variable precision, missing values, non-numerical information, unclear relevance, dubious provenance, contamination by outliers, errors and lies are just a few of the other causes that give us bad data.

We review the surprising answers to a few questions about bad data:

How can we handle data that is incomplete, unbalanced, or has missing or censored values?
When investing in sensors, when are more sensors preferable to more precise sensors?
What can be done with ludicrously small data sets, like n=8, or n=2, or even n=1?
What if the data are clearly not collected randomly?
Can bad data be combined with good data? When shouldn’t they be combined?
When can increasing the number of sensors counterintuitively increase uncertainty?

Analyses can be conducted along a spectrum of increasing robustness from assumption laden to assumption free. Sofware tools are needed to track the assumptions we are making in data analyses and automatically characterise the robustness of the estimations and conclusions we draw from them.

Mobility decision support system: integrating risks and costs for personal and social decision making

15 July 2019, Liverpool

Invited talk for Transportation Information and Safety (ICTIS)

We describe a planned mobility decision support system developed and maintained by open-source software co-creation to advise both regional planners and individual travellers. The system employs stochastic optimisation to identify optimal modes, schedules and routes for travel from pre-computed risk maps that account for various costs of travel including (1) risk of death and injury for the traveller, passengers, pedestrians, and other travellers, (2) environmental costs in terms of likely emissions of vehicle exhaust, NOx, hydrocarbons, particulate matter, greenhouse gases during the trip, and the attributable ecological impacts associated with habitat destruction and dissection from infrastructure construction and maintenance, and (3) economic costs of the trip given the route, schedule, mode, and vehicle, but also the indirect economic costs associated with traffic congestion delays, health impacts from injuries and pollution, environmental degradation, and infrastructural investment and maintenance. The system facilitates distributed optimal decision making by leveraging stochastic optimisation and blockchain accounting with strong encryption to protect personal privacy. The risk maps are created by both generic models and local models developed for particular regions using local expertise and regional data. Individual travellers making use of the smart phone app create a feedback stream of data relevant for the decision engine and transportation science generally. Encouraged under a citizen science program, data streams from hospitals, insurers, police, government bodies, and other contributors will also inform the decision engine about local conditions. Network research partners develop local data sets and data streams, and the local risk models that take account of local laws, customs, conventions within each country or region. Everyone can contribute to the data sets and models used to create risk maps, so the result is a truly co-created system. The system incorporates mechanisms to minimize improper uses by individuals or governments such as vandalism, rumour-mongering, advertising, espionage, and warfare.

Keywords: trip planning; citizen science; distributed decision making; stochastic optimization; risk maps; risk appetites

Distributed decision making with poor data

Nick Gray*, Noémie Le Carrer, Robert Birch, Edoardo Patelli, Scott Ferson

7 May 2019, Capetown, South Africa

We describe a web-based mobility decision support system using stochastic optimisation to identify optimal modes, schedules and routes for travel from risk maps that account for various costs of travel including (1) injury risks for passengers and pedestrians, (2) environmental costs from emissions and habitat destruction attributable to vehicles and roads, and (3) economic costs including indirect costs of traffic delays, injuries, environmental degradation, and infrastructure investments. The system is an open-source co-creation with data collection from citizen science participation and the use by individual travellers of smart phone apps based on Google Maps and similar mapping resources. Like Wikipedia, the system is populated by individual experts who can update each other’s work. Scientists in both the developed and developing worlds are fashioning risk models using available data to create risk/uncertainty maps that integrate economic consequences, environmental impacts, and injury risks to people. These maps are used to plan individual trips and mobility options in light of the travellers’ personal preferences, risk tolerances and appetites, and trip purpose, which are elicited via the app and users’ past choices. The maps are also used by regional planners to evaluate possible infrastructure improvements, evacuation plans, and investments, policies and regulations for transportation. The accessibility of the engine and data is the system’s strength but also its vulnerability, so provisions are necessary to protect it from improper use. It does not archive personal information about individual travellers. It uses strong anonymisation which constrains statistical risks of re-identification or other disclosure of personal information about users. The system also incorporates mechanisms to detect, minimize and mitigate other improper uses by individuals or governments such as vandalism, rumour-mongering, advertising, espionage, and warfare.

Safer, greener and cheaper transportation in the developing world

TBD

Transportation in the developing world is extraordinarily costly in terms of risks to human lives, direct and indirect economic expenditures, and environmental impacts. For instance, fifty million traffic injuries and about 1.3 million deaths occur each year on the world’s roads, but the vast majority (93%) are in low- and middle-income countries, even though they have only about half the world’s vehicles. Demographic trends suggest that vehicle ownership in developing countries will increase sharply in the future, suggesting even higher rates for injuries and environmental and economic impacts. The poorest countries often have the worst rates, but middle-income countries have higher total numbers of injuries and environmental impacts simply because they have more cars. Road fatalities are the number one risk for people between 14-46 years of age who are the most economically productive population segment. Medical bills, lost output, and vehicle damage from accidents alone cost about 5% of GDP in middle- and low-income countries. Further adverse economic consequences arise from traffic congestion, which is steadily worsening in major cities, including in the developing world. Likewise, environmental impacts from vehicle emissions and the attributable environmental destruction from vehicle and infrastructure construction and maintenance are substantial and cumulative. The current, relatively lower rates in the developed world are a recent phenomenon. Fatality rates in the United Kingdom, for instance, were three times larger fifty years ago. These lower injury rates and lower emissions are due to decades of engineering planning and research, often by trial and error, and they were won at the cost of the injuries and impacts across those years. So is it necessary that countries in the developing world experience the human carnage and environmental damage before their injury and impact rates decline? Can they leapfrog over these costs?

Computing with uncertainty: creating a tool to handle uncertain numbers in computations

Nick Gray, Marco De Angelis and Scott Ferson

TBD

Many scientists and engineers work with legacy computer codes that do not take full account of uncertainties. Because analysts are typically unwilling to rewrite their codes, various simple strategies have been used to remedy the problem, such as elaborate sensitivity studies or wrapping the program in a Monte Carlo loop. These approaches treat the program like a black box because users consider it uneditable. However, whenever it is possible to look inside the source code, it is better characterized as a “crystal box”. Strategies are needed that automatically translate original source into code with appropriate uncertainty representations and propagation algorithms.

We have developed an uncertainty compiler for this purpose. It handles the specifications of input uncertainties and inserts calls to an object-oriented library of “intrusive” uncertainty quantification (UQ) algorithms. We use ANTLR, a parser/lexer generator, and Python to translate original code into UQ code in the same language. In theory, the approach could work with any computer language. We currently support Python and are working to handle FORTRAN, C, and MATLAB languages.

A very useful extension to the uncertainty compiler is to automatically detect repetition of uncertain inputs within mathematical expressions. Uncertainty analyses are sensitive to repeated inputs. For example, if A, B and C are intervals, AB+AC yields wider bounds than A(B+C), because the dependence of the two A’s is ignored. Similar problems beset all uncertainty methods, including step-wise Monte Carlo analyses. In practice, the optimal answer can be easily computed if the expression can be rearranged into a form that contains only one occurrence of each uncertain parameter. It is easy to detect such repetitions in the uncertainty compiler and issue appropriate warnings to the user.

Even more useful would be to automatically simplify mathematical expressions with such repetitions in a way that reduces the repetitions of parameters containing uncertainty. Although this problem is known to be NP-hard in general, software strategies can be designed to find expressions with fewer repetitions of the same variable, and even partial solutions allow improved calculations. The needed simplification is very different from that normally sought in computer algebra (‘simple’ expressions have only a single instance of each uncertain parameter but they can be arbitrarily complex in other respects).

We are exploring a strategy that repeatedly applies mathematical identities that reduce the number of appearances of uncertain parameters. There are many such reducing templates. The approach is to parse an expression into a binary tree, and search for matches with a reducing template in each subtree. The search is iterated over all the templates and over all subtrees, and it is repeated until no further reduction occurs. To shorten the list of reducing templates, the matching algorithms automatically test multiple rearrangements of the subtree that are implied by associativity and commutativity of basic operators.

Bayes' rule in medical diagnostics: implications for kindergarteners' cooties

Nick Gray, Marco De Angelis and Scott Ferson

TBD

The problem of classification is very general and arises in many fields, spanning structural health in engineering, supervised learning in computer science, and patient diagnosis in medicine. Tests on which classifications rest are sometimes imperfect, yielding false alarms, undetected threats and other misclassifications. For instance, medical practitioners commonly diagnose a patient’s health condition by employing a medical test which is not by itself definitive, but has some statistical probability of revealing the true health state. Naively interpreting the result from a medical test can therefore lead to an incorrect assessment for a patient’s true health condition because of the possibility of false-positive and false-negative disease detections. Bayes’ rule is commonly used to estimate the actual chance a patient is sick given the results from the medical test, from the statistical characteristics of the test used and the underlying prevalence of the disease.

Winkler and Smith have argued that the traditional application of Bayes’ rule in medical counseling is inappropriate and represents a “confusion in the medical decision-making literature”. They propose in its place a radically different formulation that makes special use of the information about the test results for new patients, although not their actual disease status. The test results are used to update the estimates of the test’s sensitivity and specificity and the underlying prevalence of the disease, and thus improve the test asymptotically as the test is applied. Remarkably, Bayesians do not seem to have a means within their theory to determine whether the traditional approach or the Winkler and Smith approach is correct. Their reasoning would apply generally, beyond medicine, to all diagnostic and classification problems in science and engineering.

We criticize this approach, and argue that it allows for a test for the mythical childhood disease “cooties” such that, whenever it says someone has cooties, the test itself becomes more reliable, even though there is no independent gold-standard determination of whether one has cooties or not. Indeed, the test also appears to become more reliable if it says one does not have cooties. Such behavior reveals a logical reductio ad absurdum that proves untenability of the approach.

Keywords: diagnostics, sparse data, Bayes rule, false positives, prevalence, cooties.

References

Leonard, T., and J.S.J. Hsu, Bayesian Methods: An Analysis for Statisticians and Interdisciplinary Researchers. Cambridge University Press. (1999).

Mossman, D., and J.O. Berger, Medical Decision Making 21, 498 (2001).

Seidenfeld, T., and L. Wasserman, The Annals of Statistics 21, 1139 (1993).

Walley, P., Statistical Reasoning with Imprecise Probabilities. (Chapman and Hall, London, 1991).

Walley, P., Journal of the Royal Statistical Society, Series B 58, 3 (1996).

Walley, P., L.Gurrin, and P. Barton, The Statistician 45, 457 (1996).

Winkler, R.L., and J.E. Smith, Medical Decision Making 24, 654 (2004).

Cognitive biases arise from conflating epistemic and aleatory uncertainty

21 February 2019, Berlin

International Conference on Uncertainty in Risk Analysis, Bundesinstitut für Risikobewertung (BfR) and European Food Safety Authority (EFSA)

(video)

Decision scientists and psychometricians have described many cognitive biases over the last several decades, which are widely considered to be manifestations of human irrationality about risks and decision making. These phenomena include probability distortion, neglect of probability, loss aversion, ambiguity aversion and the Ellsberg Paradox, hyperbolic discounting, among others. We suggest that all these and perhaps other biases arise from the interplay between distinct special-purpose processors within the multicameral human brain whose existence is implied by recent clinical and neuroimaging evidence. Although these phenomena are usually presumed to be misperceptions or cognitive illusions, we describe the evolutionary significance of these phenomena in humans and other species, and we place them in their biological context where they do not appear to be failings of the human brain but rather evolutionary adaptations. Apparent paradoxes arise when psychometricians attempt to interpret human behaviors against the inappropriate norm of the theory of probability, which turns out to be an overly precise calculus of uncertainty when in reality the different mental processors give contradictory results. This view of the psychological and neurological evidence also suggests why risk communication efforts so often dramatically fail and how they might be substantially improved. For instance, it now seems clear that what risk analysts call epistemic uncertainty (i.e., lack of knowledge or ambiguity) and aleatory uncertainty (variation or stochasticity) should not be rolled up into one mathematical probabilistic concept in risk assessments, but they instead require an analysis that distinguishes them and keeps them separate in a way that respects the cognitive functions within the decision makers to whom risk communications are directed.

Communicating probability with natural frequencies and the equivalent binomial count

Scott Ferson, Jason O'Rawe, Michael Balch

22 February 2019, Berlin

International Conference on Uncertainty in Risk Analysis, Bundesinstitut für Risikobewertung (BfR) and European Food Safety Authority (EFSA)

(video)

Risk communication strategies for expressing a probability presume the probability is precisely characterized as a real number. In practice, however, such probabilities can often only be estimated from data limited in abundance and precision. Likewise, risk analyses often yield imprecisely specified probabilities because of measurement error, small sample sizes, and model uncertainty. Under the theory of confidence structures, the probability of an event estimated from binary data with k successes out of n trials is associated with a particular structure that has the form of a p-box, i.e., bounds on a cumulative distribution function. When n is large, this structure approximates the beta distribution obtained by a Bayesian analysis under a binomial sampling model and Jeffreys prior, and asymptotically it tends to the frequentist estimate k/n. But when n is small, it is imprecise and cannot be approximated by any single distribution. Confidence structures emphasize the importance of n to the reliability of the estimate. If n is large, the probability estimate is more reliable than if n is small. A probability resulting from a risk analysis can be approximated by a confidence structure corresponding to some values of k and n. Thus we can characterize the probability with a terse, natural-language expression of the form “k out of n”, where k and n are nonnegative integers and 0≤k≤n. We call this an equivalent binomial count, and argue that it condenses both the probability and uncertainty about that probability into a form that psychometry suggests will be intelligible to humans. Gigerenzer calls such integer pairs “natural frequencies” because humans appear to natively understand their implications, including what the size of n says about the reliability of the probability estimate. We describe data collected via Amazon Mechanical Turk showing that humans correctly interpret these expressions.

Naked expert elicitations of probabilities of rare events

December 2018, New Orleans, Louisiana

Society for Risk Analysis annual meeting

There are basically two approaches to estimating probability without actual data: expert elicitation (i.e., guessing), and disaggregation into constituent parts whose probabilities are easier to estimate (i.e., breaking into subproblems). When the latter approach is no longer workable, analysts must resort to the former and rely on expert opinion and estimation. But how should we characterize probabilities of events that are so rare that they have never been observed? By what principles can such characterizations be projected in probabilistic analyses? Sometimes elaborate elicitation strategies are employed to estimate rare-event probabilities, but the results are often expressed as probabilities with no indication of the uncertainty associated with the estimate. How might analysts model expert opinions about event probabilities of the form “1 in 10 million”, “about 1 in 1000”, or “it’s never seen in over 100 years of observation”, so they can be used in calculations that account for rather than ignore epistemic uncertainty? Several strategies provide partial solutions, addressing significant digits, hedged expressions, order of magnitude, precision overstatement bias, and uncertainty about the Bayesian prior. For instance, presumably the assertion that an event has a probability of 1 in 1,000 would include probability values as low as 0.5 in 1000, and as large as 1.5 in 1000. Linguistic analysis reveals a simple scheme to decode approximator words such as ‘about’, ‘around’, and ‘at least’ in natural-language expressions. Robust Bayes analysis can account for uncertainty about the prior. These strategies can be combined in a coherent probabilistic analysis that minimally captures the express epistemic uncertainty implied by common utterances from experts. The analysis is broadly acceptable under both Bayesian and frequentist interpretations of probability, and it distinguishes epistemic and aleatory uncertainties. The result represents a lower bound on the final uncertainty.

Monte Carlo simulation and probability bounds analysis in R with hardly any data

2 and 6 December 2018, New Orleans, Louisiana

Workshop at the Society for Risk Analysis annual meeting

Engineering design, validation and predictive capability under aleatory and epistemic uncertainty

8 October 2018, Udine, Italy

Summer School of SPP 1886

For stochastic models that make predictions in the form of probability distributions or other structures that express predictive uncertainty, validation must contend with observations that may be sparse or imprecise, or both. The predictive capability of these models, which determines what we can reliably infer from them, is assessed by whether and how closely the model can be shown to yield predictions conforming with available empirical observations beyond those data used in the model calibration process. Although validation between the model and data can be easier to establish when the predictions or observations are uncertain, the model’s predictive capability is degraded by either uncertainty. Measures used for validation and estimating predictive capability should not confuse variability with lack of knowledge, but rather integrate these two kinds of uncertainties (sometimes denoted ‘aleatory’ and ‘epistemic’) in a way that leads to meaningful statements about the fit of the model to data and the reliability of predictions it generates. Engineering design under epistemic uncertainty must incorporate backcalculation and controlled backcalculation as fundamental operators that untangle expressions involving uncertainty.

Non-Laplacian uncertainty: practical consequences of an ugly paradigm shift about how we handle not knowing

17-21 September 2018, Compiègne, France

Keynote address, Belief and Soft Methods in Probability and Statistics

The relaxation of the completeness axiom of subjective expected utility theory leads to a non-Laplacian kind of uncertainty commonly known as ignorance. Taking account of how this differs from the uncertainty of proabability theory will have broad and important implications in engineering, statistics, and medicine. However, the change can be shallow in the sense that practices need not be radically transformed during the shift.

Engineering Day at SIPTA Summer School

27 July 2018, Oviedo, Spain

Day-long workshop at the 8th Summer School on Imprecise Probabilities: Theory and Applications

We are at a crossroads in our scientific appreciation of uncertainty. The traditional view is that there is only one kind of uncertainty and that probability theory is its calculus. This view leads in practice to quantitative results that are often misconstrued and demonstrably misleading. An emerging alternative view, however, entails a richer mathematical concept of uncertainty and a broader framework for uncertainty analysis. The concept admits a kind of uncertainty that is not handled by traditional Laplacian probability measures. The “engineering” day will discuss this non-Laplacian view that different kinds of uncertainty must be propagated differently through simulations, reliability and risk analyses, calculations for robust design, and other computations. The modern approach makes practical solutions easier for engineering and physics-based models, and the inferences drawn from such models under this view are more defensible. Topics include

- aleatory versus epistemic uncertainty (variability v. incertitude),
- probability boxes to characterise imprecise random numbers,
- integrating available ancillary knowledge to improve estimates,
- confidence structures generalising Walley’s Imprecise Beta Model,
- why bounding probabilities is not always sufficient,
- handling dependence among input variables,
- sensitivity analysis,
- validation and predictive capability,
- engineering design via backcalculation,
- spacecraft mission analysis and early design, and
- satellite conjunction analysis,

depending on time and the interests of participants. We will use a convenient implementation in R of interval arithmetic and probability bounds analysis to illustrate several numerical examples.

REC

Indubiety and the crisis of trust in the age of fake news

12 July 2018, Daresbury. United Kingdom

Invited talk, 3-4 pm, Science and Technology Facilities Council

The modern world brings us communication and connectedness unparalleled in the history of humanity, but we are not prepared for this world. Recent research in risk communication has revealed the underlying psychometric reasons that conventional approaches to science education are insufficient today. We are susceptible to fake news, dubious science, and clever advertisers. Our votes can be swayed by duplicitous politicians and terrorist shocks. More and more adults subscribe to irrational conspiracy theories. People who believe the Earth is flat are increasing exponentially via the web. This gullibility becomes dangerous in our hyperconnected world, and it is a problem for our society as serious as illiteracy was in previous centuries. We argue that pervasive uncertainty analysis is one of several strategies that can be useful in cultivating—and teaching—healthy skepticism without falling into the trap of losing trust in everything.

Even if: the power of bounding analysis to settle discrepancies in risk assessment

Scott Ferson, Daniel Rozell, and Lev Ginzburg

13 June 2018, Amsterdam, Netherlands

Risk & Uncertainty Conference

Background/objective Risk analyses undertaken to characterise the effects of industrial and other commercial activities on the environment are often especially contentious, perhaps because they involve conflicting world views about human dominion versus stewardship of the planet and its resources, and also because the underlying stakes are understood in terms of short-term economic growth and jobs against environmental preservation and human health considerations over the long term. Typically, the debate revolves about multiple technical issues that are quantified imprecisely, expressed in often vague language of possibility. It is hard to discern whether one issue or multiple issues in combination would mitigate or overwhelm other considerations. We sought to determine whether bounding analysis could be useful in settling disputes in cases where there are discrepancies among stakeholders about which issues might be at play and important in a risk analysis.

Method/approach We developed quantitative tools consisting of simple but stochastic models and their user-friendly software implementations with graphical outputs. These tools were used to mediate debate among disparate stakeholders in environmental assessments. Here we review two case studies employing these tools. The first case study concerns assessment of the ecological impacts of anadromous fish populations from cooling water intake structures necessary for nuclear power generation. The second case study concerns the environmental impacts of hydraulic fracturing (fracking) associated with hydrocarbon exploration and production. We introduced the quantitative tools to contextualise the possible consequences of theoretical phenomena and unseen effects. Discussants were allowed to suggest assumptions and parameter values in live, collaborative experiments quantitatively exploring the possible magnitudes of the various effects and phenomena under debate.

Findings Before using the tools, discussants were debating phenomena that could theoretically alter impacts but are hard to assess because of uncertainty about conditions and parameters. For example, biologists working for the nuclear power industry touted the importance of compensation (density dependence) in population biology of fishes as a mechanism by which ecological impacts might be absorbed without deleterious effects at the population level. But compensation is a nonlinearity that is notoriously difficult to characterise with sparse data. Likewise, stakeholders often worry and ask about the “unseen” or “unknown” effects of an impact, or the aggregate effects of many unseen impacts. These concerns are often befuddling to other stakeholders (and perhaps to analysts) who don’t understand, as a practical matter, how to account for something that is unknown and unknowable because it is unseen. The tools allowed even relatively thorny debates to be at least partially resolved by open discussion and, importantly, quantitative calculations based on bounding assessments. Bounding impacts, including risks expressed probabilistically, can be remarkably effective for settling scientific debates when relevant information is sparse and difficult to obtain. Open access to simple but stochastic models implemented in accessible software can be key to resolving long-standing technical disagreements.

Discussion/conclusion There may be little that can be done when competing interests engage in duplicity or outright lying in debate, but even in good faith different schools of thought may have legitimate concerns whose complexity prevents them from recognising the commonalities and general quantitative agreement they share with each other. These findings have implications for other contentious risk assessments such as those concerning global climate change.

Indubiety and the crisis of trust in the age of fake news

12 June 2018, Amsterdam, Netherlands

Risk & Uncertainty Conference

Liverpool Institute for Risk and Uncertainty

TBD

Founded in 2011, the University of Liverpool's Institute for Risk and Uncertainty (Risk Institute) represents a unique national centre of multidisciplinary excellence in risk analysis and uncertainty quantification and modelling, including related fields in reliability engineering, risk communication, and planning and design for robustness, resilience and sustainability. It brings together forty affiliated members of academic staff from engineering, psychology, environmental science, medicine, finance, management, computer science, physics, and mathematics, from among ten departments from all three faculties across the University. The Risk Institute hosts a Centre for Doctoral Training in Quantification and Management of Risk & Uncertainty in Complex Systems & Environments under funding from EPSRC and ESRC (2014-2023), and it has supported over eighty PhD students so far. The research at the Risk Institute addresses industrial and societal needs, at both local and global scales, which arise from the rapidly growing complexity of systems of various kinds, their environment and associated risks and uncertainties. A feature of the Risk Institute is its extraordinarily strong industrial involvement. Research and training involves dozens of industrial partners as well as many academic collaborators from around the globe. Our multidisciplinary approach to all projects is key to translate and communicate risk and uncertainty across the boundaries of disciplines and to bridge the divide between mathematical details and straightforward explanation in ordinary human language. The Risk Institute's strengths include Bayesian analysis, imprecise probabilities, advanced techniques for Monte Carlo simulation, and communication of risks and uncertainty.

Validation and predictive capability of imperfect models with imprecise data

16 May 2018, Minneapolis, Minnesota

Keynote address to the ASME 2018 V&V Symposium

Many sophisticated models in engineering today incorporate randomness or stochasticity and make predictions in the form of probability distributions or other structures that express predictive uncertainty. Validation of such models must contend with observations that are often sparse or imprecise, or both. The predictive capability of these models, which determines what we can reliably infer from them, is assessed by whether and how closely the model can be shown to yield predictions conforming with available empirical observations beyond those data used in the model calibration process. Interestingly, a validation match between the model and data can be easier to establish when the predictions or observations are uncertain, but the model’s predictive capability is degraded by either uncertainty. It is critical that measures used for validation and estimating predictive capability not confuse variability with lack of knowledge, but rather integrate these two kinds of uncertainties (sometimes denoted ‘aleatory’ and ‘epistemic’) in a way that leads to meaningful statements about the fit of the model to data and the reliability of predictions it generates.

Florianopolis

How do we handle extremely rare events in a meaningful way?

6 March 2018, Manchester, United Kingdom

Keynote address, EPSRC HubNet Risk Day, The University of Manchester

How should we estimate probabilities of events that are extremely rare? Sometimes such probabilites are elicited from experts, but the results are often expressed with no indication of the uncertainty associated with the estimate. Is it reasonable to evaluate fault trees using expert opinions about event probabilities of the form “1 in 10⁷”, “about 1 in 1000”, or “it’s never seen in over 100 years of observation”? A combination of strategies including linguistic decoding and robust Bayes analysis can account for uncertainties about rare-event probabilities in a way that is more comprehensive that previously possible.

Validation of scientific and engineering models with imprecise data

3 January 2018, Boca Raton, Florida

Florida Atlantic University, Department of Ocean and Mechanical Engineering

Validation of scientific and engineering models must contend with observations that are usually both sparse and imprecise. The predictive capability of these models, which determines what we can reliably infer from them, is assessed by whether and how closely the model can be shown to yield predictions conforming with available empirical observations beyond those data used in the model calibration process. The methods of probability bounds analysis allow us to compute constraints on estimates of probabilities and probability distributions that are guaranteed to be correct even when some model assumptions are relaxed or removed entirely. This allows models to better represent the variability in the natural world as well as our imprecision about it. Interestingly, a match between the model and data can sometimes be easier to establish when the predictions are uncertain because of ambiguities about the model structure or when empirical measurements are imprecise, but the resulting predictive capability is degraded by both phenomena. One might hope to define a scalar metric that assesses in some overall sense the dissimilarity between predictions and observations. But it is more informative and useful to distinguish predictions and observations in two senses, one concerned with epistemic uncertainty and one concerned with aleatory uncertainty.

Title not yet decided: the value of procrastination in risk analysis

13 December 2017, Arlington, Virginia

Plenary address to the Society for Risk Analysis annual meeting

I haven't had time to write this abstract yet, but I'm sure it'll mention how useful procrastination can be in daily life. People in business call it the "second mover advantage", but I think it's more evocative to just say "the second mouse gets the cheese". Waiting allows us to accumulate more knowledge relevant to the decision. Procrastination has many other advantages too. I plan to make a list of the ones I can find easily on line, but I haven't gotten around to it. But I remember my favorite of these advantages is this: "eventually they stop asking". Things do get done, though mostly because procrastination is using fear of missing a deadline as a motivator. Compressing the time scale for work efficiently marshalls energies and focuses our efforts. When I get a good graduate student, I want to explore something else about procrastination. I haven't had a chance to work this all out yet, but I believe there is a deep connection between risk communication and procrastination. Here's the background: The human brain has a probability sense, but it also has an ambiguity detector in the amygdala which is more ancient evolutionarily. Procrastination is one of its main responses. The ambiguity detector is what causes us to freeze when we hear a possibly threatening rustle in the bushes. Our probability sense is often countermanded by the ambiguity detector, and this conflict seems to explain a lot of why it is so hard to communicate risks and uncertainties. The ambiguity detector wants to know how bad it could be, and it is not much comforted by how unlikely that outcome is. The conflict between these two senses seems to explain several of the notorious biases, irrationalities and paradoxes that humans exhibit in making decisions in risky or uncertain settings, including probability distortion, loss aversion, hyperbolic discounting, Ellsberg paradox, among others.

Quick Bayes offers performance guarantees and easy risk communication

Scott Ferson and Jason O'Rawe

11 December 2017, Arlington, Virginia

Society for Risk Analysis annual meeting

Quick Bayes is a variant of robust Bayesian analysis that is especially convenient for risk analysts because it does not require them to choose a prior distribution when no prior information is available (the noninformative case). In repeated use, the quantitative results from Quick Bayes exhibit frequentist coverage properties consistent with Neyman confidence intervals at arbitrary confidence levels, which conventional Bayesian analyses generally lack. These coverage properties mean that results from Quick Bayes exhibit guaranteed statistical performance that is especially attractive to engineers and policymakers. The numerical results from Quick Bayes can be matched to Gigerenzer's natural frequencies for easy and intuitive communication to decision makers and the lay public. We illustrate the application of the Quick Bayes approach in the context of fault tree analysis in which we can characterize an event probability estimated from an imperfectly specified fault tree with a terse, natural-language expression of the form “k out of n”, where 0 ≤ k ≤ n. These natural frequencies condense both the probability and analyst's epistemic uncertainty about the probability into a form that psychometric research suggests will be intelligible to humans. Preliminary evidence collected via crowd-sourced science shows that humans natively understand the implications of these natural frequencies, including what the size of n says about the reliability of the probability estimate.

Monte Carlo and probability bounds analysis in R with hardly any data

10 & 14 December 2017, Arlington, Virginia

Society for Risk Analysis annual meeting

This revamped full-day workshop features hands-on examples worked in R on your own laptap, from raw data to final decision. The workshop introduces and compares Monte Carlo simulation and probability bounds analysis for developing probabilistic risk analyses when little or no empirical data are available. You can use your laptop to work the examples, or just follow along if you prefer. The examples illustrate the basic problems risk analysts face: not having much data to estimate inputs, not knowing the distribution shapes, not knowing their correlations, and not even being sure about the model form. Monte Carlo models will be parameterized using the method of matching moments and other common strategies. Probability bounds will be developed from both large and small data sets, from data with non-negligible measurement uncertainty, and from published summaries that lack data altogether. The workshop explains how to avoid common pitfalls in risk analyses, including the multiple instantiation problem, unjustified independence assumptions, repeated variable problem, and what to do when there’s little or no data. The numerical examples will be developed into fully probabilistic estimates useful for quantitative decisions and other risk-informed planning. Emphasis will be placed on the interpretation of results and on how defensible decisions can be made even when little information is available. The presentation style will be casual and interactive. Participants will receive handouts of the slides and on-line access to software for the examples.

Probability bounds analysis: getting something from hardly anything

1 November 2017, Aalto University, Espoo, Finland

Society for Risk Analysis (SRA) Nordic Chapter Conference

This three-hour workshop introduces probability bounds analysis for developing fully probabilistic risk analyses when little or no empirical data are available. The presentation will be casual and interactive, and workshop will feature hands-on examples worked in R on your own laptop, from raw data to final decision. You can work the examples yourself, or just follow along if you prefer. The examples illustrate the basic problems risk analysts face: not having much data to estimate inputs, not knowing the distribution shapes, not knowing their correlations, and not being sure about the model form. Probability boxes will be developed from both large and small data sets, from data with non-negligible measurement uncertainty, and from published summaries that lack data altogether. Emphasis will be placed on the interpretation of results and on how defensible decisions can be made even when little information is available.

Optimal design of testing algorithms

2 October 2017, Liverpool, United Kingdom

Risk in Medicine https://sites.google.com/site/riskinmedicine

Diagnostic algorithms in medicine are usually developed by panels of experts without any formal optimization, although this process is slow and often yields suboptimal results that are neither adapted for the different diagnostic purposes nor flexible under conditions that alter the reliabilities of the various tests. Customizing an algorithm requires searching over test topologies to find an algorithm that comes as close as possible to meeting given constraints (e.g., level for predictive reliability) and that optimizes given criteria (e.g., minimizing average cost per patient). In order to guarantee the constraints have been met, the inherent uncertainties in diagnostic testing parameters will be propagated through calculations of test predictive values using robust Bayes methods.

Bayes’ rule in medical counseling: implications for kindergarteners’ cooties

22 September 2017, University of Liverpool

Bayes Days https://sites.google.com/site/bayesdays

Medical practitioners commonly diagnose a patient’s health condition by employing a medical test which is not by itself definitive, but has some statistical probability of revealing the true health state. Naively interpreting the result from a medical test can therefore lead to an incorrect assessment for a patient’s true health condition because of the possibility of false-positive and false-negative disease detections. Bayes’ rule is commonly used to estimate the actual chance a patient is sick given the results from the medical test, from the statistical characteristics of the test used and the underlying prevalence of the disease. However, Winkler and Smith have argued that the traditional application of Bayes’ rule in medical counseling is inappropriate and represents a “confusion in the medical decision-making literature”. They propose in its place a radically different formulation that makes special use of the information about the test results for new patients. Remarkably, Bayesians do not seem to have a means within their theory to determine whether the traditional approach or the Winkler and Smith approach is correct.

Estimating rare-event probabilities without data

6−10 August 2017, Vienna, Austria

12th International Conference on Structural Safety & Reliability

It is important to be able to estimate risks for complex engineered systems with no performance histories such as spacecraft of new design or biological control strategies using novel genetic constructs that have never existed before. When operating in a new extreme environment like outer space, even an off-the-shelf component with familiar properties may exhibit new behaviors. Likewise, releasing organisms with altered genes can theoretically lead to horizontal transfer of genetic material across species, which is more likely in large populations. There are essentially two approaches to estimating probabilities when there is virtually no data: expert elicitation (i.e., guessing), and disaggregation into constituent parts whose probabilities are easier to estimate (i.e., breaking the problem into subproblems). When the latter approach is no longer workable, analysts must resort to the former and rely on expert opinion and estimation.

How should we characterize probabilities of events that are so rare that they have never been observed? By what principles can such characterizations be projected in probabilistic analyses such as risk assessments or reliability calculations? Sometimes elaborate elicitation strategies are employed to estimate rare-event probabilities, but the results are often expressed as total probabilities, simple numbers from the interval [0,1] of the real line, with no indication of the uncertainty associated with the estimate. If the epistemic uncertainty from this exercise is recorded at all, it is usually either in the form of ranges of probability estimates from different informants which requires a cumbersome sensitivity study to address, or embodied as aggregated probability distributions, which conflates epistemic uncertainty with aleatory uncertainty.

Traditionally, fault tree analysts have suggested that inputs are likely good only to an order of magnitude, and yet they use point estimates for them in computations. This reduces the entire analysis to a back-of-an-envelope calculation of unknown reliability. But how should analysts model expert opinions about event probabilities of the form “1 in 10⁷”, “about 1 in 1000”, or “it’s never seen in over 100 years of observation”, so they can be used in calculations that account for rather than ignore epistemic uncertainties? Several strategies provide partial solutions, addressing significant digits, hedged expressions, order of magnitude, precision overstatement bias, and uncertainty about the Bayesian prior. For instance, presumably the assertion that an event has a probability of 1 in 1,000 would include probability values as low as 0.5 in 1000, and as large as 1.5 in 1000. Linguistic analysis reveals a simple scheme to decode approximator words such as ‘about’, ‘around’, and ‘at least’ in natural-language expressions. Robust Bayes analysis can account for uncertainty about the prior. These strategies can be combined in a coherent probabilistic analysis that minimally captures the express epistemic uncertainty implied by common utterances from experts. The analysis is broadly acceptable under both Bayesian and frequentist interpretations of probability, and it distinguishes epistemic and aleatory uncertainties. The result represents a lower bound on the final uncertainty.

Accounting for doubt about the model in risk and uncertainty analyses

4 August 2017, Gaithersburg, Maryland

National Institute of Standards and Technology

Model uncertainty: conservative propagation through polynomial regressions with unknown structure

31 July 2017, Baltimore, Maryland

2017 Joint Statistical Meetings

How can we project uncertainty about X through a function to characterize the uncertainty about its output Y when the function itself has not been precisely characterized? We describe some special cases where this problem has been solved comprehensively, and consider a new case with polynomial functions. When evidence of a relationship between variables has been condensed into regression analyses, a simple convolution using regression statistics allows us to reconstruct the scatter of points processed in the original regression model, but regression analysis does not necessarily select a model that actually reflects how data were generated. What if we do not know which order polynomial should have been used in the regression analysis? We describe a simple and inexpensive projection approach that yields conservative characterizations no matter what polynomial actually generated the data. The result represents the uncertainty induced in Y owing to the underlying uncertainty about X, and the model uncertainty about which degree polynomial is correct, contingent on the presumption that a polynomial model of some order is appropriate. The results appear to be useful for risk analysis.

Statistics for imprecise data: the key for growing the IP community

Marco de Angelis, Scott Ferson and Luke Green

14 July 2017, Lugano, Switzerland

Tenth International Symposium on Imprecise Probability: Theories and Applications [spotlight poster]

As a discipline, the theory of imprecise probabilities may be pricing itself out of the market in the sense that its complexity, computational burden, and requisite mathematical sophistication required for nontrivial applications are prohibitive in many subject domains. For the discipline to grow, it is essential to foster broad interest and use across science and engineering. This will involve recruiting a class of users who may not develop methods but who will apply them in their routine work. Their applications give evidence of the utility of the imprecise probabilities approach and its underlying philosophy. This implies that someone who sees imprecision in a data set but who lacks special training in uncertainty quantification or imprecise probabilities should be able to apply convenient algorithms for basic statistics.

When the data set has imprecision, computing statistics can be challenging. For example, for data in the form of intervals, using naïve interval analysis yields results with inflated uncertainty because of repetitions of variables in the formulas. Moreover, finding optimal bounds on many basic statistics are NP-hard problems that grow in difficulty with the size of the data set. It is practically impossible to solve these problems for large data sets with a simple sampling strategy, such as Monte Carlo, in which the formula for the variance is treated like a black box evaluated for many possible configurations of the data points within their respective intervals.

Over the last century, statistics has focused on developing methods for analyses in which data sample size is limiting. But not all uncertainty in data has to do with small sample sizes. Although most statistical analyses today ignore the uncertainty reported by laboratories and empiricists as interval measurement uncertainty, it is clearly that this is always because this uncertainty is negligibly small. We believe it may instead be due to the lack of friendly software to handle it. We announce and describe a software library that is intended to provide convenient access to basic statistics for interval and censored data. The library of algorithms is being used to develop on-line and stand-alone software for analyzing data sets containing imprecision as well as sampling uncertainty. The algorithms in the library require users to make fewer dubious assumptions about the data set than currently popular methods for handling data censoring, missingness, and lack of independence. The library currently supports methods to compute over two dozen measures of location, dispersion and distribution shape, including arithmetic, geometric and harmonic means and median (but not mode), variance, confidence intervals, histogram, and several inferential methods for linear and logistic regressions, t-tests, F-tests, and outlier detection. We show the accuracy of the proposed rigorous approaches via numerical comparisons between them and other bounding techniques like global optimisation, and with other traditional statistical methods for handling censored data.

Dependence among probability boxes in fault trees

Scott Ferson and Kari Sentz

14 July 2017, Lugano, Switzerland

Tenth International Symposium on Imprecise Probability: Theories and Applications [spotlight poster]

We consider the simple problem of projecting p-boxes in fault trees, for which there are many real practical uses, and we restrict ourselves to only forward uncertainty propagation problems, in which the characterizations of the probabilities of the leaf events are projected up the tree to evaluate the probability of the top event. Boole first considered the problem of projecting probabilities through the logical operations of AND and OR. Although his treatment is often criticized as cavalier or even wrong, he did realize the consequence of making no assumption about the dependence among the events, and he presaged the inequalities that modern probabilists attribute to Fréchet. Hailperin generalized and extended Boole's ideas to consider interval-valued estimates of probabilities in Boolean expressions, which are equivalent to fault trees, and showed that the necessary calculations to find optimal solutions generally require linear programming. This uncertainty logic was reconceived by Kozine and Filimonov in terms of coherent lower previsions. In this theory, the lower prevision of a binary event X is its lower probability P(X) defined as real-valued, so the characterization of the uncertain logical value is via an interval, namely [P(X), P(X)], where P(X) = 1−P(NOT X).

Introducing distributions or sets of distributions into fault trees as characterizations of leaf nodes creates subtleties in how dependencies should be handled that were not encountered in the interval-probability approaches of Hailperin or Kozine and Filimonov. For instance, assuming that the modeled leaf events are stochastically independent does not guarantee that the information sets about the probabilities associated with those events are also independent. If there are two levels at which dependence is a concern: among the events, and among the data sets used to characterize them, one can have different assumptions about dependence at each level, and one will get different answers for each combination of assumptions.

We argue that a particular variant of probabilistic logic is an uncertainty calculus appropriate for fault tree analysis in which the events are binary (failed or not-failed) but our information about their failure states is probabilistic and imprecise. We consider five logical operations: AND, OR, NOT, COND (i.e., the probability P(B|A) from marginals P(A) and P(B)), and EQUIV (material implication), each under eight models of dependence (no-assumptions, independent, perfect, opposite, mutually exclusive, positive, negative, and Pearson-correlated). This uncertainty calculus generalizes Boolean logic, probability theory (both with and without independence assumptions), and Kleene's strong logic of indeterminacy, which alternative calculi sometime fail to do. For instance, fuzzy logic generalizes Boolean logic, but it does not generalize probabilistic logic, which is usually needed for fault tree analyses. We describe efficient computational algorithms based on simplification strategies from interval analysis and reliability theory, combined with Monte Carlo simulation, that can be used to evaluate many practical fault trees.

Communicating uncertainty about probability: equivalent binomial count

Jason O’Rawe, Michael Balch and Scott Ferson

19−21 June 2017, Lisbon, Portugal

Society for Risk Analysis - Europe [poster]

Most strategies for the basic risk communication problem of expressing the probability of a well defined event presume the probability is precisely characterized as a real number. In practice, however, such probabilities can often only be estimated from data limited in abundance and precision. Likewise, risk analyses often yield imprecisely specified probabilities because of measurement error, small sample sizes, model uncertainty, and demographic uncertainty from estimating continuous variables from discrete data. Under the theory of confidence structures, the binomial probability of an event estimated from binary data with k successes out of n trials is associated with a particular structure that has the form of a p-box, i.e., bounds on a cumulative distribution function. When n is large, this structure approximates the beta distribution obtained by Bayesians under a binomial sampling model and the Jeffreys prior, and asymptotically it approximates the scalar frequentist estimate k/n. But when n is small, it is imprecise and cannot be approximated by any single distribution because of demographic uncertainty. These confidence structures make apparent the importance of the size of n to the reliability of the estimate. If n is large, the probability estimate is more reliable than if n is small. When a risk analysis yields a result in the form of a precise distribution or imprecise p-box for an event’s probability, we can approximate the result with a confidence structure corresponding to a binomial probability estimated for some values of k and n. Thus we can characterize the event probability from the risk analysis with a terse, natural-language expression of the form “k out of n”, where k and n are nonnegative integers and 0≤k≤n. We call this the equivalent binomial count, and argue that it condenses both the probability and uncertainty about that probability into a form that psychometry suggests will be intelligible to humans. Gigerenzer calls such integer pairs “natural frequencies” because humans appear to natively understand their implications, including what the size of n says about the reliability of the probability estimate. We describe preliminary data collected with Amazon Mechanical Turk that appears to show that humans correctly understand these expressions.

Diagnostic medical testing: one size does not fit all

19−21 June 2017, Lisbon, Portugal

Society for Risk Analysis - Europe [poster]

Diagnostic testing is used for several different purposes in medicine, including counseling patients, public heath surveys, blood screening, pharmacology, etc. Diagnostic algorithms (programmed test sequences) are usually developed by panels of experts without any formal quantitative analysis. This process is slow and often yields suboptimal results that are not adapted for very different diagnostic purposes and do not account for reliabilities of the various underlying tests. Customizing a diagnostic algorithm requires searching over testing topologies to find an algorithm that comes as close as possible to meeting given constraints (e.g., limiting average cost per patient) that optimizes given criteria (e.g., maximizing prediction reliability). In order to guarantee the constraints have been met, the inherent uncertainties in diagnostic testing parameters (i.e., sensitivity and specificity based on limited empirical sampling) can be propagated through calculations using robust Bayes methods. This approach can also handle potential correlations and dependencies among tests and across test performance statistics. The resulting test topology can maximize the selected criterion while meeting the constraints, with generally different answers for different medical purposes. Using such an approach to design medical test algorithms will improve public health by allowing a more efficient allocation of public health resources from quantitatively optimized testing strategies and by allowing doctors and patients to make better health decisions because of full knowledge of the concomitant uncertainties in the diagnostic process.

Quick Bayes: statistical performance and risk communication with natural frequencies

12 June 2017, Cambridge, United Kingdom

Cambridge Risk and Uncertainty Conference

Comprehensive risk analyses do not always fit neatly within a purely probabilistic framework with precisely characterized distributions of known shapes with well understood dependence functions. In such cases, an imprecise probabilities approach may be more convenient for risk analysis. This may be true when empirical observations have non-negligible imprecision or data censoring, when correlations and dependence among the input variables have not been empirically well studied, or when there is dispute about the appropriate prior distribution or even the proper mathematical structure of the risk model. For example, when relevant data sets are very small for some input variable, the choice of the prior distribution can become a contentious issue because it will often have a significant effect on the results. In most cases, the residual epistemic uncertainty that such imprecision and doubts represent must be handled by ancillary sensitivity studies, which are sometimes cumbersome to deploy and usually hard to summarize for decision makers. We describe a generalized Bayes approach, which we dub Quick Bayes because of its convenience for analysts, that can sidestep debate about choice of prior distribution at least in the noninformative case and also account for measurement imprecision and data censoring, uncertainty about intervariable dependence, and even bounded doubt about mathematical structure. Quick Bayes is a kind of robust Bayes analysis with canonical features that give it frequentist coverage properties that are desirable to engineers and policymakers alike. We illustrate the application of the approach in the context of fault tree analysis associated with an anti-malaria program using genetically engineered mosquitos currently under consideration. In repeated use, the quantitative results exhibit coverage properties like Neyman confidence intervals at arbitrary confidence levels. These coverage properties mean that results from Quick Bayes exhibit a guaranteed statistical performance that is especially attractive to engineers, which traditional Bayesian analyses do not generally have. The numerical results from Quick Bayes can be matched to natural frequencies sensu Gigerenzer for easy and intuitive communication to policy makers and the lay public. Thus, we can characterize an event probability estimated by an imperfectly specified fault tree, perhaps even with missing nodes, with a terse, natural-language expression of the form “k out of n”, where k and n are nonnegative integers and 0≤k≤n. These natural frequencies condense both the probability and residual epistemic uncertainty about that probability into a form that psychometric theory suggests will be intelligible to humans. Evidence recently collected from crowd-sourced science appears to show that humans natively understand the implications of natural frequencies, including what the size of n says about the reliability of the probability estimate.

Combating indubiety

May 2017, Liverpool

Pint of Science

Indubiety (or gullibility) is an impediment to our society as serious as illiteracy was in previous centuries. We are susceptible to fake news, dubious science, and clever advertisers. Our votes can be swayed by duplicitous politicians and terrorist shocks. More and more adults subscribe to irrational conspiracy theories. People who believe the Earth is flat are increasing exponentially via the web. Scott suggests ways that people can learn to be less gullible and more discerning.

Compliance with confidence: fashioning risk governance policies in the face of uncertainty

1−3 March 2017, Venice, Italy

SRA Policy Forum: Risk Governance for Key Enabling Technologies

Sometimes policies and rules issued by regulatory authorities turn out to be self-defeating in that they induce undesirable behaviors within the regulated communities. We consider an example involving regulatory sampling guidance for occupational health and safety in the United States originally developed in the 1970s but still in use. This guidance presumes regulated industries collect measurements of workers' toxicant exposures compared against an occupational exposure limit. Because increasing sample size can only lead to a higher chance of exceeding this limit, no companies ever want to collect more than the minimum number of samples required. In practice, many companies are only required to collect a single sample measurement. These policies are hard to defend, yet precedent and fairness demand they continue to be used and applied to new factories and even to new industries where there may be a clear need for much wider data collection to demonstrate worker health and safety. We suggest that full accounting of uncertainty can encourage sampling beyond regulatory minima. Regulation should be consistent with risk analysis that distinguishes between variability and incertitude, and compliance determinations should take account of both. For example, when sample data are summarized as prediction intervals or probability boxes that inherently express both kinds of uncertainty, the breadth of incertitude narrows as more sample data are collected. The results can more clearly demonstrate compliance as the combined uncertainty decreases, even if some samples exceed the limit. We further suggest a tiered scheme dividing regulated entities according to the scrutiny they need, with tiers for compliant facilities, new facilities, and completely new industries, using outlier detection, equivalence testing, and tolerance analysis to show compliance.

Durham

Computing risks with confidence

12 December 2016, San Diego, California [poster]

Society for Risk Analysis annual meeting

Confidence boxes (c-boxes) characterize distributions from sample data with or without distribution shape assumptions. They handle small sample size, but also censoring and imprecision about sample values, and demographic uncertainty that arises from estimating continuous variables from discrete data. C-boxes encode confidence intervals for parameters at every confidence level. They can be projected through mathematical calculations and composed with probability distributions, and the computational results also encode confidence (or prediction) intervals at all levels of confidence.

Epistemic uncertainty in agent-based modeling

Scott Ferson and Kari Sentz

12 December 2016, San Diego, California [poster]

Society for Risk Analysis annual meeting

Agent-based modeling is widely used in biology, ecology, medicine, public health, terrorism and warfare modeling, and many other disciplines. Traditional approaches to handling uncertainty in agent-based models employ Monte Carlo methods that randomly sample parameters to determine whether and how a behavior or interaction rule is realized by an individual agent. A simulation of all agents thereby represents a single realization from among many possible scenarios, and simulations with many replications are used to reveal differential probabilities and the likelihoods of extreme results. Unfortunately, Monte Carlo is a poor way to project epistemic uncertainty through a complex model, and it is an unsatisfying scheme for representing the uncertainty about volitional choices of agents. Adding epistemic uncertainty to agent-based models properly requires the ability to (1) specify agent attributes and other quantities as intervals, probability distributions, or p-boxes, (2) similarly characterize stochastic drivers imprecisely, and (3) execute behavior rules in a way that respects uncertainty in their conditional clauses. When uncertainty makes the truth value of the conditional clause of any rule unclear, the simulation should hold that the rule both fires and does not fire. This may result in subsequent uncertainties elsewhere in the simulation including the status of attributes of agents, even perhaps whether an agent exists or not. Such epistemic uncertainty must be projected through any simulation realization. New software can advance agent-based modeling to uncover a more comprehensive picture of the effects of epistemic uncertainty, which can be vastly more important than aleatory uncertainty.

Monte Carlo and probability bounds analysis in R with hardly any data

11 & 15 December 2016, San Diego, California

Society for Risk Analysis annual meeting

8:30 am − 5:30 pm

Epistemic uncertainty in agent-based modeling

Scott Ferson and Kari Sentz

17 June 2016, Bochum, Germany

Keynote presentation, 7th International Workshop on Reliable Engineering Computing (REC 2016)

meeting website

Traditional approaches to handling uncertainty in agent-based models employ Monte Carlo methods to randomly sample parameters and probabilistically determine whether and how a behavior or interaction rule is realized by an individual agent. A simulation of all agents thereby represents a single realization from among many possible scenarios, and simulations with many replications are used to reveal differential probabilities and the likelihoods of extreme results. Unfortunately, Monte Carlo is a poor way to project epistemic uncertainty through a complex model, and it is an unsatisfying scheme for representing the uncertainty about volitional choices of agents. Adding epistemic uncertainty to agent-based models properly requires the ability to (1) characterize stochastic drivers imprecisely, (2) specify agent attributes and other quantities as intervals, probability distributions, or p-boxes, and (3) execute behavior rules in a way that respects uncertainty in their conditional clauses. When uncertainty makes the truth value of the conditional clause of any rule unclear, the simulation should hold that the rule both fires and does not fire. This may result in subsequent uncertainties elsewhere in the simulation including the status of attributes of agents, even perhaps whether an agent exists or not. These facilities advance agent-based modeling to uncover a more comprehensive picture of the effects of epistemic uncertainty, which can be vastly more important than aleatory uncertainty. We compare this approach with traditional simulation using only Monte Carlo methods to reveal the differences between these two approaches to uncertainty. Keywords: agent-based models, epistemic uncertainty; Monte Carlo simulation, p-boxes, intervals

Non-Laplacian uncertainty and why your simulations need to tend to it today

10 March 2015, Stony Brook, New York

Institute for Advanced Computational Science, Stony Brook University

We are at a crossroads in our scientific appreciation of uncertainty. The traditional view is that there is only one kind of uncertainty and that probability theory is its calculus. This view has created several paradoxes that have befuddled decision theory about why humans prefer particular options when selecting among possible choices. The traditional view also leads to quantitative results that are often misconstrued and demonstrably misleading. An emerging alternative view, however, entails a richer mathematical concept of uncertainty and a broader framework for uncertainty analysis. The concept admits a kind of uncertainty that is not handled by traditional Laplacian probability measures. The modern approach makes practical solutions easier for several engineering and other physics-based models, and the inferences drawn from such models under this view are more reliable, and resolve several long-standing paradoxes. We review the mathematical, decision-theoretic and even neurological reasons that suggest it is often useful to distinguish kinds of uncertainty, including what can be called non-Laplacian uncertainty.

Non-Laplacian uncertainty

12 December 2015, Stony Brook, New York

A Celebration of the Scientific Career of Professor Lev Ginzburg

Computing with confidence

9 December 2015, Arlington, Virginia

Society for Risk Analysis annual meeting

Bayesian posterior distributions can be propagated in posterior calculations so they are useful in risk analyses and engineering. However, the interpretation of these distributions and the results they yield has no necessary connection to the empirical world when they are specified according to subjectivist principles. In contrast, traditional Neyman confidence intervals are useful in risk analysis and engineering because they offer a guarantee of statistical performance through repeated use. However, it is difficult to employ them consistently in follow-on analyses and assessments because they cannot be readily propagated through mathematical calculations. Balch has proposed confidence structures (c-boxes) which generalize confidence distributions and provide an interpretation by which confidence intervals at any confidence level can be specified for a parameter of interest. C-boxes can be used in calculations using the standard methods of probability bounds analysis and yield results that also admit the confidence interpretation. Thus, analysts using them can now literally compute with confidence. The calculation and use of c-boxes are illustrated with a set of several challenge problems involving parametric and nonparametric statistical estimation using sample data. The results include imprecise characterizations analogous to posterior distributions and posterior predictive distributions, and also structures that can be used to compute tolerance intervals at any probability levels. Simulations demonstrate the degree of conservativism of the results. The c-box approach is contrasted with statistical estimation using traditional maximum likelihood and Bayesian methods where possible. Keywords: confidence structures, c-box, confidence intervals, tolerance intervals, probability bounds analysis, maximum likelihood, Bayesian estimation.

Monte Carlo simulation and probability bounds analysis in R with hardly any data

6 & 10 December 2015, Arlington, Virginia

Society for Risk Analysis annual meeting

8:30 am − 4:30 pm

Sensitivity analysis of probabilistic models

7−8 November 2015, Stony Brook, New York [poster]

Workshop: Sensitivity, Error and Uncertainty Quantification for Atomic, Plasma and Material Data

Stony Brook University Institute for Advanced Computational Science

In probabilistic calculations, analysts routinely assume (i) probabilities and probability distributions can be precisely specified, (ii) variables are all independent of one another or at worst linearly correlated with well-known coefficients, and (iii) model structure is known without error. For the most part, these assumptions are made for the sake of mathematical convenience, rather than with any empirical justification. And, until recently, these assumptions were pretty much necessary in order to get any answer at all. New methods now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of the assumptions is relaxed or removed completely. In many cases, the results obtained are the best possible bounds, which means that tightening them would require additional empirical information. We present an overview of probability bounds as a comprehensive tool for sensitivity analysis when both epistemic and aleatory uncertainties are present.

Computing with confidence

7−8 November 2015, Stony Brook, New York [poster]

Workshop: Sensitivity, Error and Uncertainty Quantification for Atomic, Plasma and Material Data

Stony Brook University Institute for Advanced Computational Science

C-boxes (confidence boxes) are imprecise generalizations of confidence distributions (such as Student's t distribution) which encode frequentist confidence intervals at every possible confidence level. They are analogous to Bayesian posteriors and posterior predictive distributions in that they characterize the uncertainty about distributions and their parameters estimated from sparse or imprecise sample data. However, c-boxes have a purely frequentist interpretation that makes them useful in physics-based modeling because they offer a guarantee of statistical performance which Bayesian approaches do not provide. Unlike confidence intervals which cannot usually be used in mathematical calculations, c-boxes can be propagated through mathematical expressions using the ordinary machinery of probability bounds analysis, and this allows analysts to compute with confidence, both figuratively and literally. The results of these calculations with c-boxes also encode confidence intervals at every confidence level for model outputs for which no direct observations have been made. C-boxes account for the inferential uncertainty that comes from empirical observations, including the effect of very small sample sizes, but also the effects of any imprecision in the data and uncertainty from trying to characterize a continuous parameter from discrete data observations. C-boxes have been derived both for nonparametric problems in which the shape of the underlying distribution from which the data were randomly drawn is unknown, and for parametric problems where the underlying distribution is known to come from a particular family. We compare c-boxes to traditional maximum likelihood and Bayesian estimators, and demonstrate their application in problems where the traditional approaches provide no answer at all.

Validation of scientific and engineering models with imprecise data

14 September 2015, Atlanta, Georgia

Structural Engineering, Mechanics and Materials Seminar Series

Georgia Institute of Technology

Probabilistic uncertainty analyses commonly make assumptions for the sake of convenience. For instance, analysts routinely assume most or all variables are independent of one another without empirical justification for this assumption, or even in the face of evidence to the contrary. They also often assume probabilities and probability distributions can be precisely specified even when relevant empirical data sets are very small. In the past, such assumptions have been necessary in order to get any answer at all. New methods now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of these assumptions is relaxed or removed entirely.

Validation of scientific and engineering models must contend with observations that are usually both sparse and imprecise. The predictive capability of these models, which determines what we can reliably infer from them, is assessed by whether and how closely the model can be shown to yield predictions conforming with available empirical observations beyond those data used in the model calibration process. The new methods allow models to better represent the variability in the natural world as well as our imprecision about it. Interestingly, a match between the model and data can sometimes be easier to establish when the predictions are uncertain because of ambiguities about the model structure or when empirical measurements are imprecise, but the resulting predictive capability is degraded by both phenomena. One might hope to define a scalar metric that assesses in some overall sense the dissimilarity between predictions and observations. But it is more informative and useful to distinguish predictions and observations in two senses, one concerned with epistemic uncertainty and one concerned with aleatory uncertainty.

Error propagation in shape analyses with or without landmarks

22 July 2015, Arlington, Virginia

International Symposium on Forensic Science Error Management, NISTIFS-CR2-06

Shape analysis is often required in crime scene pattern matching. For instance, elliptic Fourier analysis can be used in analyzing outlines, tracks, signatures, silhouettes, symbols, ordered points, etc. This multivariate shape analysis decomposes geometric patterns into numerical coefficients that can be normalized to optionally remove effects such as rotation, size or magnification, translation, registration, orientation, resolution or pixel density, etc., which makes the coefficients suitable for statistical analysis. Interestingly, elliptic Fourier analysis can also be applied directly to landmark data if the points are endowed or assigned some order (which is always possible). The ‘contour’ in this case is simply a list of the landmark coordinates. The analysis does not require the point locations it uses as input to be contiguous or to form a connected or closed shape. Thus, the method can be applied to landmark sequences, arbitrarily complex outlines, closed contours, even self-intersecting tracks in two or three dimensions, and it can completely capture the shape if sufficiently many harmonics are used in the decomposition.

This method is one of a family of techniques that reduce complex shape information into forms suitable for multivariate t-tests, outlier detection, classifications, statistical discriminations, and other quantitative characterizations. It induces a mathematical metric across the entire space of possible closed shapes that allows us to assess quantitatively how close one shape is to another. It would, for example, allow one to display an array of shapes that are as close to a test shape as a given shape is. Such an array can therefore be used to demonstrate the fidelity of a putative match in a purely visual and intuitive way that might be understood by jurors without appeal to numerical statistics.

There are inescapable measurement uncertainties in the capture of contours and landmarks to be analyzed via elliptic Fourier analysis and similar techniques. If contour information is encoded manually by a human using a digitizer, the achievable measurement precision depends in large part on the care and skill of the operator. Even if done by machine, as is common in image analysis, the theoretical pixel-level precision cannot usually be obtained in practice. The tracing algorithm can often be calibrated for good, consistent performance, and its measurement precision can then be assessed empirically. There are other sources of uncertainty arising from errors of registration and deformation of non-rigid objects by stretching, twisting or drooping during the measurement.

Current implementations of elliptic Fourier analysis do not characterize these measurement errors as uncertainty about the harmonic coefficients and thus cannot project them through subsequent statistical calculations. We show how such imprecision can be characterized using interval boxes or ellipsoids and how the uncertainty can be projected through intervalized elliptic Fourier analysis and intervalized statistical tests such as t-tests and discriminations. There are various ways to visualize the uncertainty. We explore how the resulting uncertainty about shape, and about the decisions that depend on matching shapes, can be visualized.

KEYWORDS: shape analysis; ellipitic Fourier analysis; landmark analysis; registration error; non-rigid shapes; deformation of shape evidence; coordinate error; multivariate analysis of shape; outlines; tracks; signatures; silhouettes; symbols; ordered points; shape metric; flower of shapes; shape match fidelity; visualizing error in shape; rotation invariance; size invariance; magnification invariance; translation invariance; registration invariance; orientation invariance; resolution invariance; pixel-density invariance

Uncertainty quantification with hard problems

18 February 2015, Stony Brook, New York

Stony Brook Astronomy Program, Department of Physics and Astronomy, Stony Brook University

1:30 pm, penthouse (4th floor) conference room, Earth and Space Sciences Building

Accounting for uncertainty in model predictions can be the difference between prudent analysis and wishful thinking. Although traditional methods of error and uncertainty analysis are useful, they typically require untenable or unjustified assumptions in many real-world applications that involve both epistemic and aleatory uncertainties. Methods are needed that can relax these assumptions and thereby avoid confounding the different kinds of uncertainty. Such methods reflect what is actually known, and what is not known, about the underlying system. Uncertainty quantification can be difficult for problems that involve both kinds of uncertainty in strongly nonlinear models with many repeated uncertain variables. However, a wide array of strategies can be employed that, in combination, can provide computationally feasible solutions for many practical problems.

Verified calculation with uncertain numbers: how to avoid pretending we know more than we do

9-12 December 2014, Orlando, Florida

Keynote address, IEEE Symposium on Computational Intelligence for Engineering Solutions CIES'14

Applications of probabilistic uncertainty analyses commonly make assumptions for the sake of convenience. For instance, analysts routinely assume most or all variables are independent of one another, without empirical justification for this assumption, or even in the face of evidence to the contrary. They also often assume probabilities and probability distributions can be precisely specified even when relevant empirical data sets are very small. In the past, such assumptions have been necessary in order to get any answer at all. New methods now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of these assumptions is relaxed or removed entirely. In many cases, the results obtained are the best possible bounds, which means that tightening them would require additional empirical information. This talk will present an overview of probability bounds analysis, as a computationally practical implementation of imprecise probabilities that combines ideas from both interval analysis and probability theory and sidesteps the limitations of each. It introduces probability boxes (p-boxes), briefly explains the numerous ways they can arise when empirical information is sparse, and illustrates their use in uncertainty modeling with a series of applications spanning environmental, structural and electrical engineering.

[cancelled...sorry]

9,11 December 2014, Denver, Colorado

Society for Risk Analysis annual meeting

Uncertainty

22 October 2014, Bethesda, Maryland

National Science Advisory Board for Biosecurity

This briefing introduces the two basic forms of uncertainty and explains why they cannot be treated with the same methods. It discusses the notion of an 'uncertain number' and describes the fundamental differences between deterministic estimates and assessments that fully account for stochasticity (variability) and partial ignorance (incertitude) about inputs and model structure. It also explains how modern uncertainty quantification can be used as a quality assurance check on probabilistic models and how it can be made more comprehensive than a sensitivity study could possibly be.

Validation of uncertain ecological models with imprecise data

15-19 September 2014, Lausanne, Switzerland

Centre Interfacultaire Bernoulli, École polytechnique fédérale de Lausanne

[workshop organizer] (workshop, program)

The predictive capability of ecological models, which determines what we can reliably infer from them, is assessed by whether and how closely the model can be shown to yield predictions conforming with available empirical observations beyond those data used in the model calibration process. Realistic ecological models usually incorporate stochasticity to mimic the variability in the natural world, which means that their predictions are often expressed as probability distributions or similarly uncertain numbers. Validation of these models must also contend with data that are usually sparse and often imprecise. But this stochasticity and imprecision complicate the validation process considerably. A match between the model and data can sometimes be easier to establish when the predictions are uncertain because of ambiguities about the model structure or when empirical measurements are imprecise, but the resulting predictive capability is degraded by both phenomena. One might hope to define a scalar metric that assesses in some overall sense the dissimilarity between predictions and observations. But there may be cases in which it would be more informative and useful to distinguish predictions and observations in two senses, say, one concerned with epistemic uncertainty and one concerned with aleatory uncertainty. This workshop will investigate the appropriate accounting that is needed for conducting proper validations and estimating predictive capabilities of ecological models.

Computing with confidence using imprecise structures

10 September 2014, Piscataway, New Jersey

Department of Statistics and Biostatistics, Rutgers University

Like Student's t distribution, confidence distributions encode frequentist confidence intervals for a parameter at any confidence level. They characterize inferential uncertainty about parameters estimated from sparse or imprecise sample data, just like bootstrap distributions or Bayesian posterior distributions, but they enjoy a frequentist guarantee of statistical performance that makes them useful in risk and uncertainty analyses. Unfortunately, confidence distributions do not exist for many inference problems, and they are not supposed to be used in calculations. However, imprecise generalizations of confidence distributions, which we call 'c-boxes', can be propagated through mathematical expressions using the ordinary machinery of probability bounds analysis, and the results also offer the same statistical guarantee. C-boxes allow analysts to literally as well as figuratively compute with confidence.

Accounting for doubt about the model in risk and uncertainty analyses

3 September 2014, Los Alamos, New Mexico

Los Alamos National Laboratory

Information Science and Technology Seminar Speaker Series

Analysts usually construct a model and then act as though it correctly represents the state of the world. This understates the uncertainty associated with the model’s predictions, because it fails to express the analyst’s uncertainty that the model itself might be in error. Several approaches have been proposed to account for model uncertainty within a probabilistic assessment, including what-if studies, stochastic mixtures, Bayesian model averaging, generalized method of moments, probability bounds analysis, robust Bayes analyses, and imprecise probabilities. Although each approach has advantages that make it attractive in some situations, each also has serious limitations. For example, several approaches require the analyst to explicitly enumerate all the possible competing models. This might sometimes be reasonable, but the uncertainty will often be more profound and there might be possible models of which the analyst is not even aware. Although Bayesian model averaging and stochastic mixture strategies are considered by many to be the state of the art in accounting for model uncertainty in probabilistic assessments, numerical examples show that both approaches actually tend to erase model uncertainty rather than truly propagate it through calculations. In contrast, probability bounds analysis, robust Bayes methods, and imprecise probability methods can be used even if the possible models cannot be explicitly enumerated and, moreover, they do not average away the uncertainty but propagate it fully through calculations. In the context of risk analysis and uncertainty modeling, it is often possible to project uncertainty about X (which may be aleatory, epistemic or both) through a function f to characterize the uncertainty about Y = f (X) even though f itself has not been precisely characterized. Although the general strategies for quantitatively expressing and projecting model uncertainty though mathematical calculations seem either dubious or quite crude, there are a variety of special cases where methods to handle model uncertainty are rather well developed and available solutions are both comprehensive and subtle.

How to compute with confidence

3 September 2014, Los Alamos, New Mexico

Los Alamos National Laboratory

Confidence boxes ("c-boxes") are imprecise generalizations of traditional confidence distributions, which, like Student's t distribution, encode frequentist confidence intervals for parameters of interest at every confidence level. They are analogous to Bayesian posterior distributions in that they characterize the inferential uncertainty about distribution parameters estimated from sparse or imprecise sample data, but they have a purely frequentist interpretation that makes them useful in engineering because they offer a guarantee of statistical performance through repeated use. Unlike confidence intervals which cannot usually be used in mathematical calculations, c-boxes can be propagated through mathematical expressions using the ordinary machinery of probability bounds analysis, and this allows analysts to compute with confidence, both figuratively and literally, because the results also have the same confidence interpretation. For instance, they can be used to compute probability boxes for both prediction and tolerance distributions. Confidence boxes can be computed in a variety of ways directly from random sample data. There are c-boxes both for parametric problems (where the family of the underlying distribution from which the data were randomly generated is known to be normal, lognormal, exponential, binomial, Poisson, etc.), and for nonparametric problems in which the shape of the underlying distribution is unknown. Confidence boxes account for the uncertainty about a parameter that comes from the inference from observations, including the effect of small sample size, but also the effects of imprecision in the data and demographic uncertainty which arises from trying to characterize a continuous parameter from discrete data observations.

The first step is the hardest: specifying input distributions

Jason O’Rawe, Scott Ferson, Masatoshi Sugeno, Kevin Shoemaker, Michael Balch, Jimmy Goode

23 July 2014, Liverpool, United Kingdom

Institute for Risk and Uncertainty, University of Liverpool

10 am, Mason Bibby Common Room, Harrison Hughes Building

A fundamental task in probabilistic risk analysis is selecting an appropriate distribution or other characterization with which to model each input variable within the risk calculation. Currently, many different and often incompatible approaches for selecting input distributions are commonly used, including the method of matching moments and similar distribution fitting strategies, maximum likelihood estimation, Bayesian methods, maximum entropy criterion, among others. We compare and contrast six traditional methods and six recently proposed methods for their usefulness in risk analysis in specifying the marginal inputs to be used in probabilistic assessments. We apply each method to a series of challenge problems involving synthetic data, taking care to compare only analogous outputs from each method. We contrast the use of constraint analysis and conditionalization as alternative techniques to account for relevant information, and we compare criteria based on either optimization or performance to interpret empirical evidence in selecting input distributions. Despite the wide variety of available approaches for addressing this problem, none of the methods seems to suffice to handle all four kinds of uncertainty that risk analysts must routinely face: sampling uncertainty arising because the entire relevant population cannot be measured, mensurational uncertainty arising from the inability to measure quantities with infinite precision, demographic uncertainty arising when continuous parameters must be estimated from discrete data, and model structure uncertainty arising from doubt about the prior or the underlying data-generating process.

Computing with confidence (figuratively and literally)

22 July 2014, Liverpool, United Kingdom

Institute for Risk and Uncertainty, University of Liverpool

10 am, Mason Bibby Common Room, Harrison Hughes Building

What do biological anthropology, clinical experience and recent neuroimagery evidence suggest about communicating risk?

21 July 2014, Liverpool, United Kingdom

Institute for Risk and Uncertainty, University of Liverpool

2 pm, Mason Bibby Common Room, Harrison Hughes Building

Risk communication is notoriously difficult, and many attempts have been disastrous failures. In this informal presentation, we discuss traditional views in science and engineering about risk communication and the conventional wisdom of professional risk communicators, and we contrast these with the emerging synthesis of recent neuroimagery with clinical evidence on the efficacy of risk communication strategies considered in the context of biological anthropology, which we call the Montauk guidance. We suggest that risk communication efforts fail so badly because they set up contradictions across calculations within our multicameral brains that conflate probabilities and ambiguity, or confuse risk calculations with the balancing of interpersonal trust. We also summarize three avenues of our current research in this area. (1) Equivalent binomial counts can be used to communicate both a risk and its uncertainty by converting the results of a quantitative risk assessment into a natural-language expression of the form "k out of n". Gigerenzer calls such expressions "natural frequencies" because humans seem to have a native facility for understanding their implications. (2) Risk communicators should employ linguistic hedges such as 'about', 'nearly' or 'more than' in numerical expressions in the same way that human language speakers use these words. We have quantitatively characterized the implications of these words in linguistic surveys administered via Amazon Mechanical Turk or captured via ludic elicitations (what von Ahn calls "games with a purpose") on Facebook. (3) Stereotype attitude mapping helps to forecast the likely perspectives of different stakeholders with respect to their opinions about where the burden of proof falls, tolerance of evidential disputes, and attitudes about uncertainty, in order to see risks through the eyes of those stakeholders. Their respective attitudes seem to explain why exactly the same data can lead to completely different conclusions by different stakeholders, or even by a single stakeholder at different times.

Where probability paradoxes come from

Scott Ferson, Jason O'Rawe, Jack Siegrist, Christian Luhmann

21 July 2014, Liverpool, United Kingdom

Institute for Risk and Uncertainty, University of Liverpool

10 am, Mason Bibby Common Room, Harrison Hughes Building

Decision scientists and psychometricians have documented a long list of cognitive “biases and heuristics” over the last several decades, which are widely considered to be manifestations of human irrationality about risks and decision making. These phenomena include neglect of probability, loss aversion, ambiguity aversion and the Ellsberg Paradox, hyperbolic discounting, and the two-dimensionality of risk perception described by Slovic. We suggest that all these and perhaps other biases arise from the interplay between distinct special-purpose processors within the multicameral human brain whose existence is implied by recent clinical and neuroimaging evidence. Although these phenomena are usually presumed to be misperceptions or cognitive illusions, we describe the evolutionary significance of these phenomena in humans and other species, and we place them in their biological context where they do not appear to be failings of the human brain but rather evolutionary adaptations. Apparent paradoxes arise when psychometricians attempt to interpret human behaviors against the inappropriate norm of the theory of probability, which turns out to be an overly precise calculus of uncertainty when in reality the different mental processors give contradictory results. This view of the psychological and neurological evidence also suggests why risk communication efforts so often dramatically fail and how they might be substantially improved. For instance, it now seems clear that what risk analysts call epistemic uncertainty (i.e., lack of knowledge or ambiguity) and aleatory uncertainty (variation or stochasticity) should not be rolled up into one mathematical probabilistic concept in risk assessments, but they instead require an analysis that distinguishes them and keeps them separate in a way that respects the cognitive functions within the decision makers to whom risk communications are directed.

Computing with confidence: imprecise posteriors and predictive distributions

Scott Ferson, Jason O’Rawe and Michael Balch

16 July 2014, Liverpool, United Kingdom

4 pm, Track D, Second International Conference on Vulnerability and Risk Analysis and Management (ICVRAM2014) and Sixth International Symposium on Uncertainty Modelling and Analysis (ISUMA2014)

Confidence structures (c-boxes) are imprecise generalizations of confidence distributions. They encode frequentist confidence intervals, at every confidence level, for parameters of interest and thereby characterize the inferential uncertainty about distribution parameters estimated from sparse or imprecise sample data. They have a purely frequentist interpretation that makes them useful in engineering because they offer a guarantee of statistical performance through repeated use. Unlike traditional confidence intervals which cannot usually be propagated through mathematical calculations, c-boxes can be used in calculations using the standard methods of probability bounds analysis and yield results that also admit the same confidence interpretation. This means that analysts using them can now literally compute with confidence. We provide formulas for c-boxes in several important problems including parametric and nonparametric statistical estimation from random sample data. The results are imprecise characterizations analogous to posterior distributions and posterior predictive distributions. We contrast this c-box approach to statistical estimation using traditional maximum likelihood and Bayesian methods.

Model uncertainty in risk analysis: conservative projection methods for polynomial regressions with unknown structure

26 May 2014, Chicago, Illinois

Keynote address, Reliability Engineering Conference REC 2014

How can we project uncertainty about X (which may be aleatory, epistemic or both) through a function f to characterize the uncertainty about Y = f (X) when f itself has not been precisely characterized? Although the general problem of how to quantitatively express and project model uncertainty though mathematical calculations in a risk analysis can be addressed by only a few strategies, all of which seem either dubious or quite crude, there are a variety of special cases where methods to handle model uncertainty are rather well developed and available solutions are both comprehensive and subtle. For instance, uncertainty about the shapes of probability distributions can be captured as credal sets or p-boxes (Levi 1980, Walley 1991, Ferson et al. 2003). Likewise, uncertainty about the stochastic dependencies dependencies between distributions can be projected using Kolmogorov–Fréchet bounding (Williamson and Downs 1991, Ferson et al. 2004). Numerical experiments suggest that there is also another special case of model uncertainty that can be addressed fairly robustly: when evidence of the statistical relationship between variables has been condensed into regression analyses. Suppose that the function f that has been characterized from sparse sample data (X_i,Y_i) via statistical regression. A straightforward convolution using regression statistics allows us to reconstruct the scatter of points processed in the original regression model, but it is well known that regression analysis does not necessarily select a model that actually reflects how data were generated (Adams 1991). What if we do not know which order polynomial should have been used in the regression analysis? It turns out that the default reconstruction has conservative characteristics no matter what polynomial actually generated the data. We observe three facts: (1) models of all orders yield conservative characterizations of the variance of Y, (2) models of all orders yield reasonably conservative characterizations of the tail risks of Y, and (3) the envelope of resulting predicted Y distributions expressed as a p-box is conservative in all respects (i.e., it is effectively sure to enclose the real distribution of Y ). This p-box seems to represent the model uncertainty induced in Y owing to the underlying uncertainty about X, and the model uncertainty about which degree polynomial is correct, contingent on the presumption that a polynomial model of some order is appropriate. These observations suggest a very simple and inexpensive strategy for computing conservative bounds on Y. Moreover, these expressions of uncertainty do not seem overly conservative. It does not matter which order a prior regression analysis may have used, so it is possible to obtain appropriately conservative estimates of tail risks for Y = f (X ) whatever model was used.

Protecting patient privacy while preserving medical information for research

Gang Xiang, Jason O’Rawe, Vladik Krienovich, Janos Hajagos, and Scott Ferson

26 May 2014, Chicago, Illinois

Reliability Engineering Conference REC 2014

Data collected during health care delivery and public health surveys possess a great deal of information that could be used in biomedical and epidemiological research. Access to these data, however, is usually restricted because of the private nature of most personal health records. Simple strategies to anonymize data do not protect privacy. For example, dropping explicit identifiers such as name, social security number and address makes the data appear anonymous, but the remaining attributes can often be used to re-identify individuals. Golle (2006 "Revisiting the uniqueness of simple demographics in the US population" in Proc. of the 5th ACM Workshop on Privacy in Electronic Society) showed that 63% of the US population can be uniquely identified by gender, five-digit ZIP code, and date of birth, and those three attributes often exist in health data put in open archives. This information is widely distributed, so it can easily be obtained and used for re-identification. Techniques for statistical disclosure control have been developed to ensure privacy but information truthfulness is not well preserved so that unreliable results may be released. In generalization-based anonymization approaches, there is information loss due to attribute generalization and existing techniques do not provide sufficient control for maintaining data utility. We need methods that protect both the privacy of individuals as well as the integrity of the statistical relationships in the data. The problem is that there is an inherent tradeoff between these. Protecting privacy always loses information. However, for a given anonymization strategy, there are often multiple ways of masking the data that meet the disclosure risk criteria provided. This can be exploited to choose the solution that best preserves statistical information while still meeting the disclosure risk criteria. We are developing an integrated software system that provides solutions for managers of data sets so they can minimize disclosure risks while maximizing data informativeness. To overcome the computational challenges associated with subsequent statistical analyses, we selected the shapes of the intervals used in the anonymization to exploit reduced-complexity algorithms available for those analyses.

Why uncertainty theorists are befuddled

25 March 2014, Compiègne, France

LC14, Scientific and Technical Communication in English, 13h, Salle A100, Centre Benjamin Franklin, Université de Technologie de Compiègne

Mathematical theories about uncertainty are currently in a tumult. Some argue that the theory of probability invented by Laplace is not sufficiently rich to properly characterize pure incertitude (also called 'epistemic uncertainty', 'ambiguity', or simply 'ignorance'). We may be in a revolutionary period leading to a new non-Laplacian view of uncertainty similar to the revolution in mathematics 150 years ago that lead to non-Euclidean geometry, which enormously enriched geometry and broadened its applications. Like the earlier revolution, the debate today turns on whether a single axiom should be regarded as always or only sometimes true. Evidence from neuroscience, psychology and anthropology suggests that humans are wired to distinguish at least two kinds of uncertainty, yet the traditional view of uncertainty originated by Laplace conflates these two together and, in doing so, creates what appear to be cognitive “biases and heuristics”. These biases are often considered to be manifestations of human irrationality about risks and decision making, but they may in fact be normal and evolutionarily advantageous reactions to risks and uncertainties in the natural environment.

Finding out about ‘about’: traditional and ludic quantification of the meanings of hedges in numeric expressions

Scott Ferson, Jason O’Rawe, Jimmie Goode, James Mickley, Christian Luhmann, William McGill, and Jack Siegrist

8 February 2014, Compiègne, France

Heudiasyc Decision and Image team, Université de Technologie de Compiègne

Eliciting numerical values requires us to decode words commonly used to express or modify numbers in natural language. These words include ‘about’, ‘around’, ‘almost’, ‘exactly’, ‘below’, ‘order of’, etc., which are collectively known as approximators or numerical hedges. We used Amazon Mechanical Turk to identify the implications of various approximators common in English. We can use the results both to decode the uncertainty in what a respondent says, and also to encode uncertainty uncovered by a risk assessment to express its results to a human being in his own language. The investigation strategies generalize easily to languages other than English.

Where probability paradoxes come from

21 January 2014, Compiègne, France

Labex Maitrise des Systèmes des Systèmes Technologiques, Université de Technologie de Compiègne

(announcement, slides, video, photos)

How to communicate risks: implications from psychometry and dual uncertainties

3 January 2014, Stony Brook, New York

Health Sciences Center, Stony Brook University Hospital

Specifying input distributions: no method solves all problems

Jason O’Rawe, Scott Ferson, Masatoshi Sugeno, Kevin Shoemaker, Michael Balch, Jimmy Goode

9 December 2013, Baltimore, Maryland

Society for Risk Analysis annual meeting

Mixing good data with bad

Kevin Shoemaker, Jack Siegrist, and Scott Ferson

9 December 2013, Baltimore, Maryland

Society for Risk Analysis annual meeting

Data sets have different qualities. Some data are collected with careful attention to proper protocols and careful measurement using highly precise instruments. In contrast, some data are hastily collected by sloppy or unmotivated people with bad instruments or shoddy protocols under uncontrolled conditions. Statistical methods make it possible to formally combine these two kinds of data in a single analysis. But is it always a good idea to do so? Interval statistics is one convenient method that accounts for the different qualities of data in an analysis. High quality data have tighter intervals and poor quality data have wider intervals, and the two can be legitimately pooled using interval statistics, but it appears that it is not always advisable for an analyst to combine good data with bad. We describe examples showing that, under some circumstances, including more data without regard for its quality unnecessarily increases the amount of uncertainty in the final output of an analysis. Ordinarily, statistical judgment would frown on throwing away any data, but as demonstrated by these examples, it seems clearly advantageous sometimes to ignore this judgment. More data does not always lead to more statistical power, and increasing the precision of measurements sometimes provides a decidedly more efficient return on research effort. This result is highly intuitive even though these examples imply a notion of negative information, which traditional Bayesian analyses do not allow.

Probabilistic risk analysis with hardly any data

Scott Ferson and Kevin Shoemaker

8 and 12 December 2013, Baltimore, Maryland

[day-long workshop, given twice] Society for Risk Analysis continuing education program

This tutorial explains how you can develop a fully probabilistic risk analysis even though there may be very little empirical data available on which to base the analysis. The talks are organized around the basic problems that risk analysts face: not knowing the input distributions, not knowing their correlations, not being sure about the model itself, or even which variables should be considered. Possible strategies include traditional approximative methods and recent robust and bounding methods. Numerical examples are given that illustrate the use of various methods including traditional moment propagation, PERT, maximum entropy, Fermi estimates, uniformity principle, probability bounds analysis, Bayesian model averaging and the old work horse, sensitivity analysis. All of the approaches can be used to develop a fully probabilistic estimate useful for screening decisions and other planning. The advantages and drawbacks of the various approaches are examined. The discussion addresses how defensible decisions can be made even when little information is available, and when one should break down and collect more empirical data and, in that case, what data to collect. When properly formulated, a probabilistic risk analysis reveals what can be inferred from available information and characterizes the reliability of those inferences. In cases where the available information is insufficient to reach unambiguous conclusions, best-possible bounding probabilistic risk analysis provides a compelling argument for further empirical research and data collection.

Verified computation with uncertain numbers

31 October 2013, El Paso, Texas

Invited talk, Constraint Programming and Decision Making Workshop

University of Texas at El Paso

Uncertainty propagation methods have often been disjoint and incomplete. Intervals alone cannot generally account for functional or stochastic dependence among variables, so propagations of interval uncertainty risk exploding to triviality. The dream of a workable ‘probabilistic arithmetic’, imagined by many people, seems unachievable in practice. Whenever probability theory has been used to make calculations, analysts have routinely made untenable assumptions that ignore doubts about the model structure, the shape or precision of distribution specifications, and the character of stochastic dependence among variables. Until recently, such assumptions without any empirical justification have been common—even in relatively sophisticated and high-profile assessments such as risk analyses for space expeditions—because alternative methods that did not require these assumptions had not been available. New methods now allow us to compute often best-possible bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of the assumptions is relaxed or removed. This talk will present an overview of probability bounds analysis, as a computationally practical implementation of imprecise probabilities, that combines ideas from both interval analysis and probability theory to sidestep the limitations of each.

Computing with confidence

Scott Ferson, Michael Balch, Kari Sentz and Jack Siegrist

2 July 2013, Compiègne, France

International Symposium for Imprecise Probability: Theories and Applications

Université de Technologie de Compiègne

Traditional confidence intervals are useful in engineering because they offer a guarantee of statistical performance through repeated use. However, it is difficult to employ them consistently in analyses and assessments because it is not clear how to propagate them through mathematical calculations. Confidence structures (c-boxes) generalize confidence distributions and provide an interpretation by which confidence intervals at any confidence level can be specified for a parameter of interest. C-boxes can be used in calculations using the standard methods of probability bounds analysis and yield results that also admit the confidence interpretation. Thus analysts using them can now literally compute with confidence. We illustrate the calculation and use of c-boxes for some elementary inference problems and describe R functions to compute them and some Monte Carlo simulations demonstrating the coverage performance of the c-boxes and calculations based on them.

Psychometry of probability and decision

28 February 2013, Selden, New York

Mathematics Department, Suffolk County Community College

Epistemic uncertainty, which is the result of poor measurements and incomplete knowledge, is often distinguished from aleatory uncertainty, which is caused by stochasticity or variability in the world. As recognized at least since Keynes and Knight, humans treat these two kinds of uncertainty very differently. When given simple frequency information, humans appear to make risk calculations in a manner consistent with probability theory and Bayesian norms, but the presence of even small epistemic uncertainty disrupts these calculations and produces decisions that typically focus exclusively on the worst-case outcomes, ignoring the available probably information. There are, in fact, many similar cognitive “biases and heuristics” that have been described by decision scientists and psychometricians over the last several decades, which are widely considered manifestations of human irrationality about risks and decision making.

Recent clinical and neuroimaging evidence suggest that humans are endowed with at least two special-purpose uncertainty processors in the multicameral human brain. One of these processors is devoted to risk calculations while the other handles detection and processing of epistemic uncertainty. These processors are localized in different parts of the brain and are mediated by different chemical systems that are separately activated by the format of sensory input. When both processors are activated, they can give conflicting resolutions, but the brain appears to often give priority to considerations of incertitude over variability.

We explore the effect that these competing processors have on perception and cognition of uncertainty and suggest that several famous paradoxes in probability and decision making may arise because of the interplay between these mental processors. These phenomena include loss aversion, ambiguity aversion and the Ellsberg Paradox, hyperbolic discounting, the two-dimensionality of risk perception, and others. Although these phenomena are usually presumed to be biases or cognitive illusions, we describe the adaptive significance of these phenomena in humans and other species and place them in an evolutionary context where they do not appear to be failings of the human brain but rather adaptations. The psychological and neurological evidence suggests that epistemic and aleatory uncertainty should not be rolled up into one mathematical concept in risk assessment, but require distinct approaches that respect biological realities within the decision-maker.

Statistical inference in imprecise probabilities

9 January 2013, Guangzhou, People's Republic of China

Imprecise probability workshop, Institute of Logic and Cognition, Sun Yat-Sen University

Imprecise probabilities arise naturally whenever there is epistemic uncertainty about statistical information such as data censoring, missing values, demographic uncertainty from using discrete data to estimate continuous parameters, uncertainty about the shape of the distribution from which data are sampled or the stationarity of the sampling process, unknown dependence among input parameters, imprecisely specified distribution parameters, or constraint information without sample data. Imprecise probabilities also arise naturally whenever any of the strict assumptions commonly used in statistical inference are relaxed, such as in robust Bayes or Bayesian sensitivity analysis, or when likelihood functions are shallow and a neighborhood of parameter values with large likelihoods are taken to all be plausible characterizations rather than selecting only the value with the maximal likelihood. Expressing these imprecise probabilities as probability boxes (i.e., bounds on distribution functions) enables analysts to project these various sources of uncertainty in rigorous and quantitative calculations.

Traditional and ludic quantification of the meanings of hedges in numeric expressions

Scott Ferson, Jason O’Rawe, Jimmie Goode, James Mickley, Christian Luhmann, William McGill, and Jack Siegrist

12 December 2012, San Francisco, California

Society for Risk Analysis annual meeting

An important part of processing elicited numerical inputs is an ability to quantitatively decode natural-language words that are commonly used to express or modify numerical values, such as about, around, almost, exactly, nearly, below, at least, order of, etc., which are collectively known as hedges. Figuring out the quantitative implications of hedges is important to being able to understand the numerical implications of what a patient is reporting when he says, for example, that he has had a headache for about 7 days, and how we should translate the patients complaint into a numerical interval or other uncertain number. We have collected numerical information about hedges from direct questionnaires distributed to students, but this method of interrogation is inefficient, and tiresome or even dizzying for the informant. Fixed questions in static questionnaires are also highly susceptible to psychometric artifacts from cognitive biases such as anchoring. We describe a research strategy based on Von Ahns use of games for harvesting human intelligence to determine the quantitative meaning of hedges. Artfully designed games can elicit information as a side-effect of play so that the elicitation process is enjoyable to participants who will thus play longer and share more of their intelligence. From several possible variants, we describe two games implemented on Facebook that can be used to elicit the quantitative meanings of hedge words. The first game What Can You Say? is essentially a direct translation of the research question. The second game Liar! may be more ludicly compelling because it evokes cheater detection skills in players.

Probabilistic risk analysis with hardly any data

Scott Ferson and Dan Rozell

9 and 13 December 2012, San Francisco, California

[day-long workshop, given twice] Society for Risk Analysis continuing education program

Risk analysis from an evolutionary perspective

13 July 2012, Auckland, New Zealand

Social Values Intersecting Risk Analysis, International speaker, University of Auckland, New Zealand

Many risk analysts believe that the public is often irrational about risks and perhaps incapable of understanding the careful calculations of probabilities about adverse events undertaken in risk assessments. But the truth may be that it is the risk analysts who are delusional about how risks should be computed and expressed. Psychometricians have cataloged a large variety of “paradoxes” and “biases” exhibited by humans, but they have not noticed that many of these supposed misconceptions are explicable and justified by the need to distinguish different kinds of uncertainty. Humans are in fact wired by evolution to process incertitude (epistemic uncertainty) separately and differently from variability (aleatory uncertainty), so risk assessments that conflate these two kinds of uncertainty, as almost all assessments today do, will generally be unintelligible to or irrelevant to the public. Risk analysts also seem to be in the dark about what risk analysis as a practice really is. Ecologists recognize that the purpose of communication is not to "share information", but rather a behavior by a speaker intended to evoke a reaction in a listener. Likewise, psychologists recognize that the purpose of reasoning is not for uncovering truth, but rather reasoning is a mechanism by which we craft arguments that will be compelling to a listener. Risk analysts may believe themselves to be disinterested bringers of truth, but people implicitly know that risk communicators are trying to get them to do something. And they also know that analysts’ numbers could be wrong, and that they could be lying. With this perspective, it is entirely reasonable to consider the risks presented by risk analysts to be lower bounds on the true risks. Thus, risk analysts who complain that the public doesn't understand risks because they act as if the risks are larger than they are reported to be may themselves be the one who do not understand what is happening in the practice of risk communication broadly.

Biology and evolution of risk and uncertainty perception

17 July 2012, Sydney, New South Wales, Australia

International speaker, Frontiers of Risk Analysis, SRA on Campus, University of Sydney

Many risk analysts believe that the public is often irrational about risks and perhaps incapable of understanding the careful calculations of probabilities about adverse events undertaken in risk assessments. But the truth may be that it is the risk analysts who are delusional about how risks should be computed and expressed. Probabilists have cataloged a large variety of “paradoxes” and “biases” exhibited by humans, but they have not noticed that many of these supposed misconceptions are explicable and justified by the need to distinguish different kinds of uncertainty. Humans are in fact wired by evolution to process incertitude (epistemic uncertainty) separately and differently from variability (aleatory uncertainty), so risk assessments that conflate these two kinds of uncertainty, as almost all assessments today do, will generally be unintelligible to or irrelevant to the public. Risk analysts also seem to be in the dark about what risk analysis as a practice really is. Ecologists recognize that the purpose of communication is not to "share information", but rather a behavior by a speaker intended to evoke a reaction in a listener. Likewise, psychologists recognize that the purpose of reasoning is not for uncovering truth, but rather reasoning is a mechanism by which we craft arguments that will be compelling to a listener. Risk analysts may believe themselves to be disinterested bringers of truth, but people implicitly know that risk communicators are trying to get them to do something. And they also know that analysts’ numbers could be wrong, and that they could be lying. With this perspective, it is entirely reasonable to consider the risks presented by risk analysts to be lower bounds on the true risks. Thus, risk analysts who complain that the public doesn't understand risks because they act as if the risks are larger than they are reported to be may themselves be the one who do not understand what is happening in the practice of risk communication broadly.

Factoring out bias and overconfidence: advanced bias correction in risk analysis

Scott Ferson, Jack Siegrist, Michael Balch and Adam Finkel

6 December 2011, Charleston, South Carolina

Society for Risk Analysis annual meeting

Numerical estimates produced by experts and lay people alike are commonly biased as a result of self-interest on the part of the persons making the estimates. There is also empirical evidence that expressions of uncertainty are much smaller than justified. Simple scaling, shifting or inflating corrections are widely used to account for such biases and overconfidence, but better distributional information is usually available, and fully using this information can yield corrected estimates that properly express uncertainty. Corrections can be made in two distinct ways. First, predictions can be convolved with an empirical distribution or p-box of observed errors (from data quality or validation studies) to add uncertainty about predictions associated with model error. Second, predictions can be deconvolved to remove some of the uncertainty about predictions associated with the measurement protocol. In both of these cases, the structure of errors can be characterized as a distribution or p-box with arbitrary complexity. We illustrate the requisite calculations to make these corrections with numerical examples. We conclude (1) the notion of 'bias' should be understood more generally in risk analysis to reflect both location and uncertainty width, (2) self-interest bias and understatement of uncertainty are common, large in magnitude, and should not be neglected, (3) convolution can be used to inflate uncertainty to counteract human psychology, and (4) deconvolution can be used to remove some of the uncertainty associated with measurement errors.

Probability paradoxes explained by the second uncertainty processor

Jack Siegrist, Scott Ferson, Christian Luhmann and Lev Ginzburg

5 December 2011, Charleston, South Carolina

Society for Risk Analysis annual meeting

Probabilistic risk analysis with hardly any data

Scott Ferson and Dan Rozell

4 and 8 December 2011, Charleston, South Carolina

[day-long workshop, given twice] Society for Risk Analysis continuing education program

Verified computation with probabilities

2 August 2011, Boulder, Colorado

Keynote address, IFIP/NIST Working Conference on Uncertainty Quantification in Scientific Computing

National Institute for Standards and Technology

(slides)

Interval analysis is often offered as the method for verified computation, but the pessimism in the old saw that "interval analysis is the mathematics of the future, and always will be" is perhaps justified by the impracticality of interval bounding as an approach to projecting uncertainty in real-world problems. Intervals cannot account for dependence among variables, so propagations commonly explode to triviality. Likewise, the dream of a workable 'probabilistic arithmetic', which has been imagined by many people, seems similarly unachievable. Even in sophisticated applications such as nuclear power plant risk analyses, whenever probability theory has been used to make calculations, analysts have routinely assumed (i) probabilities and probability distributions can be precisely specified, (ii) most or all variables are independent of one another, and (iii) model structure is known without error. For the most part, these assumptions have been made for the sake of mathematical convenience, rather than with any empirical justification. And, until recently, these or essentially similar assumptions were pretty much necessary in order to get any answer at all. New methods now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of the assumptions is relaxed or removed. In many cases, the results obtained are the best possible bounds, which means that tightening them would require additional empirical information. This talk will present an overview of probability bounds analysis, as a computationally practical implementation of imprecise probabilities that combines ideas from both interval analysis and probability theory to sidestep the limitations of each.

Probability paradoxes explained by the second uncertainty processor

Scott Ferson, Jack Siegrist and Lev Ginzburg

10 June 2011, Yorktown Heights, New York

IBM, Thomas J. Watson Research Center

Probability boxes: quality assurance for probabilistic risk analyses

August 2011, Hampton, Virginia

National Institute of Aerospace Summer Design Institute

Even sophisticated applications of probabilistic risk analyses often assume that probabilities and probability distributions are precisely known although data are sparse, or that most variables are mutually independent, or that lack of knowledge of dependence justifies independence. In general, they often that uncertainty about model structure can be safety neglected. For the most part, these assumptions have been made for the sake of mathematical convenience, rather than with any empirical justification. And, until recently, these or essentially similar assumptions were pretty much necessary in order to get any quantitative answer at all. New methods including probability bounds analysis, which is the calculus for probability boxes, now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of the assumptions is relaxed or removed. In many cases, the results obtained are the best possible, which means that tightening them would require additional empirical information. Probability bounds analysis is a computationally practical implementation of the theory of imprecise probabilities, which combines ideas from both bounding analysis and probability theory to sidestep the limitations of each. It allows probability boxes to be used in a variety of applications including QMU assessments, screening risk assessments, and quality assurance reviews of traditional one- or two-dimensional Monte Carlo analyses. The methods are so efficient that they can be deployed in a scheme we call ‘quiet doubt’ in which uncertainty analyses are conducted automatically and pervasively whenever numerical calculations are done, in a way that is similar to how modern word processors automatically do spell checking.

Beyond probability: a pragmatic approach to uncertainty quantification in engineering

4 May 2011, Williamsburg, Virginia

First NASA Statistical Engineering Symposium

(slides)

Statistically detecting clustering for rare events

Jack Siegrist, Scott Ferson, Jimmy Goode, and Roger Grimson

13 April 2011, Hyattsville, Maryland

International Conference on Vulnerability and Risk Analysis and Management (ICVRAM) and International Symposium on Uncertainty Modeling and Analysis (ISUMA)

Cluster detection is considered to be essential for recognizing design flaws and cryptic common-mode or common-cause dependencies among events such as component failures, but when such events are rare, the uncertainty inherent in sparse datasets makes statistical analysis challenging. Traditional statistical tests for detecting clustering assume asymptotically large sample sizes and are therefore not applicable when data are sparse—as they generally are for rare events. We describe several new statistical methods that can be used to detect clustering of rare events in ordered cells. The new tests employ exact methods based on combinatorial formulations so that they yield exact p-values and cannot violate their nominal Type I error rates like the traditional tests do. As a result, the new tests are reliable whatever the size of the data set, and are especially useful when data sets are extremely small. We characterize the relative statistical power of the new tests under different kinds of clustering mechanisms and data set configurations.

Statistical inference under two structurally different approaches to interval uncertainty

Scott Ferson and Jack Siegrist

11 April 2011, Hyattsville, Maryland

International Conference on Vulnerability and Risk Analysis and Management (ICVRAM) and International Symposium on Uncertainty Modeling and Analysis (ISUMA)

Two broadly different approaches to such data have been proposed for handling data that contain non-negligible interval uncertainty from censoring, plus-minus digital readouts, and other sources of measurement imprecision or incertitude. Modeling interval data with uniform distributions over their ranges allows relatively straightforwardcalculation of sample statistics, ut does not guaranetee these estimates will apoproach the parameters of teh actual distributions, even for asymptotically many random samples. In contrast, modeling interval data as bounds on possible values yields corresponding bounds on sample statistics that are easier to interpret although often more difficult to calculate. We illustrate the approaches in estimating descriptive statistics, empirical distribution functions, and best-fit distributinos. Statistical inference under the bounding approach generally yields a class of decisions under the theory of imprecise probabilities. In contrast, the uniforms approach will yeild a unique decision (up to indifference), although this decision cannot be said to implied by the data alone because it depends on ancillary assumptions that may not be tenable.

Uncertainty arithmetic on Excel spreadsheets: add-in for intervals, distributions and p-boxes

Scott Ferson, James Mickley and William McGill

11 April 2011, Hyattsville, Maryland

International Conference on Vulnerability and Risk Analysis and Management (ICVRAM) and International Symposium on Uncertainty Modeling and Analysis (ISUMA)

Despite their limitations as a platform for calculations, Microsoft Excel spreadsheets enjoy widespread use throughout much of engineering and science, and they have emerged as a lingua franca for computations in some quarters. Given their ubiquity, it would be useful if Excel spreadsheets could express uncertainty in inputs and propagate uncertainty through calculations. We describe an add-in for Microsoft Excel that supports arithmetic on uncertain numbers, which include intervals, probability distributions, and p-boxes (i.e., bounds on probability distributions). The software enables native calculations in Excel with these objects and ordinary scalar (real) numbers. The add-in supports basic arithmetic operations (+, −, ×, ÷, ^, min, max), standard mathematical functions (exp, sqrt, atan, etc.), and Excel-style cell referencing for both function arguments and uncertain number results. Graphical depictions of uncertain numbers are created automatically. Using function overloading, the standard Excel syntax is extended for uncertain numbers so that the software conducts uncertainty analyses almost automatically and does not require users to learn entirely new conventions or special-purpose techniques.

Risk analysis with hardly any data

Scott Ferson and Lev Ginzburg

6 June 2011, Wilmington, Delaware

DuPont Crop Protection Statistics and Modeling Group

What is the right model?

22 February 2011, Ottawa, Ontario, Canada

“Better models – better assessments: The use of models in plant health and biotechnology risk assessment” Symposium on using models to assess biotechnology risk, Canadian Food Inspection Agency

Probability bounds analysis

Scott Ferson and Lev Ginzburg

19 November 2010, Cincinnati, Ohio

Centers for Disease Control, The National Institute for Occupational Safety and Health

This day-long presentation will introduce probability bounds analysis for conducting health and safety assessments when relevant empirical data are sparse and germane scientific understanding is deficient about the underlying physical and biological mechanisms at play. We review methods and software tools to make calculations despite these uncertainties, and discuss the psychological and practical issues involved in decision making based on the necessarily imperfect results.

Probability bounds analysis

Workshop on the Assessment and Communication of Risk and Uncertainties…

22 January 2010, Santa Monica, California

Worst-case analysis and interval bounding analysis are useful when incertitude (epistemic uncertainty) is the only kind of uncertainty present. However, not all uncertainty is incertitude, and perhaps most uncertainty that analysts face isn’t incertitude. In fact, in some contexts, incertitude may be entirely negligible. If so, standard probability theory is perfectly sufficient to model the system in this context. However, some analysts face non-negligible incertitude. Handling it with standard probability theory requires assumptions that may not be tenable, including randomness, unbiasedness, homoscedasticity, uniformity, and independence. In such situations, it can be useful to know what difference the incertitude might make in the outcomes, and this can be discovered by bounding probabilities. The results are mathematically equivalent to the results of sensitivity studies, and should be no more controversial.

We're doing something wrong

29 September 2009, Wellington, New Zealand

Keynote address, Australian and New Zealand Society for Risk Analysis

Wishful thinking is a common and understandable reaction to severe uncertainty, but it is not a helpful one in such a context. Much of what is offered as serious analyses is in fact little different from wishful thinking. Qualitative analysis has several limitations that argue against its use in regulatory and other settings. A quantitative and fully numerical approach that accounts for uncertainty with bounding sidesteps the kinds of problems that afflict qualitative approaches and offers a workable alternative to ignoring or applying wishful thinking.

Epistemic and aleatory uncertainty in engineering design

Scott Ferson, W. Troy Tucker and Lev Ginzburg

2009, Langley, Virginia

National Atmospheric and Space Administration, NASA Langley

Probability bounds and sensitivity analysis

Scott Ferson and Lev Ginzburg

22-23 February 2009, Gainesville, Florida

Engineering Risk Control and Optimization Conference, University of Florida

Whenever probability theory has been used to make calculations, analysts have routinely assumed (i) probabilities and probability distributions can be precisely specified, (ii) variables are all independent of one another or linearly correlated with well known coefficients, and (iii) model structure is known without error. For the most part, these assumptions have been made for the sake of mathematical convenience, rather than with any empirical justification. And, until now, these assumptions were pretty much necessary in order to get any answer at all. New methods now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of the assumptions is relaxed or removed completely. In many cases, the results obtained are the best possible bounds, which means that tightening them would require additional empirical information. This talk will present an overview of probability bounds analysis and sensitivity analysis when both epistemic and aleatory uncertainties are present.

Two structurally different approaches to interval data

10 December 2008, Boston, Massachusetts

Society for Risk Analysis annual meeting

Data sets whose values contain interval uncertainty arise from various kinds of censoring, intermittent measurements, missing data, plus-or-minus digital readouts, data binning, and intentional data blurring for privacy and security reasons. Two broadly different approaches to such data have been proposed for situations in which the uncertainty about the values cannot reasonably be neglected. The first approach, which can trace its roots to Laplace’s principle of insufficient reason, models the interval uncertainty of each datum with a uniform distribution over its range. It allows relatively straightforward calculation of sample statistics. However, it does not necessarily have good statistical properties. In particular, it cannot guarantee that estimates computed from such models will approach the parameters of the actual distribution from which the data were drawn even when there are asymptotically many random samples. The second approach is completely different from the first and models the interval uncertainty of each datum solely in terms of the bounds on the possible value, which corresponds not to any single distribution but rather to a class of distributions all having support over the interval’s range. This approach is motivated under the theory of imprecise probabilities so it has a much more recent heritage. Although calculation of even basic descriptive sample statistics such as the variance is generally computationally difficult under this approach, it nevertheless has several interpretational advantages. The differences and advantages of each approach are illustrated with problems such as estimating descriptive statistics, the empirical distribution function, and best-fit normal distributions.

Verified computation with probabilities

3 October 2008, El Paso, Texas

Plenary address, Scientific Computing, Computer Arithmetic and Verified Numerical Computations (SCAN'08)

Population-level ecorisk analysis

16-17 July 2008, Setauket, New York

U.S. Army Corps of Engineers workshop at Applied Biomathematics

We review the basic issues, modeling problems, approaches to solutions, and software tools available for population-level analysis of ecological risk. Stochastic models focusing on scalar, age- and stage-structured, and spatially explicit population models are considered for single species. These models can be extended to multiple species in foodweb communities organized by Lotka-Volterra or ratio-dependent trophic interactions.

Population-level ecorisk analysis

4 June 2008, Bar Harbor, Maine

Plenary tutorial, Society for Environmental Toxicology and Chemistry, North Atlantic Chapter

This day-long tutorial will introduce the risk language for characterizing population-level ecological impacts in single and multispecies assessments, species classification schemes, and predictive models. The models include age- and stage- structure, nonlinear density dependence, temporal variability and catastrophes, ecotoxicology, spatial structure and habitat trends. Methods that can be used when data are sparse are emphasized. The tutorial will review data requirements and selecting a model, other resources for population modeling, and possible approaches to validation.

How to measure a degree of mismatch between probability models, p-boxes, etc.: a decision theory-motivated utility-based approach

W. Troy Tucker, Luc Longpré and Scott Ferson

20 May 2008, New York City

North American Fuzzy Information Processing Society

Probability boxes as info-gap models

Scott Ferson and W. Troy Tucker

20 May 2008, New York, New York

Special Session on Generalized Information Theory at NAFIPS’08

Just as an interval bounds an uncertain real value, a probability box bounds an uncertain cumulative distribution function. It can express doubt about the shape of a probability distribution, the distribution parameters, the nature of intervariable dependence, or some other aspect of model uncertainty. Probability bounds analysis rigorously projects probability boxes through mathematical expressions. Ben-Haim’s info-gap decision theory is a non-probabilistic decision theory that can address poorly characterized and even unbounded uncertainty. It bases decisions on optimizing robustness to failure rather than expected utility. Nested probability boxes can be used to define info-gap models for probability distributions, and probability bounds analysis provides a ready calculus for the calculations needed for an info-gap analysis involving probabilistic uncertainty.

Accounting for epistemic and aleatory uncertainty in early system design

Scott Ferson, W. Troy Tucker, Christian J.J. Paredis and Lev Ginzburg

1 April 2008, Langley, Virginia

NASA Langley, National Atmospheric and Space Administration

We introduce p-boxes and probability bounds analysis for use in the early phases of the design of spacecraft. We also demonstrate software tools implementing these structures and analyses and explain how uncertainty can be communicated among members of the design team under the ICE system.

Propagating uncertainties in modeling nonlinear dynamic systems

Joshua A. Enszer, Youdong Lin, Scott Ferson, George F. Corliss, and Mark A. Stadtherr

20 February 2008, Savannah, Georgia

Reliability in Engineering Conference 2008

Engineering analysis and design problems, either static or dynamic, frequently involve uncertain parameters and inputs. Propagating these uncertainties through a complex model to determine their effect on system states and outputs can be a challenging problem, especially for dynamic models. In this work, we demonstrate the use of Taylor model methods for propagating uncertainties through nonlinear ODE models. We concentrate on uncertainties whose distribution is not known precisely, but can be represented by a probability box (p-box), and show how to use p-boxes in the context of Taylor models. This allows us to obtain p-box representations of the uncertainties in the state variables and outputs of a nonlinear ODE model. Examples are used to demonstrate the potential of this approach for studying the effect of uncertainties with imprecise probability distributions.

Validation of uncertain predictions against uncertain observations

Scott Ferson, William Oberkampf and Lev Ginzburg

20 February 2008, Savannah, Georgia

Reliability in Engineering Conference 2008

Validation is the assessment of the match between a model’s predictions and any empirical observations relevant to those predictions. This comparison is straightforward when the data and predictions are deterministic, but is complicated when either or both are expressed in terms of uncertain numbers (i.e., intervals, probability distributions, p-boxes, or more general imprecise probability structures). There are two obvious ways such comparisons might be conceptualized. Validation could measure the discrepancy between the shapes of the uncertain numbers representing prediction and data, or it could characterize the differences between realizations drawn from the respective uncertain numbers. When both prediction and data are represented with probability distributions, comparing shapes would seem to be the most intuitive choice because it sidesteps the issue of stochastic dependence between the prediction and the data values which would accompany a comparison between realizations. However, when prediction and observation are represented as intervals, comparing their shapes seems overly strict as a measure for validation. Intuition demands that the measure of mismatch between two intervals be zero whenever the intervals overlap at all. Thus, intervals are in perfect agreement even though they may have very different shapes. The unification between these two concepts relies on defining the validation measure between prediction and data as the shortest possible distance given the imprecision about the distributions and their dependencies.

Probability bounds analysis and imprecise probabilities

23 January 2008, Albuquerque, New Mexico

Computer Science Research Institute, Sandia National Laboratories

Whenever probability theory has been used to make calculations, analysts have routinely assumed (i) probabilities and probability distributions can be precisely specified, (ii) variables are all independent of one another, and (iii) model structure is known without error. For the most part, these assumptions have been made for the sake of mathematical convenience, rather than with any empirical justification. And, until now, these assumptions were pretty much necessary in order to get any answer at all. New methods now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of the assumptions is relaxed or removed. In many cases, the results obtained are the best possible bounds, which means that tightening them would require additional empirical information. This talk will present an overview of p-boxes (probability bounds analysis), imprecise probability methods, and the modeling of dependence among epistemic uncertain variables.

Validation of a probabilistic model with sparse data

Scott Ferson, W. Troy Tucker and Lev R. Ginzburg

11 December 2007, Society for Risk Analysis, San Antonio, Texas

Validation is the assessment of the match between a model’s predictions and any empirical observations relevant to those predictions. This comparison is fairly straightforward when the model is deterministic, but is complicated when the model is making probabilistic (i.e., distributional) predictions. We suggest a validation metric for probabilistic models that works for single or multiple observations. The data may exhibit substantial variability and measurement uncertainty, and possibly even statistical trends. The metric is a function of the area between the predicted distribution and the empirical distribution function for the observations. This metric retains the original scale on which the predictions and the observations are expressed so it is more intuitive than competing metrics that are expressed on abstract probabilistic scales. The approach can be used even when the model is itself so complex and expensive to run that it can only generate a small number of realizations from its prediction distribution. It can also synthesize evidence of the conformance between the model and the data into a single measure even when multiple predictions are simultaneously made on different scales or even different dimensions (such as, for instance, temperature and conductivity). A (classical) hypothesis test for the match between predictions and observations can identify statistically significant disagreement between theory and data. The properties of the proposed validation metric are examined and illustrated with an application to an engineering performance assessment problem.

Risk analysis: rough but ready tools for calculations under variability and uncertainty

16 July 2007, Prague, Czech Republic

Plenary tutorial, International Symposium on Imprecise Probabilities and their Applications

Risk analysis is widely used in many disciplines to quantify risks or expectations in the face of pervasive variability and profound uncertainty about both natural and engineered systems. Although most analyses today are still based on point estimates, awkward qualitative assessments, or probabilistic calculations employing unwarranted assumptions, the methods of imprecise probability hold great promise for allowing analysts to develop quantitative models that make use of the knowledge and data that are available but do not require untenable or unjustified assumptions or simplifications. This tutorial will introduce some of the methods that are easiest to make calculations with, including probability bounds analysis, Dempster-Shafer evidence theory, and interval statistics, and will show how they can be used to address the basic problems that risk analysts face: not knowing the input distributions, not knowing their correlations, not being sure about the model itself, or even which variables should be considered. We suggest that these tools constitute a practical uncertainty arithmetic (and logic) that can be widely deployed for lots of applications. Of course, not all problems can be well solved by these relatively crude methods. Examples requiring fuller analyses with the methods of imprecise probability are described.

What can be known when uncertainty pervades observations, models, and decision rules

<<>> 2007, Stony Brook, New York

Department of Preventive Medicine, Health Sciences Center, Stony Brook University, New York

RAMAS workshop

Scott Ferson and H. Resit Akçakaya

12 February 2007, Vancouver, British Columbia

Environment Canada, user workshop on uncertainty in population viability analysis

Propagating epistemic and aleatory uncertainty in nonlinear dynamic models

Scott Ferson, Youdong Lin, George F. Corliss and Mark A. Stadtherr

16 December 2006, Delray Beach, Florida

Fourth International Workshop on Taylor Methods

Engineering analysis and design problems frequently involve uncertain parameters and inputs. Propagating these uncertainties through a complex model to determine their effects on system states and outputs can be a challenging problem, especially for dynamic models. Lin and Stadtherr recently described the implementation of a new validating solver "VSPODE" for parametric ordinary differential equations (ODEs). Using this software, it is possible to obtain a Taylor-model representation (i.e., a Taylor polynomial function and an interval remainder bound) for the state variables and outputs in terms of the uncertain quantities. We give numerical examples to illustrate how these Taylor models can be used to propagate uncertainty about inputs through a system of nonlinear ODEs. We show that the approach can handle cases in which the uncertainty is represented by interval ranges, by probability distributions, or even by a set of possible cumulative probability distribution functions bounded by a pair of monotonically increasing functions (a "p-box"). The last case is useful when only partial information is available about the probability distributions, as is often the case when measurement error is non-negligible or in the early phases of engineering design when system features and properties have not yet be selected.

Strategies for risk communication: the Montauk guidance

Scott Ferson and W. Troy Tucker

6 December 2006, Baltimore, Maryland

Society for Risk Analysis annual meeting

The Montauk symposium on risk communication focused on what psychometric risk perception research, neuroscience, and the evolutionary social sciences are revealing about how we understand and respond to risks. One result of the symposium is a guidance document for practial risk communication and a research agenda organized around the need to understand the evolved mental calculi humans used to evaluate uncertainty and make decision in the face of risk. This guidance and agenda are not endorsed or subscribed to by all symposium participants, however, we believe they provide a basis for interdisciplinary collaboration and progress towards a common language and goals.

PRA methods and tools, including Monte Carlo methods

3 October 2006, Waco, Texas

Baylor University, Society of Toxicology, Gulf Coast Chapter workshop on probabilistic risk assessment

We introduce probability (ensembles and distributions), selecting distributions, default distributions, fitting distributions (method of matching moments, maximum likelihood, regression) , empirical distributions, dependence, number of replications to use, confidence interval for percentile, confidence interval for whole distribution, Latin hypercube sampling to accelerate simulation, multiple instantiation, model uncertainty, backcalculation, steps, limits and pitfalls of Monte Carlo, second-order Monte Carlo, and probability bounds analysis.

Uncertainty, sensitivity and validation

28 July 2006, Stanford, California

4pm, Building 200, Room 002, Stanford University Center for Turbulence Research (CTR) Summer Program 2006

An honest assessment of the uncertainty in calculations and model predictions may be the only difference between prudent analysis and mere wishful thinking. Although traditional methods of error analysis and uncertainty assessments are useful, they typically require untenable or unjustified assumptions. Methods are needed that can relax these assumptions to reflect what is actually known and what is not known about the underlying system. Analysts in many fields draw a careful distinction between epistemic uncertainty and aleatory uncertainty. The latter comes from variability across time or space, heterogeneity within a population, and other sources of stochasticity, and it is commonly modeled with the methods of probability theory. The former arises from measurement imprecision, residual scientific ignorance about model structure, and other forms of incertitude. Many analysts are coming to believe that alternative methods must be used to fully account for epistemic uncertainty and to properly distinguish it from aleatory uncertainty. Considerations of these two kinds of uncertainty have suggested new approaches to some of the fundamental tasks in model building, including uncertainty propagation and especially the treatment of model uncertainty, but also sensitivity analysis and validation exercises. There are some strategies that can be used even for extremely complex models that have high-dimensional inputs and require long calculation times. For example, Monte Carlo techniques and the Cauchy deviate method have errors determined only by the number of replications, rather than the dimensionality of the problem. The former can project probabilistic uncertainty and the latter projects interval-like incertitude. For models that are so complex that very few runs can be computed, Kolmogorov-Smirnov confidence procedures can assess the sampling uncertainty associated with having few replications. Neglect of model uncertainty, which is often the elephant in the living room, is especially egregious in modeling. Analysts usually construct a model and then act as though it correctly represents the world. This understates the uncertainty associated with the model’s predictions, because it fails to express that the model might be in error. Standard methods recommended to account for model uncertainty have serious deficiencies, and some tend to erase the uncertainty rather than truly propagate it through calculations. Alternative strategies will be discussed.

2006

ASME workshop, Uncertainty Representation in Robust and Reliability-based Design, Philadelphia, PA

2006

Sandia National Laboratories Validation Challenge Workshop, Albuquerque, New Mexico

Reliable calculation of probabilities

26 August 2005, Copenhagen, Denmark

Second Scandinavian Workshop on Interval Methods and Their Applications

Technical University of Denmark, København

<<>>A variety of practical computational problems arise in risk and safety assessments, forensic statistics and decision analyses in which the probability of some event or proposition E is to be estimated from the probabilities of a finite list of related subevents or propositions F, G, H, ... . When the probabilities of the subevents are known only poorly from crude estimations, or the dependencies among the subevents are unknown, one cannot use traditional methods of fault/event tree analysis to estimate the probability of the top event. Representing probability estimates as interval ranges on [0,1] has been suggested as a way to address these sources of uncertainty. Interval operations are given that can be used to compute rigorous bounds on the the probability of the top event which, it turns out, account for both imprecision about subevent probabilities and uncertainty about their dependencies at the same time.

2005

Joint Regulatory Science Forum IV, Risk Assessment Data: Why, What and How, Canberra, Australia

Estimating environmental risks.in the face of uncertainty (squeezing blood from a stone)

1 August 2005, Canberra, Australia

Office of the Gene Technology Regulator

2005

Shortcourse on Probabilistic Modeling, PRA Center of CNY, Washington, DC

2005

Probabilistic Risk Assessment workshop, SETAC, Baltimore, Maryland, and East Lansing, Michigan

What Monte Carlo cannot do

3 August 2005, Hobart, Tasmania

CSIRO Division of Marine Research, Hobart, Tasmania, Australia

Looking outside the toolbox: alternative approaches that can advance SPS risk assessment

10 August 2005, Washington, DC

InternationalConference on Sanitary and Phytosanitary (SPS) Risk Assessment Methodology

"Optimization of SPS Regulatory Tool-Box"

Environmental contamination in ecological systems

9 June 2005, Burlington, Vermont

Keynote address, Society for Environmental Toxicology and Chemistry, North Atlantic Chapter

Many human activities introduce chemical contaminants into the natural environment. Manufacturing by-products, agricultural fertilizers and pesticides, leachates from mine tailings, combustion residues, waste and effluent streams deliver anthropogenic toxicants and other chemicals into aquatic and terrestrial ecosystems. Planning mitigation and remediation strategies and designing systems for minimum environmental impact require clear assessment of the nature, magnitude and consequence of the impacts of these contaminants. Risk assessors are beginning to appreciate the need to include ecological processes in their assessment models. The need arises because ecological systems have an inherent complexity that can completely erase the effects of an impact or greatly magnify it, depending on the life histories of the biological species involved. This complexity can also delay the consequence of an impact or alter its expression in other ways. Three central themes have emerged in ecological risk assessment:

The elephant in the living room: what to do about model uncertainty

July 2005, Austin, Texas

Analysts usually construct a model and then act as though it correctly represents the state of the world. This understates the uncertainty associated with the model’s predictions, because it fails to express the analyst’s uncertainty that the model itself might be in error. Several approaches have been proposed to account for model uncertainty within a probabilistic assessment, including what-if analyses, stochastic mixtures, Bayesian model averaging, probability bounds analysis, robust Bayes analyses, and imprecise probabilities. Although each approach has advantages that make it attractive in some situations, each also has serious limitations. For example, several approaches require the analyst to explicitly enumerate all the possible competing models. This might sometimes be reasonable, but the uncertainty will often be more profound and there might be possible models of which the analyst is not even aware. Although Bayesian model averaging and stochastic mixture strategies are considered by many to be the state of the art in accounting for model uncertainty in probabilistic assessments, numerical examples show that both approaches actually tend to erase model uncertainty rather than truly propagate it through calculations. In contrast, probability bounds analysis, robust Bayes methods, and imprecise probability methods can be used even if the possible models cannot be explicitly enumerated and, moreover, they do not average away the uncertainty but propagate it fully through calculations.

When model uncertainty is ignored in a risk assessment, analysts may be overly confident in and thus misled by the results obtained. Probabilistic risk assessments based on Monte Carlo methods typically propagate model uncertainty by randomly choosing among possible models, which treats it in the same fashion as parameter uncertainty. Like the duck-hunting statisticians who shot above and below a bird and declared a hit, this Monte Carlo approach averages the available models, and can produce an aggregate model supported by no theory whatever. The approach represents this uncertainty in the choice of models by their mixture and the resulting answers can be dramatically wrong. We propose an alternative method that can comprehensively represent and propagate model uncertainty based on the idea of the envelope of distributions corresponding to different models. The central feature of this strategy is that it does not average together mutually incompatible models. What it provides are bounds on the resulting distribution from the risk assessment. This method is comprehensive in that it can handle the uncertainty associated with all identifiable models. It cannot, however, anticipate the true surprise of completely unrecognized mechanisms, although it may be more forgiving in such circumstances. We describe software that implements this strategy and illustrate its use in numerical examples. We also contrast the strategy with other possible approaches to the problem, including Bayesian and other kinds of model averaging.

LANL ESA-WR workshop problems

7 June 2005, Los Alamos, New Mexico

Los Alamos National Laboratory

(announcement )

Introduction to imprecise probabilities

6 June 2005, Los Alamos, New Mexico

Los Alamos National Laboratory

(announcement )

Varying correlation coefficients cannot account for uncertainty about dependence, but comprehensive methods exist

2005, Los Alamos National Laboratory

Sensitivity Analysis of Model Output (Kenneth M. Hanson and François M. Hemez, eds.)

Although risk analysis often still assume independence among input variables as a matter of mathematical convenience, most analysts recognize that inter-variable dependencies can sometimes have a substantial impact on computational results. In the face of epistemic uncertainty about dependencies, analysts sometimes employ a sensititvity study in which the correlation coefficient is varied between plausible values. This strategy is insufficient to explore the possible range of results however, as can be shown by simple examples. Fortunately, comprehensive bounds on convolutions of probability distributions (or even bounds thereon) can be obtained using simple formulas that are computationally cheaper than Monte Carlo methods. We review the use of these formulas in the cases of variously restricting assumptions about dependence, from no assumption at all, to specified sign of the dependence, to a particular dependence function. This work is supported by contract 19094 with Sandia National Laboratories.

Impacts of power generation and delivery on ecological systems

4-7 May 2005, Potsdam, Germany

German-American Frontiers of Engineering Symposium

Many serious ecological and environmental impacts are by-products of electrical power generation and delivery systems. In aquatic environments, these include leachate toxicity from coal mining and combustion residues, and effects of water impoundments and cooling water intake structures such as entrainment, impingement, thermal impacts, impediments to fish migration, and alteration of aquatic habitats. There are likewise many terrestrial impacts, including habitat loss (power line right-of-ways consume more habitat than highways), and interference with migration and dispersal pathways from pipelines and power lines. There are also a variety of incidental impacts associated with power generation facilities. For instance, avicides intended to disperse flocks of birds congregating on warm concrete structures of nuclear power plants enter the food chain and kill hawks and endangered owls.

Engineering minimal-impact designs, as well as planning for mitigation and remediation strategies, requires clear assessment of the nature, magnitude and consequence of these impacts. Engineers are beginning to appreciate the need to include ecological processes in their assessment models. The need arises because ecological systems have an inherent complexity that can completely erase the effects of an impact or greatly magnify it, depending on the life histories of the biological species involved. This complexity can also delay the consequence of an impact or alter its expression in other ways. For instance, after a lake’s population of bluegills was devastated by the heavy metal selenium leached from ash settling ponds, a demographic model of the species revealed that, if selenium poisoning were stopped, the population could recover to pre-impact abundances within two years. However, the increased abundance would be unevenly distributed among age groups and, following this temporary recovery, there would a population crash to levels even lower than those originally caused by the selenium. If this crash were not forecast in advance, its unanticipated occurrence would have caused considerable consternation among managers, regulators and the interested public. It is important to predict the ecological consequences to understand the nature and duration of biological recovery from toxicological insults. Without the understanding provided by the ecological analysis, the population decline would probably have been completely misinterpreted as the failure of the mitigation program.

It is essential that we have a way to synthesize effects from multiple kinds of impacts and integrate effects from different plants and structures that are distributed across the environment. Assessments of ecological impacts are complicated by two issues: what we know about ecology and what we don’t know, i.e., the uncertainty about our models and their parameters. The state of the art in assessment of ecological impacts from energy generation has three foci:

Variability versus incertitude. Natural biological systems fluctuate in time and space, partially due to interactions we understand, but substantially due to various factors that we cannot foresee. The variability of ecological patterns and processes, and our incertitude about them, prevent us from making precise, deterministic estimates of the effects of environmental impacts. Because of this, comprehensive impact assessment requires a probabilistic language of risk that recognizes variability and incertitude, yet permits quantitative statements of what can be predicted. The emergence of this risk language has been an important development in applied ecology over the last decade. A risk-analytic endpoint is a natural summary that can integrate disparate impacts on a biological system.

Population-level assessment. In the past, assessments were conducted at the level of the individual organism, or, in the case of toxicity impacts, even at the level of tissues or enzyme function. To justify costly decisions about remediation and mitigation, biologists are often asked “so what?” questions that demand predictions about the consequences of impacts on higher levels of biological organization. Management plans require predictions of the consequent effects on ecological populations and communities. Our scientific understanding of community and ecosystem ecology is very limited, however, and quantitative predictions, even in terms of risks, for complex systems would require vastly more data and mechanistic knowledge than are usually available. Extrapolating the results of individual-level impacts to potential effects on the ecosystem may simply be beyond the current scientific capacity of ecology, which still lacks wide agreement about even fundamental equations governing predator-prey interactions. How can we satisfy the desire for ecological relevance when we are limited by our understanding of how ecosystems actually work? As a practical matter, focusing on populations and meta-populations (assemblages of distinct local populations) may be a workable compromise between the organism and ecosystem levels. Risk assessment at the population level requires the combination of several technical tools including demographic models, potentially with explicit age, stage or geographic structure, and methods for probabilistic uncertainty propagation, which are usually implemented with Monte Carlo simulation. Meta-populations are likely to be at the frontier of what we can address with scientifically credible models over the next decade.

Cumulative attributable risk. Assessments should focus on the change in risk due to a particular impact. The risk that a population declines to, say, 50% of its current abundance in the next 50 years is sometimes substantial whether it is impacted by anthropogenic activity or not. Only the potential change in risk, not the risk itself, should be attributed to impact. On the other hand, for environmental protection to be effective, remediation and mitigation must be designed with reference to the cumulative risks suffered by an ecological system from impacts and from all the various stresses present cumulated through time.

The principle of ravnoprochnost (равнопрочность, equi-sturdiness) in engineering holds that system design should be balanced, rather than an uneven mixture of gold-plated, hyperdesigned parts with shoddy, unreliable parts. This principle has an analog in uncertainty analysis: it is wasteful to expend too much effort in estimating some pieces of an assessment with great precision if the other pieces of the assessment have large uncertainties that cannot be reduced. The old joke illustrating this principle suggests that we have “half the solution” to the question of how many angels can dance on the head of a pin because we can now estimate the surface area of the head of a pin. In the context of assessing the ecological impacts of electricity generation and transmission, the principle of ravnoprochnost says we should focus attention on the components of the assessment where uncertainty has the largest consequence for the reliability of our final predictions. In complex scenarios of energy by-products causing ecological impacts, the phenomena needing attention first may be the ecological, toxicological, chemical, or physico-mechanical phenomena. In some important situations, ecological phenomena will form the crucial focus.

Overdriving the headlights: empirical data limit risk analyses

Scott Ferson, W. Troy Tucker, and Lev R. Ginzburg

<<>> 2005, Boston, Massachusetts

Society for Risk Analysis, North East Chapter

Even though there may be little relevant empirical information, Monte Carlo simulation requires an analyst to select a precise statistical distribution for every variable in an assessment. Moreover, even when there is no information about correlations among the variables, analysts must make some assumptions about their dependencies. Typically, analysts assume independence even between variables that are mechanistically related. By making assumptions merely for the sake of mathematical convenience that do not have empirical justification, risk assessments based on Monte Carlo simulation yield results that cannot be considered reliable. Although many have argued that two-dimensional or second-order probabilistic risk assessments can account for uncertainties about distribution shapes and parameters, and dependencies and model structure, it is easy to show that the results obtained from such analyses can be grossly misleading. Probability bounds analysis, on the other hand, allows an analyst to relax inappropriately precise statements about statistical distributions as well as untenable independence assumptions. It bounds the probabilistic results in a rigorous way and characterizes their reliability. It can even comprehensively account for many kinds of model uncertainty that may attend a risk calculation. Several numerical examples involving the real-world ecological and human health risk calculations are described and graphically contrasted with results from precise Monte Carlo simulations and second-order simulations.

Risk perception and the problems we make for ourselves

W. Troy Tucker, Scott Ferson, and Lev R. Ginzburg

<<>> 2005, Boston, Massachusetts

Society for Risk Analysis, North East Chapter

The failures of risk communication are often blamed on public ignorance of technical issues or mistrust of industry or government. We suggest that often neither ignorance nor mistrust is fundamentally responsible for the difficulty. Instead, humans seem wired by natural evolution to use a mental calculus for reckoning risk and making decisions that can be substantially different from probability theory. We suggest that several important biases of risk perception recognized by psychometricians can be interpreted as adaptive strategies for responding to incertitude, variation, and multiple dimensions of risk. In particular, we deduce evolutionary reasons why (i) people routinely misestimate risks, (ii) people are insensitive to prior probabilities, (iii) the notion of independence is so difficult to correctly interpret, and (iv) people concentrate on the worst case (and ignore how unlikely it is). If these biases are fundamental to human perception and not removable by general education or specific training, perhaps risk analysts should make their calculations and arguments more natural, interesting, and compelling to humans. We describe such an approach to risk assessment and communication based on a practical review of recent findings in evolutionary psychology and neurobiology. Implications for medical decision making in the context of uncertainty are explored.

Uncertainty in process models: good ideas and wishful thinking

29 October 2004, Stony Brook, New York

Marine Sciences Research Center, State University of New York

An honest assessment of the uncertainty in calculations may be the only difference between prudent analysis and wishful thinking. Although traditional methods of error analysis and uncertainty assessments are useful, they typically require untenable or unjustified assumptions. Methods are needed that can relax these assumptions to reflect what is actually known and what is not known about the underlying system. Probability bounds analysis is a comprehensive approach that can relax many of the fundamental assumptions in a probabilistic risk analysis. It is based on interval estimates of probabilities, or interval bounds on distribution functions, and uses rigorous propagation methods to ensure the uncertainty about risks is not underestimated. When information is abundant, it produces answers that are identical to the answers obtained by traditional Monte Carlo simulation and other (precise) probabilistic methods. The approach has now been used in a wide variety of scientific and engineering contexts. Examples include endangered species classification, design of water control structures, and global climate change models.

Introduction to using imprecise probabilities in risk analysis

27 July 2004, Lugeno, Switzerland

ISIPTA Summer School on Imprecise Probabilities

(talk, exercises)

This full-day tutorial workshop introduces five practical and quantitative approaches to risk analysis based on the notions of interval-valued probabilities and imprecisely specified probability distributions. The simplest approach uses the idea of interval probability, in which the probability of an event can be specified as an interval of possible values rather than only as a precise one. The idea, dating from George Boole, provides a convenient way to assess the reliability of fault-tree risk analyses. This idea is generalized by probability bounds analysis, which propagates constraints on a distribution function through mathematical operations, and Dempster-Shafer theory which recognizes that uncertainty attending any real-world measurement may not allow an analyst to distinguish between events in empirical evidence. These approaches are related to robust Bayes (aka Bayesian sensitivity) methods, in which an analyst can relax the requirement that the prior distribution and likelihood function must be precisely specified. The most general approach comes from the theory of imprecise probabilities in which uncertainty is represented by closed, convex sets of probability distributions.

These five approaches redress, or comprehensively solve, several major deficiencies of Monte Carlo simulations and of standard probability theory in risk assessments. For instance, it is almost always difficult, if not impossible, to completely characterize precise distributions of all the variables in a risk assessment, or the multivariate dependencies among the variables. As a result, in the practical situations where empirical data are limiting, analysts are often forced to make assumptions that can result in assessments that are arbitrarily over-specified and therefore misleading. In practice, the assumptions typically made in these situations, such as independence, (log)normality of distributions, and linear relationships, can under the risks of adverse events. The assumptions therefore fail to be "protective" in the risk-analytic sense. By relaxing the need to make such unjustified or untenable assumptions, the five approaches based on interval or imprecise probabilities can restore an appropriate degree of conservativism to the analysis.

More fundamentally, it can be argued that probability theory has an inadequate model of ignorance because it uses equiprobability as a model for incertitude and thus cannot distinguish uniform risk from pure lack of knowledge. In most practical risk assessments, some uncertainty is epistemic rather than aleatory, that is, it is incertitude rather than variability. For example, uncertainty about the shape of a probability distribution and most other instances of model uncertainty are typically epistemic. Treating incertitude as though it were variability is even worse than overspecification because it confounds epistemic and aleatory uncertainty and leads to risk conclusions that are simply wrong. The five approaches based on interval and imprecise probabilities allow an analyst to keep these kinds of uncertainty separate and treat them differently as necessary to maintain the interpretation of risk as the frequency of adverse outcomes.

The five approaches also make backcalculations possible and practicable in risk assessments. Backcalculation is required to compute cleanup goals, remediation targets and performance standards from available knowledge and constraints about uncertain variables. The needed calculations are notoriously difficult with standard probabilistic methods and cannot be done at all with straightforward Monte Carlo simulation, except by approximate, trial-and-error strategies.

Although the five approaches arose from distinct scholarly traditions and have many important differences, the tutorial emphasizes that they share a commonality of purpose and employ many of the same ideas and methods. They can be viewed as complementary, and they constitute a single perspective on risk analysis that is sharply different from both traditional worst-case and standard probabilistic approaches. Each approach is illustrated with a numerical case study and summarized by a checklist of reasons to use, and not to use, the approach.

The presentation style will be casual and interactive. Participants will receive a CD of some demonstration software and the illustrations used during the tutorial.

Overview of topics
What's missing from Monte Carlo?
Correlations are special cases of dependencies
Probability theory has an inadequate model of ignorance
Model uncertainty is epistemic rather than aleatory in nature
Backcalculation cannot be done with Monte Carlo methods
Interval probability
Conjunction and disjunction (ANDs and ORs)
Fréchet case (no assumption about dependence)
Mathematical programming solution
Case study 1: fault-tree for a pressurized tank system
Why and why not use interval probability
Robust Bayes and Bayesian sensitivity analysis
Bayes' rule and the joy of conjugate pairs
Dogma of Ideal Precision
Classes of priors and classes of likelihoods
Robustness and escaping subjectivity
Case study 2: extinction risk and conservation of pinnipeds
Why and why not use robust Bayes
Dempster-Shafer theory
Indistinguishability in evidence
Belief and plausibility
Convolution via the Cartesian product
Case study 3: reliability of dike construction
Case study 4: human health risk from ingesting PCB-contaminated waterfowl
Why and why not use Dempster-Shafer theory
Probability bounds analysis
Marrying interval analysis and probability theory
Fréchet case in convolutions
Case study 5: environmental exposure of wild mink to mercury contaminationbirds to an agricultural insecticide
Backcalculation
Case study 6: planning cleanup for selenium contamination in San Francisco Bay
Why and why not use probability bounds analysis
Imprecise probabilities
Comparative probabilities
Closed convex sets of probability distributions
Multifurcation of the concept of independence
Case study 7: medical diagnosis
Why and why not use imprecise probabilities

When the best estimate isn't good enough: interval probabilities and p-boxes

15 June 2004, Hampton, Virginia

NASA/NIA workshop Uncertainty Characterization in Systems Analysis

An honest assessment of the uncertainty in calculations may be the only difference between prudent analysis and wishful thinking. Although traditional methods of error analysis and uncertainty assessments are useful, they typically require untenable or unjustified assumptions. In critical risk assessments, such as for high-investment missions, nuclear safety, and human health risk, methods are needed that can relax these assumptions to reflect what is actually known and what is not known about the underlying system. Probability bounds analysis is a comprehensive approach that can relax many of the fundamental assumptions in a probabilistic risk analysis. It is based on interval estimates of probabilities, or interval bounds on distribution functions, and uses rigorous propagation methods to ensure the uncertainty about risks is not underestimated. When information is abundant, it produces answers that are identical to the answers obtained by traditional Monte Carlo simulation and other (precise) probabilistic methods. Probability bounds analysis is computationally more convenient than comparable methods such as Dempster-Shafer theory or more general imprecise probability approaches. The approach has now been used in a wide variety of scientific and engineering contexts.

<<BELOW THIS POINT, THIS WEBPAGE IS UNDER CONSTRUCTION>>

2004

Improved Approaches for Long-term Pesticide Risks to Birds and Mammals, CSL, York, UK

2003

Pfizer Research Global Headquarters, New London, Connecticut

2003

European Commission workshop on pesticides, EUPRA, Bilthoven, Netherlands

2002

Risk-based Decision Making in Water Resources X, United Engineering Foundation, Santa Barbara

2002

CSIRO workshop on ecological risk assessment for genetically modified organisms, Canberra, Australia

2002

Setting Priorities and Making Decisions for Conservation Risk Management, NCEAS, Santa Barbara

Don't open that envelope: solutions to the Sandia problems using probability boxes

Scott Ferson and Janos Hajagos

6 August 2002, Albuquerque, New Mexico

Epistemic Uncertainty Workshop, Sandia National Laboratory

(announcement, slides)

2002

Workshop on Novel Approaches to Uncertainty Quantification, Los Alamos National Laboratory

2002

SETAC workshop, Uncertainty Analysis for Ecological Risks of Pesticides, Pensacola, Florida

Beyond Point Estimates: Risk Assessment Using Interval, Fuzzy and Probabilistic Arithmetic

December 2001, Seattle, Washington

[day-long workshop] Society for Risk Analysis continuing education workshop

2001

Los Alamos National Laboratory, Los Alamos, New Mexico

2001

USDA–ESA–SRA Workshop on Invasive Species, Las Cruces, New Mexico

2001

Uncertainty Quantification Seminar, Sandia National Laboratories, Albuquerque, New Mexico

2001

November 2001, Baltimore, Maryland

Society for Environmental Toxicology and Chemistry session "Coping with Uncertainty in Fate/Exposure Models"

2000

Risk-based Decision Making in Water Resources IX, United Engineering Foundation, Santa Barbara

2000

IUCN Workshop on Precautionary Principle and Wildlife Conservation, Cambridge, UK

Wishful thinking and prudent assumptions

11 February 2000, Arlington, Virginia

Society for Risk Analysis Forum "Uncertainty: its nature, analytical treatment and interpretation"

Uncertainty: its nature, analytical treatment and interpretation

10−11 February 2000, Arlington, Virginia

[workshop organizer] Society for Risk Analysis Forum

This two-day forum will bring together theorists and practitioners in risk analysis, policy making, philosophy and computer science to address the emerging issues about what it means to admit we're unsure. The questions to be addressed include

What's the real difference between wishful thinking and a prudent assumption?

- Is it essential to distinguish uncertainty from variability and ambiguity?
- Can Bayesian rationality be consistent with democratic decision making?
- Is probability theory the only consistent calculus for handling uncertainty?
- What are the practical implications of epistemological constraints?
- When is a decision defensible even if the data it's based on are incomplete?
- Do bright-line definitions of "adverse effects" thwart optimal decision making?

The conference is intended as a forum for discussion and debate, rather than merely a teaching workshop. The forum will be focused by short presentations by invited speakers and will allow ample time for open-format discussion and debate among all participants.

1999

Massachusetts Department of Environmental Protection, Boston, Massachusetts

1999

U.S. Environmental Protection Agency’s National Center for Statistics and the Environment, Seattle

1999

Harvard Center for Risk Analysis, Harvard School of Public Health, Cambridge, Massachusetts

RAMAS Risk Calc: a microcomputer environment for probabilistic arithmetic

22 August 1999, Washington, D.C.

PSA'99 International Topical Meeting on Probabilistic Safety Assessment: Risk-informed Performance-based Regulation in the New Millennium

<<abstract available>>

Quality assurance for probabilistic safety assessments

22 August 1999, Washington, D.C.

PSA'99 International Topical Meeting on Probabilistic Safety Assessment: Risk-informed Performance-based Regulation in the New Millennium

The conditional or hypothetical nature of many probabilistic safety assessments (PSAs) often preclude scientific validation in the strict sense. It is therefore all the more important in such cases to take extreme care in the model construction and calculation phases to ensure the reasonableness and accuracy of the assessment. We describe quality assurance tools for reviewing the computations involved in a PSA. These tools include (i) checks on he well-formedness and feasibility of constructs used within the PSA, and (ii) algorithms for propagating uncertainty about inputs and model structure through the PSA.

Biological invasion

12 March 1999, Lincoln, Rhode Island

Rhode Island Natural History Survey conference “Challenges & Opportunities Facing Rhode Island’s Biodiversity”

Biological invasion consists of dispersal and establishment of a species into new habitats. Invasions include diverse processes such as range expansion, spread of epidemics, and introductions of exotics, all of which can induce irreversible changes in an ecological community. The pattern of invasion is determined by how fast the species can disperse and the extent of the invadable habitat. Dispersal is often modeled with percolation theory, as exemplified by the well known “Game of Life” often played on computer screens. We learn from such models that when offspring invade only adjacent areas, the invasion is rather slow because the advance is limited by the perimeter length of the range. When offspring can invade distant areas, however, then the invasion can be much faster because it is limited by the total abundance of the population. Despite their extreme complexity, the frontiers of invasions are not fractals as some have suggested. However, measuring the complexity of the frontier may let us directly compute the relative chance for establishment of an invasion. Invasion patterns are reviewed for laboratory examples and several famous cases, including the starlings in North America, killer bees in South and Central America, and muskrat in Europe. For exotics and pest species, we are especially concerned with the chances of the population becoming large. We call the probability that a population grows to large abundances the “risk of population explosion”. Estimating this risk is the symmetrically opposite problem to estimating the risk of extinction, and similar mathematical methods can be employed for both problems.

Environmental impact analysis at the population level

27 January 1999, Yokohama, Japan

Institute of Environmental Science and Technology, University of Yokohama

Ecological risk assessment based on extinction distributions

28 January 1999, Yokohama, Japan

International Japan Science and Technology Corporation Workshop of Risk Evaluation and Management of Chemicals

Many researchers now agree that an ecological risk assessment should be a probabilistic forecast of effects at the level of the population. The emerging consensus has two essential themes: (i) individual-level effects are less important for ecological management, and (ii) deterministic models cannot adequately portray the environmental stochasticity that is ubiquitous in nature. It is important to resist the temptation to reduce a probabilistic analysis to a scalar summary based on the mean. An assessment of the full distribution of risks will be the most comprehensive and flexible endpoint. There are two ways to visualize a distributional risk assessment of a chemical’s impact on a population. The first is to display, side by side, the two risk distributions arising from separate simulations with and without the impact but alike in every other respect. Alternatively, one can display the risk of differences between population trajectories with and without impact but alike in every other respect. Like all scientific forecasts, an ecological risk assessment requires appropriate uncertainty propagation. This can be accomplished by using a mixture of interval analysis and Monte Carlo simulation techniques.

Constrained mathematics and the repeated variable problem

9 December 1998, Phoenix, Arizona

Society for Risk Analysis annual meeting

<<abstract available>>

Uncertainty analysis via Monte Carlo: does it deliver what it promises?

8 December 1998, Phoenix, Arizona

Society for Risk Analysis annual meeting

Monte Carlo analysis has been proposed as an approach to overcome some of the weaknesses of traditional deterministic estimates in quantitative risk assessments. Traditional approaches use a combination of moderate, conservative and worst-case point estimates in their calculations and, as a result, have uncontrolled levels of conservativism for final risk estimates that are difficult to compare. With a Monte Carlo approach, analysts combine entire distributions for each parameter and create a much more comprehensive evaluation of the uncertainty of the risk estimate. But, just like the traditional point estimate, Monte Carlo analysis depends on the accuracy of the inputs and the model specification. Relying on incomplete information or unjustified conventions can lead to over- or underestimation of risks. We discuss a case study in which the Monte Carlo result based on best available input distributions and default assumptions grossly underestimated the true potential risks. In this case, the traditional point estimate actually may be a better characterization of the overall situation. Although this case may be unusual, it suggests that a Monte Carlo analysis by itself cannot be deliver on its promise of a fully comprehensive characterization of risks.

Why humans are so bad at interpreting probabilities

Scott Ferson and Lev R. Ginzburg

7 December 1998, Phoenix, Arizona

Society for Risk Analysis annual meeting

It is widely recognized that the lay public has difficulty in grasping the meaning of risk analyses. The sometimes spectacular failures of risk communication strategies are often blamed on the public’s ignorance of technical issues or its mistrust of industry or government. We suggest, however, that it is neither ignorance nor mistrust that is fundamentally responsible for the difficulty. Instead, humans seem to have been wired by natural evolution to use a mental calculus for reckoning uncertainty and making decisions that is substantially different from probability theory. We suggest that several of the most important biases of probability perception that have been recognized by psychometricians can be interpreted as highly adaptive strategies for responding to variation and risk. Given that humans evolved in a strongly autocorrelated natural environment that heavily rewarded pattern recognition skills and often punished indecision more sternly than it did a suboptimal decision, it is easy to deduce evolutionary reasons why it is that (i) people routinely underestimate risks, (ii) people are insensitive to prior probabilities, (iii) the notion of independence is so difficult to correctly interpret, and (iv) people always ask how bad it could be (and ignore how unlikely that outcome is). If these biases and misconceptions are fundamental in human perception and not removable by general education or even specific training, perhaps it is incumbent on risk analysts to make calculations and arguments in a way that is more natural for humans, and that yields results that are interesting and compelling to them. We describe some of the properties and features of such an approach to risk assessment and communication.

1998

Department of Ecology and Evolution, State University of New York at Stony Brook, New York

The role of population- and ecosystem-level risk analysis in addressing impacts under §316(b) of the Clean Water Act

Lev Ginzburg and Scott Ferson

23 September 1998, Berkeley Springs, West Virginia

We address three issues germane to a practical definition of “adverse impact” and the choice of technical tools needed to address it.

1 Variability and uncertainty. Natural populations and ecosystems fluctuate in time and space, partially due to interactions we understand, but substantially due to various factors which we cannot foresee. The variability of ecological patterns and processes, as well as our uncertainty about them, prevent us from making precise, deterministic estimates of the effects of environmental impacts. Because of this, comprehensive impact assessment requires a language of risk which recognizes variability and uncertainty, yet permits quantitative statements of what can be predicted. The emergence of this risk language has been an important development in applied ecology over the last two decades.

2 Cumulative attributable risk. For regulation to be fair, it should focus on the change in risk due to a particular impact. The risk that a population declines to, say, 50% of its current abundance in the next 50 years is sometimes substantial whether it is impacted by anthropogenic activity or not. Only the potential change in risk, not the risk itself, should be attributed to impact. On the other hand, for environmental protection to be effective, regulation must be expressed in terms of cumulative risks suffered by a population or ecosystem from impacts and from all the various agents present cumulated through time.

3 Food chains. Although assessments under §316 (b) have generally considered only effects on single populations, public and regulatory concern with ecosystem-level responses has increased substantially. However, attitudes vary about precisely what ecosystem-level response is and how it should be characterized. Currently, the “watershed” concept and “watershed-compatible” techniques organize much of the effort at EPA, although the ideas are perhaps more relevant for water chemistry than for ecology. Likewise, there are many competing ideas about “emergent properties” of ecosystems which are not reducible to properties of underlying populations. However, our scientific understanding of community and ecosystem ecology is quite limited, and, quantitative predictions, even in terms of risks, for complex systems would require vastly more data than are usually available. How can we satisfy the desire for a holistic ecosystem approach when we are limited by our understanding of how whole systems actually work? As a practical matter, focusing on food chains may be a workable compromise between the population and ecosystem levels. Food chain models are general enough to permit assessment of indirect trophic effects and biomagnification yet have data requirements that are not so much larger than for single population assessments as to be unmanageable. In any case, food chains are likely to be at the frontier of what we can address by scientifically credible models for the next decade or two.

Risk assessment at the population or higher level requires the combination of several technical tools including demographic and trophic models, potentially with explicit age, stage or geographic structure, and methods for probabilistic uncertainty propagation, which are usually implemented with Monte Carlo simulation. Over the last fifteen years, EPRI has facilitated the use of these tools by sponsoring the development of the RAMAS® library.

1998

Waterways Experiment Station Risk Assessment Modeling Workshop, New Orleans. Louisiana

1998

U.S. Army Corps of Engineers conference on ecological risk, San Diego, California

Probability bounds analysis (why Laplace was wrong)

6 May 1998, Durham, North Carolina

Decision Analysis Seminar, Fuqua School of Business, Duke University

Whenever probability theory has been used to make calculations, analysts have routinely assumed (i) probabilities and probability distributions can be precisely specified, (ii) variables are all independent of one another, and (iii) model structure is known without error. For the most part, these assumptions have been made for the sake of mathematical convenience, rather than with any empirical justification. And, until now, these assumptions were pretty much necessary in order to get any answer at all. New methods now allow us to compute bounds on estimates of probabilities and probability distributions that are guaranteed to be correct even when one or more of the assumptions is relaxed or removed. In many cases, the results obtained are the best possible bounds, which means that tightening them would require additional empirical information.

Why Laplace was wrong: using copulas to bound convolutions when marginals are known

26 October 1997, Dallas, Texas

INFORMS, Decision Analysis Society session on copula applications

Session SD5, Reunion G

Williamson's algorithms that compute bounds on the distributions of arithmetic operations on random numbers when only marginal distributions are known also work when the marginals can only be bounded. They therefore constitute a probability bounds analysis that renders obsolete the maximum entropy criterion for selection of input distributions.

Reliable calculation of probabilities

29 July 1997, Albuquerque, New Mexico

High Consequence Operations Safety Symposium, Sandia National Laboratories

<<abstract available>>

Detecting rare event clusters

Scott Ferson and Kwisung Hwang

30 July 1997, Albuquerque, New Mexico

High Consequence Operations Safety Symposium, Sandia National Laboratories

Cluster detection is considered to be essential in many environmental and epidemiological studies. Likewise, it can be very important in engineering studies for recognizing design flaws and cryptic common-mode or common-cause dependencies among rare events such as component failures. How can we tell whether a town's incidence of childhood cancers is significantly greater than the national average? How can we tell whether an airline's crash history is significantly worse than what one would expect given the industry's performance? The answers to these questions require some statistical method to detect of clustering among rare events. However, traditional statistical tests for detecting clustering assume asymptotically large sample sizes and are therefore not applicable when data are sparse-as they generally are for rare events. In fact, simulation studies show that the Type I error rates for traditional tests such as chi-square or the log-likelihood ratio (G-test) are routinely much larger than their nominal levels when applied to small data sets. As a result these tests can seriously overestimate the evidence for clustering and thus cause more alarm than is warranted. In other cases, traditional cluster tests can fail to detect clusters that can be shown by other methods to be statistically significant. Thus, the traditional approaches will provide an inefficient review of the available data. It is difficult to anticipate whether the traditional test will overestimate or underestimate the probability of clustering for a particular data set. Moreover, these tests are sensitive to a specific kind of deviation from randomness and may not provide the most appropriate measure of clustering from a specific mechanism. We describe eight new statistical methods, implemented in a convenient software package, that can be used to detect clustering of rare events in structured environments. Because the new tests employ exact methods based on combinatorial formulations, they yield exact P-values and cannot violate their nominal Type I error rates like the traditional tests do. As a result, the new tests are reliable whatever the size of the data set, and are especially useful when data sets are extremely small. By design, these tests are sensitive to different aspects of clusters and should be useful in discerning not only the fact of clustering but also something about the processes that generated the clustering. We characterize the relative statistical power of the new tests under different kinds of clustering mechanisms and data set configurations. The new statistical tests, along with several traditional and Monte Carlo tests for clustering, have been implemented in convenient graphical software for Windows operating systems which should be useful to risk and safety analysts for detect clustering of rare events in data sets even when the available data are sparse. This work was supported by SBIR grant R44GM49521 from the National Institute of General Medical Sciences of the National Institutes of Health.

Do you always need a full blown Monte Carlo analysis?

16 July 1997, Boston, Massachusetts

IBC conference Ecological Risk Assessment

After much controversy, the U.S. EPA is now poised to accept, and may soon require, probabilistic analyses as part of all ecological risk assessments. While Monte Carlo is often touted as the only practical approach to use, there are other methods that may sometimes be much better. Although often simpler and easier to parameterize than Monte Carlo methods, they permit fully comprehensive probabilistic conclusions.

Determining population viability is a time-dependent risk analysis problemassessments

Scott Ferson, Lev R. Ginzburg and Paolo Inchausti

10 December 1996, New Orleans, Lousiana

Society for Risk Analysis annual meeting

We performed an ecological <<abstract available>>

Computational alternative to second-order Monte Carlo simulation for propagation of variability and uncertainty in exposure assessments

Scott Ferson and Robert C. Lee

9 December 1996, New Orleans, Lousiana

Society for Risk Analysis annual meeting

Much recent interest has been focused on the use of second order simulation mothods (i.e., "two dimensional Mote Carlo" or "nested loop Monte Carlo") for propagation of parameter uncertainties in risk assessment models. This method is computational intensive and is only suitable for parametric uncertainty about an input's distribution. When the distribution family itself is unknown, this method can become very cumbersome, Although second-order methods are intended to distinguish stochastic variability and uncertainty associated with the state of knowledge, they often yield results that cannot be justified by appeal to empirical information by treating both with the same methods. Additionally, it is difficult to assess model uncertainty, to perform backcalculations, and to incorporate temporal changes in simulations using this method. Probability bounds arithmetic is an alternative method that addresses many of these shortcomings. An example is presented based on modifications to an EPA model used to assess toxicant exposure from waste incinerator emissions. The specific exposure pathway is ingestion of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) through home-grown beef. Results are compared to a previously publuished second-order Monte Carlo analysis of this exposure scenario. The probability bounds method results in a wider range of uncertainty associated with exposure than the Monte Carlo method. Limitations of the probability bounds method are discussed. Recommendations are made regarding stochastic modeling approaches to complex environmental problems based on this comparison.

996

Population viability analysis workshop on the Sonoran pronghorn antelope, Phoenix, Arizona

1996

Workshop on information representation in decision systems, NIST, Gaithersburg, Maryland

Reliable calculation of probabilities: accounting for small sample size and model uncertainty

20-23 October 1996, Gaithersburg, Maryland

Intelligent Systems: A Semiotic Perspective, NIST, October 1996

A variety of practical computational problems arise in risk and safety assessments, forensic statistics and decision analyses in which the probability of some event or proposition E is to be estimated from the probabilities of a finite list of related subevents or propositions F,G,H,.... In practice, the analyst’s knowledge may be incomplete in two ways. First, the probabilities of the subevents may be imprecisely known from statistical estimations, perhaps based on very small sample sizes. Second, relationships among the subevents may be known imprecisely. For instance, there may be only limited information about their stochastic dependencies. Representing probability estimates as interval ranges on [0,1] has been suggested as a way to address the first source of imprecision. A suite of AND, OR and NOT operators defined with reference to the classical Fréchet inequalities permit these probability intervals to be used in calculations that address the second source of imprecision, in many cases, in a best possible way. Using statistical confidence intervals as inputs unravels the closure properties of this approach however. One solution is to characterize each probability as a nested stack of intervals for all possible levels of statistical confidence, from a point estimate (0% confidence) to the entire unit interval (100% confidence). The corresponding logical operations implied by convolutive application of the logical operators for every possible pair of confidence intervals reduces by symmetry to a manageably simple level-wise iteration. The resulting logical calculus can be implemented in software that allows users to compute comprehensive and often level-wise best possible bounds on probabilities for logical functions of events.

1996

Workshop on risk assessment, Savannah River Ecological Laboratory, Aiken, South Carolina

<<>>

1996, Kingston, Rhode Island

Department of Biological Sciences, University of Rhode Island

<<abstract available>>

Ecology is required to answer the so what question of ecotoxicology

16 February 1996, Lake Buena Vista, Florida

New Techniques in Risk Assessment: State-of-the-Art Methods and Future Directions

Even though justification of regulatory decisions and management plans requires predicting effects of contaminants on populations and ecological communities, the available bioassay techniques measure the impact of environmental contamination on biochemical function, histopathology, develop, reproduction or mortality. New approaches are described to extrapolate results on organismal effects to population-level consequences.

Fuzzy arithmetic in screening-level and full probabilistic assessments

14 February 1996, Lake Buena Vista, Florida

Simulation Modeling and Computer-aided Risk Assessment: Assays and Expert Systems

Fuzzy arithmetic is a refinement of interval analysis based on the theory of fuzzy sets. It permits comprehensive uncertainty propagation in situations for which data are very limited yet potential consequences are grave. Risk Calc^TM is Windows^TM-based software that provides state-of-the-art computational facilities for using fuzzy arithmetic in risk analysis.

Hybrid processing of stochastic and subjective uncertainty data

6 December 1995, Honolulu, Hawaii

Society for Risk Analysis annual meeting

Uncertainty analyses typically recognize separate stochastic and subjective sources of uncertainty, but do not systematically combine the two, although a large amount of data used in analyses is partly stochastic and partly subjective. We have developed methodology for mathematically combining stochastic and subjective data uncertainty, based on new "hybrid number" approaches. The methodology can be utilized in conjunction with various traditional techniques, such as PRA (probabilistic risk assessment) and risk analysis decision support. Hybrid numbers have been previously examined as a potential method to represent combinations of stochastic and subjective information, but mathematical processing has been impeded by the requirements inherent in the structure of the numbers, e.g., there was no known way to multiply hybrids. In this paper, we will demonstrate methods for calculating with hybrid numbers that avoid the difficulties. By formulating a hybrid number as a probability distribution that is only fuzzily known, or alternatively as a random distribution of fuzzy numbers, methods are demonstrated for the full suite of arithmetic operations, permitting complex mathematical calculations. It will be shown how information about relative subjectivity (the ratio of subjective to stochastic knowledge about a particular datum) can be incorporated. Techniques are also developed for conveying uncertainty information visually, so that the stochastic and subjective constituents of the uncertainty, as well as the ratio of knowledge about the two, are readily apparent. The techniques demonstrated have the capability to process uncertainty information for independent, uncorrelated data, and for some types of dependent and correlated data. Example applications are suggested, illustrative problems are worked, and graphical results are given. [See Cooper et al. 1996. Risk Analysis 16: 785-791]

Hybrid arithmetic

Scott Ferson and Lev Ginzburg

20 September 1995, College Park, Maryland

Symposium on Uncertainty Modeling and Analysis, North American Fuzzy Information Processing Society

Kaufmann’s formulation of hybrid numbers, which simultaneously express fuzzy and probabilistic uncertainty, allows addition and subtraction, but offers no obvious way to do multiplication, division or other operations. We describe another, more comprehensive formulation for hybrid numbers that allows the full suite of arithmetic operations, permitting them to be incorporated into complex mathematical calculations. There are two complementary approaches to computing with these hybrid numbers. The first is extremely efficient and yields theoretically optimal results in many circumstances. The second more general approach is based on Monte Carlo simulation using intervals or fuzzy numbers rather than scalar numbers.

Quality assurance for Monte Carlo risk assessment

18 September 1995, College Park, Maryland

Symposium on Uncertainty Modeling and Analysis, North American Fuzzy Information Processing Society

Three major problems inhibit the routine use of Monte Carlo methods in risk and uncertainty analyses:

(1) correlations and dependencies are often ignored,

(2) input distributions are usually not available, and

(3) mathematical structure of the model is questionable.

Most practitioners acknowledge the limitations induced by these problems, yet rarely employ sensitivity studies or other methods to assess their consequences. This paper reviews several computational methods that can be used to check a risk assessment for the presence of certain kinds of fundamental modeling mistakes, and to assess the possible error that could arise when variables are incorrectly assumed to be independent or when input distributions are incompletely specified.

1995

Expert working group meeting on the status of marine turtles, Honolulu, Hawaii

1995

Uncertainty analysis for ecological risk assessment, SETAC, Vancouver Island, British Columbia

1995

SETAC workshop on uncertainty analysis for ecological risk, Pellston, Michigan

1995

Canadian Department of Fisheries and Oceans, Winnipeg, Manitoba

1995

U.S. Fish and Wildlife Service, Amherst, Massachusetts

Independence assumptions in probabilistic risk assessments

Scott Ferson and Lev Ginzburg

6 December 1994, Baltimore, Maryland

Society for Risk Analysis annual meeting

<<abstract available>>

Integrating objective and subjective uncertainties in ecological risk: an example of the spotted owl

Lev R. Ginzburg, Scott Ferson and Lloyd R. Goldwasser

6 December 1994, Baltimore, Maryland

Society for Risk Analysis Symposium on Exposures to Wildlife

We distinguish two sources of uncertainty in analyzing risks of extinction. The first is stochastic variability caused by fluctuations in the environment and variation among individuals. The second is measurement error induced by incomplete sampling and empirical imprecision. We argue that the two kinds of uncertainty require different treatments when uncertainty is propagated through models of population dynamics. We use published data on the Olympic Peninsula population of the northern spotted owl (Strix occidentalis caurina) to illustrate the importance of this difference. Using recently proposed quantitative criteria for extinction threat, this owl population appears either to be threatened if the current number of owl breeding territories is 200, or to be non-threatened if this number is closer to 300. Taking measurement error into account strongly increases the uncertainty in this determination. Although we can be reasonably confident that this population will persist at least for the next 30−40 years, reliable risk evaluations over a 100-year time scale are probably impossible.

Naive Monte Carlo methods yield dangerous underestimates of tail probabilities

12 July 1994, Albuquerque, New Mexico

High Consequence Operations Safety Symposium, Sandia National Laboratories

Extreme-event probabilities (i.e., the tails) of a statistical distribution resulting from probabilistic risk analysis can depend strongly on dependencies among the variables involved in the calculation. Although well known techniques exist for incorporating correlation into analyses, in practice they are often neglected on account of a paucity of information about joint distributions. Furthermore, certain forms of dependency that are not adequately measured by simple correlation must perforce be omitted from such assessments. Two general techniques may be used to compute conservative estimates of the tails in the context of ignorance about correlation and dependency among the variables. The first is based on a variance maximization/minimization trick. It is compatible with existing Monte Carlo methods, but it accounts only for linear dependencies. The second is based on Fréchet inequalities, and, although incompatible with Monte Carlo methods, it guarantees conservative estimates of tail probabilities no matter what dependence structure exists among the variables in the analysis.

Interactive microcomputer software for fuzzy arithmetic

Scott Ferson and Rüdiger Kuhn

12 July 1994, Albuquerque, New Mexico

High Consequence Operations Safety Symposium, Sandia National Laboratories

We describe a new microcomputer implementation of fuzzy arithmetic for the Microsoft Windows operating system based on the strict definition of fuzzy numbers (i.e., fuzzy sets that are both convex and normal). Fuzzy numbers can be specified by a user as intervals in the form [a,b] or [a±b], as triangular numbers in the form [a,b,c], as trapezoidal numbers in the form [a,b,c,d], or as an arbitrary list of points describing the fuzzy membership function. A fuzzy number can also be synthesized as the consensus of a list of intervals. The software supports the full complement of standard operators and functions, including +, −, ×, ÷, >, ≥, <, ≤, equality comparison, variable assignment, additive and multiplicative deconvolution, maximum, minimum, power, exponential, natural and common logarithm, square root, integer part, sine, cosine, and arc tangent. The basic logical operations (and, or, not) are interpreted in both traditional boolean and fuzzy-logical contexts. Also several new functions for fuzzy operands are introduced, including least and greatest possible value, breadth of uncertainty, most possible interval, interval of specified possibility, and possibility level of a scalar. All operations and functions are transparently supported for pure or mixed expressions involving scalars, intervals, and fuzzy numbers. Expressions are evaluated as they are entered and the resulting values automatically displayed graphically. The software also accepts and propagates (arbitrarily named) units embedded with the numbers in expressions and checks that their dimensions conform in additive and comparison operations. The software features fundamental programming constructs for conditional execution and looping, strings and basic string operators, hypertext help, script execution, user-defined display windows, and an extensive library of examples that illustrate the properties of fuzzy arithmetic.

1994

U.S. Environmental Protection Agency, Office of Toxic Substances, Washington, DC

Dependency bounds on probabilistic risk assessments

Scott Ferson, Neal Oden and Thomas F. Long

8 December 1993, Savannah, Georgia

Society for Risk Analysis annual meeting

Proponents of Monte Carlo analysis argue that the central tendenceies of distributions estrimated in probabilistic risk assessments are insensitive to the presence of correlations among the variables involved in the estimation. The distribution tails, however, can be quite sensitive to such correlations and other dependencies among the input variables. As a consequence, the estimated probabilities of extreme events, which may have special significance in risk analyses, can be in error by several orders of magnitudes if these correlations are ignored. We review a new method which estimates bounds on distributions expressed as cumulative distribution functions when correlations among the input variables are unknown. The approach redresses one of the weaknesses of Monte Carlo methods which require considerable empirical information to estimate correlations among input variables. The approach allows estimation of upper and lower bounds on the distributions of sums, differences, products, quotients, minima, maxima, logarithms, and powers of specified random distributions. Remarkably, its algorithm is computationally cheaper than comparable Monte Carlo methods. As an example of its use, we recompute a recent assessment of lifetime cancer risks from benzene contamination in soil that was originally based on Monte Carlo analysis.

1993

ASTM Symposium on Ecotoxicology and Risk Assessment, Atlanta, Georgia

1993

Southern California Edison, Environmental Research Issue Seminar, Los Angeles, California

1992

Northeast Pacific Salmon and Coho Workshop, Boise, Idaho

Uncertainty propagation: temporal variability vs. measurement error

Scott Ferson and Lev Ginzburg

10 December 1991, Baltimore, Maryland

Society for Risk Analysis annual meeting

Ecological risk analysis has been strongly criticized for failing to express the uncertainty of its estimates and for ignoring uncertainty when making assessments. We have previously described and implemented a general Monte Carlo approach for estimating species extinction risks that treats stochastic temporal fluctuation of process parameters (RAMAS). While the approach handles an important source of uncertainty, another significant source, measurement error, cannot be addressed with the same methods. Since measurement error in ecology can often be 50% r more of the measured magnitude, the standard practice of treating values as if they were known with certainty can result in grossly misleading conclusions. We apply possibility theory (which is a generalization of interval analysis) to the existing methodology for estimating quasi-extinction risks of biological populations. The resulting extinction risks, which reflect both measurement error and intrinsic temporal variability in the vital rates and abundance measures, are therefore considerably more useful in planning conservation strategies. The methodology will be widely applicable in ecological risk analysis in general, including environmental impact assessments and human health risk analyses.

1991

Center for Remote Sensing, University of Delaware, Newark, Delaware

1991

ESA Symposium on Ecological Modeling and Natural Areas, San Antonio, Texas

1991

IBM Academic Computing Conference, Dallas, Texas

1990

Microcomputers in Ecological Modeling, Intecol, Yokohama, Japan

1990

C-CUE Conference at Carnegie Mellon University, Pittsburg, Pennsylvania

1988

NSF Workshop on Modeling Methods in Biological Resource Management

1988

Los Angeles County Natural History Museum, Los Angeles, California

1987

Biology Department, San Diego State University, San Diego, California

- Ohio Chapter of The Wildlife Society, Ohio State University, Columbus, OH (October 2000)
- Annual meeting of the Society for Conservation Biology, Missoula, Montana (June 2000)
- "Uncertainty: Its Nature, Analytical Treatment and Interpretation", Society for Risk Analysis forum, Arlington, Washington (2000)
- "Beyond Point Estimates: Risk Assessment Using Interval, Fuzzy and Probabilistic Arithmetic", Society for Risk Analysis workshop, Arlington, Virginia (2000)
- Annual meeting of the Society for Conservation Biology, College Park, Maryland (June 1999)
- Tools for population viability analysis. Society for Conservation Biology, College Park, MD (June 1999)
- Biology Department, Middle East Technical University, Ankara, Turkey (May 1999)
- "Beyond Point Estimates: Risk Assessment Using Interval, Fuzzy and Probabilistic Arithmetic", Society for Risk Analysis workshop, Atlanta, Georgia (1999)
- Massachusetts Department of Environmental Protection, Boston, Massachusetts (1999)
- U.S. Environmental Protection Agency's National Center for Statistics and the Environment, Seattle, Washington (1999)
- Harvard Center for Risk Analysis, Harvard School of Public Health, Cambridge, Massachusetts (1999)
- Institute of Environmental Science and Technology, University of Yokohama, Japan (1999)
- International JST Workshop of Risk Evaluation and Management of Chemicals, Yokohama, Japan (1999)
- Annual meeting of the Society for Conservation Biology, Sydney, Australia (July 1998)
- "Wrangling Variability and Uncertainty", Society of Risk Analysis forum, Arlington, Virginia (1998)
- "Beyond Point Estimates: Risk Assessment Using Interval, Fuzzy and Probabilistic Arithmetic", Society for Risk Analysis workshop, Phoenix, Arizona (1998)
- Waterways Experiment Station Risk Assessment Modeling Workshop, New Orleans, Louisiana (1998)
- Decision Analysis Seminar, Fuqua School of Business, Duke University, Chapel Hill, North Carolina (1998)
- U.S. Army Corps of Engineers conference on ecological risk, San Diego, California (1998)
- Beyond point estimates: risk assessment using interval, fuzzy and probabilistic arithmetic. Washington, DC (December 1997)
- Tools for ecological and human health risk analyses. Setauket, NY (April 1997)
- Decision Analysis Society session on copulas applications, Dallas, Texas (1997)
- High Consequence Operations Safety Symposium, Albuquerque, New Mexico (1997)
- IBC conference on ecological risk assessment, Boston, Massachusetts (1997)
- Interval analysis and fuzzy arithmetic for environmental risk assessment. New Orleans, LA (December 1996)
- Ecological risk assessment for red cockaded woodpecker. Fort Polk, LA (February 1996)
- Interval analysis and fuzzy arithmetic for environmental risk assessmen. Society for Risk Analysis, Honolulu, HI (December 1995)
- Ecological risk analysis: methods and software. Winnipeg, Canada (March 1995)
- Tools for ecological risk assessment: ecotoxicology and endangered species. Irving, TX (March 1995)
- Ecological risk assessment. Setauket, NY (January 1995)
- Beyond point estimates: risks using interval analysis and fuzzy arithmetic. Society for Risk Analysis, Baltimore, MD (December 1994)
- Ecological risk analysis. Environment Canada, Ottawa, Canada (November 1994)
- Ecological risk analysis: methods and software. Dupont Corporate Remediation, Wilmington, DE (September 1994)
- Software tools for population viability analysis. AIBS Annual Meeting, Knoxville, TN (August 1994)
- Annual meeting of the Ecological Society of America, Knoxville, Tennessee (August 1994)
- Tools for risk assessment: RAMAS software library. Raleigh, NC (February 1994)
- Ecological risk analysis: meeting present and future environmental regulation. Sponsored by EPRI and Army Corps of Engineers, Setauket, NY (August 1993)
- Workshop on software tools for population viability analysis. Ecological Society of America Annual Meeting, Madison, WI (July 1993)
- Annual meeting of the Ecological Society of America, Madison, Wisconsin (July 1993)
- Ecological risk assessment: methodology and software. Utility Water Act Group, EPRI, Palo Alto, CA (February 1993)
- The use of exemplar species in ecological studies. Setauket, NY (October 1992)
- Ecological risk analysis at the population level. International Society for Ecological Modelling, Honolulu, HI (August 1992)
- Workshop on software tools for population viability analysis. AIBS Annual Meeting, Honolulu, HI (August 1992)
- 43rd annual AIBS meeting, University of Hawaii, Honolulu, Hawaii (August 1992)
- Ecological risk analysis at the population level. Setauket, NY (October 1991)
- Modeling ecotones, boundaries and transition zones. ISEM Annual Meeting, Toronto, Ontario (August 1989)

Additional Abstracts

Regan, H.M., B.E. Sample and S. Ferson. Comparison of deterministic and probabilistic calculation of ecological soil screening levels. Environmental Toxicology and Chemistry <<>>.

The U.S. EPA is sponsoring development of Ecological Soil Screening Levels (Eco-SSLs) for terrestrial wildlife. These are intended to be used to identify chemicals of potential ecological concern at Superfund sites. Eco-SSLs represent concentrations of contaminants in soils that are believed to be protective of ecological receptors. An exposure model, based on soil and food ingestion rates, and the relationship between the concentrations of contaminants in soil and food, has been developed for estimation of wildlife Eco-SSLs. It is important to understand how conservative and protective these values are, how parameterization of the model influences the resulting Eco-SSL, and how the treatment of uncertainty impacts results. Eco-SSLs were calculated for meadow voles (Microtus pennsylvanicus) and northern short-tailed shrews (Blarina brevicauda) for lead and DDT using deterministic and probabilistic methods. Conclusions obtained include: use of central-tendency point estimates may result in hazard quotients much larger than one; a Monte Carlo approach also leads to hazard quotients that can be substantially larger than one; if no hazard quotients larger than one are allowed, any probabilistic approach is identical to a worse-case approach; the larger the uncertainty about inputs, the smaller the Eco-SSL must be. This is the inherent cost of uncertainty.

Ferson, S., L.R. Ginzburg and H.R. Akçakaya. Whereof one cannot speak: when input distributions are unknown. [accepted for publication by Risk Analysis, but never published; available on line].

One of the major criticisms of probabilistic risk assessment is that the requisite input distributions are often not available. Several approaches to this problem have been suggested, including creating a library of standard empirically fitted distributions, employing maximum entropy criteria to synthesize distributions from a priori constraints, and even using 'default' inputs such as the triangular distribution. Since empirical information is often sparse, analysts commonly must make assumptions to select the input distributions without empirical justification. This practice diminishes the credibility of the assessment and any decisions based on it. There is no absolute necessity, however, of assuming particular shapes for input distributions in probabilistic risk assessments. It is possible to make the needed calculations using inputs specified only as bounds on probability distributions. We describe such bounds for a variety of circumstances where empirical information is extremely limited, and illustrate how these bounds can be used in computations to represent uncertainty about input distributions far more comprehensively than is possible with current approaches.

Ferson, S. Checking for errors in calculations and software: dimensional balance and conformance of units. Accountability in Research: Policies and Quality Assurance 8: 261-279.

Although there has always been a general awareness that mathematical expressions must make dimensional sense in terms of the units involved, it is very easy to make simple mistakes in quantitative work that result in profound and potentially dangerous errors. Such errors are ubiquitous in modern research, as can be seen by reviewing government publications where dimensional errors persist despite peer and public review. Software methods have recently become available for checking calculations, equations, algorithms and programs for dimensional soundness. Correctness depends on conformance at two levels: balance of dimensions and agreement among units. Error at either level can now be detected automatically by software. Disagreement among units can even be automatically corrected by software procedures. These software tools can be used to check for errors in calculations and software source code in a way that is similar to using a spelling or grammar checker for text.

Spencer, M., N.S. Fisher, W.-X. Wang, S. Ferson. 2001. Temporal variability and ignorance in Monte Carlo contaminant bioaccumulation models: a case study with selenium in Mytilus edulis. Risk Analysis 21: 383-394.

Although the parameters for contaminant bioaccumulation models most likely vary over time, lack of data makes it impossible to quantify this variability. As a consequence, Monte Carlo models of contaminant bioaccumulation often treat all parameters as having fixed true values that are unknown. This can lead to biased distributions of predicted contaminant concentrations. This article demonstrates this phenomenon with a case study of selenium accumulation in the mussel Mytilus edulis in San Francisco Bay. "Ignorance-only" simulations (in which phytoplankton and bioavailable selenium concentrations are constant over time, but sampled from distributions of field measurements taken at different times), which an analyst might be forced to use due to lack of data, were compared with "variability and ignorance" simulations (sampling phytoplankton and bioavailable selenium concentrations each month). It was found that ignorance-only simulations may underestimate or overestimate the median predicted contaminant concentration at any time, relative to variability and ignorance simulations. However, over a long enough time period (such as the complete seasonal cycle in a seasonal model), treating temporal variability as if it were ignorance at least gave a range of predicted concentrations that enclosed the range predicted by explicit treatment of temporal variability. Comparing the temporal variability in field data with that predicted by simulations may indicate whether the right amount of temporal variability is being included in input variables. Sensitivity analysis combined with biological knowledge suggests which parameters might make important contributions to temporal variability. Temporal variability is potentially more complicated to deal with than other types of stochastic variability, because of the range of time scales over which parameters may vary.

Burgman, M.A., D.R. Breininger, B.W. Duncan and S. Ferson. Setting reliability bounds on habitat suitability indicies. Ecological Applications <<>>.

We expressed quantitative and qualitative uncertainties in suitability index functions as triangular distributions and used the mechanics of fuzzy numbers to solve for the distribution of uncertainty around the habitat suitability indices derived from them. We applied this approach to a habitat model for the Florida Scrub-Jay. The results demonstrate that priorities and decisions associated with management and assessment of ecological systems may be influenced by an explicit consideration of the reliability of the indices.

Crutchfield, J. and S. Ferson. 2000. Predicting recovery of a fish population after heavy metal impacts. Environmental Science and Policy 3: S183-S189.

Bluegill sunfish, Lepomis macrochirus, in part of Hyco Reservoir (North Carolina) were decimated by toxicological and developmental effects of selenium leached from coal ash settling ponds during 1970-1980. Bluegill are especially sensitive to elevated concentrations of the heavy metal, and near-complete recruitment failure of zero-year olds was observed. To predict the potential recovery after cessation of heavy metal contamination, a demographic model was created for the bluegill population based on data collected from on-going biological monitoring at the lake. The model included density dependence and used Monte Carlo methods to analyze the effects of natural environmental variability. The life history of the species suggests that once selenium poisoning stopped, the population could recover to pre-impact abundances within two years, although the increased abundance would be unevenly distributed among age groups. However, following this increase in abundance, we predicted a population crash due to the time-delayed effects of selenium on the population resulting from the strong nonlinearity of density dependence in this species. The sharp increase in population size itself precipitates the crash which, if not forecast in advance, could cause considerable concern among managers, regulators and the interested public. This example shows that it can be important to predict ecological consequences to understand the nature and duration of biological recovery of toxicological insults. Without the understanding provided by the ecological analysis, the population decline would probably be completely misinterpreted as the failure of the mitigation program.

Akçakaya, H.R., S. Ferson, M. Burgman, D. Keith, G. Mace and C. Todd. 2000. Making consistent IUCN classifications under uncertainty. Conservation Biology 14: 1001-1013.

The World Conservation Union (IUCN) defined a set of categories for conservation status supported by decision rules based on thresholds of parameters such as distributional range, population size, population history, and risk of extinction. These rules have received international acceptance and have become one of the most important decision tools in conservation biology because of their wide applicability, objectivity, and simplicity of use. The input data for these rules are often estimated with considerable uncertainty due to measurement error, natural variation, and vagueness in definitions of parameters used in the rules. Currently, no specific guidelines exist for dealing with uncertainty. Interpretation of uncertain data by different assessors may lead to inconsistent classifications because attitudes toward uncertainty and risk may have an important influence on the classification of threatened species. We propose a method of dealing with uncertainty that can be applied to the current IUCN criteria without altering the rules, thresholds, or intent of these criteria. Our method propagates the uncertainty in the input parameters and assigns the evaluated species either to a single category (as the current criteria do) or to a range of plausible of categories, depending on the nature and extent of uncertainties.

Fortin, M.-J., R.J. Olson, S. Ferson. L. Iverson, D. Levine, K. Buteras, V. Klemas and C. Hunsaker. 2000. Detecting boundaries and gradients associated with ecotones. Landscape Ecology 15: 453-466.

Ecotones are inherent features of landscapes, transitional zones, and play more than one function role in ecosystem dynamics. The delineation of ecotones and environmental boundaries is therefore an important step in land-use management planning. The delineation of ecotones depends on the phenomenon of interest and the statistical methods used as well as the associated spatial and temporal resolution of the data available. In the context of delineating wetland and riparian ecosystems, various data types (field data, remotely sensed data) can be used to delineate ecotones. Methodological issues related to their detection need to be addressed, however, so that their management and monitoring can yield useful information about their dynamics and functional roles in ecosystems. The aim of this paper is to review boundary detection methods. Because the most appropriate methods to detect and characterize boundaries depend of the spatial resolution and the measurement type of the data, a wide range of approaches are presented: GIS, remote sensing and statistical ones.

Goldwasser, L., L. Ginzburg and S. Ferson. 2000 Variability and measurement error in extinction risk analysis: the northern spotted owl on the Olympic Peninsula. Quantitative Methods for Conservation Biology, S. Ferson and M. Burgman (eds.), Springer-Verlag, New York.

Ferson, S., H.R. Akçakaya and A. Dunham. 1999. Using fuzzy intervals to represent measurement error and scientific uncertainty in endangered species classification. Real World Application of Fuzzy Logic and Soft Computing, R.N. Davé and T. Sudkamp (eds.), Proceedings of the 18th International Conference of NAFIPS, IEEE, Piscataway, New Jersey.

Although fuzzy numbers (including fuzzy intervals) are often used to capture semantic ambiguity, they are also useful to represent and propagate measurement error. In this application, a classification scheme used by international authorities for assigning biological species into categories of relative endangerment is generalized to accept intervals and triangular or trapezoidal fuzzy numbers as inputs representing empirical estimates of unknown quantities. Non-traditional definitions for fuzzy magnitude comparisons and logical operations were required but, otherwise, standard fuzzy arithmetic was used. A defuzzification step, which explicitly reveals the analyst's attitudes regarding evidence, can condense the result from the fuzzified classification scheme to a single category. But this step is not required and may be counterproductive.

Ferson, S. 1999. Ecological risk assessment based on extinction probabilities of populations. Proceedings of the Second International Workshop on Risk Evaluation and Management of Chemicals, J. Nakanishi (ed.), Japan Science and Technology Corporation, Yokohama.

Many researchers now agree that an ecological risk assessment should be a probabilistic forecast of effects at the level of the population. The emerging consensus has two essential themes: (i) individual-level effects are less important for ecological management, and (ii) deterministic models cannot adequately portray the environmental stochasticity that is ubiquitous in nature. It is important to resist the temptation to reduce a probabilistic analysis to a scalar summary based on the mean. An assessment of the full distribution of risks will be the most comprehensive and flexible endpoint. There are two ways to visualize a distributional risk assessment of a chemical's impact on a population. The first is to display, side by side, the two risk distributions arising from separate simulations with and without the impact but alike in every other respect. Alternatively, one can display the risk of differences between population trajectories with and without impact but alike in every other respect. Like all scientific forecasts, an ecological risk assessment requires appropriate uncertainty propagation. This can be accomplished by using a mixture of interval analysis and Monte Carlo simulation techniques.

Ferson, S. and S. Donald. 1998. Probability bounds analysis, pp. 1203-1208 in Probabilistic Safety Assessment and Management, A. Mosleh and R.A. Bari (eds.), Springer-Verlag, New York.

Probabilistic risk assessments almost always demand more information about the statistical distributions and dependencies of input variables than is available empirically. For instance, to use Monte Carlo simulation, one generally needs to specify the particular shapes and exact parameters for all the input variables. Imperfect understanding of how common-cause or common-mode mechanisms induce correlations or complicated dependencies among the variables typically forces analysts to assume independence even if they suspect otherwise. Probability bounds analysis allows assessors to sidestep both uncertainty about the precise specifications of input variables and imperfect information about the correlation and dependency structure among the variables to compute rigorous bounds on the resulting risks.

Hwang, K. and S. Ferson. 1998. Detecting rare event clustering in very sparse data sets, pp. 1235-1240 in Probabilistic Safety Assessment and Management, A. Mosleh and R.A. Bari (eds.), Springer-Verlag, New York.

Traditional statistical methods to detect clustering are inappropriate for the sparse data available for rare events because they assume large sample sizes. These methods can either underestimate or overestimate the evidence for clustering, and it is hard to predict the direction of the error. Moreover, the traditional statistics are very coarse indicators of non-random pattern and are not especially designed to detect clustering per se. We describe eight new statistical tests, implemented in a convenient software package, that can be used to detect clustering of rare events in structured environments. Because the new tests employ combinatorial formulations, they are exact methods. As a result, the new tests are reliable whatever the size of the data set, and are especially useful when data sets are extremely small. By design, these tests are sensitive to different aspects of clusters and should be useful in discerning not only the fact of clustering but also something about the processes that generated the clustering.

Ferson, S. and T.F. Long. 1997. Deconvolution can reduce uncertainty in risk analyses. Risk Assessment: Measurement and Logic, M. Newman and C. Strojan (eds.), Ann Arbor Press.

The operations implemented in familiar software packages such as @Risk, Crystal Ball and Analytica routinely estimate convolutions of random variables. In comprehensive risk analyses, however, deconvolutions should be used whenever justified because they will often yield narrower distributions containing less overall uncertainty. Furthermore, in some circumstances, failing to use deconvolutions when appropriate can lead to conclusions that are insufficiently protective. Deconvolutions reduce uncertainty by taking into account knowledge about the underlying relationship among the variables. Although they are almost never used in current risk analyses, situations which justify their use arise frequently in practical situations. Both probabilistic and traditional worst-case or bounding-estimate risk analyses can benefit from appropriate use of deconvolution. We outline when deconvolutions can be used and how they are computed in both analytical settings.

Ferson, S. 1997. Probability bounds analysis software. Computing in Environmental Resource Management. Proceedings of the Conference, A. Gertler (ed.), Air and Waste Management Association and the U.S. Environmental Protection Agency, Pittsburgh, Pennsylvania. pp. 669-678.

Probabilistic risk assessments almost always require more information about the statistical distributions and dependencies of input variables than is empirically available. For instance, to use Monte Carlo simulation, one generally needs to specify the particular shapes and precise parameters for all the input variables even if relevant data are very sparse. Moreover, imperfect understanding of how common-cause or common-mode mechanisms induce correlations or complicated dependencies among the variables typically forces analysts to assume independence even if they suspect otherwise. Most practitioners acknowledge the limitations induced by these problems, yet rarely employ sensitivity studies or second-order simulations to assess their possible significance because they would be computationally prohibitive. Probability bounds analysis allows assessors to sidestep both uncertainty about the precise specifications of input variables and imperfect information about the correlation and dependency structure among the variables. Probability bounds analysis is therefore an important tool for establishing quality assurance of probabilistic risk assessments, yet until now this tool has not been accessible to the practicing risk analyst because of a lack of convenient software. The package Risk Calc implements probability bounds analysis under the Windows operating system. The interface is convenient and natural for quantitative risk assessors. It offers a powerful array of standard functions, which are extensible by means of user-written programs. The software permits calculations that are vastly less expensive computationally than alternative approaches. For instance, a second-order Monte Carlo simulation that required two weeks to compute on a microcomputer can be replaced by a calculation with probability bounds analysis in Risk Calc that takes only seconds. The features and uses of Risk Calc are outlined.

Ferson, S. 1996. What Monte Carlo methods cannot do. Human and Ecological Risk Assessment 2:990-1007.

Although extremely flexible and obviously useful for many risk assessment problems, Monte Carlo methods have four significant limitations that risk analysts should keep in mind. (1) Like most methods based on probability theory, Monte Carlo methods are data-intensive. Consequently, they usually cannot produce results unless a considerable body of empirical information has been collected, or unless the analyst is willing to make several assumptions in the place of such empirical information. (2) Although appropriate for handling variability and stochasticity, Monte Carlo methods cannot be used to propagate partial ignorance under any frequentist interpretation of probability. (3) Monte Carlo methods cannot be used to conclude that exceedance risks are no larger than a particular level. (4) Finally, Monte Carlo methods cannot be used to effect deconvolutions to solve backcalculation problems such as often arise in remediation planning. This paper reviews a series of ten exemplar problems in risk analysis for which classical Monte Carlo methods yield an incorrect answer.

Ferson, S. and L.R. Ginzburg. 1996. Different methods are needed to propagate ignorance and variability. Reliability Engineering and Systems Safety 54:133-144.

There are two kinds of uncertainty. One kind arises as variability resulting from heterogeneity or stochasticity. The other arises as partial ignorance resulting from systematic measurement error or subjective (epistemic) uncertainty. As most researchers recognize, variability and ignorance should be treated separately in risk analyses. Although a second-order Monte Carlo simulation is commonly employed for this task, this approach often requires unjustified assumptions may be inappropriate in some circumstances. We argue the two kinds of uncertainty should be propagated through mathematical expressions with different calculation methods. Basically, interval analysis should be used to propagate ignorance, and probability theory should be used to propagate variability. We demonstrate how using an inappropriate method can yield erroneous results. We also show how ignorance and variability can be represented simultaneously and manipulated in a coherent analysis that does not confound the two forms of uncertainty and distinguishes what is known from what is assumed.

Ginzburg, L.R., C. Janson, and S. Ferson. 1996. Judgment under uncertainty: evolution may not favor a probabilistic calculus. Behavioral and Brain Sciences 19: 24f.

The environment in which humans evolved is strongly positively autocorrelated in space and time. Probabilistic judgements based on the assumption of independence may not yield evolutionarily adaptive behavior. A number of 'faults' of human reasoning are not faulty under fuzzy arithmetic, a non-probabilistic calculus of reasoning under uncertainty that may be closer to that underlying human decision making.

Ferson, S. 1996. Reliable calculation in probabilistic logic: accounting for small sample size and model uncertainty. Intelligent Systems: A Semiotic Perspective, NIST, Gaithersburg, Maryland. 115-121.

A variety of practical computational problems arise in risk and safety assessments, forensic statistics and decision analyses in which the probability of some event or proposition E is to be estimated from the probabilities of a finite list of related subevents or propositions F,G,H,.... In practice, the analyst's knowledge may be incomplete in two ways. First, the probabilities of the subevents may be imprecisely known from statistical estimations, perhaps based on very small sample sizes. Second, relationships among the subevents may be known imprecisely. For instance, there may be only limited information about their stochastic dependencies. Representing probability estimates as interval ranges on [0,1] has been suggested as a way to address the first source of imprecision. A suite of AND, OR and NOT operators defined with reference to the classical Fréchet inequalities permit these probability intervals to be used in calculations that address the second source of imprecision, in many cases, in a best possible way. Using statistical confidence intervals as inputs unravels the closure properties of this approach however. One solution is to characterize each probability as a nested stack of intervals for all possible levels of statistical confidence, from a point estimate (0% confidence) to the entire unit interval (100% confi-dence). The corresponding logical operations implied by convolutive application of the logical operators for every possible pair of confidence intervals reduces by symmetry to a manageably simple level-wise iteration. The resulting logical calculus can be implemented in software that allows users to compute comprehensive and often level-wise best possible bounds on probabilities for logical functions of events.

Ferson, S., L.R. Ginzburg and R.A. Goldstein. 1995. Inferring ecotoxicological risk from toxicity bioassays. Water, Air and Soil Pollution 90:71-82.

Results from toxicological bioassays can express the likely impact of environmental contamination on biochemical function, histopathology, development, reproduction and survivorship. However, justifying environmental regulatory decisions and management plans requires predictions of the consequent effects on ecological populations and communities. Although extrapolating the results of toxicity bioassays to potential effects on the ecosystem may be beyond the current scientific capacity of ecology, it is possible to make detailed forecasts at the level of a population. We give examples in which toxicological impacts are either magnified or diminished by population-dynamic phenomena and argue that ecological risk assessments should be conducted at a level no lower than the population. Although methods recently proposed by EPA acknowledge that ecological risk evaluations should reflect population-level effects, they adopt approaches from human health risk analysis that focus on individuals.

Ferson, S. 1995. Quality assurance for Monte Carlo risk assessments. Proceedings of the 1995 Joint ISUMA/NAFIPS Symposium on Uncertainty Modeling and Analysis, IEEE Computer Society Press, Los Alamitos, California, pp. 14-19.

Three major problems inhibit the routine use of Monte Carlo methods in risk and uncertainty analyses: (1) correlations and dependencies are often ignored, (2) input distributions are usually not available, and (3) mathematical structure of the model is questionable. Most practitioners acknowledge the limitations induced by these problems, yet rarely employ sensitivity studies or other methods to assess their consequences. This paper reviews several computational methods that can be used to check a risk assessment for the presence of certain kinds of fundamental modeling mistakes, and to assess the possible error that could arise when variables are incorrectly assumed to be independent or when input distributions are incompletely specified. decisions.

Ferson, S. and L. Ginzburg. 1995. Hybrid arithmetic. Proceedings of the 1995 Joint ISUMA/NAFIPS Symposium on Uncertainty Modeling and Analysis, IEEE Computer Society Press, Los Alamitos, California, pp. 619-623.

Kaufmann's formulation of hybrid numbers, which simultaneously express fuzzy and probabilistic uncertainty, allows addition and subtraction, but offers no obvious way to do multiplication, division or other operations. We describe another, more comprehensive formulation for hybrid numbers that allows the full suite of arithmetic operations, permitting them to be incorporated into complex mathematical calculations. There are two complementary approaches to computing with these hybrid numbers. The first is extremely efficient and yields theoretically optimal results in many circumstances. The second more general approach is based on Monte Carlo simulation using intervals or fuzzy numbers rather than scalar numbers.

Ferson, S. and M. Burgman. 1995. Correlations, dependency bounds and extinction risks. Biological Conservation 73:101-105.

Methods for estimating bounds on extinction risks are described for cases where correlations among the parameters in a quantitative population viability analysis are unknown, and more generally where the forms of statistical dependence among the model's parameters are unknown. An example using Leadbeater's possum shows that making incorrect assumptions about correlations and dependencies may lead to severe over-estimation or under-estimation of extinction risks.

Ferson, S. 1995. Using approximate deconvolution to estimate cleanup targets in probabilistic risk analyses, pages 239-248 in Hydrocarbon Contaminated Soils, P. Kostecki (ed). Amherst Scientific Press, Amherst, Massachusetts.

Although deconvolution seems to be virtually unknown among practicing risk analysts, it will likely play an important role in backcalculating the soil cleanup goals necessary to satisfy remediation requirements. Many algorithms for computing deconvolutions have been described, but most are numerically unstable and extremely sensitive to noise. Even modern nonlinear methods which have considerably improved performance characteristics can be unwieldy in the kinds of uses required in risk analysis. A simple but robust approximate method to estimate deconvolutions is described and illustrated with a numerical example.

Ferson, S. and T.F. Long. 1995. Conservative uncertainty propagation in environmental risk assessments. Environmental Toxicology and Risk Assessment, Third Volume, ASTM STP 1218, J.S. Hughes, G.R. Biddinger and E. Mones (eds.), American Society for Testing and Materials, Philadelphia, pp. 97-110.

Toxicological risk analysis is at a crossroads. The traditional approach using worst case analysis is widely regarded as fundamentally flawed since it yields conclusions that are often strongly biased and presumed hyperconservative. Probabilistic analysis using Monte Carlo simulation can yield overly optimistic conclusions when used without information about the correlation structure among variables. What is needed is a conservative methodology that makes no assumptions unwarranted by empirical evidence. To be conservative means both that the estimated risk is not systematically lower than the actual impact, but also that the uncertainty around the estimate is not narrower than justified by the available data. An appropriate methodology is dependency bounds analysis which computes bounds on the distributions for arithmetic operations on random variables when only their marginal distributions are known. Used in a risk analysis, it yields conservative results because it does not depend on knowledge about the correlation structure among all of the variables used in the analysis.

Ferson, S. 1994. Naive Monte Carlo methods yield dangerous underestimates of tail probabilities. Proceedings of the High Consequence Operations Safety Symposium, Sandia National Laboratories, SAND94-2364, J.A. Cooper (ed.), pp. 507-514.

Ferson, S. and R. Kuhn. 1994. Interactive microcomputer software for fuzzy arithmetic. Proceedings of the High Consequence Operations Safety Symposium, Sandia National Laboratories, SAND94-2364, J.A. Cooper (ed.).

We describe a new microcomputer implementation of fuzzy arithmetic for the Microsoft Windows operating system based on the strict definition of fuzzy numbers (i.e., fuzzy sets that are both convex and normal). Fuzzy numbers can be specified by a user as intervals in the form [a,b] or [a±b], as triangular numbers in the form [a,b,c], as trapezoidal numbers in the form [a,b,c,d], or as an arbitrary list of points describing the fuzzy membership function. A fuzzy number can also be synthesized as the consensus of a list of intervals. The software supports the full complement of standard operators and functions, including +, -, ´, ¸¸ >, ³, <, £, equality comparison, variable assignment, additive and multiplicative deconvolution, maximum, minimum, power, exponential, natural and common logarithm, square root, integer part, sine, cosine, and arc tangent. The basic logical operations (and, or, not) are interpreted in both traditional boolean and fuzzy-logical contexts. Also several new functions for fuzzy operands are introduced, including least and greatest possible value, breadth of uncertainty, most possible interval, interval of specified possibility, and possibility level of a scalar. All operations and functions are transparently supported for pure or mixed expressions involving scalars, intervals, and fuzzy numbers. Expressions are evaluated as they are entered and the resulting values automatically displayed graphically. The software also accepts and propagates (arbitrarily named) units embedded with the numbers in expressions and checks that their dimensions conform in additive and comparison operations. The software features fundamental programming constructs for conditional execution and looping, strings and basic string operators, hypertext help, script execution, user-defined display windows, and an extensive library of examples that illustrate the properties of fuzzy arithmetic.

Ferson, S. 1994. Using fuzzy arithmetic in Monte Carlo simulation of fishery populations. Management Strategies for Exploited Fish Populations, G. Kruse et al. (eds.), proceedings of the International Symposium on Management Strategies for Exploited Fish Populations, Anchorage, 1992, Alaska Sea Grant College Program, AK-SG-93-02, pp. 595-608.

Traditional methods of population projection have made two manifestly false assumptions. The first is that vital rates are constant in time and the second is that they are known with certainty. The first assumption can be removed by using a Monte Carlo approach that treats stochastic temporal fluctuation of vital parameters and estimates population decline risks and recovery chances. While this approach handles an important source of uncertainty, another significant source, measurement error, cannot be addressed with the same methods. Since measurement error in fisheries management can often be fifty percent or more of the measurement magnitude, the standard practice of treating values as if they were known with certainty can result in grossly misleading conclusions. Possibility theory (a generalization of interval analysis) together with the existing methodology yields a risk analysis of population growth that reflects both measurement error and temporal variability in the vital rates and abundance measures. This analysis should be more useful in planning management strategies because it characterizes the reliability of the predictions made.

Ferson, S. and R. Kuhn. 1992. Propagating uncertainty in ecological risk analysis using interval and fuzzy arithmetic. Computer Techniques in Environmental Studies IV, P. Zannetti (ed.), Elsevier Applied Science, London, pp. 387-401.

Ecological risk analysis has been strongly criticized for failing to express the reliability of its estimates and for ignoring uncertainty when making assessments. There are probabilistic methods for projecting uncertainty through calculations but they are often not used because they require a great deal of data to estimate probability distributions and cross correlations. Since measurement error in ecology can often be 50% or more of the measured magnitude, the standard practice of treating values as if they were known with certainty can result in grossly misleading conclusions. We propose to apply possibility theory (which is a refinement of simple interval analysis) to propagate uncertainty. The proposed methodology is logically consistent, computationally inexpensive and does not require extensive sampling effort. In comparison with analogous probabilistic methods, it produces conservative estimates of uncertainty in the final answer, which may be especially attractive in many situations. We have developed a command-based software package that performs fuzzy-arithmetical operations necessary for possibilistic projection of uncertainty. The methodology should be widely applicable in ecological risk analysis, including environmental impact assessments and human health risk analyses, to provide an estimate of the reliability of an assessment.

Ferson, S., R. Akçakaya, L. Ginzburg and M. Krause. 1991. Use of RAMAS to estimate ecological risk. EPRI Technical Report EN-7176. Electric Power Research Institute, Palo Alto, California.

The software system RAMAS was used to assess the population- level ecological risks associated with anthropogenic mortality in two species of fish. RAMAS facilitated comparison of the effects of fishing and entrainment/impingement mortality on Hudson River striped bass populations. The highest likely mortality levels associated with power generation did not yield increases in risk of overall population decline as large as did the pressure from sport fishing alone (33" limit, 5/day). Qualitative differences associated with the life stages affected by these industries account for most of the variation observed. Simulations performed under a range of assumptions about density-dependent parameters for the striped bass population gave similar conclusions. However, strengthening density dependence decreased the probability of quasi-extinction slightly. Density-dependent stochastic demographic modeling of a bluegill population in a selenium (Se) affected power plant cooling lake in North Carolina revealed intrinsic cycling of population abundance. This cycling increases the risk of quasi-extinction in natural as well as anthropogenically impacted populations. The dynamics of bluegills affected by Se contrasts sharply with that of the undisturbed fish. Continuation of the Se discharge will most likely result in the suppression of the affected bluegill population; however, the bluegills could recover to natural abundance levels within two or three generations if released from this mortality pressure.

Symposium at the Society for Risk Analysis annual meeting

The Balance of Nature: Can Toxicologists and Ecologists Come to Consensus?

Chair: S. Ferson. The subject of the proposed symposium is the difference in perspective of toxicologists and ecologists and how it affects the evaluation of risks from contaminant effects to natural communities. The consequence of homeostasis of organisms and other stability mechanisms in natural communities are interpreted differently by the disciplines. Toxicologists often view these mechanisms as a cushion against toxicological insults. Ecologists, on the other hand, are more likely to view the balance of nature as an unstable one, which can be destroyed by toxicant effects. The goals of the symposium are to explore the differences between two communities and to illustrate ways to resolve disagreements to achieve a synthetic assessment of ecotoxicological risks. The speakers and their titles are:

- Todd S. Bridges, USACE Waterways Experiment Station, "Straining the gnat and swallowing the camel: The importance of scale and uncertainty in assessing ecological risk"
- Beth McGee, University of Maryland, "Gleaning ecological risks from sediment toxicity tests"
- Sara Hoover, Golder Associates, "Competing issues in contaminated site risk assessment: a case study"
- Bruce K. Hope, Oregon Department of Environmental Quality, "A new diet of worms: ecologists, toxicologists, and regulators"

Symposium at the Society for Risk Analysis annual meeting

Backcalculating cleanup goals

Chair: S. Ferson. Risks from environmental contaminants are estimated with a risk equation involving contaminant concentration and other factors. Because the intent of environmental remediation is to ensure that these risks are not intolerably large, some way is needed to backcalculate from tolerance limits on risk mandated by regulators to the allowable environmental concentration for the contaminant. It is now well known that the approach used in deterministic assessments of simply inverting the risk equation to compute the cleanup goal does not work in a probabilistic assessment. Several approaches sidestepping the underlying mathematical problems have been proposed by various authors. The goal of this symposium is to review the needs of the regulatory community and examine whether and how the recent methodological work satisfies these needs. The speakers and their titles are:

- Ted Simon, U.S. Environmental Protection Agency, Region 4, "Calculating cleanup levels with monte carlo: regulatory concerns and perspective"
- Matthew Butcher and Rob Pastorok, Exponent Environmental, and Brad Sample, CH2M Hill, "Setting soil screening levels for wildlife at superfund sites"
- Robert Fares and K.G. Symms, Environmental Standards, Inc., "Back-calculation of soil cleanup levels utilizing monte carlo techniques"
- David Myers and Scott Ferson, Applied Biomathematics, "How to satisfy multiple constraints on cleanup goals in a probabilistic assessment"

Surprising dynamics during ecological recovery after heavy metal contamination

Scott Ferson and John Crutchfield

The population of bluegill sunfish Lepomis macrochirus in part of a lake in North Carolina was decimated by toxicological and developmental effects of selenium leached from ash settling ponds. To forecast the potential recovery after cessation of heavy metal contamination, a demographic model was created for the bluegill population based on data collected from on-going biological monitoring at the lake. The model included density dependence which is known to be an important aspect of the life history of this species and used Monte Carlo methods to analyze the effect of natural environmental variability. The life history of the species revealed by analysis of the population model suggests that, if selenium poisoning were stopped, the population could recover to pre-impact abundances within two years. The increased abundance would be unevenly distributed among age groups, however. Following this increase in abundance, the biology predicts a population crash, especially among older year classes (which are prized by sportfishermen). This crash is due to the time-delayed effects of selenium on the population resulting from the strong non-linearity of density dependence in this species. The sharp increase in population size itself precipitates the crash. If this crash were not forecast in advance, its unanticipated occurrence could cause considerable consternation among managers, regulators and the interested public. This example shows that it can be important to predict ecological consequences to understand the nature and duration of biological recovery from toxicological insults. Without the understanding provided by the ecological analysis, the population decline would probably be completely misinterpreted as the failure of the mitigation program.

Ecological risk assessment based on extinction distributions

Many researchers now agree that an ecological risk assessment should be a probabilistic forecast of effects at the level of the population. The emerging consensus has two essential themes: (i) individual-level effects are less important for ecological management, and (ii) deterministic models cannot adequately portray the environmental stochasticity that is ubiquitous in nature. It is important to resist the temptation to reduce a probabilistic analysis to a scalar summary based on the mean. An assessment of the full distribution of risks will be the most comprehensive and flexible endpoint. There are two ways to visualize a distributional risk assessment of a chemical's impact on a population. The first is to display, side by side, the two risk distributions arising from separate simulations with and without the impact but alike in every other respect. Alternatively, one can display the risk of differences between population trajectories with and without impact but alike in every other respect. Like all scientific forecasts, an ecological risk assessment requires appropriate uncertainty propagation. This can be accomplished by using a mixture of interval analysis and Monte Carlo simulation techniques.

Robust risk analysis: what can we be sure about?

Four major problems inhibit the routine use of Monte Carlo methods in risk and uncertainty analyses:

1. correlations and dependencies are often ignored,
2. input distributions are usually not available,
3. mathematical structure of the model is questionable, and
4. there are no practical methods to back-calculate solutions for constraints on probabilistic equations.

Most practitioners acknowledge the limitations induced by these problems, yet rarely employ sensitivity studies or other methods to assess their consequences. Recently developed techniques employing probability bounds analysis can be used to obtain (often optimal) bounds on risk calculations that do not require false or unjustified assumptions. The techniques therefore yield a comprehensive risk assessment without the need for precise information about inter-variable dependencies, and even without selection of specific input distributions for those variables. While Monte Carlo techniques for propagating model uncertainty are useful when it can be reduced to parametric uncertainty, a probability bounds approach is far more general and is useful whenever model uncertainty can be reduced to distributional uncertainty. Finally, the new techniques provide convenient solutions to probabilistic equations involving constraints such as are needed to compute the distribution of environmental concentrations that are sure to yield a distribution of doses specified by multiple constraints.

Probabilistic screening assessment of ecological risks

To be practically useful in regulatory settings, screening assessments must have two properties: they must be conservative and they must be easy to perform. Most analysts presume that making an assessment probabilistic makes it relatively more complex and costly and will therefore force it into a higher-tier assessment. But the requirements of conservativism and ease do not in themselves preclude an assessment from being probabilistic. We describe an example of a screening assessment for ecological risks to a biological population that is fully probabilistic. The Malthusian model of population growth (the simplest model possible) can be written as a first-passage problem and solved for the probability that the population declines to any given level within a given time horizon. The formulation uses five quantities, including three describing the population (i) growth rate, (ii) variation in the growth rate, (iii) initial population size, and two quantities determined by the interests of the analyst, (iv) crossing threshold and (v) the time horizon. This formulation permits the evaluation of risks to a population from toxicological or other impacts whenever they can be expressed as changes in any of these quantities. There are only three quantities for which empirical estimates are needed and these estimates can be made conservative to temporal trending and uncertainty in a completely straightforward way. Therefore, a fully probabilistic assessment of ecological risks at the population level may be conducted in screening assessments rather than necessarily being relegated to higher tiers.

What to do about model uncertainty

Matthew Butcher and Scott Ferson

Cleanup goals in a probabilistic assessment

Scott Ferson, David S. Myers, and Matthew Butcher

The ecological risk from a contaminant is estimated with a risk equation involving the contaminant's environmental concentration and other factors. Because the intent of environmental remediation is to ensure that these risks are not intolerably large, we need some way to backcalculate from constraints on risk mandated by regulation or prudence to the allowable environmental concentration for the contaminant. It is now well known that the approach used in deterministic assessments of simply inverting the risk equation to compute the cleanup goal does not work in a probabilistic assessment. Several approaches that sidestep the underlying mathematical problems have been proposed by various authors, but all are strongly limited in their generality. We present for the first time simple and efficient methods to compute cleanup goals that satisfy multiple simultaneous criteria in the context of a probabilistic assessment. The approach can be used with multiple receptors and with arbitrarily many constraints on percentiles or moments of the target risk. This approach uses probability bounds analysis to characterize concentration distributions that satisfy the constraints. The calculations yield two kinds of bounds on concentration: a 'core' and 'shell'. If the concentration distribution is entirely inside the core, then the result surely obeys the prescribed constraints. If the concentration distribution is anywhere outside the shell, then the result certainly fails to comply with the prescribed constraints. If the concentration distribution is outside the core but inside the shell, then compliance must be determined by a forward calculation. Although the core is essentially comparable to the screening level familiar from deterministic assessments, the shell cannot be similarly analogized with an action level.

Population-level assessments with almost no data

Risks to biological populations are among the most relevant endpoints in environmental assessment. However, in most tiered assessment systems, ecological risks are usually considered only at the highest tiers, where monitoring costs are greatest and analyses are most complex. We show how a probabilistic population-level assessment can be conducted at the screening level where answers must be both conservative and cheap to obtain. The Malthusian model of population growth can be expressed as a first-passage problem and solved for the probability that the population declines to any given level within a given time horizon. The formulation uses five quantities, three of which describe the population: (i) average rate of population growth, (ii) variation in the growth rate, and (iii) initial population size. Two quantities represent interests of the analyst: (v) crossing threshold and (vi) the time horizon. This formulation permits the evaluation of additional risks to a population from toxicological impacts whenever they can be expressed as changes in any of these quantities. This means that evaluating ecological risks at the population level may be conducted in screening assessments rather than necessarily being relegated to higher tiers. There are of course many ecological phenomena that are ignored in this simple formulation, including age or spatial structure, density dependence, dispersal, windows of recruitment, trophic interactions, demographic stochasticity, and autocorrelation in environmental conditions. However, it is generally possible to bound the effects of such phenomena and make conservative calculations as needed for a screening assessment. Comparisons with more comprehensive population assessments that use extensive empirical information demonstrate that the results of the screening-level assessments bound the best estimates of population risks.

Can we know the iceberg from its tip? Censored distributions in risk assessment

Matthew Butcher, Timothy Barry, and Scott Ferson

December 1999, Atlanta, Georgia

Society for Risk Analysis Annual Meeting

Data censoring occurs when empirical information about a quantity is limited to knowing only that its value is less than (or greater than) some threshold. Censored distributions commonly arise when observed chemical concentrations contain results that are reported as non-detects. There are several methods available for estimating the underlying distribution from censored data sets. These methods approximate the underlying distribution based on assumptions about the underlying distribution, but their performance degrades as the proportion of non-detects grows to dominate the sample. We describe a method for incorporating all the available data from a censored data set in a risk assessment, without imposing any assumptions on them. We show that even in the cases of highly censored distributions, the information contained in the detected values is still useful, because it is they that contribute to highest dose or exposure. In conjunction with the information known from the censored observations (number of samples and their detection limit), these data may be used to define a probability region, in place of an estimate of a single distribution. The method is sufficiently general to be applied to any source of measurement error, and therefore may be extended to the more difficult cases of multiple detection limits within a data set and the inclusion of uncertainty provided by the analytical laboratory for each concentration. Software for this method is described, and numerical examples involving radon exposures are provided.

Ecosystem models for ecological risk analysis: from single species to communities

To justify regulatory and mitigation decisions, toxicologist are often asked the "so what?" questions that demand predictions about the population or even ecosystem response to contamination. RAMAS Ecotoxicology and RAMAS Ecosystem are microcomputer software specifically created to assist toxicologists answer such questions by extrapolating effects on organisms observed in bioassays to their eventual population-level consequences. It provides a software shell from which users can construct their own models for projecting toxicity effects through the complex filters of demography, density dependence and ecological interactions in foodchains. It allows various standard choices about low-dose response models (probit, etc.), which vital parameters are affected by the toxicant, the magnitudes and variabilities of these impacts, and species-specific life history descriptions. During the calculations, the software distinguishes between measurement error and stochastic variability. It forecasts the expected risks of population declines resulting from toxicity of the contaminant and provides estimates of the reliability of these expectations in the face of empirical uncertainty. This risk-analytic endpoint is a natural summary that integrates disparate impacts on biological functions over many organization levels. Where applicable, the software automatically performs consistency tests to check the at the input conforms to statistical assumptions and is dimensionally coherent. Parameterizations have already been prepared for several vertebrate and invertebrate species for use in assessments of soil or sediment contamination

Reliable calculation of probabilities

Total exposure does not equal average daily exposure times days exposed

Suppose we are interested in estimating the probability distribution among exposed individuals of their total exposures over some time period. Using simple convolution (i.e., what @Risk or Crystal Ball does) with the distribution of toxicant concentrations and the distribution of individuals' bodyweights leads to an answer whose variance can be grossly overestimated if exposures are iterated over the time period. The reason is that this calculation assumes that the magnitude of every exposure event through time is the same for an individual. In other words, if a person is given a small exposure once, then he will always experience exposures of the same size over the entire time period. This approach fails to appreciate that the toxicant concentration encountered may be different for different exposure events. When time periods are long, as they are for lifetime exposures needed in cancer r isk assessments, the resulting error can be substantial. Even if the temporal autocorrelation among sequential exposures for an individual is exceedingly high (e.g., 0.99), sufficiently many iterations will eventually overwhelm the autocorrelation. The simplistic calculation is appropriate only if exposure events are perfectly correlated (which seems unlikely in most practical cases). Nevertheless, this approach has been almost universally used since exposure assessments have been conducted within the probabilistic framework. Several examples of recent assessments that have made this mistake will be reviewed. In most cases, the effect of the error is to very strongly overestimate the chance of large lifetime exposures. Simple approximations are described that can be used to improve the estimate for the distribution of total exposure but do not require a full description of the autocorrelation function or an elaborate simulation of event-to-event variation in exposures.

Combining toxicant kinetics, population dynamics and trophic interactions

Scott Ferson, Matthew Spencer, W.-X. Wang, Nick Fisher and Lev Ginzburg

The chemistry and kinetics of an environmental toxicant and the population dynamics and food chain relationships among the species exposed to the toxicant are all fundamental phenomena for the questions we pose in ecological risk analysis. Yet the disciplines that address these phenomena have developed almost completely separately from each other. As a consequence it is not at all clear how we should combine models of toxicant kinetics, population dynamics and food chain relations together. Models of toxicant kinetics usually assume that trophic interactions and populations dynamics are so slow as to be constant. Models of population dynamics usually assume trophic interactions and especially toxicant kinetics to be so fast as to be equilibrial. In real-world systems, such assumptions are not always reasonable. When the assumptions must be abandoned and we have to consider all three phenomena simultaneously, what compromises are necessary and practical? We describe case studies involving heavy metal accumulation in marine and freshwater food chain systems of zooplankton, copepods and bivalves using the new software RAMAS Ecotoxicology and RAMAS Ecosystem.

Detecting rare event clusters when data are extremely sparse

Scott Ferson and Kwisung Hwang

Detection of clustering among rare events can be very important in recognizing engineering design flaws and cryptic common-mode or common-cause depen- dencies among rare events such as component failures. However, traditional statistical tests for clustering assume asymptotically large sample size. Simulation studies show that with small data sets the Type I error rates for traditional test s such as chi-square or the likelihood ratio can be much larger than nominal levels. Moreover, these tests are sensitive to a specific kind of deviation from randomness and may not provide the most appropriate measure of clustering in a particular circumstance. We describe five new statistical tests, implemented in a convenient software package, that can be used to detect clustering of rare events in structured environments. Because the formulations employ combinatorial expressions, they yield exact P-va lues and can therefore be used even when data sets are extremely small. These new statistical methods allow risk and safety analysts to detect clustering of rare events in data sets of the size usually encountered in practice. We characterize the relative statistical power of the tests under different kinds of clustering mechanisms and data set configurations. This work was supported by SBIR grant R44GM49521 from the National Institutes of Health.

Ecotoxicology: how to bring ecology and toxicology together

Matthew Spencer, Scott Ferson and Lev Ginzburg

Models of the kinetics of environmental toxicants generally assume that ecological processes such as population growth and predation can be ignored. Models that concentrate instead on the ecological processes typically assume that toxicant concentrations are constant through time and ignore toxicant kinetics. Clearly, the discipline of ecotoxicology will require us to consider both kinds of phenomena simultaneously, and to focus on their interaction through time. We describe a software shell running under Windows operating systems in which users can build and evaluate their own models to make probabilistic forecasts of the impacts of environmental toxicants on ecosystems consisting of food chains with several species, or single-species systems with age or stage structure. The shell of RAMAS Ecosystem uses state-of-the-art second-order Monte Carlo simulation and supports a rich array of risk-analytic summaries of the results. Users can choose among several different models of dose-reponse functions (such as Weibull, probit, logit, etc.), toxicant kinetics (first-order, constant, constant'environment), density dependence (ceiling, logistic, Ricker, etc.), and trophic interactions (Lotka-Volterra, ratio-dependent, Holling type II). The user can specify parameters as scalar numbers, intervals to represent measurement error (e.g., p10,15] mg per liter), or one of twenty name statistical distributions to represent temporal variability (e.g., lognormal(10,1) mg per liter). The software performs automatic unit conversions and checking of dimensional consistency. We illustrate its use with a study of the effect of an organophosphate on a food chain consisting of worms, sparrows and hawks. The results of the simulation appear to mirror what actually happens in such systems and would not have been expected from study of either the toxicant kinetics or the ecological dynamics separately.

Checking the computational integrity of probabilistic risk analyses

Many have argued that probabilistic risk analyses should not be widely required because they would simply be too difficult for regulators to review on a routine basis. I describe a series of elementary checks that can be employed with modest effort by a reviewer to test (1) dimensional soundness of the expressions used in the calculations, (2) feasibility of the correlation structure among the input variables, and (3) quantitative plausibility of the computed risk estimates. The first check uses dimension and units analysis such as is currently available in several software implementations. The second check tests for positive semi-definiteness of the matrix of correlations among the input distributions of random numbers. The third check uses both interval analysis on the ranges of the input variables and moment constraint propagation on their first and second moments (i.e. their means and variances) to determine whether the reported risks are possible within the stated assumptions of the model. With software to perform the checks themselves, a reviewer need only transcribe the input variables (both the characterized distributions and their units) and the mathematical expression(s). The automated analyses can reveal many if not most of the serious qualitative and quantitative errors likely to occur in probabilistic risk analyses, and should therefore relieve the reviewer of much of his or her burden. Several examples taken from the recent risk analysis literature will be used to illustrate how errors, ranging from the simple to the profound, could have been caught by applying these straightforward checks.

Cancer clusters: how to be sure you have one

Kwisung Hwang, Scott Ferson and Roger Grimson

How much should we worry about ten more cases of childhood cancers in a city than would be expected from its population size and the background incidence rate? Until now, it has been difficult or impossible to compute P-values for putative diseases clusters when the data set is small (as it usually is for cancer). Traditional statistical tests for clustering assume asymptotically large sample sizes and are therefore not strictly applicable when data are sparse. Numerical studies show, in fact, that widely used tests such as chi-square routinely and strongly overestimate the evidence for clustering. Thus, they can cause more alarm than is warranted. We describe several new statistical methods, implemented in a convenient software package, that can be used to compute exact P-values for clustering. These new methods can be used whatever the size of the data set, and are especially useful when data sets are extremely small. They provide tools to public health researchers and epidemiologists that, for the first time, have wide applicability for detecting clustering and other epidemiologic patterns in data sets of the size usually encountered in practice. We describe the relative statistical power of these tests under different kinds of clustering mechanisms and data set configurations. This work was supported by SBIR grant (R44GM49521) from the National Institutes of Health.

Reliable probabilities when sample sizes are small and the model is uncertain

A risk analyst's knowledge can be incomplete in two ways. First, the probabilities of the inputs may be imprecisely known from statistical estimations, perhaps based on very small sample sizes. Second, dependency relationships among the inputs may be known imprecisely. Representing probability estimates as interval ranges on [0,1] has been suggested as a way to address the first source of imprecision. A suite of logical operators (AND, OR and NOT) defined by the classical Fréchet inequalities permit these probability intervals to be used in calculations that address the second source of imprecision, in many cases, in a best possible way. One can employ statistical confidence intervals to estimate the inputs, but doing so introduces the question of which confidence level should be used, and unravels the closure properties of this approach. A solution to this problem is to characterize each probability as a nested stack of intervals for all possible levels of statistical confidence, from a point estimate (0% confidence) to the entire unit interval (100% confidence). The implied calculation reduces by symmetry to a manageably simple computational problem. The resulting logical calculus can be implemented in software that allows users to compute comprehensive and often best-possible bounds on probabilities for logical functions of events. It yields reliable computations in the sense that answers are sure to enclose the true probabilities. The approach is illustrated with an example problem from forensic statistics. This work was supported by SBIR grant (R43ES06857) from the National Institutes of Health.

Six methods to use when information is limited (as it always is)

Scott Ferson and Dwayne R.J. Moore

Several different methods for uncertainty propagation might be used in a risk analysis when there is little empirical information. Because the methods have different underlying axioms and assumptions about their input, they may be appropriate in different analytical situations. We compared six methods:

1. Monte Carlo analysis using conventional assumptions,
2. probabilistic analysis based on the maximum entropy criterion,
3. second-order (Monte Carlo) analysis,
4. probability bounds analysis,
5. fuzzy arithmetic, and
6. interval or worst case analysis.

In applying these six methods to case studies, we found that they can yield substantially different results and that it may often be instructive to use more than one method in an analysis. Each approach has its theoretical advantages, but which is best in a particular circumstance depends on what and how much empirical information is available. Standard probabilistic approaches (methods 1, 2 and 3) cannot handle non-statistical uncertainty such as doubt about the proper form of the model. The bounding approaches (methods 4, 5 and 6) are not most powerful when very detailed information is available. We rank the six methods in terms of their conservativism, ease of use, data requirements, necessary assumptions, and appropriateness for use in screening and higher-tier assessments. Although Monte Carlo analysis is often touted as the only practical approach for uncertainty propagation, we conclude that other methods may sometimes be more appropriate. This work was supported by in part by SBIR grant R43ES06857 from the National Institutes of Health.

Quality assurance for an environmental risk assessment

Sunil Donald and Scott Ferson

Although a probabilistic risk assessment using Monte Carlo techniques is widely regarded as the most comprehensive kind of uncertainty analysis, in current practice there are usually several assumptions that accompany an assessment that have not been justified by empirical information but are made for the sake of mathematical convenience. For example, an analyst typically specifies precise statistical distributions to be used as inputs to the model even though evidence supporting the particular choices ma y be fairly sparse. In some cases, there is controversy about the form of the model itself, including the nature of the dependencies among the variables and even the mathematical functions that tie them together. Simple methods based on the notion of interval probabilities have recently been described that can be used to incorporate these kinds of uncertainty into the analysis in a way that does not confound an analyst's subjective uncertainty with the true stochasticity of the system. These methods allow one to directly compute bounds on the probabilities of adverse events and therefore provide estimates of the reliability of a probabilistic risk assessment. We performed a quality assurance assessment using these methods on a previously reported human health risk analysis from multiple-pathway exposures to chemical contamination at an industrial facility by a complex mixture of PAHs and dioxins. The results quantify the reliability of conclusions drawn from the analysis about exposures and the need for remediation.

Probability bounds analysis

Scott Ferson and Sunil Donald

Probabilistic risk assessments almost always seem to demand more information about the statistical distributions and dependencies of input variables than is available empirically. For instance, to use Monte Carlo simulation, one generally needs to specify the particular shapes and precise parameters for all the input variables even if relevant data are very sparse. Moreover, imperfect understanding of how common-cause or common-mode mechanisms induce correlations or complicated dependencies among the variables typically forces analysts to assume independence even if they suspect otherwise. Most practitioners acknowledge the limitations induced by these problems, yet rarely employ sensitivity studies or second-order simulations to assess their possible significance because they would be computationally prohibitive. Probability bounds analysis allows assessors to sidestep both uncertainty about the precise specifications of input variables and imperfect information about the correlation and dependency structure among the variables. However, this approach has not been accessible to risk analysts because of a lack of convenient software. We describe probability bounds analysis software for Windows operating systems. The software has a graphical interface that is convenient and natural for quantitative risk assessors and permits calculations that are vastly less expensive computationally than alternative approaches. For instance, a second-order Monte Carlo simulation that required several days to compute on a microcomputer can be replaced by a calculation with probability bounds analysis that takes only seconds. A probability distribution or bounds on a probability distribution (p-bounds) can be specified interactively according to what empirical information is available. For instance, the parameters of the distributions can be given as precise numbers or as intervals. If the shape of the underlying distribution is not known, but some parameters such as the mean, mode, variance, etc. can be specified (or given as intervals), the software will construct bounds that are guaranteed to enclose the distribution subject to whatever constraints are specified. The software supports the full complement of standard operators and functions, including +, -, *, /, <, <=",">, >=, equality comparison, variable assignment, maximum, minimum, power, exponential, natural and common logarithm, square root, integer part, sine, cosine, arc tangent, and the basic logical operations (and, or, not). Binary operations are computed according to what can be assumed about the dependence between the operands. The software supports operations (i) under an assumption of independence, (ii) assuming the operands are positively dependent, (iii) assuming the operands are negatively dependent, or (iv) without any assumption whatever about the dependence between the operands. All operations and functions are transparently supported for pure or mixed expressions involving scalars, intervals, probability distributions and p-bounds. Expressions are evaluated as they are entered and the resulting values automatically displayed graphically. The software also accepts and propagates units embedded with the numbers in expressions. It checks that dimensions conform in additive and comparison operations so that trying, for instance, to add meters and seconds generates an error message. The software will also perform conversions automatically when needed to interpret input and complete calculations. For instance, it will correctly interpret inputs such as an interval with one endpoint given in feet and the other in meters, or a probability distribution with its mean given in kilograms and its standard deviation in grams, and it will produce the correct answer when a distribution in units of days is convolved with a distribution in units of hours. The software also supports basic programming constructs for conditional execution (if), looping (while), blocking (begin-end), string operations, user-defined functions and procedures, script execution, user-defined display windows, hypertext help, and an extensive library of examples that illustrate the properties of uncertainty arithmetic. This work was supported by funding from the National Institutes of Health, Southern California Edison and the Electric Power Research Institute.

Statistically detecting clustering in very sparse data sets

Kwisung Hwang, Scott Ferson and Roger Grimson

Cluster detection is considered to be essential in many environmental and epidemiological studies. Likewise, it can be very important in engineering studies for recognizing design flaws and cryptic common-mode or common-cause dependencies among rare events such as component failures. How can we tell whether a town's incidence of childhood cancers is significantly greater than the national average? How can we tell whether an airline's crash history is significantly worse than what one would expect given the industry's performance? The answers to these questions require some statistical method to detect of clustering among rare events. However, traditional statistical tests for detecting clustering assume asymptotically large sample sizes and are therefore not applicable when data are sparse-as they generally are for rare events. In fact, simulation studies show that the Type I error rates for traditional tests such as chi-square or the log-likelihood ratio (G-test) are routinely much larger than their no minal levels when applied to small data sets. As a result these tests can seriously overestimate the evidence for clustering and thus cause more alarm than is warranted. In other cases, traditional cluster tests can fail to detect clusters that can be shown by other methods to be statistically significant. Thus, the traditional approaches will provide an inefficient review of the available data. It is difficult to anticipate whether the traditional test will overestimate or underestimate the probability of clustering for a particular data set. Moreover, these tests are sensitive to a specific kind of deviation from randomness and may not provide the most appropriate measure of clustering from a specific mechanism. We describe eight new statistical methods, implemented in a convenient software package, that can be used to detect clustering of rare events in structured environments. Because the new tests employ exact methods based on combinatorial formulations, they yield exact P-values and cannot violate their nominal Type I error rates like the traditional tests do. As a result, the new tests are reliable whatever the size of the data set, and are especially useful when data sets are extremely small. By design, these tests are sensitive to different aspects of clusters and should be useful in discerning not only the fact of clustering but also something about the processes that generated the clustering. We characterize the relative statistical power of the new tests under different kinds of clustering mechanisms and data set configurations. The new statistical tests, along with several traditional and Monte Carlo tests for clustering, have been implemented in convenient graphical software for Windows operating systems which should be useful to risk and safety analysts for detect clustering of rare events in data sets even when the available data are sparse. This work was supported by SBIR grant R44GM49521 from the National Institute of General Medical Sciences of the National Institutes of Health.

Why probability is insufficient for handling uncertainty in risk analysis

Risk analysts commonly distinguish variability resulting from heterogeneity or stochasticity from incertitude (partial ignorance) resulting from systematic measurement error or subjective uncertainty. In almost all risk assessments, both forms of uncertainty are present, although their respective magnitudes can vary widely. Most analysts now agree that variability and incertitude should be treated separately in risk analyses so that planners can design effective risk management strategies and future empirical efforts. My claim is much stronger: the two kinds of uncertainty require entirely difference propagation calculi. Some approach that does not make unjustified assumptions about randomness and independence should be used to propagate incertitude, and probability theory (with appropriate assumptions) should be used to propagate variability. Using conventional probabilistic methods for incertitude leads to results that are clearly erroneous, at least in terms of the goals of risk analysis. As the number of variables increases, or the extrapolation time grows, the problem becomes more and more severe. It is possible to represent and manipulate both incertitude and variability simultaneously in a coherent analysis that does not confound the two forms of uncertainty and distinguishes what is known from what is assumed, although it does not appear that two-dimensional Monte Carlo is sufficient to accomplish this.

Fat chance and slim chance: distribution-free probabilistic risk analysis

Uncertainty analysis is essential in engineering, yet traditional methods sometimes require untenable assumptions. For instance, Monte Carlo simulations requires one to specify distribution shapes and dependencies precisely. A practical alternative, probability bounds analysis (PBA), allows one to make calculations without making unwarranted assumptions about distributions or dependence. With PBA, one can obtain quality assurance for probabilistic risk calculations. PBA extends both “robust Bayesian methods” from statistics and “automatically verified computations” from computer science.

Techniques for visualizing uncertainty and perception of risk

Scott Ferson and Lev Ginzburg

We describe risk imaging software technology that decomposes risk into two elements: (1) the frequency of each kind of harm associated with a hazard, and (2) the adversity of each of those harms. In most risk analyses, considerable uncertainty exists regarding the actual magnitude of both the frequency and the adversity of each potential harm. For example, small sample size and statistical bias contribute to uncertainty about the frequency. In the case of adversity, differences in opinion, measurement uncertainty and choice of dimensions lead to uncertainty. Because different kinds of harm are measured along often incompatible dimensions, we quantify adversity on a scale obtained by ordination. The method we have developed then bounds estimate of frequency and adversity using quantitative uncertainty techniques. Risk from the hazard is imaged as an area circumscribed by the uncertainty bounds characterizing all of the harms. We refer to this area as a risk profile of the hazard. Different individuals and groups respond to uncertainty and risk differently, and the risk profile can be further focused to reflect particular attitudes and visualize particular perceptions of the risk. To do this, we specify values for attitude parameters. These attitudes include the overall importance of uncertainty, the meaning of disagreements between measurements or opinions, and the meaning of absence of evidence. Different values specified for these attitude parameters result in different visualizations of risk as perceived by the individual or group. These alternative risk visualizations can be contrasted and compared across management choices or across different risk perceivers to facilitate communication and decision making. To illustrate the methodology, published clinical trial data are imaged. We discuss the withdrawal of Vioxx as a miscalculation arising from misunderstanding uncertainty perception by FDA.

Computing cleanup targets is complicated (but possible) in probabilistic risk assessments

Ferson, Scott, Troy W. Tucker and David Myers

The deterministic expressions traditionally used in risk assessments are being replaced with probabilistic expressions that represent natural heterogeneity in populations and variability among possible exposure scenarios. This important advance has led to certain complications. So long as only point estimates were used in quantitative risk assessments, it was straightforward to solve for the environmental concentration that, if not exceeded, would ensure that resulting doses were below tolerable limits. Now that the mathematical equations involve probability distributions, however, it is not possible to simply rearrange the equations to solve for these concentrations. Because determining cleanup goals for remediation is often the primary purpose of a risk assessment, convenient and reliable methods are needed to untangle the probabilistic calculations involved in modern risk assessments. We present simple and efficient methods to compute cleanup goals that satisfy multiple simultaneous criteria expressed in terms of limits on resultant exposures or risks in the context of a probabilistic assessment. The approach can be used with arbitrarily many constraints on percentiles of the target risk. The approach characterizes concentration distributions that satisfy constraints prescribed by the regulator and the calculations yield two kinds of bounds on concentration: a ‘core’ and ‘shell’. If the concentration distribution is entirely inside the core, then the result surely obeys the prescribed constraints. If the concentration distribution is anywhere outside the shell, then the result certainly fails to comply with the prescribed constraints. If the concentration distribution is outside the core but inside the shell, then compliance must be determined by a forward calculation. The core is essentially comparable to a screening level familiar from deterministic assessments, but the shell is not similarly analogous to an action level, but rather to a frontier delimiting clearly unacceptable distributions.

Basic statistics for imprecise data

Scott Ferson and Vladik Kreinovich

Risk analysts conscientiously distinguish between epistemic and aleatory uncertainty when the assessment demands it. This conscientiousness about the distinction should extend to data handling and statistical methods as well, but this has generally not been the case. For instance, guidance promulgated by national and international standards agencies for handling uncertainty in empirical data does not adequately distinguish between the two forms of uncertainty and tends to confound them together. Likewise, because most of the statistical methods developed over the last century have focused on assessing and projecting sampling uncertainty (which arises from having observed only a subset of a population) and neglect measurement uncertainty (which arises from imperfect mensuration, censoring or missing values), almost all of the statistical methods widely employed in risk analysis presume that individual measurements are infinitely precise, even though this is rarely justifiable. One reason for the ubiquity of the assumption is the difficulty of making the necessary calculations that account for both-and distinguish between-variability and imprecision in empirical data. For instance, computing the variance of a collection of interval data is generally an NP-hard computational problem. Nevertheless, some progress has been made. Within the last five years, for example, workable algorithms for computing basic descriptive statistics for interval data have been developed. We describe software implementing these and other algorithms for computing basic descriptive and inferential statistics for interval data. We also review the growing literature on censoring, missing values, interval data, and robust statistical methods that carefully make the distinction between epistemic and aleatory uncertainty.

Modeling correlation and dependence among intervals

Scott Ferson and Vladik Kreinovich

This note introduces the notion of dependence among intervals to account for observed or theoretical constraints on the relationships among uncertain inputs in mathematical calculations. We define dependence as any restriction on the possible pairings of values within respective intervals and define nondependence as the degenerate case of no restrictions (which we carefully distinguish from independence in probability theory). Traditional interval calculations assume nondependence, but alternative assumptions are possible, including several which might be practical in engineering settings that would lead to tighter enclosures on arithmetic functions intervals. We give best possible formulas for addition of intervals under several of these dependencies. We also suggest some potentially useful models of correlation, which are single-parameter families of dependencies, often from the identity dependence (u=v) representing maximal correlation, through nondependence, to opposite dependence (1-u=v) representing maximally negative correlation.

Calculating a soil EPC when sampling and exposures are non-random

Harlee Strauss, Scott Ferson, John Lortie, Richard McGrath, Susan Svirsky

The exposure point concentration (EPC) is intended to represent the average concentration of soil with which a receptor comes into contact. For EPA risk assessments, the EPC is the 95th upper confidence limit for the mean of data collected from random sampling. Important assumptions underlying the EPC are that sampling is random and that an individual comes into contact with the contaminated soil in a random way across the exposure area (or site) under evaluation. In practice, the random sampling assumption is often violated if no adjustment is made for samples that are collected in a non-random manner, such as programs intended to define areas of elevated concentrations. The random exposure assumption may be violated when considering recreational activities such as hunting, fishing, birdwatching, and hiking, where there may be preferential areas or paths used by individuals, or avoidance of areas that are difficult to access because of dense vegetation or obstructions. In order to maintain consistency with the assumptions of random sampling and random exposure within an exposure area, we developed a methodology that included area-weighting to account for non-random sampling patterns and use-weighting to account for preferential use of certain areas within an exposure area. This EPC methodology was applied to PCBs in the floodplain soil of the Housatonic River in Massachusetts. The use-weighting system was based on information about ecological habitat and thus potential accessibility and attractiveness of the area. The area-weighting system also required information about habitat types as they were indicative of the topographic and hydrologic factors that governed the deposition of PCBs in the floodplain. The 95th UCL of the mean was calculated from area- and used-weighted data by generalizing a bootstrap resampling procedure that accounts for skewness.

Propagating epistemic and aleatory uncertainty in nonlinear dynamic models

Youdong Lin, Mark A. Stadtherr, Scott Ferson and George F. Corliss

The tradeoff between measurement precision and sample size: should we get more or better data?

Scott Ferson and Vladik Kreinovich

One intuitively expects there to be a tradeoff between precision and sample size of measurements. For instance, one might be able to spend a unit of additional resources to be devoted to measurement either to increase the number of samples, or to improve the precision of the individual samples. Many practitioners apparently believe, however, that the tradeoff always favors increasing the number of samples over improving their precision. This belief is understandable given that most of the statistical literature of the last century has focused on assessing and projecting sampling uncertainty, and has in comparison neglected the problem of assessing and projecting measurement uncertainty. The belief is nevertheless mistaken, as can easily be shown by straightforward numerical examples. Consider, for example, the problem of conservatively estimating an exposure point concentration (EPC) from sparse and imprecise data. We might use an upper confidence limit on the mean to account for the sampling uncertainty associated with having made only a few measurements. This value is affected by the sample size, but, if the calculation also accounts for the imprecision of the values in a reasonable way, it is also affected by the measurement precision. Using recent algorithms to compute basic statistics for interval data sets, we consider the EPC and describe a nonlinear tradeoff between precision and sample size. This nonlinearity means the optimal investment of empirical resources between increasing sampling and improving precision depends on the quantitative details of the problem. We describe how an analyst can plan an optimal empirical design.

Quasi-extinction risk in a wood frog (Rana sylvatica) metapopulation under environmental contamination by PCBs

W. Troy, Tucker, Michael E. Thompson, John P. Lortie, Douglas J. Fort, Susan Svirsky, and Scott Ferson

A stochastic population model projecting wood frog population trends into the future and computing the risk of population decline was constructed using vital rate information from the literature and abundances derived from studies of 27 vernal pools in western Massachusetts. The model was age- and sex-structured with yearly time steps, and both demographic and environmental stochasticity were incorporated. The model was spatially explicit and frogs were allowed to disperse between ponds as a function of distance. The impact of PCBs on this wood frog population was assessed by comparing population projections from a base population model, i.e., a wood frog population not impacted by PCBs, with projections from population models that included the effect of PCBs on population vital rates. Both a non-declining and a declining base population were simulated. Parameterizations included the effect of PCBs on initial population size and combinations of low and high estimates of the proportion of malformed frogs that subsequently died or became reproductively unfit due to PCB exposure. The impacts of PCBs were derived from vernal pool and laboratory studies in the study area and from literature sources. Based upon this modeling effort, PCBs appear to increase the risk of population decline and quasi-extinction at all levels for wood frog at the site.

Model validation, calibration and predictive capability

Scott Ferson and William L. Oberkampf

The theoretical and mathematical issues involved in model validation, calibration, and predictive capability touch the foundations of science. As the capability of computational simulation continues to dramatically increase, these issues must be addressed more directly and clearly in science and engineering. Sandia National Laboratories’ Validation Challenge Problems were designed to require most of the key issues be addressed. We describe an analysis of the thermal challenge problem that does not assume that the experimental data given in the problem are obtained without experimental measurement uncertainty. Instead, we assume a model for the experimental measurement uncertainty, along with the appropriate model constants and use the material characterization data provided to calibrate estimates for thermal conductivity and volumetric heat capacity as a regression function of temperature. The calibration includes the unit-to-unit variability of the samples tested, and reflects the limited number of samples. We then use the one-dimensional, unsteady, thermal model provided to make predictions for both the ensemble and accreditation experiments. As part of these predictions, we address the uncertainty introduced from the incompatibility between the assumption of constant thermal properties in the thermal model as compared to the temperature-dependent trend exposed in the material characterization data. In comparing the predictions and experimental data for the ensemble and accreditation cases, we construct validation metrics to quantify the disagreement between the two sets of data. Predictions are then made for the regulatory compliance condition, taking into account the temperature-dependent properties, the validation metric results, and the extrapolative nature of the prediction. Separate predictions are made for each of the three quantities of experimental data (low, medium, and high), as well as with and without experimental measurement uncertainty. Each of the predictions includes an assessment of satisfying the probabilistic requirement for the regulatory condition.

Communicating uncertainty about probability: equivalent binomial count

Jason O’Rawe, Michael Balch and Scott Ferson

Most strategies for the basic risk communication problem of expressing the probability of a well defined event presume the probability is precisely characterized as a real number. In practice, however, such probabilities can often only be estimated from data limited in abundance and precision. Likewise, risk analyses often yield imprecisely specified probabilities because of measurement error, small sample sizes, model uncertainty, and demographic uncertainty from estimating continuous variables from discrete data. Under the theory of confidence structures, the binomial probability of an event estimated from binary data with k successes out of n trials is associated with a particular structure that has the form of a p-box, i.e., bounds on a cumulative distribution function. When n is large, this structure approximates the beta distribution obtained by Bayesians under a binomial sampling model and Jeffreys prior, and asymptotically it approximates the scalar frequentist estimate k/n. But when n is small, it is imprecise and cannot be approximated by any single distribution because of demographic uncertainty. These confidence structures make apparent the importance of the size of n to the reliability of the estimate. If n is large, the probability estimate is more reliable than if n is small. When a risk analysis yields a result in the form of a precise distribution or imprecise p-box for an event’s probability, we can approximate the result with a confidence structure corresponding to a binomial probability estimated for some values of k and n. Thus we can characterize the event probability from the risk analysis with a terse, natural-language expression of the form “k out of n”, where k and n are nonnegative integers and 0≤k≤n. We call this the equivalent binomial count, and argue that it condenses both the probability and uncertainty about that probability into a form that psychometry suggests will be intelligible to humans. Gigerenzer calls such integer pairs “natural frequencies” because humans appear to natively understand their implications, including what the size of n says about the reliability of the probability estimate.

Title: Planning clean up strategies in the context of a probabilistic assessment

Authors: W. Troy Tucker and Scott Ferson

Abstract: Determining the cleanup goals for remediation is often the primary purpose of a risk assessment. This chapter presents simple and efficient methods to compute cleanup goals that satisfy multiple simultaneous criteria in the context of a probabilistic assessment. The approach can be used with multiple receptors and with arbitrarily many constraints on percentiles of the target risk. This approach uses probability bounds analysis to characterize concentration distributions that satisfy the constraints. The calculations yield two kinds of bounds on concentration: a 'core' and 'shell'. If the concentration distribution is entirely inside the core, then the result surely obeys the prescribed constraints. If the concentration distribution is anywhere outside the shell, then the result certainly fails to comply with the prescribed constraints. If the concentration distribution is outside the core but inside the shell, then compliance must be determined by a forward calculation. Although the core is essentially comparable to the screening level familiar from deterministic assessments, the shell cannot be similarly analogized with an action level. Remediation strategies differ in their effects on the resulting concentration distribution. For instance, hotspot removal tends to truncate the high values of the distribution, whereas sparging may tend reduce the distribution more evenly as a multiplicative transformation. The effects of different kinds of remediations are compared to estimate the effort required for each to attain a clean up goal.

Title: Probabilistic risk assessments with practically no data

Authors: Scott Ferson and W. Troy Tucker

Abstract: Fully probabilistic risk assessments can be developed even though minimal specific information is available about the conditions at a site using probability bounds analysis. The approach allows the analyst to avoid making restrictive assumptions that are not justified by what is known empirically, such as precise shapes of probability distributions or the details of the intervariable dependencies. But neither does it require extensive, extraordinary or specific expert judgment to develop the assessment. Instead, a bounding approach is employed that makes only weak assumptions and yields, in turn, conservative bounds on the distributions of endpoints of interest. Thus, a screening-level assessment can be done that is fully probabilistic in that it describes bounds on the probabilities of all possible outcomes. The approach is illustrated with an application to several assessments involving a travel time problem for a hydrocarbon contaminant in groundwater, mercury exposure in wild mink, and total daily intake of dioxin for waterfowl.

Title: Extinction risk in a wood frog (Rana sylvatica) metapopulation under environmental contamination by PCBs

Authors: W. Troy Tucker, Michael E. Thompson, John P. Lortie, Douglas J. Fort, Susan Svirsky, and Scott Ferson

Abstract: A stochastic population model projecting wood frog population trends into the future and computing the risk of population decline was constructed using vital rate information from the literature and abundances derived from studies of 27 vernal pools in western Massachusetts. The model was age- and sex-structured with yearly time steps, and both demographic and environmental stochasticity were incorporated. The model was spatially explicit and frogs were allowed to disperse between ponds as a function of distance. The impact of PCBs on this wood frog population was assessed by comparing population projections from a base population model, i.e., a wood frog population not impacted by PCBs, with projections from population models that included the effect of PCBs on population vital rates. Both a non-declining and a declining base population were simulated. Parameterizations included the effect of PCBs on initial population size and combinations of low and high estimates of the proportion of malformed frogs that subsequently died or became reproductively unfit due to PCB exposure. The impacts of PCBs were derived from vernal pool and laboratory studies in the study area and from literature sources. Based upon this modeling effort, PCBs appear to increase the risk of population decline and extinction.

Title: Calculating the EPC when data are sparse and imprecise and both data and exposures are non-randomly sampled

Authors: Harlee Strauss and Scott Ferson

Title: The exposure point concentration (EPC) should represent the average concentration with which an individual comes into contact in an exposure area. The EPC is often modeled as the 95% upper confidence limit on the mean of concentration data. Underlying assumptions are that sampling is random and that individuals encounter concentrations in the exposure area randomly. Often, however, random sampling is violated by “hotspot” or other idiosyncratic sampling schemes. And exposures may also not be random if some parts of the exposure area are more attractive or accessible than others. We developed a methodology based on area-weighting to account for non-random sampling patterns and use-weighting to account for preferential use of certain areas within an exposure area. This methodology was applied in a human health risk assessment of PCBs in the floodplain soil of the Housatonic River in Massachusetts.

HOW TO SATISFY MULTIPLE CONSTRAINTS ON CLEANUP GOALS IN A PROBABILISTIC ASSESSMENT

David S. Myers and Scott Ferson

Determining the cleanup goals for remediation is often the primary purpose of a risk assessment. We present simple and efficient methods to compute cleanup goals that satisfy multiple simultaneous criteria in the context of a probabilistic assessment. The approach can be used with multiple receptors and with arbitrarily many constraints on percentiles or moments of the target risk. This approach uses probability bounds analysis to characterize concentration distributions that satisfy the constraints. The calculations yield two kinds of bounds on concentration: a ‘core’ and ‘shell’. If the concentration distribution is entirely inside the core, then the result surely obeys the prescribed constraints. If the concentration distribution is anywhere outside the shell, then the result certainly fails to comply with the prescribed constraints. If the concentration distribution is outside the core but inside the shell, then compliance must be determined by a forward calculation. Although the core is essentially comparable to the screening level familiar from deterministic assessments, the shell cannot be similarly analogized with an action level.

Combining toxicant kinetics, population dynamics and trophic interactions S. Ferson, M. Spencer, W.-X. Wang, N. Fisher and L. Ginzburg, Applied Biomathematics and State University of New York at Stony Brook. The chemistry and kinetics of an environmental toxicant and the population dynamics and food chain relationships among the species exposed to the toxicant are all fundamental phenomena for the questions we pose in ecological risk analysis. Yet the disciplines that address these phenomena have developed almost completely separately from each other. As a consequence it is not at all clear how we should combine models of toxicant kinetics, population dynamics and food chain relations together. Models of toxicant kinetics usually assume that trophic interactions and populations dynamics are so slow as to be constant. Models of population dynamics usually assume trophic interactions and especially toxicant kinetics to be so fast as to be equilibrial. In real-world systems, such assumptions are not always reasonable. When the assumptions must be abandoned and we have to consider all three phenomena simultaneously, what compromises are necessary and practical? We describe case studies involving heavy metal accumulation in marine and freshwater food chain systems of zooplankton, copepods and bivalves using the new software RAMAS Ecotoxicology.

Total Exposure Does Not Equal Average Daily Exposure Times Days Exposed S. Ferson, Applied Biomathematics. Suppose we are interested in estimating the probability distribution among exposed individuals of their total exposures over some time period. Using simple convolution (i.e., what @Risk or Crystal Ball does) with the distribution of toxicant concentrations and the distribution of individuals’ bodyweights leads to an answer whose variance can be grossly overestimated if exposures are iterated over the time period. The reason is that this calculation assumes that the magnitude of every exposure event through time is the same for an individual. In other words, if a person is given a small exposure once, then he will always experience exposures of the same size over the entire time period. This approach fails to appreciate that the toxicant concentration encountered may be different for different exposure events. When time periods are long, as they are for lifetime exposures needed in cancer risk assessments, the resulting error can be substantial. Even if the temporal autocorrelation among sequential exposures for an individual is exceedingly high (e.g., 0.99), sufficiently many iterations will eventually overwhelm the autocorrelation. The simplistic calculation is appropriate only if exposure events are perfectly correlated (which seems unlikely in most practical cases). Nevertheless, this approach has been almost universally used since exposure assessments have been conducted within the probabilistic framework. Several examples of recent assessments that have made this mistake will be reviewed. In most cases, the effect of the error is to very strongly overestimate the chance of large lifetime exposures. Simple approximations are described that can be used to improve the estimate for the distribution of total exposure but do not require a full description of the autocorrelation function or an elaborate simulation of event-to-event variation in exposures.

Ecotoxicology: how to bring ecology and toxicology together. Matthew Spencer and Scott Ferson, Applied Biomathematics, Lev Ginzburg, State University of New York. Models of the kinetics of environmental toxicants generally assume that ecological processes such as population growth and predation can be ignored. Models that concentrate instead on the ecological processes typically assume that toxicant concentrations are constant through time and ignore toxicant kinetics. Clearly, the discipline of ecotoxicology will require us to consider both kinds of phenomena simultaneously, and to focus on their interaction through time. We describe a software shell running under Windows operating systems in which users can build and evaluate their own models to make probabilistic forecasts of the impacts of environmental toxicants on ecosystems consisting of food chains with several species, or single-species systems with age or stage structure. The shell uses state-of-the-art second-order Monte Carlo simulation and supports a rich array of risk-analytic summaries of the results. Users can choose among several different models of dose-reponse functions (such as Weibull, probit, logit, etc.), toxicant kinetics (first-order, constant, constant´environment), density dependence (ceiling, logistic, Ricker, etc.), and trophic interactions (Lotka-Volterra, ratio-dependent, Holling type II). The user can specify parameters as scalar numbers, intervals to represent measurement error (e.g., p10,15] mg per liter), or one of twenty name statistical distributions to represent temporal variability (e.g., lognormal(10,1) mg per liter). The software performs automatic unit conversions and checking of dimensional consistency. We illustrate its use with a study of the effect of an organophosphate on a food chain consisting of worms, sparrows and hawks. The results of the simulation appear to mirror what actually happens in such systems and would not have been expected from study of either the toxicant kinetics or the ecological dynamics separately.

Surprising dynamics during ecological recovery after heavy metal contamination. Ferson, S.* Applied Biomathematics, Setauket, NY, Crutchfield, J., Carolina Power & Light, Raleigh, NC. The population of bluegill sunfish Lepomis macrochirus in part of a lake in North Carolina was decimated by toxicological and developmental effects of selenium leached from ash settling ponds. To forecast the potential recovery after cessation of heavy metal contamination, a demographic model was created for the bluegill population based on data collected from on-going biological monitoring at the lake. The model included density dependence which is known to be an important aspect of the life history of this species and used Monte Carlo methods to analyze the effect of natural environmental variability. The life history of the species revealed by analysis of the population model suggests that, if selenium poisoning were stopped, the population could recover to pre-impact abundances within two years. The increased abundance would be unevenly distributed among age groups, however. Following this increase in abundance, the biology predicts a population crash, especially among older year classes (which are prized by sportfishermen). This crash is due to the time-delayed effects of selenium on the population resulting from the strong non-linearity of density dependence in this species. The sharp increase in population size itself precipitates the crash. If this crash were not forecast in advance, its unanticipated occurrence could cause considerable consternation among managers, regulators and the interested public. This example shows that it can be important to predict ecological consequences to understand the nature and duration of biological recovery from toxicological insults. Without the understanding provided by the ecological analysis, the population decline would probably be completely misinterpreted as the failure of the mitigation program.

The Importance of Temporal Variability in Ecotoxicological Risk Assessment.

M. Spencer, S. Ferson, and L.R. Ginzburg. University of Sheffield, Applied Biomathematics, and the State University of New York at Stony Brook. We use a case study of the effects of PCBs on a marine food chain involving phytoplankton and copepods to illustrate the necessity of accounting for temporal variability in an ecotoxicological risk assessment. Bioaccumulation models usually assume parameter values and population sizes are constant through time. However, fluctuations in uptake rates, population growth, and predator-prey interactions can strongly affect an assessment's predictions. Risk assessment at the level of the ecosystem, where the influences of weather, successional processes and biotic interactions often dominate, will generally involve several temporal phenomena proceeding on different time scales. It is essential to account for these phenomena in some way if one hopes to obtain useful estimates of the risks to the ecosystem. In the phytoplankton-copepod case study, we used computer simulation to make probabilistic estimates of the risks of population decline as a consequence of the presence of PCBs. The simulation employed second-order Monte Carlo techniques to account for both measurement error and variability through time. We found that simply ignoring the time-dependent processes can produce grossly incorrect results. In some cases, there can be overestimates and, in others, there can be underestimates. Because of the inconsistency of the bias, risk assessments that ignore temporal variability will not even be useful for ranking relative risks. As is typical in risk assessments, precise empirical information about the uptake rates, toxicities, vital rates and the interactions among the species was not available for this case study. Nevertheless, we were able to compute conservative estimates of the final risks with a risk-analytic trick of treating temporal variability as measurement error. The resulting estimates would at least be useful for screening assessments. Our experience suggests this may be a generally useful strategy when empirical information is very sparse.

The role of population- and ecosystem-level risk analysis

in addressing impacts under §316(b) of the Clean Water Act

Lev Ginzburg and Scott Ferson

Applied Biomathematics

Abstract.

We address three issues germane to a practical definition of “adverse impact” and the choice of technical tools needed to address it.

To justify regulatory and mitigation decisions, toxicologist are often asked the “so what?” questions that demand predictions about the population or even ecosystem response to contamination. RAMAS Ecotoxicology is microcomputer software specifically created to assist toxicologists answer such questions by extrapolating effects on organisms observed in bioassays to their eventual population-level consequences. It provides a software shell from which users can construct their own models for projecting toxicity effects through the complex filters of demography, density dependence and ecological interactions in foodchains. It allows various standard choices about low-dose response models (probit, etc.), which vital parameters are affected by the toxicant, the magnitudes and variabilities of these impacts, and species-specific life history descriptions. During the calculations, the software distinguishes between measurement error and stochastic variability. It forecasts the expected risks of population declines resulting from toxicity of the contaminant and provides estimates of the reliability of these expectations in the face of empirical uncertainty. This risk-analytic endpoint is a natural summary that integrates disparate impacts on biological functions over many organization levels. Where applicable, the software automatically performs consistency tests to check the at the input conforms to statistical assumptions and is dimensionally coherent. Parameterizations have already been prepared for several vertebrate and invertebrate species for use in assessments of soil or sediment contamination.