Automatic uncertainty analysis and Quiet Doubt

Our grand intention is to make uncertainty analysis useful for people who don't know anything about uncertainty analysis. There are six main features of software that would help to do this.

(1) Overloading of functions and operators for uncertain arguments,

(2) Nativization of intervals,

(3) Quiet doubt,

(4) “About” modifiers,

(5) Smart dependence tracking (contra myth #11), and

(6) Help in understanding/interpreting output.

Are there any other features? We might say that these six features implement "automatic uncertainty analysis", but I'm itching for a catchier phrase!

The first feature was the most critical of course. The idea was to make it so users wouldn't have to change their formulas at all to be able to operate on uncertain numbers. That means that +, -, *, /, ^, min, max, abs, sqrt, exp, log, etc. all work seamlessly on uncertain numbers and scalar numbers. It seems that we've worked out the details so that (1) is now done, at least conceptually. I guess we still need to do the grunt work of actually porting the code so that each function/operation is actually implemented. Which of the functions do we still need to do the grunt work for?

The idea of feature (2) is that users should be able to express intervals (as two endpoints either explicitly or as expressions, plus-minus intervals, or sigdig intervals) as conveniently as possible. This feature is almost fully done now, given James' recent breakthrough on nativizing intervals. Of course, I'd like to be able to use "[1,2]" inside an expression too, rather than needing "i(1,2)", but I guess I can't have eggs in my beer.

The idea of quiet doubt (3) is that Excel software should do the whole uncertainty analysis without any special inputs from the user at all. You'll recall we had this loose idea that the add-in should enable people who don't know squat about uncertainty analysis--and didn't do anything special to their spreadsheet--to nevertheless be able to ask Excel how sure it is about their output, or that maybe even Excel would volunteer this information somehow. Like Word automatically corrects your spelling as you type, the add-in is automatically handling uncertainty propagation under its breadth, so to speak. Quiet doubt (trademark!) could work by having the software capture every Excel input, including each scalar, and establish its uncertainty. It would be as though Excel placed sigdig brackets around every scalar a user entered and went ahead and propagated them automatically. Answers could be expressed as intervals or, perhaps better, as scalars with the correct number of significant digits. If, as might commonly happen, there are zero significant digits in the output, I suppose we could display an interval or a modified scalar (see below). We haven't heretofore developed the add-in in the direction of or even thinking about quiet doubt, but maybe it actually wouldn't be too hard to support it given where we are. It wouldn't necessarily mean that every input would have to become a full-blown uncertain number. The digits from scalar numeric inputs could be read on the fly as needed. I guess every formula would have to be evaluated by UC though, obviously. But, again, they're computed on the fly, and the display in the function line wouldn't have to change at all, right? I'd like us to think about this. Would be it useful? Can it be done? How much would it take to do this?

Given James' breakthrough with native intervals, it may be the right time to ask again about the feasibility of the modifiers such as "about", "around", "near", etc. Are we going to require parentheses, so the user types "about(6)"? It'd be nice not to have to. Maybe we're asking too much of the Excel cell input parser. Maybe we should be talking about a wizard or pop-up dialog that gets a value from a user. Certainly in the dialog you could have a user type in almost any string at all, or even type some stuff and check some boxes. When he clicks okay, the wizard formats some thing with proper syntax to put into the cell. Yan originally started with such a dialog, but it fell away somewhere. Maybe we should go back to that. What would the dialog look like? It'd certainly have an immediate-mode picture of the value you were creating. That picture could be generated on the fly, but I'm not sure where we'd store the user's dialog inputs. Would they have to be appended to the literal string or something? I don't think you can go back from an interval to a modifed expression, so I think you certainly have to save what the user originally specified. We should think about this issue, both the feasibility of cell-based modifiers typed in directly by the user, and the feasibility of a fancier and more flexible dialog for specifying modifiers and other uncertain number formats. I guess the dialog idea kind of goes against the plan to be as unobtrusive as possible, but it might have advantages too.

James has been implementing the automatic handling of dependency with Smart Dependency Tracking (5). If we could find situations in which it would be proper to infer positive dependence (which bridges perfect and independent), I would be thrilled. That would be way cool. Naively, it seems that somehow conflating independence and perfect dependence would be how 'positive' dependence could be inferred. Our factory-setting presumption that new uncertain constructions are independent of each other should be subject to revocation by the user. Maybe on the ribbon or someplace, the user should be able to uncheck a box that says 'Assume independence in new constructs'. If he does so the software reverts to Risk Calc's behavior of using Frechet for everything unless specifically told to do otherwise, which would be by the user explicitly calling for some dependence with syntax markers around the operators, or by the work of the BOB conventions.

We haven't really even thought about (6), but the exercise of writing the manuscript made it clear to me that we'll need to do something in this direction, or risk befuddling and scaring the bejezus out of users when they start getting hit in the face with intervals or, even more inexplicably, p-boxes. I suppose the output sigdigs scheme mentioned above would be a start. Or we could go the output modifier route. We might use compound modifiers. For instance, the output C6 in the manuscript might be described as "Variable around 20, ranging from 1 to 42, with mean between 18 and 23". Maybe the modifiers could be user-configurable, so we might also include the median or distribution family if the user asked us to. I'm not sure what else one might help a naive user, but it's probably a lot more important than we've previously thought. Jack should tell us how to explain these beasts to users.

Page updated

Google Sites

Report abuse