Please review and make contributions to this paper. To make a change or add text, just click on the pen icon (above, near the right corner of the screen). When you're done editing, be sure to save your changes by clicking the blue Save button. It is easy to include pictures and graphs in JPG, PNG or TIF format by selecting Insert/Image from the main menu above (on the left). Slides in a PowerPoint presentation can be converted to these formats by using the Save As menu option. Equations can be inserted by creating an image with Texify and copying it in to the text while editing. For special characters you cannot directly type, copy the glyphs from the table of HTML characters. To see previous versions and to compare revisions, select Revision History from the More menu (on the right above).
Generally, don't send emails about this collaboration. If your instinct is to send an email, make a posting instead on the Say page, at least in addition to an email. Posting to Say gives our discussions a permanence we need. But you don't need to announce suggested changes in the text. If you've got an idea or some language that ought to be in the paper, go ahead and put it in the paper yourself. That's what makes you an author. If it's finished prose, highlight it in blue for a while so people can see you added it. If it needs discussion and you want people to notice it and comment on it, use red highlighting. Any elements that are placeholders that should eventually be replaced or re-worded should be enclosed in <<double angle brackets>>, so we cannot miss them when editing. People will know you've modified a page if they've subscribed to it (which you do by toggling with the F key while looking at a page). Not everyone subscribes to every page, but everyone will see your new text when they reload the page. And you don't need to annotate or sign your chances because they'll be apparent in the revision history.
Whenever appropriate, annotate messages intended to spur discussion with the author and addressee using the Roman style, so write things like <<Scott says to Jack: What do you mean?>> or <<Scott says: I think we should do this.>>. This is especially useful if you intersperse elements in a discussion within the text. Use red highlighting to indicate something that needs attention from the group, and blue to indicate something you've changed that you think no one should object to or have to weigh in on. Other colors, including background colors, may also be used. Any text or material that is temporary, should be enclosed in <<double angle brackets>>, so it cannot slip by us in editing later.
Title:
Algorithms for computing transformed mean and transformed variance from p-box:
Outline:
Section I: Introduction
1) Importance/motivation for developing algorithms for computing transformed mean and transformed variance from p-box.
In probabilistic risk assessments, analysts usually assume statistical distributions for each of the variables in the risk expression. Of course, such assumptions are often difficult to justify when empirical data are sparse. For instance, it may be implausible that the distribution family is known empirically. This can be a serious problem for analysts because the choice of distribution family may strongly influence the results of the analysis (Bukowski et al. 1995).
To avoid committing to untenable assumptions about the input variables, many authors have suggested propagating the means and variances of the variables through the risk expressions as a crude form of risk analysis (e.g., Reckhow and Chapra 1983; Slob 1994; Wiwatandate and Claycamp 2000). This approach is sometimes called first-order error analysis, and is a widely used approach for making risk estimates. In traditional probability theory, this approach is called moment propagation and is considered a fundamental part of mathematical statistics (see, for example, Wilks 1962).
Bukowski, J. L. Korn, Wartenberg, D. 1995. Correlated inputs in quantitative risk assessments: the effects of distributional shape. Risk Analysis 15:215-219.
Reckow, K.H. and S.C. Chapra. 1983. Engineering Approaches For Lake Management. Volume 1: Data Analysis And Empirical Modeling. Butterworth Publishers, Boston.
Slob, W. 1994. Uncertainty analysis in multiplicative models. Risk Analysis 14: 571-576.
Wilks, S.S. 1962. Mathematical Statistics. John Wiley & Sons, New York.
Wiwatandate, P. and H.G. Claycamp. 2000. Exact propagation of uncertainties in multiplicative models. Human and Ecological Risk Assessment 6: 355-368.
2) Mathematical formulations of the problems
Given a p-box as B = pbox(u, d, shape, ml, mh, vl, vh) and a function t(x) with t’(x) and t’’(x) having constant sign on the interval [lower_x, upper_x].
Compute mean(t(B)) = [lower_E, upper_E] and var(t(B)) = [lower_V, upper_V].
3) Existing approaches (if any) and their insufficiencies.
Section II: Algorithms for computing bounds on transformed mean
1) Algorithm description.
We have an algorithm to calculate EXACT bounds on transformed mean E given the conditions
- untransformed minimum m
- untransformed maximum M
- untransformed mean [lower_mu, upper_mu]
- a set of order statistics <c_i, [lower_o_i, upper_o_i]> (The intervals [lower_o_i, upper_o_i]’s should satisfy no-subset property, i.e., no interval [lower_o_i, upper_o_i] is the subset of another interval [lower_o_j, upper_o_j]. Obviously, the set of order statistics provided by p-box B’s attributes u and d satisfy this property)
That means, this algorithm could calculate EXACT bounds on transformed mean E from p-box B’s attributes u, d, and ml, mh (where m, M, and <c_i, [lower_o_i, upper_o_i]>’s could be derived from u and d, and lower_mu and upper_mu are exactly ml and mh respectively). However, p-box B’s attributes shape, vl and vh are not used yet.
Find: transformed mean
Algorithm for computing upper boundof transformed mean
Step 1: Discretize the p-box B into a sequence of n intervals.
Step 2: If,we simply compute , else we continue with following steps.
Step 3: We check each k from 1 to n, until we find the one satisfying
Step 4: Once we find such k, we compute a based on
,
i.e.,
Step 5: Once we find such k and a, we compute
Algorithm for computing upper boundof transformed mean
Step 1: Discretize the p-box B into a sequence of n intervals.
Step 2: If,we simply compute , else we continue with following steps.
Step 3: We sort the 2n endpoints into a sequence. By this sorting, we also know for each,
what is the index of the original interval
of which is one of the endpoints (we denote this index as);
whethercomes from a lower endpoint or an upper endpoint (we denote to indicate that comes from a lower endpoint and to indicate that comes from a upper endpoint).
Step 4: We compute
as, andas 0.
Step 5: For each k from 1 to 2n-1, we compute and as,
if
(i.e., comes from a lower endpoint), we compute
if
(i.e., comes from an upper endpoint), we compute
We do this computation until we find the k satisfying
Step 6: Once we find such k, we select values of’s as
if , we select
if , we select
We count the number of’s in the first case as, and the number of’s in the second case as.For the rest of ’s, we select their values as
Step 7: We compute based on the selected values of ’s.
2) Proof of algorithm’s correctness
3) Analysis of algorithm’s time complexity (in term of big-O)
Section III: Algorithms for computing bounds on transformed variance
1) Algorithm description.
We have two p-boxizied algorithms to calculate bounds on transformed variance V, and the final result is the intersection of the results obtained from these two algorithms.
a) The first algorithm calculates bounds on transformed variance based on the conditions
- untransformed minimum m
- untransformed maximum M
- untransformed mean [lower_mu, upper_mu]
- untransformed stand deviation sigma [lower_sigma, upper_sigma]
- and also the computed interval of transformed mean as [lower_E, upper_E]
Here, m and M could be derived from p-box B’s u and d; lower_mu and upper_mu are exactly B’s ml and mh respectively; and lower_sigma and upper_sigma could be derived from B’s vl and vh respectively.
b) The second algorithm calculates bounds on transformed variance based on the conditions
- untransformed minimum m
- untransformed maximum M
- untransformed mean [lower_mu, upper_mu]
- a set of order statistics <c_i, [lower_o_i, upper_o_i]>
- and also the computed interval of transformed mean as [lower_E, upper_E]
Here, m, M, and <c_i, [lower_o_i, upper_o_i]>’s could be derived from p-box B’s u and d, lower_mu and upper_mu are exactly B’s ml and mh respectively.
2) Proof of algorithm’s correctness
3) Analysis of algorithm’s time complexity (in term of big-O)
Section IV: Numerical experiments
Algorithms are implemented in R.
Numerical experiments will be performed – results will be illustrated and execution time will be counted to demonstrate performance of the developed algorithms.
Section V: Conclusion