Post date: Feb 23, 2014 9:46:53 PM
Mohamed, Sébastien:
I still haven't received Mohamed's file to compute the Birnbaum and JRI measures. Mohamed, you said a few days ago you'd sent it before and that you'd resend it "immediately". Maybe we're having an email problem?
Okay, the email I sent you 9 days ago about trouble evaluating the Birnbaum importance measures may have been exaggerated. I've re-evaluated them and depicted the results below. Note that they are all strictly positive small numbers with moderate uncertainty, and interesting differences which perhaps can be made into a story for the paper (cf. the green and red plots). Note, however, that these results are based on k,n numbers for the original c-boxes that I just made up. I don't have that file that Mohamed used specifying the values he wanted to use.
These calculations assume that interval Boolean expressions can be solved by merely evaluating the corners of the input hyperintervals. This is certainly not generally true for arbitrary arithmetic expressions, and I had not expected it to be true for Boolean expressions, but it seems to be suggested by the Cooper et al. (2000) paper on unateness in Boolean expressions. (Mohamed, I gave you a hard copy of this paper.) Although I don't really understand the paper, it seems to say the function's extrema will be at these corners. While this is still an NP-hard problem in the number of variables, it's easy for small and moderately sized problems. And the bridge problem turns out to be very easy. Sébastien seems to agree that evaluating corners is all that's necessary for Boolean problems. If so, then it makes our computational problems easy again. Using the "constrained mathematics" described the Cooper et al. paper, the evaluations might in fact be practical for considerably larger problems.
For the variable a, I guess the expression for the Birnbaum importance measure is
c(1-b(d+e))+de(1-c-b+2bc)
where b,c,d, and e represent the intervals that are the slices from their respective c-boxes. (Sébastien, you'll note that we get this expression EITHER as the partial derivative of reliability with respect to a, OR as the difference of reliabilities when a is either zero or one, which confirms the equivalence that Mohamed mentioned was proved. I guess this is yet another amazing simplicity of Boolean expressions.)
I had previously believed that, because the Birnbaum measures are not monotone in each variable, correctly computing them with independent c-boxes as arguments would require forming a four-dimensional Cartesian product with horizontal slices from each of the c-boxes, and evaluating the uncertainty implied by each cell in that four-dimensional product as an interval problem. Because each interval problem has repetitions and non-positive functions, the solution would require subinterval reconstitution involving partitions of the respective intervals and another, nested four-dimensional Cartesian product of calculations. This would have been a nightmare.
Instead, we can approach the calculations like this: We use a Monte Carlo sampling strategy to replace the outer Cartesian product. This is possible because we're assuming the variables are independent. From each of the four c-boxes, we then randomly sample an interval (as a horizontal cut). Because the function for the Birnbaum measure for a is monotone only in b and c, we cannot use the trick that worked so well for evaluating the reliability. However, if we just need to evaluate corners, then we only need 32 evaluations of the expression to solve each interval problem.
The calculations in R that produced these graphs are archived in this email in red below. They can easily be modified to conform to Mohamed's input specifications, and they are pretty quick.
Note that, because we're using Monte Carlo simulation in the outer loop, the evaluation is not strictly rigorous, although it can be made arbitrarily good by increasing the number of Monte Carlo replications. The repetitions mean that the function cannot be decomposed into a sequence of binary operations. If you want a rigorous calculation, you would need to slice the c-boxes up into some number of intervals, each of which completely enclosed a corresponding part of the c-box. Even using 20 such intervals as a bare minimum discretization would mean you'd have to solve 160 thousand interval problems in order to get a pretty coarse enclosure for a single variable. But the result would be guaranteed in a way that the Monte Carlo solution is not.
I'm not sure I believe that any Boolean function can be evaluated so easily. In particular, I don't know whether the JRI functions will be solvable this way. What are those JRI formulas? Maybe Sébastien can tell us whether and why they will be solvable by evaluating just the corners, but I needed to do some test calculations to convince myself it was even plausible for the Birnbaum formulas. The graph below shows some testing of various strategies to compute an interval problem. All of the plotted results are really intervals; but I show some of them as triangles to more easily distinguish them. This testing is to check that evaluating corners is sufficient to find the true range of uncertainty. No matter how I varied the inputs within [0,1], it seemed to be true that the corners approach (shown in bright green) gave the correct answer, squeezed between the inner bounds found via a Monte Carlo sampling strategy (red) and the outer bounds found via subinterval reconstitution (SIR, shown in black). The various rearrangement strategies (gray, olive, teal, blue) were considerably weaker, and suggested values outside [0,1]. (Note, however, that they do not violate Scott's law of probabilities since they are not larger than two!)
The test calculations that produced this picture are also archived in this email, in blue below.
Are we back on track?
Cheers,
Scott
Reference
Cooper, J.A., S. Ferson and D.K. Cooper. 2000. Constrained mathematics for calculating logical safety and reliability probabilities with uncertain inputs. Journal of System Safety 36: 23-29. Draft version at http://www.osti.gov/scitech/servlets/purl/3219
#################################################################################
# checking only endpoints seems to work, but I don't quite understand why, nor can I believe it will work generally
#################################################################################
#################################################################################
# infrastructural functions
many = 10000 # increase for more accuracy
precise <- function(b) if (length(b)==many) TRUE else FALSE
leftside <- function(b) if (precise(b)) return(b) else return(b[1:many])
rightside <- function(b) if (precise(b)) return(b) else return(b[(many+1):(2*many)])
sampleindices = function() round(runif(many)*(many-1) + 1)
L <- function(x,random=TRUE) if (!random) leftside(x) else leftside(x)[sampleindices()]
R <- function(x,random=TRUE) if (!random) rightside(x) else rightside(x)[sampleindices()]
env <- function(x,y) if ((precise(x) && precise(y))) c(x,y) else stop('env error') # serves as env, but only works for precise distributional inputs
beta <- function(v,w) if ((v==0) && (w==0)) env(rep(0,many),rep(1,many)) else if (v==0) rep(0,many) else if (w==0) rep(1,many) else sort(rbeta(many, v, w))
kn <- function(k,n) return(env(beta(k, n-k+1), beta(k+1, n-k))) # c-box for Bernoulli parameter: x[i] ~ Bernoulli(parameter), x[i] is either 0 or 1, n = length(x), k = sum(x)
breadth <- function(b) if (precise(b)) 0 else mean(sort(rightside(b))-sort(leftside(b)))
endpoints <- function(b,side, which=1:many) if (side) return(rightside(b)[which]) else return(leftside(b)[which])
plotbox <- function(b,new=TRUE,col='blue',lwd=2,xlim=range(b[is.finite(b)]),ylim=c(0,1),xlab='',ylab='Probability',...) {
edf <- function (x, col, lwd, ...) {
n <- length(x)
s <- sort(x)
if (2000<n) {s = c(s[1],s[1:(n/5)*5],s[n]); n <- length(s);} # subsetting for the plot to every 5th value
lines(c(s[[1]],s[[1]]),c(0,1/n),lwd=lwd,col=col,...)
for (i in 2:n) lines(c(s[[i-1]],s[[i]],s[[i]]),c(i-1,i-1,i)/n,col=col,lwd=lwd,...)
}
b = ifelse(b==-Inf, xlim[1] - 10, b)
b = ifelse(b==Inf, xlim[2] + 10, b)
if (new) plot(NULL, xlim=xlim, ylim=ylim, xlab=xlab, ylab=ylab)
if (length(b) < many) edf(b,col,lwd) else
edf(c(min(b),max(b),b[1:min(length(b),many)]),col,lwd)
if (many < length(b)) edf(c(min(b),max(b),b[(many+1):length(b)]),col,lwd)
}
#################################################################################
# functions specifically for the bridge calculations
#
# These calculation concern the reliability of a bridge with structure
#
# +---a----+----c---+
# | | |
# ------| e |-------
# | | |
# +---b----+----d---+
#
# where the reliabilities of the five (unrepairable) components are
# given as independent uncertain numbers. This calculation uses
# a combination of monotonicity and Monte Carlo methods to
# take account of independence assumptions and sidestep
# the problem of the repeated variables.
#
# The Boolean function is (a&c)|(b&d)|(a&e&d)|(b&e&c) which
# implies an arithmetic function 1+(ac-1)(1-bd)(1-ade)(1-bec),
# after simplification. Replacing squares (because a*a = a in
# Boolean systmes), yields the function
#
# ac+bd+ade+bce-abce-acde-abde-bcde-abcd+2abcde
#
# which gives the reliability of the system in terms of probabilities
# a, b, c, d, e of its component parts.
#
a = kn(58,60) # Mohamed, what should these guys be?
b = kn(59,60)
c = kn(57,60)
d = kn(60,60)
e = kn(54,60)
a = kn(158,160) # Mohamed, what should these guys be?
b = kn(59,60)
c = kn(5,10)
d = kn(6,6)
e = kn(5,6)
BridgeI = function(a,b,c,d,e) (a*c+b*d+a*d*e+b*c*e-a*b*c*e-a*c*d*e-a*b*d*e-b*c*d*e-a*b*c*d+2*a*b*c*d*e)
bridgeI = function(a,b,c,d,e) return(env(BridgeI(L(a),L(b),L(c),L(d),L(e)), BridgeI(R(a),R(b),R(c),R(d),R(e))))
#################################################################################
# compute Birnbaum measures generically, without requiring knowledge about monotone variables
birnbaum <- function(a,b,c,d,e, F) {
rr = sampleindices()
A0 = endpoints(a, 0, which=rr)
A1 = endpoints(a, 1, which=rr)
rr = sampleindices()
B0 = endpoints(b, 0, which=rr)
B1 = endpoints(b, 1, which=rr)
rr = sampleindices()
C0 = endpoints(c, 0, which=rr)
C1 = endpoints(c, 1, which=rr)
rr = sampleindices()
D0 = endpoints(d, 0, which=rr)
D1 = endpoints(d, 1, which=rr)
rr = sampleindices()
E0 = endpoints(e, 0, which=rr)
E1 = endpoints(e, 1, which=rr)
# do all 32 corners so we don't have to figure out which variables the functions are monotone in; in general, you'd have to compute 2^(#dims) corners
F0 = F(A0,B0,C0,D0,E0)
F1 = F(A0,B0,C0,D0,E1)
F2 = F(A0,B0,C0,D1,E0)
F3 = F(A0,B0,C0,D1,E1)
F4 = F(A0,B0,C1,D0,E0)
F5 = F(A0,B0,C1,D0,E1)
F6 = F(A0,B0,C1,D1,E0)
F7 = F(A0,B0,C1,D1,E1)
F8 = F(A0,B1,C0,D0,E0)
F9 = F(A0,B1,C0,D0,E1)
F10 = F(A0,B1,C0,D1,E0)
F11 = F(A0,B1,C0,D1,E1)
F12 = F(A0,B1,C1,D0,E0)
F13 = F(A0,B1,C1,D0,E1)
F14 = F(A0,B1,C1,D1,E0)
F15 = F(A0,B1,C1,D1,E1)
F16 = F(A1,B0,C0,D0,E0)
F17 = F(A1,B0,C0,D0,E1)
F18 = F(A1,B0,C0,D1,E0)
F19 = F(A1,B0,C0,D1,E1)
F20 = F(A1,B0,C1,D0,E0)
F21 = F(A1,B0,C1,D0,E1)
F22 = F(A1,B0,C1,D1,E0)
F23 = F(A1,B0,C1,D1,E1)
F24 = F(A1,B1,C0,D0,E0)
F25 = F(A1,B1,C0,D0,E1)
F26 = F(A1,B1,C0,D1,E0)
F27 = F(A1,B1,C0,D1,E1)
F28 = F(A1,B1,C1,D0,E0)
F29 = F(A1,B1,C1,D0,E1)
F30 = F(A1,B1,C1,D1,E0)
F31 = F(A1,B1,C1,D1,E1)
f0 = pmin(F0,F1,F2,F3,F4,F5,F6,F7,F8,F9,F10,F11,F12,F13,F14,F15,F16,F17,F18,F19,F20,F21,F22,F23,F24,F25,F26,F27,F28,F29,F30,F31)
f1 = pmax(F0,F1,F2,F3,F4,F5,F6,F7,F8,F9,F10,F11,F12,F13,F14,F15,F16,F17,F18,F19,F20,F21,F22,F23,F24,F25,F26,F27,F28,F29,F30,F31)
env(f0,f1)
}
par(mfrow=c(2,3))
# Birnbaum function for variable a
F <- function(a,b,c,d,e) return(c-b*c*e-b*c*d+d*e-c*d*e-b*d*e+2*b*c*d*e)
fa = birnbaum(a,b,c,d,e,F)
plotbox(fa)
title('Birnbaum for variable a')
# Birnbaum function for variable b
F = function(a,b,c,d,e) (d+c*e-a*c*e-a*d*e-c*d*e-a*c*d+2*a*c*d*e)
fb = birnbaum(a,b,c,d,e,F)
plotbox(fb)
title('Birnbaum for variable b')
# Birnbaum function for variable c
F = function(a,b,c,d,e) (a+b*e-a*b*e-a*d*e-b*d*e-a*b*d+2*a*b*d*e)
fc = birnbaum(a,b,c,d,e,F)
plotbox(fc)
title('Birnbaum for variable c')
# Birnbaum function for variable d
F = function(a,b,c,d,e) (b+a*e-a*c*e-a*b*e-b*c*e-a*b*c+2*a*b*c*e)
fd = birnbaum(a,b,c,d,e,F)
plotbox(fd)
title('Birnbaum for variable d')
# Birnbaum function for variable e
F = function(a,b,c,d,e) (a*d+b*c-a*b*c-a*c*d-a*b*d-b*c*d+2*a*b*c*d)
fe = birnbaum(a,b,c,d,e,F)
plotbox(fe)
title('Birnbaum for variable e')
#################################################################################
# possibly faster way to compute Birnbaum measure for variable a
F <- function(A,B,C,D,E) return(C - B * C * E - B * C * D + D * E - C * D * E - B * D * E + 2 * B * C * D * E)
for (s in 0:1) {
A = 0
B = endpoints(b, !s, sampleindices()) # monotone decreasing
C = endpoints(c, s, sampleindices()) # monotone increasing
rr = sampleindices()
D0 = endpoints(d, 0, rr)
D1 = endpoints(d, 1, rr)
rr = sampleindices()
E0 = endpoints(e, 0, rr)
E1 = endpoints(e, 1, rr)
# do all four corners; in general, you'd have to compute 2^(#dims) corners
F0 = F(A,B,C,D0,E0)
F1 = F(A,B,C,D0,E1)
F2 = F(A,B,C,D1,E0)
F3 = F(A,B,C,D1,E1)
if (s==0) f0 = pmin(F0,F1,F2,F3)
if (s==1) f1 = pmax(F0,F1,F2,F3)
}
fa2 = env(f0,f1)
plotbox(fa2, col='red')
#################################################################################
# compute Birnbaum measure for variable b
F = function(a,b,c,d,e) (d+c*e-a*c*e-a*d*e-c*d*e-a*c*d+2*a*c*d*e)
for (s in 0:1) {
A = endpoints(a, !s, sampleindices()) # monotone decreasing
B = 0
rr = sampleindices()
C0 = endpoints(c, 0, rr)
C1 = endpoints(c, 1, rr)
D = endpoints(d, s, sampleindices()) # monotone increasing
rr = sampleindices()
E0 = endpoints(e, 0, rr)
E1 = endpoints(e, 1, rr)
# do all four corners; in general, you'd have to compute 2^(#dims) corners
F0 = F(A,B,C0,D,E0)
F1 = F(A,B,C0,D,E1)
F2 = F(A,B,C1,D,E0)
F3 = F(A,B,C1,D,E1)
if (s==0) f0 = pmin(F0,F1,F2,F3)
if (s==1) f1 = pmax(F0,F1,F2,F3)
}
fb2 = env(f0,f1)
plotbox(fb2,new=FALSE,col='green')
title('a (red) and b (green)')
////////////////////////////////////////////////////////////////////////////////
// make up some interval values to play with
clear
function rand() {r__=random; return env(r__,random); }
a=rand(); b=rand(); c=rand(); d=rand(); e=rand()
//a;b;c;d;e
a = [ 0.03375244, 0.4723816]
b = [ 0.6900725, 0.9585024]
c = [ 0.7711181, 0.9777833]
d = [ 0.3543395, 0.6956177]
e = [ 0.09057617, 0.6003418]
////////////////////////////////////////////////////////////////////////////////
// "naive" calculation of the Birnbaum measure for the variables a
func F() return C - B * C * E - B * C * D + D * E - C * D * E - B * D * E + 2 * B * C * D * E
B = b
C = c
D = d
E = e
naive = F()
////////////////////////////////////////////////////////////////////////////////
// subinterval reconstitution is slow but rigorous
run brute
//Brute force interval library has been loaded
func map2() return C * (1 - B * ($1 + $2)) + $1 * $2 * (1 - C - B + 2 * B * C)
B = left(b)
C = right(c)
R1 = sir2(d,e,50)
B = right(b)
C = left(c)
R2 = sir2(d,e,50)
R = env(R1,R2)
R // [ 0.11755425, 0.7023882]
[ 0.1175542, 0.7023882]
////////////////////////////////////////////////////////////////////////////////
// Monte Carlo searching yields an inner estimate
function try() {
A = specific(a)
B = specific(b)
C = specific(c)
D = specific(d)
E = specific(e)
return C - B*C*E - B*C*D + D * E * (1 - C - B + 2 * B * C)
}
mc = try()
for i = 1 to 1000 do mc = env(mc, try())
////////////////////////////////////////////////////////////////////////////////
// algebraic rearrangements are rigorous, but limited
a0 = tri(c + d*e - b*c*e - c*d*e - b*d*e - b*c*d + 2*b*c*d*e)
a1 = tri(c - b*c*e - b*c*d + d * e * (1 -c - b + 2 * b * c))
a2 = tri(c * (1 - b * e - b*d) + d * e * (1 - c - b + 2 * b * c))
a3 = tri(c * (1 - b * (d + e)) + d * e * (1 - c - b + 2 * b * c))
a4 = tri(c - b*c*e + d * (e * (1 -c - b + 2 * b * c) - b*c))
a5 = tri(c - b*c*d + e * (d * (1 -c - b + 2 * b * c) - b*c))
a;b;c;d;e
[ 0.03375244, 0.4723816]
[ 0.6900725, 0.9585024]
[ 0.7711181, 0.9777833]
[ 0.3543395, 0.6956177]
[ 0.09057617, 0.6003418]
////////////////////////////////////////////////////////////////////////////////
// checking endpoints seems to work, but I don't quite understand why,
// nor can I believe it will work generally
func endpoint() if ($2==0) return left($1) else return right($1)
procedure assignall() begin
A = endpoint(a,$1)
B = endpoint(b,$2)
C = endpoint(c,$3)
D = endpoint(d,$4)
E = endpoint(e,$5)
end
// get an anchor for the env function
assignall(0,0,0,0,0)
f = F()
// search corners, but don't repeat for the variable a
i = 0 // for i=0 to 0 do
for j=0 to 1 do for k=0 to 1 do for l=0 to 1 do for m=0 to 1 do begin
assignall(i,j,k,l,m)
ff = F()
f = env(f,ff)
end
ends = f
////////////////////////////////////////////////////////////////////////////////
// combine endpoint checking with monotonicity
assignall(0,0,0,0,0)
f = F()
for n = 0 to 1 do begin
B = endpoint(b, n) // monotone increasing
C = endpoint(c, not n) // monotone decreasing
for l = 0 to 1 do for m = 0 to 1 do begin
D = endpoint(d,l)
E = endpoint(e,m)
ff = F()
f = env(f,ff)
end
end
monends = f
////////////////////////////////////////////////////////////////////////////////
// summarize and compare output intervals
show mc in red
show R
show a0; show a1 in olive; show a2 in blue; show a3 in green; show a4 in teal; show a5 in gray
ff = tri(monends)
show ff in green
range(monends)
[ 0.1258789, 0.699457]
range(ends)
[ 0.1258789, 0.699457]
R
[ 0.1175542, 0.7023882]
mc
[ 0.1605074, 0.6508642]
range(a5);range(a4);range(a3);range(a2);range(a1);range(a0)
[ -0.4162429, 1.059946]
[ -0.4354022, 1.149603]
[ -0.2326928, 1.267755]
[ -0.2326928, 1.267755]
[ -0.439358, 1.331206]
[ -1.185823, 1.894515]
naive
[ -1.185823, 1.894515]