We believe we know how to compute bounds on a distribution of a sum from p-boxes for the addends under the assumption that the addends are perfectly dependent on one another, or “comonotonic”. We discretize the two p-boxes into intervals and add together only the intervals from the same discretization level. In the R library, this would be this function:
perfectadd.pbox <- function(a,b) {
cu = a@u + b@u
cd = a@d + b@d
pbox(u = cu, d = cd)
}
where the u-array is the vector of quantile bounds corresponding to the left edge of the p-box; the d-array corresponds to the right edge.
Operationally, we’ve generalized this function to handle the other basic operations. We found that we needed to add sorting steps to the function to handle multiplication and division. And subtraction and division require the cross-combination of the u- and d-arrays, rather than combing the two u-arrays together and the d-arrays together. The R library currently uses this function:
perfectconv.pbox <- function(a,b,op='+') {
if (op %in% c('-', '/')) {
cu = do.call(op, list(a@u, b@d))
cd = do.call(op, list(a@d, b@u))
} else {
cu = do.call(op, list(a@u, b@u))
cd = do.call(op, list(a@d, b@d))
}
scu = sort(cu + 0) # adding zero convert Booleans to 0 or 1 when op is ‘<’
scd = sort(cd + 0)
if (all(cu == scu) && all(cd == scd))
pbox(u=scu,d=scd,dids=paste(a@dids,b@dids),bob=a@bob) else pbox(u=scu,d=scd,dids=paste(a@dids,b@dids))
}
We’ve been using this algorithm or something like it for many years, although we’ve never proved that it is generally correct, even for addition. We want to prove that our algorithm is correct for addition, and likewise check its validity for other operations.
More critically, we know that this algorithm does not work correctly for multiplication (or division) when operands include both negative and positive values. The study below shows by sampling distributions from p-boxes and multiplying them under the assumption of perfect dependence that the algorithm works when both operands are strictly positive, or both are strictly negative, but it does not work if one but not the other is negative, or if both straddle zero so they both contain both positive and negative numbers. To simplify the example, the study uses distributional p-boxes which contain only uniform distributions.
# test perfect multiplication
tpm <- function(a1,a2, b1,b2) {
# a1 and a2 are the interval bounds on the min and max of a uniform distribution
# b1 and b2 are the interval bounds on the min and max of a uniform distribution
a <<- U(a1,a2)
b <<- U(b1,b2)
ab <<- perfectconv.pbox(a,b,'*')
r <<- interval(a1,a2)*interval(b1,b2)
plot(c(left(r),right(r)),c(0,1),col='white',xlab='Product',ylab='Cumulative probability')
lines(ab,col='cyan',lwd=7)
many = 1000
perftimes <- function(u1,u2,col='black',col2='white') {
u = sort(u1 * u2)
if ((u1[[1]] < u1[[2]]) && (u2[[1]] < u2[[2]])) lines(u,p,col=col) else if (col2 != 'white') lines(u,p,col=col2)
}
p = 0:(many-1)/(many-1)
randomrange <- function(x) return(left(x)+runif(1)*(right(x)-left(x)))
for (i in 1:500) {
u1 = seq(randomrange(a1),randomrange(a2),length.out=many)
u2 = seq(randomrange(b1),randomrange(b2),length.out=many)
perftimes(u1,u2)
}
lines(ab,col='cyan',lwd=7)
alledges <- function(a1,a2, b1,b2) {
edges <- function(w,x,y,z) perftimes(seq(w,x,length.out=many), seq(y,z,length.out=many), 'red') # , 'blue')
edges(left(a1),left(a2), left(b1),left(b2)); edges(left(a1),left(a2), left(b1),right(b2))
edges(left(a1),left(a2), right(b1),left(b2)); edges(left(a1),left(a2), right(b1),right(b2))
edges(left(a1), right(a2), left(b1),left(b2)); edges(left(a1), right(a2), left(b1),right(b2))
edges(left(a1), right(a2), right(b1),left(b2)); edges(left(a1), right(a2), right(b1),right(b2))
edges(right(a1),left(a2), left(b1),left(b2)); edges(right(a1),left(a2), left(b1),right(b2))
edges(right(a1),left(a2), right(b1),left(b2)); edges(right(a1),left(a2), right(b1),right(a2))
edges(right(a1), right(a2), left(b1),left(b2)); edges(right(a1), right(a2), left(b1),right(b2))
edges(right(a1), right(a2), right(b1),left(b2)); edges(right(a1), right(a2), right(b1), right(b2))
}
alledges(a1,a2, b1, b2)
}
tpm( interval(1,3), interval(2,4), interval(2,4), interval(3,5)) # both operands positive, OKAY
tpm( interval(1,3)-13, interval(2,4)-13, interval(2,4)-13, interval(3,5)-13) # both operands negative, OKAY
tpm( interval(1,3)-13, interval(2,4)-13, interval(2,4), interval(3,5)) # one operand negative, BAD
tpm( interval(1,3)-13, interval(2,4)-13, interval(2,4), interval(3,5)) # one operand straddles zero, BAD
tpm( interval(1,3)-2, interval(2,4), interval(2,4)-5, interval(3,5)) # both operands straddle zero, BAD
# the trick of deferring application of the sign and swapping opposite for perfect solve the first of these BAD cases
o = -oppositeconv.pbox(-U(interval(1,3)-13, interval(2,4)-13),U(interval(2,4), interval(3,5)),'*')
tpm( interval(1,3)-13, interval(2,4)-13, interval(2,4), interval(3,5)) # one operand negative
lines(o,col='tan',lwd=5)
We need to improve the perfectconv.pbox algorithm so that it can correctly compute bounds on multiplication (and division) under the assumption of perfect dependence for all operands, whatever their signs. It will also be useful to be able to infer whether the resulting product (or quotient) is perfectly dependent with its factors (or dividend, and oppositely dependent with the divisor).