The Wikipedia entries for Bayes’ rule and Bayesian analysis describe how to apply Bayes’ rule to the problem of estimating the probability a patient has a disease given multiple independent tests results for that disease. The often unspoken problem is that medical tests for disease are often not independent. What can be concluded from Bayes’ rule without the assumption of independence? Even if we assume that the probabilities for the prior and the multiple specificities and sensitivities are just scalar (real) values, it’s not obvious how to generalize Bayes’ rule without assuming those likelihoods can be multiplied together. Or at least it’s not been widely known.
It seems clear that omitting the independence assumption between the tests will result in a set of possible values—no doubt an interval—for the posterior. To compute these bounds we should be able to use the Fréchet (1935) inequalities, which allow us to say
P(A and B)
[ max(0, a+b–1), min(a, b) ],
P(A or B)
[ max(a, b), min(1, a+b) ],
where a = P(A), b = P(B). I usually prefer to write these statements as equalities so we say, for instance, that P(A and B) is equal to the interval, rather than an element of the interval. Instead of multiplying the likelihoods (which assumes independence) we should apply the Fréchet conjunction to get an interval or band for the likelihood, which would then be propagated through to an interval for the posterior. How would this work exactly? We need to spell out the formulas and give simple numerical examples that show them working.
For reference, the general definitional formulas for conjunctions and disjunctions are
P(A and B) = P(A) P(B given A),
P(A or B) = P(A) + P(B) – P(A and B).
When A and B are independent, then P(B given A) is just the same as P(B), and the probability of the conjunction is simply the product of the two probabilities
P(A and B) = P(A) P(B) = ab.
Likewise, disjunctions under independence are given by
P(A or B) = P(A) + P(B) – P(A) P(B) = a + b – ab = 1 - (1 - a)(1 - b).
These independence results are values within the Fréchet intervals. Negation and exclusive disjunction are given by
P(not A) = 1 – a,
P(A or B but not both) = [ max(a–b, b–a), min(a+b, 2–a–b) ].
We’ve worked out formulas for many logical operations under a variety of dependence assumptions. These formulas are individually suitable for use with interval values , although the general problem of evaluating arbitrary logical expressions with interval inputs with imprecisely specified dependencies is NP-hard and requires mathematical programming techniques (Hailperin 1986).
<<Gang’s answers in blue>>
<<Scott’s replies in plum>>
Dear Scott,
Regarding the long list, I think two are completely solved.
1) Bayes’ rule for multiple tests without independence assumptions
- P(A|B) = [ max(0, a+b–1) / b, min(a, b) / b ]
- P(B|A) = [ max(0, a+b–1) / a, min(a, b) / a ]
- where P(A) = a and P(B) = b
- These formulas also satisfy P(A|B) = P(A) P(B|A) / P(B) already.
2) Bayes’ rule with epistemic uncertainty and without independence assumptions
- Given the values of any two of sensitivity, specificity, and prevalence, the third one can still take any possible values (with uniform probability distribution). So they are independent.
I will send a more detailed email tonight explainning my thoughts. For other questions, I am in the "half way". You know, sometimes, just need "sparks". For some of them, I have "half sparks", I am looking for those complete sparks now. :-)
I will also begin to work on the problem you mentioned in this email tonight.
Thanks a lot.
Gang
Gang:
I'm familiar with the formulas for conditionals P(A|B) from a 2004 paper "Modus tollens probabilized" by Wagner, and I think we know how to intervalize them so that repeated parameters won't be a problem and also how to tighten them to account for partial dependency information, but I don't quite see how to use them to solve Bayes' rule to get something like P(Disease | positive for test A, positive for test B).
I look forward to your more detailed email.
Bayes’ rule for multiple tests without independence assumptions
If my understanding is correct, we basically need to find the interval for P(A | B & C).
Now, we already know that P(B&C) = [ max(0, b+c–1), min(b, c) ].
Case 1: min(b, c) = 0
That means P(B&C) = 0, P(A | B & C) is undefined.
Case 2: min(b, c) > 0, and b+c-1 >0, i.e., max(0, b+c–1) = b+c-1 > 0
Therefore, P(B&C) = [ b+c–1, min(b, c) ]
Let us denote B&C = E, b+c–1 = e1>0, min(b, c) = e2, i.e., P(E) = [e1, e2 ] and we want to find the interval for P(A | E).
For any e such that e1 <= P(E) = e <= e2 , P(A | E) = [max(0, a+e–1) / e, min(a, e)/e]. Therefore, we need find the union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where 0 < e1 <= e <= e2.
Here, since e > 0, max(0, a+e–1) / e =
· 0, when a+e–1<=0, (i.e., when e <=1- a)
· 1 + (a–1) / e, when a+e–1>=0, (i.e., when e >=1- a)
Notice that 1 + (a–1) / e decreases (non-strictly though) when e increases.
Now, we need to consider two sub cases
· Sub case 2.1: If e can reach the value <=1- a, i.e., when e1<=1- a (or b+c-1<=1- a, b+c+a<=2), then max(0, a+e–1) / e can reach 0.
· Sub case 2.2: If e cannot reach the value <=1- a, i.e., when e1>1- a (or b+c-1>1- a, b+c+a>2), then the smallest possible value max(0, a+e–1) / e can reach is 1 + (a–1) / e2.
Here, since e > 0, min(a, e)/e =
· 1, when a>=e, (i.e., when e <= a)
· a / e, when a<=e, (i.e., when e >= a)
Therefore, min(a, e)/e decreases (non-strictly though) when e increases.
Therefore, union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where e1 <= e <= e2 will be
· [0, min(a, e1)/e1], when b+c+a<=2
· [1 + (a–1) / e2, min(a, e1)/e1], when b+c+a>2
That means
· When when b+c+a<=2,
P(A | B & C) = [0, min(a, e1)/e1]
= [0, min(a, b+c–1)/ (b+c–1)]
· When when b+c+a>2,
P(A | B & C) = [1 + (a–1) / e2, min(a, e1)/e1]
= [1+ (a-1)/min(b, c), min(a, b+c–1)/ (b+c–1)]
Case 2: min(b, c) > 0 and b+c-1 <=0
Therefore, P(B&C) = [ 0, min(b, c) ]
Let us denote B&C = E, min(b, c) = e2, i.e., P(E) = [0, e2 ] and we want to find the interval for P(A | E).
For any e such that 0 <= P(E) = e <= e2 , P(A | E) = [max(0, a+e–1) / e, min(a, e)/e]. Therefore, we need find the union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where 0 <= e <= e2, and e2>0.
Here, max(0, a+e–1) / e =
· Undefined, when e = 0.
· 0, when a+e–1<=0, (i.e., when 0< e <=1- a), this value could be reached when e is close to 0.
· 1 + (a–1) / e, when a+e–1>=0, (i.e., when e >=1- a)
Here, min(a, e)/e =
· Undefined, when e = 0.
· 1, when a>=e, (i.e., when 0< e <= a), this value could be reached when e is close to 0.
· a / e, when a<=e, (i.e., when e >= a)
Therefore, union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where 0 <= e <= e2 will be
[0, 1]
That means
P(A | B & C) = [0, 1]
Summary:
1. When min(b, c) = 0
P(A | B & C) is undefined
2. When b+c-1 <=0 and min(b, c) > 0
P(A | B & C) = [0,1]
3. When min(b, c) > 0 and b+c-1 >0 and b+c+a<=2
P(A | B & C) = [0, min(a, b+c–1)/ (b+c–1)]
4. When min(b, c) > 0 and b+c-1 >0 and b+c+a>2
P(A | B & C) = [1+ (a-1)/min(b, c), min(a, b+c–1)/ (b+c–1)]
Folks:
I wrote the Risk Calc function below to evaluate Gang's formula for P(A | B & C). You can call it as "bayes2(a,b,c)" where a is the prior probability, and b and c are the probabilities of the first and second test being positive respectively. The behavior of this trivariate function is pretty easy to suss out with a few evaluations, as below. It's value is the dunno interval [0,1] if either b or c is small or moderate. When they're both high, the value tends to the prior a. I guess independence is a really strong assumption in this problem.
Does Gang’s formula enclose the corresponding formula that assumes independence? The Wikipedia page http://en.wikipedia.org/wiki/Bayes%27_theorem says
P(A|B&C) = P(A) P(B|A) P(C|A&B) / ( P(B) P(C|B) )
P(A|B&C) = P(B|A&C) P(A|C) / P(B|C)
Scott
function bayes2() if (min($2,$3)==0) return 'undefined' else if ($2+$3-1 <= 0) return [0,1] else if ($2+$3+$1<=2) return [0, min($1, $2+$3-1)/ ($2+$3-1)] else return [1+ ($1-1)/min($2, $3), min($1, $2+$3-1)/ ($2+$3-1)]
bayes2(0.1, 0.01, 0.01)
[ 0, 1]
bayes2(0.1, 0.0005, 0.999)
[ 0, 1]
bayes2(0.1, 0.5, 0.5)
[ 0, 1]
bayes2(0.1, 0.56, 0.56)
[ 0, 0.8333333333334]
bayes2(0.1, 0.6, 0.6)
[ 0, 0.5000000000001]
bayes2(0.1, 0.75, 0.85)
[ 0, 0.1666666666667]
bayes2(0.1, 0.75, 0.95)
[ 0, 0.1428571428572]
bayes2(0.1, 0.9975, 0.9985)
[ 0.09774436090225, 0.1004016064258]
bayes2(0.91, 0.01, 0.01)
[ 0, 1]
bayes2(0.91, 0.0005, 0.999)
[ 0, 1]
bayes2(0.91, 0.5, 0.5)
[ 0, 1]
bayes2(0.91, 0.56, 0.56)
[ 0.8392857142857, 1]
bayes2(0.91, 0.6, 0.6)
[ 0.85, 1]
bayes2(0.91, 0.75, 0.85)
[ 0.88, 1]
bayes2(0.91, 0.75, 0.95)
[ 0.88, 1]
bayes2(0.91, 0.9975, 0.9985)
[ 0.9097744360902, 0.9136546184739]
bayes2(0.51, 0.01, 0.01)
[ 0, 1]
bayes2(0.51, 0.0005, 0.999)
[ 0, 1]
bayes2(0.51, 0.5, 0.5)
[ 0, 1]
bayes2(0.51, 0.56, 0.56)
[ 0, 1]
bayes2(0.51, 0.6, 0.6)
[ 0, 1]
bayes2(0.51, 0.75, 0.85)
[ 0.3466666666666, 0.85]
bayes2(0.51, 0.75, 0.95)
[ 0.3466666666666, 0.7285714285715]
bayes2(0.51, 0.9975, 0.9985)
[ 0.5087719298245, 0.5120481927711]
Gang:
Maybe you can help us understand what you derived.
I presumed that A denoted the event "has the disease", and B and C are "test positive under test 1" and "test positive under test 2" respectively. In ARRA Need 13, we define a test's sensitivity as P(test positive | have the disease), and specificity as P(test negative | healthy), so I computed the probability of testing positive with the law of total probability as P(positive) = P(positive | disease) + P(positive | not disease) = sensitivity + (1 - specificity), for both tests. I took the probability of having the disease (A) to be the prior prevalence of the disease. Thus this sound right?
Your formula would then compute the posterior probability of having the disease given that you tested positive under both tests, right? If so, then using the numbers from Need 13, and assuming the two tests have the same statistical characteristics, the posterior probability of having the disease given two positive tests would be
// P(A | B & C) = [1+ (a-1)/min(b, c), min(a, b+c-1)/ (b+c-1)]
function bayes2() if (min($2,$3)==0) return 'undefined' else if ($2+$3-1 <= 0) return [0,1] else if ($2+$3+$1<=2) return [0, min($1, $2+$3-1)/ ($2+$3-1)] else return [1+ ($1-1)/min($2, $3), min($1, $2+$3-1)/ ($2+$3-1)]
prevalence = 0.01%
sensitivity1 = 99.9%
specificity1 = 99.99%
sensitivity2 = sensitivity1
specificity2 = specificity1
positive1 = sensitivity1 + (1-specificity1)
positive2 = sensitivity2 + (1-specificity2)
bayes2(prevalence, positive1, positive2) + 0
[ 0, 0.0001001803245843]
But, of course, this range seems quite small. For comparison, the posterior after only one test would be
function bayes_medicaltest() return 1/(1+((1/$1-1)*(1-$3))/$2) // args: prevalence, sensitivity, specificity
bayes_medicaltest(prevalence,sensitivity1,specificity1)
0.4997748761818
Could the right posterior after two positive tests be the logical negation of your formula? If so it would be in the interval [0.9998998, 1]
The posterior probability of having the disease given that you tested positive for one test but negative for the other would be computed as
bayes2(prevalence, positive1, not positive2) + 0
[ 0, 1]
wouldn't it? I guess this seems reasonable given we make no assumption about the dependence between the two tests.
Cheers,
Scott
Dear Scott,
I think the law of total probability “P(positive) = P(positive | disease) + P(positive | not disease) = sensitivity + (1 - specificity)”, which you used to compute P(positive), seems incorrect.:-) Instead, I guess it should be
P(positive) = P(positive & disease) + P(positive & not disease)
= P(positive | disease) * P(disease) + P(positive | not disease) * P(not disease)
= sensitivity * prevalence + (1 - specificity) * (1- prevalence)
Then
positive1 = sensitivity1 * prevalence + (1 – specificity1) * (1- prevalence)
= 99.9% * 0.01% + (1-99.99%) * (1- 0.01%)
= 0.00999% + 0.01% * 99.99% = 0.00999% + 0.009999%
= 0.019989%
Similarly,
positive2 = sensitivity2 * prevalence + (1 – specificity2) * (1- prevalence)
= 0.019989%
Therefore
bayes2(prevalence, positive1, positive2)
= [0, 1]
Also,
bayes2(prevalence, positive1, not positive2)
= [0, 1]
These results seems reasonable, right? :-)
Thanks a lot.
Gang
Gang:
Ha! Of course you're right! I must have been asleep. You're officially my hero. Good work!
Gang:
Trust you are well.
Now that I'm back from two weeks of travel, we were talking excitedly today about your derivation of Bayes' rule for the case when we cannot assume independence between two tests. It looks like there'll be an inescapable repeated variable in the calculation, so the calculation become a little tricky if there's uncertainty in prevalence and specificity. Interesting.
In our discussions, we realized that we have three follow-on questions for you:
1) Does your derivation generalize to more than two tests in an easy way?
2) Can we get analogous formulas for some other dependence assumptions too? In red below are descriptions of two Risk Calc functions 'cond' and 'imply' for which we've spelled out formulas for several other dependence assumptions. Can you handle some of these dependencies too. Some will probably be trivial.
cond(a, b)
The cond function computes bounds on the conditional probability of an event A given an event B, whose probabilities are a and b respectively. This conditional probability is sometimes denoted with the notation “Prob(A | B)”. The arguments can be any kind of logical values, including Booleans (true or false), scalar probabilities, intervals, or distributions of various kinds, but both must be dimensionless and their ranges must lie entirely within the unit interval [0,1]. If the value b includes zero, then the function returns the dunno interval [0,1]. This is one of several Risk Calc functions that support uncertainty logic. See also logical conjunctions and modusponens. See especially the imply function.
Risk Calc returns bounds on the probability of the first event occurring given that the second event occurs, assuming nothing about the dependence between the two events. For instance, the expression cond(0.2, 0.5) yields the interval [0, 0.4]. These bounds are the best possible given no information except the marginal probabilities for each event separately.
Tightening the result
Sometimes information is available about the dependence between the underlying events that can be used to tighten the calculated conditional probability. Let a be the probability of event A and let b be the probability of event B.
Positive dependence. If A and B are positively dependent, then the best possible bounds on the conditional probability Prob(A | B) are env(a,min(a/b,1)).
Negative dependence. If A and B are negatively dependent, then the best possible bounds on the conditional probability Prob(A | B) is env(max(0,(a-1)/b+1),a).
Independence. If A and B are independent, then the conditional probability of A given B is the same as the marginal probability of A, which is a.
Perfect dependence. If A and B are perfectly dependent, then the bounds on the conditional probability Prob(A | B) can be computed as min(a/b,1).
Opposite dependence. If A and B are as close to mutually exclusive as they can be given their marginal probabilities, then the probability Prob(A | B) can be computed as max(0,(a-1)/b+1).
Mutual exclusivity. If A and B are mutually exclusive events, the probability of A given B is zero.
imply(a, b)
The imply function computes the truth value or probability of the logical implication A®B (which is read “A implies B”) between the events A and B whose truth values or probabilities are prescribed by the arguments a and b respectively. The arguments can be any kind of logical values, including Booleans (true or false), scalar probabilities, intervals, or distributions of various kinds, but both must be dimensionless and their ranges must lie entirely within the unit interval [0,1]. The logical implication A®B between two events A and B is equivalent to the disjunction (not(A) or B). This is one of several Risk Calc functions that support uncertainty logic. See also logical disjunctions and modusponens. See especially the cond function.
If the operands are Boolean values (i.e., either true or false), Risk Calc’s results agree with the traditional results of Boolean logic. For instance if a is 1 (true) and b is 0 (false), then the expression imply(a, b) yields the value 0 (false). All other possible values of a and b yield the value 1 (true).
If the operands represent probabilities for two different events, Risk Calc returns bounds on the probability that the first event implies the occurrence of the second event, assuming nothing about the dependence between the two events. For instance, the expression imply(0.2, 0.5) yields the interval [0.8, 1]. These bounds are the best possible given no information except the marginal probabilities for each event separately.
Tightening the result
Sometimes information is available about the dependence between the underlying events that can be used to tighten the calculated probability of the implication. Let a be the probability of event A and let b be the probability of event B.
Positive dependence. If A and B are positively dependent, the probability of A®B is orn(not(a),b). The orn function is used here, because, if A and B are positively dependent, then not(A) and B are negatively dependent.
Negative dependence. If A and B are negatively dependent, the probability of A®B is orp(not(a),b).
Independence. If A and B are independent, you can compute the probability of the implication A®B with the expression ori(not(a),b), or equivalently, the expression !a|||b.
Perfect dependence. If A and B are perfectly dependent, then the bounds on the probability of A®B can be computed as min(1-a+b, 1).
Opposite dependence. If A and B are as close to mutually exclusive as they can be given their marginal truth values, then the probability of A®B can be computed as max(1-a, b).
Mutual exclusivity. If A and B are mutually exclusive events, the probability of the implication A®B is just not(a).
3) Finally, from ARRA Need #13, we ask an even more basic question. It's probably not fair to ask you this question, but perhaps you're thinking more clearly and cogently than we've been able to do here lately. Does Bayes’ rule assume independence between the sensitivity and specificity (and prevalence)? If so, then can it be restructured by appeal to the Fréchet (1935) inequalities to generalize it for cases where it is not reasonable to assume independence? In particular, if there is some reason to think that specificity and sensitivity are not stochastically or epistemically independent, what weaker conclusion can be drawn? And if somehow there is no issue of dependence between the inputs to Bayes rule, what’s the pedagogical explanation of why there isn’t? It is remarkable to me that I cannot find a clear answer or statement about this question in the material I've read. Lalkhen and McCluskey (http://ceaccp.oxfordjournals.org/content/8/6/221.full) point out "The sensitivity and specificity of a quantitative test are dependent on the cut-off value above or below which the test is positive. In general, the higher the sensitivity, the lower the specificity, and vice versa". This seems to suggest that sensitivity and specificity will be stochastically dependent on each other. See also some confusing language on the subject at http://www.mclibrary.duke.edu/training/courses/ebmcourse/diagnosis/sensitivity.pdf.
11-b. Bayes’ rule for multiple tests (more than two) without independence assumptions
We basically need to find the interval for P(A | B1 & …&Bn), here P(A) = a, and P(B1) = b1 , ... ,P(Bn) = bn.
Stage 1 Compute P(B1 & …&Bn).
We can compute P(B1 & B2) = [ max(0, b1+ b2–1), min(b1, b2) ], which we denote as [p2, q2].
Let us denote P(B1 & …&Bk) as [pk, qk], then we need to find out how to compute [pk, qk] from [pk-1, qk-1] for 2<k<=n
In fact, P(B1 & …&Bk-1) = [pk-1, qk-1] and P(Bk) = bk. For any value b such that pk-1<= b = P(B1 & …&Bk-1) <=qk-1, P(B1 & …&Bk) = [ max(0, b+ bk–1), min(b, bk) ]. Therefore, we need find the union of all intervals [max(0, b+ bk–1), min(b, bk) ] where pk-1<= b <=qk-1.
We can easily get that this union is [max(0, pk-1+ bk–1), min(qk-1, bk) ]. Therefore, we get the algorithm to compute P(B1 & …&Bn) as follows,
l First, we compute P(B1 & B2) = [p2, q2] = [ max(0, b1+ b2–1), min(b1, b2) ].
l For each 2<k<=n, we compute [pk, qk] = [max(0, pk-1+ bk–1), min(qk-1, bk) ].
l The computed [pn, qn] will be P(B1 & …&Bn).
Stage 2 Compute P(A |B1 & …&Bn).
Now from Stage 1, we already compute P(B1 & …&Bn) as [pn, qn].
Case 1: qn = 0
That means P(B1 & …&Bn) = 0, P(A | B1 & …&Bn) is undefined.
Case 2: qn > 0 and pn = 0
Therefore, P(B1 & …&Bn) = [ 0, qn ]
Let us denote B1 & …&Bn = E, i.e., P(E) = [ 0, qn ] and we want to find the interval for P(A | E).
For any e such that 0 <= P(E) = e <= qn , P(A | E) = [max(0, a+e–1) / e, min(a, e)/e]. Therefore, we need find the union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where 0 <= e <= qn, and qn >0.
Here, max(0, a+e–1) / e =
· Undefined, when e = 0.
· 0, when a+e–1<=0, (i.e., when 0< e <=1- a), this value could be reached when e is close to 0.
· 1 + (a–1) / e, when a+e–1>=0, (i.e., when e >=1- a)
Here, min(a, e)/e =
· Undefined, when e = 0.
· 1, when a>=e, (i.e., when 0< e <= a), this value could be reached when e is close to 0.
· a / e, when a<=e, (i.e., when e >= a)
Therefore, union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where 0 <= e <= qn will be
[0, 1]
That means
P(A |B1 & …&Bn) = [0, 1]
Case 3: qn > 0, and pn >0
Let us denote B1 & …&Bn = E, i.e., P(E) = [pn, qn ] and we want to find the interval for P(A | E).
For any e such that pn <= P(E) = e <= qn , P(A | E) = [max(0, a+e–1) / e, min(a, e)/e]. Therefore, we need find the union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where 0 < pn <= e <= qn.
Here, since e > 0, max(0, a+e–1) / e =
· 0, when a+e–1<=0, (i.e., when e <=1- a)
· 1 + (a–1) / e, when a+e–1>=0, (i.e., when e >=1- a)
Notice that 1 + (a–1) / e decreases (non-strictly though) when e increases.
Now, we need to consider two sub cases
· Sub case 2.1: If e can reach the value <=1- a, i.e., when pn <=1- a, then max(0, a+e–1) / e can reach 0.
· Sub case 2.2: If e cannot reach the value <=1- a, i.e., when pn >1- a, then the smallest possible value max(0, a+e–1) / e can reach is 1 + (a–1) / qn.
Here, since e > 0, min(a, e)/e =
· 1, when a>=e, (i.e., when e <= a)
· a / e, when a<=e, (i.e., when e >= a)
Therefore, min(a, e)/e decreases (non-strictly though) when e increases.
Therefore, union of all intervals [max(0, a+e–1) / e, min(a, e)/e] where pn <= e <= qn will be
· [0, min(a, pn)/ pn], when pn <=1- a
· [1 + (a–1) / qn, min(a, pn)/ pn], when pn >1- a
Therefore, we get the algorithm in Stage 2 to compute P(A|B1 & …&Bn) - based on the computed result P(A|B1 & …&Bn) = [pn, qn] from Stage 1 - as follows,
l If qn = 0, P(A | B1 & …&Bn) is undefined.
l If qn > 0 and pn = 0, P(A | B1 & …&Bn) = [0,1].
l If qn > 0 and 0 <pn <=1- a, P(A | B1 & …&Bn) = [0, min(a, pn)/ pn].
l If qn > 0 and pn >1- a, P(A | B1 & …&Bn) = [1 + (a–1) / qn, min(a, pn)/ pn].
// Hailperin (1965, http://www.jstor.org/pss/2313491, "Best possible inequalities for the probability of a logical function of events")
func fand1() begin b = [max(0, arg(1)+arg(2)-1), min(arg(1),arg(2))]; for i = 3 to args do b = [max(0,left(b)+arg(i)-1),min(right(b),arg(i))]; return b; end
func fand2() begin b0 = b1 = arg(1); for i = 2 to args do begin b0 = b0 + arg(i); b1 = min(b1,arg(i)); end; return [max(0,b0-(args-1)),b1]; end
for i = 1 to 5 do b.i = random
for i = 1 to 5 do say b.i
0.8893432617188
0.7359313964844
0.4613037109375
0.5315551757812
0.0352783203125
fand1(b.1,b.2, b.3,b.4,b.5)
[ 0, 0.0352783203125]
fand2(b.1,b.2, b.3,b.4,b.5)
[ 0, 0.0352783203125]
// interval arguments
for i = 1 to 5 do b.i = env(random, random)
for i = 1 to 5 do say b.i
[ 0.6208190917968, 0.7710876464844]
[ 0.2170104980468, 0.253662109375]
[ 0.3354797363281, 0.5662231445313]
[ 0.1328430175781, 0.6141967773438]
[ 0.2127075195312, 0.9486694335938]
fand1(b.1,b.2, b.3,b.4,b.5)
[ 0, 0.253662109375]
fand2(b.1,b.2, b.3,b.4,b.5)
[ 0, 0.253662109375]
NEED TO IMPLEMENT AND CHECK GANG’S MORE-THAN-TWO FORMULA
NEED TO IMPLEMENT AND CHECK ALGORITHM TO COMPUTE P(Bi) FROM SENS. AND SPEC.
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
My previous formula for P(A | B & C) - i.e., P(ill | positive1 & positive2) - only uses P(A) = P(ill), P(B) = P(positive1) and P(C) = P(positive2) as inputs. Actually, we also have sensitivities P(positive1 | ill), P(positive2 | ill) and specificities P(negative1 | healthy), P(negative2 | healthy) available. I have seen some post online regarding how to use Bayes’ rule for multiple tests. This make me revisit this question and propose a new solution as follows.
1) Bayes’ rule P(A | B) = P(B | A) * P(A) / P(B) is used to estimate probability a patient has a disease given multiple test results. This is what I found regarding how Bayes’ rule works with independence assumption of two tests and without consideration of epistemic uncertainty,
Given:
sensitivities P(positive1 | ill) = sen1, P(positive2 | ill) = sen2
specificities P(negative1 | healthy) = spe1, P(negative2 | healthy) = spe2
prevalence P(ill) = p
Compute:
P(ill | positive1 & positive2) = P(positive1 & positive2 | ill ) * P(ill) / P(positive1 & positive2)
= P(positive1 | ill ) * P(positive2 | ill ) * P(ill) / ( P(positive1 & positive2 & ill) + P(positive1 & positive2 & healthy) )
= P(positive1 | ill ) * P(positive2 | ill ) * P(ill) / ( P(positive1 & positive2 | ill) * P(ill) + P(positive1 & positive2| healthy) * P(healthy) )
= P(positive1 | ill ) * P(positive2 | ill ) * P(ill) / ( P(positive1 | ill ) * P(positive2 | ill ) * P(ill) + P(positive1 | healthy ) * P(positive2 | healthy ) * P(healthy) )
= P(positive1 | ill ) * P(positive2 | ill ) * P(ill) / ( P(positive1 | ill ) * P(positive2 | ill ) * P(ill) + ( 1 - P(negative1 | healthy ) ) * ( 1 - P(negative2 | healthy ) ) * ( 1 - P(ill) ) )
= sen1 * sen2 * p / ( sen1 * sen2 * p + ( 1 - spe1 ) * ( 1 - spe2 ) * (1 - p) )
2) Considering epistemic uncertainty, we should have given interval values as
3) Considering Fréchet inequalities (without independence assumption), we have
P(A) = a, P(B) = b ==> P(A&B) = [max(0, a+b-1), min(a, b)]
Also, we can esily get
P(A) = a, P(B) = b ==> P(!A) = 1-a, P(!B) = 1-b
==> P(!A&!B) = [max(0, 1-a-b), min(1-a, 1-b)]
In the interval cases, we have
P(A) = [a_l, a_h, P(B) = [b_l, b_h] ==> P(A&B) = [max(0, a_l + b_l - 1), min(a_h, b_h)]
==> P(!A&!B) = [max(0, 1 - a_h - b_h), min(1 - a_l, 1 - b_l) ]
4) Therefore, this is how it works without independence assumption of two tests and with consideration of epistemic uncertainty,
Given:
sensitivities P(positive1 | ill) = [sen_l1,sen_h1] P(positive2 | ill) = [sen_l2,sen_h2]
specificities P(negative1 | healthy) = [spe_l1,spe_h1], P(negative2 | healthy) = [spe_l2,spe_h2]
prevalence P(ill) = [p_l, p_h]
sensitivities P(positive1 | ill) = [sen_l1,sen_h1] P(positive2 | ill) = [sen_l2,sen_h2]
specificities P(negative1 | healthy) = [spe_l1,spe_h1], P(negative2 | healthy) = [spe_l2,spe_h2]
prevalence P(ill) = [p_l, p_h]
Compute:
P(ill | positive1 & positive2) = P(positive1 & positive2 | ill ) * P(ill) / P(positive1 & positive2)
= P(positive1 & positive2 | ill ) * P(ill) / ( P(positive1 & positive2 & ill) + P(positive1 & positive2 & healthy) )
= P(positive1 & positive2 | ill ) * P(ill) / ( P(positive1 & positive2 | ill) * P(ill) + P(positive1 & positive2| healthy) * P(healthy) )
= 1 / ( 1 + P(positive1 & positive2| healthy) * P(healthy) / ( P(positive1 & positive2 | ill ) * P(ill) ) )
= 1 / ( 1 + P(positive1 & positive2| healthy) * ( 1-P(ill) ) / ( P(positive1 & positive2 | ill ) * P(ill) ) )
= 1 / ( 1 + P(positive1 & positive2| healthy) * ( 1/P(ill) - 1 ) / P(positive1 & positive2 | ill ) )
= 1 / ( 1 + P(!negative1 & !negative2| healthy) * ( 1/P(ill) - 1 ) / P(positive1 & positive2 | ill ) )
= 1 / ( 1 + [max(0, 1 - spe_h1 - spe_h2), min(1-spe_l1,1-spe_l2)] * ( 1/[p_l, p_h] - 1 ) / [max(0, sen_l1 + sen_l2 -1), min(sen_h1,sen_h2)] )
$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$
Now, extend the results to the cases of n multiple tests,
Given:
We can compute:
P(ill | positive1 &...& positiven) = P(positive1 &...& positiven | ill ) * P(ill) / P(positive1 &...& positiven)
= P(positive1 &...& positiven | ill ) * P(ill) / ( P(positive1 &...& positiven & ill) + P(positive1 &...& positiven & healthy) )
= P(positive1 &...& positiven | ill ) * P(ill) / ( P(positive1 &...& positiven | ill) * P(ill) + P(positive1 &...& positiven| healthy) * P(healthy) )
= 1 / ( 1 + P(positive1 &...& positiven| healthy) * P(healthy) / ( P(positive1 &...& positiven| ill ) * P(ill) ) )
= 1 / ( 1 + P(positive1 &...& positiven| healthy) * ( 1-P(ill) ) / ( P(positive1 &...& positiven| ill ) * P(ill) ) )
sensitivities P(positive1 | ill) = [sen_l1,sen_h1] ,..., P(positiven | ill) = [sen_ln,sen_hn]
specificities P(negative1 | healthy) = [spe_l1,spe_h1] ,..., P(negativen | healthy) = [spe_ln,spe_hn]
prevalence P(ill) = [p_l, p_h]
= 1 / ( 1 + P(positive1 &...& positiven| healthy) * ( 1/P(ill) - 1 ) / P(positive1 &...& positiven| ill ) )
= 1 / ( 1 + P(!negative1 & ...& !negativen| healthy) * ( 1/P(ill) - 1 ) / P(positive1 &...& positiven | ill ) )
= 1 / ( 1 + P(!negative1 & ...& !negativen| healthy) * ( 1/[p_l, p_h] - 1 ) / P(positive1 &...& positiven | ill ) )
We therefore need to compute
P(positive1 &...& positiven | ill )
P(!negative1 & ...& !negativen| healthy)
1) Compute P(positive1 &...& positiven | ill )
* We have P(positive1 | ill) = [sen_l1,sen_h1]
* Once we have P(positive1 &...& positivek | ill ) = [l, h], we compute P(positive1 &...& positivek & positivek+1 | ill ) as
P(positive1 &...& positivek & positivek+1 | ill )
= P( (positive1 &...& positivek ) & positivek+1 | ill )
= [max(0, l + sen_lk+1 -1), min(h,sen_hk+1)]
* We therefore can compute the result of P(positive1 &...& positiven | ill )
2) Compute P(!negative1 & ...& !negativen| healthy)
* Since P(negative1 | healthy) = [spe_l1,spe_h1] ,..., P(negativen | healthy) = [spe_ln,spe_hn]
we have P(!negative1 | healthy) = [1 - spe_h1,1 - spe_l1] ,..., P(!negativen | healthy) = [1 - spe_hn, 1- spe_ln]
* We have P(!negative1 | healthy) = [1 - spe_h1,1 - spe_l1]
* Once we have P(!negative1 & ...& !negativek| healthy) = [l, h], we compute P(!negative1 & ...& !negativek & !negativek+1| healthy) as
P(!negative1 & ...& !negativek & !negativek+1| healthy)
= P( (!negative1 & ...& !negativek) & !negativek+1| healthy)
= [max(0, l + (1 - spe_hk+1 ) -1), min(h, 1 - spe_lk+1)]
= [max(0, l - spe_hk+1), min(h, 1 - spe_lk+1)]
* We therefore can compute the result of P(!negative1 & ...& !negativen| healthy)
3) Summary,
* We first compute
P(positive1 &...& positiven | ill )
P(!negative1 & ...& !negativen| healthy)
following steps described in 1) and 2)
* Then we compute
P(ill | positive1 &...& positiven) = 1 / ( 1 + P(!negative1 & ...& !negativen| healthy) * ( 1/[p_l, p_h] - 1 ) / P(positive1 &...& positiven | ill ) )