New STD

Post date: Dec 31, 2015 4:18:4 PM

Kristine Thomason of Health.com wrote in a CNN story on 8 December 2015 entitled "What you should know about this 'new' STD" about mycoplasma genitalium (MD):

"Though experts have known of [mycoplasma genitalium]'s existence since the '80s, the new paper published in the International Journal of Epidemiology reveals that the bacterial infection, which resides in the urinary and genital tracts, likely spreads through sexual contact.

"To reach their findings, researchers at University College London, examined urine samples of 4,507 men and women between 18 and 44 years old who were sexually active with at least one partner. Of these participants, 48 women and 24 men were diagnosed with MG. However, when the researchers tested urine samples from about 200 teenagers who had never had sex, zero tested positive for the infection."

Are the researchers drawing a reasonable conclusion that the 200 teenagers have a statistically lower infection rate than the sexually active men and women?

We can answer this question with c-boxes, but we need subtraction to do so.  We can use the Megan software if convenient, or use Risk Calc, or R with the pbox.r library.  Here is an R script to do it:

source("C:\\Users\\Scott\\Google Drive\\ramas_pboxr\\pbox.r")

adults = CBbinomial(48+24, 4507)

teens = CBbinomial(0, 200)

adults > teens

d = adults - teens

pl(-.2,.2)

red(d)

abline(v=0)

Executing this script in R yields the output "Interval: [0.95, 1]" and the graph below depicting the difference d between adult and teenager infection rates.

Insofar as this difference is to the right of zero, then there is statistical evidence that the adults have a higher infection rate than teens.  We see that its left tail skirts into the negative range.  The R expression prob(d,0)  yields the interval [0, 0.05], which is the vertical interval where the difference crosses the axis at zero.  This is the (imprecise) probability that the quantity teens is larger than the quantity adults, which is the complement of the value of the logical expression adults > teens.  This suggests that we have at least 95% confidence that adults have a greater infection rate than teens.

Would the conclusion have been as strong if one of the 200 teenagers had had the disease?  Changing the 0 to a 1 in the specification of the teenager c-box would lead to the difference below:

The expression  prob(d,0)  yields the interval [0.035, 0.185], which means our confidence that adults have a higher infection rate than teens would be only 81.5%.  So the evidence would be weaker for the conclusion had there been a case among the teenagers.

Of course, this quantitative analysis does not address the biological question of whether a person's age or even their sexual maturity rather than their sexual activity per se might be responsible for the difference in infection rates.