Post date: Apr 04, 2014 12:22:39 PM
Scott asked Michael:
I recently realized that the maximum likelihood estimates for the min and max of a uniform distribution are the observed min and max of the sample data. Because that is obviously a flawed method for that case, it seems like it would be good to add the uniform to our list of challenge problems. Do you have a c-box for the uniform case?
Michael replied to Scott:
So, I didn't mean to today, but I solved the problem. It should be worth a small paper, I think. I initially solved it using Bayesian inference with a 1/sigma prior on (b-a), but I think we can get there from a pivot too. The question is how much work has already been done for us by past statisticians.
SOLUTION:
*Note that this solution is only valid for sample size n >= 2, which makes sense since we have two free parameters.*
Let "a" and "b" be the lower and upper bound of the uniform distribution in question. The (precise) c-box estimating the distribution width is
F(b-a) = 1 - q^(n-1) * ( n - (n-1)*q )
evaluated over [max(x), Inf[, where
q = ( max(x) - min(x) ) / ( b - a )
and where F is the CCF <<Scott guesses this is the "cumulative confidence function" for the distribution width b-a>>. For a given value of (b-a), (a+b)/2 is uniformly distributed on [max(x)-(b-a)/2, min(x)+(b-a)/2]. I obtain this approximate (?) confidence distribution by doing Bayesian inference with a (hopefully) non-informative prior of 1/(b-a).
You can validate the confidence distribution for (b-a) using the following R-script:
n = 2
ccfTrk = array( 0 , c(1e4) )
a = 0
b = 2
for( kk in 1:1e4 ){
x = rbeta( n , 1 , 1 )*(b-a)
qLoc = ( max(x) - min(x) ) / (b-a)
ccfTrk[kk] = 1 - qLoc^(n-1) * ( n - (n-1)*qLoc )
}
plot( c(0,1) , c(0,1) , type='l' )
lines( sort( ccfTrk) , (1:1e4)/1e4 , col=4 )
<<Scott says: I can tell that the variable ccfTrk is the thing that should be uniformly distributed on [0,1], but it doesn't make sense to me that b-a should appear in the quotient qLoc, which I take to be part of the c-box calculation. The difference b-a is the (unknown) parameter. Likewise, the argument of F on the left-hand side of the definition of the c-box should also be w for the possible value of the width, not the true width b-a>>
I leave to you the more difficult challenge of validating the confidence distribution for (a+b)/2. Note, for a given (b-a),
(a+b)/2 is uniformly distributed on [max(x)-(b-a)/2,min(x)+(b-a)/2]. I'm sure it can be verified mathematically too.
I would be shocked if no one else has ever derived this as a posterior for uniform(a,b). I also seem to recall some special frequentist solution in my statistics textbook. It might've just been that (min(x), max(x)) is a sufficient statistic for this problem. I'll see what I can find. Anyway, if this leads to a paper, it should be a short paper. The ratio, ( max(x) - min(x) )/( b-a ) is definitely a pivot. I think the pair, { ( max(x) - min(x) )/( b-a ) , ( max(x)+min(x) - (a+b) )/(b-a) }, is a two-dimensional pivot. So, yeah, short paper. Once you have a pivot, you have a confidence structure.
Also, note that if you want to generate a random sample of (a,b), given min(x) and max(x), take q ~ beta(n-1,2). Then,
(b-a) = (max(x)-min(x))/q
Then, for each (b-a), take u ~ beta(1,1)
(a+b)/2 = ( max(x)-(b-a)/2 ) + ( (b-a) - (max(x)-min(x)) )*u
And that will give you a joint Monte-Carlo sample of (a+b)/2.
Also, for fun ... the expected values of (a,b) derived from this confidence distribution are as follow:
E[ (a+b)/2 | min(x),max(x) ] = ( min(x) + max(x) ) / 2
E[ (b-a) | min(x),max(x) ] = ....
Actually, I have not been able to reliably calculate the expectation of a 1/beta variable. I'm not 100% sure that it's stable, pretty sure, but not 100%.
I guess you could just use median as your canonical point estimate ..
Your point estimate for (a+b)/2 will still be ( min(x) + max(x) ) / 2
Your point estimate for (a-b) will be as follows:
(a-b)_hat = ( max(x)-min(x) ) * ( 1 / qbeta( .5 , n-1 , 2 ) )
Here follow some sample values:
array( c(n,1/(qbeta(.5,n-1,2))) , c(9,2) )
[,1] [,2]
[1,] 2 3.414214
[2,] 3 2.000000
[3,] 4 1.627942
[4,] 5 1.457323
[5,] 6 1.359527
[6,] 7 1.296159
[7,] 8 1.251770
[8,] 9 1.218947
[9,] 10 1.193692
array( c(n,1/(qbeta(.5,n-1,2))) , c(9,2) )
[,1] [,2]
[1,] 20 1.089930
[2,] 30 1.058556
[3,] 40 1.043411
[4,] 50 1.034490
[5,] 60 1.028611
[6,] 70 1.024444
[7,] 80 1.021336
[8,] 90 1.018930
[9,] 100 1.017011
array( c(n,1/(qbeta(.5,n-1,2))) , c(9,2) )
[,1] [,2]
[1,] 200 1.008448
[2,] 300 1.005620
[3,] 400 1.004210
[4,] 500 1.003366
[5,] 600 1.002804
[6,] 700 1.002402
[7,] 800 1.002101
[8,] 900 1.001868
[9,] 1000 1.001681
So, as you would expect, as n -> infinity, we converge to the maximum likelihood estimate of (b-a). The asymptotic formula appears to be
(b-a)_hat = ( max(x) - min(x) ) * ( 1 + 1.68/n )
and let's say that this formula is valid for n > 20.
INITIAL THOUGHTS ABOUT THE LIKELIHOOD, FOR POSTERITY OR SOMETHING; I DON'T KNOW
If you imagine the (a,b) plane, where "a" and "b" are your lower and upper bound, respectively; the curves of constant likelihood are all of the form b-a=constant. Those lines are also clipped by a = min(x) and b = max(x). So, given a hypothetical value of (b-a), all values of (a+b)/2 in [max(x)-(b-a)/2, min(x)+(b-a)/2] are possible and equally likely. Now, there is a small not-actual-paradox. You would think that the inferred (a+b)/2 should be focused near the sample mean. However,
# Scott interpreted Michael's email to suggest that the following three functions can be added to the software page:
# x[i] ~ uniform(m, w), where m is the midpoint and w is the width of the distribution
nextvalue.uniform <- function(x) {n=length(x); w=(max(x)-min(x))/beta(length(x)-1,2); m=(max(x)-w/2)+(w-(max(x)-min(x)))*uniform(0,1); return(uniform(m-w/2, m+w/2))}
parameter.uniform.width <- function(x) return((max(x)-min(x))/beta(length(x)-1,2))
parameter.uniform.midpoint <- function(x) {w=(max(x)-min(x))/beta(length(x)-1,2); return((max(x)-w/2)+(w-(max(x)-min(x)))*uniform(0,1))}
# c-box for the width of a uniform distribution (one of its two parameters)
some = 1e4
par(mfrow=c(5,5))
nn = round(runif(5*5,2,25))
mm = c(-20,-1, 0, 1, 20)
ww = c(0.01, 0.1, 1, 4, 8)
for (i in 1:5) for (j in 1:5) {
a = mm[[i]] - ww[[j]]/2
b = mm[[i]] + ww[[j]]/2
n = nn[[(i-1)*5 + j]]
ccfTrk = array(0, some)
for (kk in 1:some) {
x = runif(n,a,b) # rbeta( n , 1 , 1 )*(b-a)
qLoc = ( max(x) - min(x) ) / (b-a)
ccfTrk[kk] = 1 - qLoc^(n-1) * ( n - (n-1)*qLoc )
}
plot(c(0,1),c(0,1),type='l',col=2)
lines( sort( ccfTrk) , (1:some)/some)
title(paste(n,'~U(',a,',',b,')'))
}
# but what is the c-box anyway? Is it F(b-a) for the distribution width
par(mfrow=c(1,1))
a = 1
b = 10
n = 20
x = runif(n,a,b)
bma = 0:10000/500
q = ( max(x) - min(x) ) / rep(b-a,length(bma))
ccf = 1 - q^(n-1) * ( n - (n-1)*q )
plot(bma, ccf)
bma[which.max(ccf)]
Michael later added:
Attached are images of Monte-Carlo samples for confidence distributions on (a,b) for the following data generated from a uniform distribution:
> x
[1] 1.037293 3.547301 1.007426 3.193071
The "exact" sample was generated from the exact confidence distribution exposited above. The "approx" sample was generated by inferring mean and standard deviation *as though* x were normal, and then transforming those moments into lower and upper bound for a uniform distribution.
Also, the blue lines mark the sample min and sample max values of "x".
As you can tell, even for sample-size four, the moment-based approximate confidence distribution is not *terrible* when compared to the exact confidence distribution. It just ... goes outside the lines a bit, implying (a,b) values that we know are impossible.