Post date: Jul 05, 2021 8:50:47 PM
The variance of a data set of intervals should be different from the variance for a Dempster-Shafer structure of exactly the same intervals. See the bottom of the Moments page for a discussion of why this is. See also Variance-extreme distributions for some very simple numerical examples with two intervals that illustrate this.
The realisation that Dempster-Shafer structures and collections of intervals in general have different upper variances arose in considering Marco's 10 intervals described in a FAQ about moments:
[1, 3]
[1.1, 2.9]
[1.19999, 2.8]
[1.3, 2.70001]
[1.39999, 2.60001]
[1.5, 2.5]
[1.6, 2.4]
[1.69999, 2.3]
[1.8, 2.20001]
[1.89999, 2.10001]
If these intervals are equally weighted focal elements of a Dempster-Shafer structure, its biggest possible variance comes from mixing together 10 little Bernoulli distributions over those respective ranges, which gives the variance as 0.3850034. The left figure below shows in blue where the masses would be.
On the other hand, if we apply Vladik’s variance algorithms to the 10 intervals as though they are imprecise data points—and we interpret the intervals to have zero variance—the largest possible variance would be about 0.3849054, which is very slightly smaller. It is the variance of 10 point masses at {3, 2.9, 1.19999, 2.70001, 1.39999, 1.5, 1.6, 1.69999, 2.20001, 1.89999} which extremise the variance among all distributions with a single point mass per interval. The right figure shows in blue where the masses would be in this distribution.
See Dominik’s Matlab code for algorithms to evaluate maximal variances. See also some stupid code to search for distributions that extremize variance for the simplest problems with two intervals.
This post has been edited with corrections by Marco. The original post and corrections are archived below.
Scott originally said:
I had a little crisis when I realised that the variance of a data set of intervals should be different from the variance for a DSS of exactly the same intervals.
There are some questions for Marco on FAQ on the first of the two questions about moments. It seems to me that the biggest possible variance for Marco’s 10-element DSS comes from mixing together 10 Bernoulli distributions over those respective ranges, which gives the largest variance as 0.385, much smaller than Marco’s value. The left figure below shows in blue where the masses would be. Are there are other distributions that could yield bigger variance for the mixture?
On the other hand, it seems to me that if we apply Vladik’s variance algorithms to the 10 intervals as though they are data…and we interpret the intervals to have zero variance…the largest possible variance would be about 0.3825, which is smaller still. The right figure shows in blue where the masses would be in one of the distributions that extremise variance and have one point mass in each interval. At least I think this is what Vladik’s algorithms are assuming. Is that wrong?
I haven’t been able to run Dominik’s Matlab code (evil new VPN). So I made some stupid code to search for distributions that extremize variance for the simplest problems with two intervals. These considerations about very simple numerical examples are recorded at Variance-extreme distributions.
The two FAQ questions involving moments are kind of important, so we can maybe afford some material which readers could use to get a running start on the topic. I added the subpage Moments, with some questions for Dominik, that the FAQ could then summarise or refer to.
The last time I had this kind of a weekend, I was totally and fundamentally wrong about every single idea, so I’m acutely aware I may be on drugs. I’m braced for when you tell me that I’ve completely misunderstood what’s happening. Please don’t hesitate.
Marco replied:
Scott,
I think the confusion comes from the fact that I was not using the population variance. https://sites.google.com/site/fuzzypossrisk/say/variance-extremedistributions
When I compute the population variance I get 0.385 for the ten-element DSS, as the largest variance. However, my masses are not located as attached in momdss1 and momdss2. The masses yielding the largest variance are located at (1, 1, 0, 1, 0, 0, 0, 0, 1, 0), where 0 denotes the left and 1 the right endpoint (interval indexed from bottom to top). With the configuration of figure momdss2, which is (0, 1, 0, 1, 0, 1, 0, 1, 0, 1), I do get 0.3825, but I guess that is because it’s a suboptimal maximum. I am not sure what the configuration in figure momdss2 wants to represent, but I find it counter intuitive to have mass on both endpoints. What is the mass on both ends meant to depict?
In summary the best upper bound on the population variance for this interval dataset is indeed 0.385. This could be computed because of the relatively small sample size using a brute force algorithm. Vladik’s algorithm is not brute force, which makes it applicable to larger datasets but it’s suboptimal (not wrong) when compared with the brute force one. Here suboptimal means that it computes an inner approximation of the bound.
Cheers
Marco
Scott replied:
Marco:
I think the confusion comes from the fact that I was not using the population variance. https://sites.google.com/site/fuzzypossrisk/say/variance-extremedistributions
Of course. I think Ander thought of that, but I re-confused myself.
When I compute the population variance I get 0.385 for the ten-element DSS, as the largest variance. However, my masses are not located as attached in momdss1 and momdss2. The masses yielding the largest variance are located at (1, 1, 0, 1, 0, 0, 0, 0, 1, 0), where 0 denotes the left and 1 the right endpoint (interval indexed from bottom to top).
Meaning 0.3849054, right?
a = c(
interval(1, 3),
interval(1.1, 2.9),
interval(1.19999, 2.8),
interval(1.3, 2.70001),
interval(1.39999, 2.60001),
interval(1.5, 2.5),
interval(1.6, 2.4),
interval(1.69999, 2.3),
interval(1.8, 2.20001),
interval(1.89999, 2.10001))
which = c(1, 1, 0, 1, 0, 0, 0, 0, 1, 0)
A = NULL
for (i in 1:length(a)) A = c(A,ifelse(which[[i]],a[[i]]@hi,a[[i]]@lo))
pvar(A) # 0.3849054
My brute-force search was not brutish enough I guess.
With the configuration of figure momdss2, which is (0, 1, 0, 1, 0, 1, 0, 1, 0, 1), I do get 0.3825, but I guess that is because it’s a suboptimal maximum.
Yes, I was wrong.
I am not sure what the configuration in figure momdss2 wants to represent, but I find it counter intuitive to have mass on both endpoints. What is the mass on both ends meant to depict?
Well, it just represents a little Bernoulli distribution at each interval, half the mass at each endpoint. When you mix together 10 such Bernoulli distributions, the variance is 0.3850034, just slightly larger than the variance for the single-point mass per interval that Vladik assumes is the interpretation of those intervals.
largest = c( 1.00000, 2.90000, 1.19999, 2.70001, 1.39999, 2.50000, 1.60000, 2.30000, 1.80000, 2.10001, 3.00000, 1.10000, 2.80000, 1.30000 2.60001, 1.50000, 2.40000, 1.69999, 2.20001, 1.89999)
pvar(largest) # 0.3850034
It’s possible I suppose that this difference is from round off error somewhere, but I think the principle that the DSS can have a larger variance that the collection of zero-variance intervals is sound, as it seems to be easy to demonstrate with the two interval examples on the posting on Say at Variance-extreme distributions. Unless I’ve made an embarrassing mistake there too.
In summary the best upper bound on the population variance for this interval dataset is indeed 0.385. This could be computed because of the relatively small sample size using a brute force algorithm. Vladik’s algorithm is not brute force, which makes it applicable to larger datasets but it’s suboptimal (not wrong) when compared with the brute force one. Here suboptimal means that it computes an inner approximation of the bound.
Huh? Oh! You mean it’s not conservative?? An inner approximation of the upper bound? That sounds wrong to me. Maybe you mean outer?
Cheers
Marco
Thanks for clearing most of this up, Marco. I’ll try to fix the posts I made.
Cheers,
Scott