NOUS Object-Oriented Statistics

Links to NOUS News Articles on Objectivity

Terminology
Object-Oriented Statistics (OOS) is a term coined by Mark Moulton (see History) to refer to any model that requires "objectivity" as a condition of fit between the observations in a dataset and the model estimates corresponding to those observations.  NOUS is a specific flavor of OOS developed by Moulton and colleagues and embodied in Damon and an earlier algorithm called NOUS.  NOUS views entities as "points floating in space", and these points (to the degree they really are points and not mushy or malformed blobs) are called "objects".

The discussion that follows lapses into the language of psychometrics, specifically educational testing, but the concepts are generalizable to any field.  "Person" means "row entity".  "Item" (i.e., question) means "column entity".  Data, in this context, is what happens when row entities interact with column entities.  It could be pixels in a photograph, price points over time, burglary rates in different neighborhoods.

Objectivity (a.k.a. "specific objectivity") is a term that was employed by Danish mathematician Georg Rasch to describe a property he considered to be highly desirable in the field of psychometrics and educational testing.  This property is that the statistics used to describe persons are the same regardless of the sample of items used to calculate them, and the statistics used to describe items are the same regardless of the sample of persons to which they are applied.  A test can only be considered "fair" if it has this property.  The Rasch model is a tool for determining whether tests have this property, and if they do not, to edit them until they do.

This makes sense, right?  For a test to be "fair", we must be able to assume that each examinee is exposed to exactly the same items.  Otherwise, we could not be sure that an examinee passed based on real ability or because the items happened to be easy.  Examinees rarely experience the "same test".  They may accidentally skip items or be administered a test form with different items on it.  Rasch's model offered a way to compute person statistics that are pretty much the same regardless of the sample of items administered to each person.  This makes it possible to compare examinees as though they had taken the same test, even if they were in fact exposed to different items.

OOS is a Goal, Not an Algorithm
It is important to realize that the Rasch model does not itself calculate object-oriented statistics.  Just running the program will not do it.  What it does is tell the analyst whether a given test meets the standard of objectivity.  Where it does not, the model produces cell estimates that differ significantly from the corresponding cell observations.  This is called "misfit".  To the degree a dataset does not "fit" the model, it is "not objective" according to Rasch's criterion.  To the degree data do match the model's estimates, the resulting student and item measures are objective, and will be the same regardless of which subset of items or persons are used to compute them (with some important caveats I won't go into here).

How is it used?  A testing company field-tests a bunch of items on a bunch of students.  They run the data through a Rasch program and look for items with high misfit statistics.  These are dropped from the analysis, as well as misfitting persons, after investigation of causes of misfit.  The process is repeated until none of the persons or items misfit above a certain threshold.  The item measures are locked in ("anchored") at those values, and only these items are included on live tests.  This increases the likelihood of a "fair" test.  

It is still likely (indeed inevitable) that some examinees misfit the live test, and this will show up in live test fit statistics.  What this means is that the measure calculated for that student is to some degree "meaningless".  He may get a score, but really it is kind of uninterpretable.  (That doesn't stop us from giving him a score anyway, but it should!)

What is true of the Rasch model is also true of the NOUS model behind Damon.  Damon does not itself compute objective statistics.  It shows whether you succeeded in doing so and offers a pathway for approaching objectivity.  Some models have this property; most do not.  Models that do not have this property depend for their validity on the sample of entities being "representative" of the population.  They yield "sample-dependent" statistics.  OOS (objective) models do not require representative samples.  They yield "sample-independent" or "sample-free" statistics. 

OOS Models
  • Rasch Model
  • NOUS (both Damon and its first incarnation)
  • Some other matrix decomposition methods, such as Singular Value Decomposition, so long as these methods are adapted to handle misfit, randomly and non-randomly missing values, correct determination of rank (dimensionality), and some other things
non-OOS Models
  • Multivariate Regression
  • Correlations, Means, Standard Deviations, ANOVA, etc.
  • Neural Networks
  • Bayesian Methods (to a degree)
These classifications are off-the-cuff.  A proper determination of the "objectivity" properties of various methodologies has not been done.  
I hasten to point out there's nothing "wrong" with non-OOS methods.  They simply require the statistical power of representative samples whereas OOS methods do not.

Rasch vs. Damon
The Rasch model gets its objectivity property by requiring that items reside in a 1-dimensional space.  Items differ only in terms of difficulty on some defined dimension, and in no other respect.  (So long as items are in the same dimension, it doesn't matter that persons are multidimensional.)  Thus, a math test has to be composed only of math items, and they should be the same type of math item.  A test of upper body strength should consist only of tasks that test upper body strength.  Also, items need to be positively correlated with each other.

Damon's model generalizes the Rasch model to multidimensional spaces (without the requirement for positive correlation -- a complex topic not treated here).  Items must reside in the same D-dimensional space, and differ only in their location in that space.  Unlike Rasch, a test may contain a mix of math, language, and science items, but each item must partake of at least a little math, a little language, a little science, and they should not introduce foreign elements unshared by the rest of the items, like fluency in French.   It does not matter that persons are more complex than the items; ability in areas besides math, language, and science are automatically filtered out so long as the items are constrained to the math/science/language space. 

Whereas Rasch reports one statistic per entity (person ability or item difficulty), Damon reports an array of spatial coordinates for each entity.  Rasch looks at the world through the prism of a yardstick.  Damon looks at the world through the prism of spaces.  Humans view the physical universe according to three spatial dimensions.  Damon views it according to 10 dimensions, or 100, whatever is necessary to catch the shared richness of the items.

To the degree all items participate in the same D-dimensional space, Damon's objectivity properties kick in:
  • Persons map into the same spatial location (as coordinates) regardless of the items used to map them.
  • Items map into the same spatial location regardless of the persons used to map them.
  • The results of one analysis can be applied to another that participates in the same space.
  • And, let us not forget, predictions of values for missing cells will be accurate, in this sense:  
    Damon's cell prediction will approximate the value obtained by averaging a large number of independent observations for that cell, if such were possible.
Projection into Subspaces
Damon has a special method for handling items that do not fit into the same space as the remaining items.  It projects that item into the subspace erected by the other items, filtering out all extraneous dimensions without disturbing the subspace in any way.  This has many useful applications.

Science vs. Description
Object-Oriented Statistics represents a paradigm, not a methodology per se.  Most statistical methods are descriptive, not intended to generalize beyond a given population.  While valuable, description is not science.  

Science wants the intrinsic properties of objects.  Descriptive statistics do not provide these.  Science wants invariant laws that relate objects to each other in predictable ways.  Descriptive statistics do not establish invariance; predictability depends on representative samples.  Science wants to provide a basis for technology -- the invention of systems that act in a predictable manner according to the intrinsic properties of their components.  Descriptive statistics are unsafe for developing technology.  

Object-oriented statistics do not suffer the limitations of descriptive statistics.  By viewing the universe as the interaction of objects containing intrinsic properties, and by discovering and tabulating those properties and how they interact with other objects, object-oriented statistical methods derive invariant laws and provide a foundation for safe and predictable technologies in fields that, like education, have been resistant to scientific treatment.

Damon is a step in that direction.