Welcome!

Important Update, October 31, 2023

This website and the version of Damon it describes were last current in 2020.  Since then, I have been hard at work building related applications and reworking Damon from scratch.  When released, the new Damon will:

In the meantime, I have removed most of the resources on this website to start the transition to the new Damon, which will probably migrate to a new website.  

I don't have a completion date, but progress is steady.

-- Mark Moulton

****************************************************

Description

Pythias Consulting (meaning me, Mark Moulton) offers open-source software (Damon) for analyzing multidimensional tabular datasets.  It also offers consulting in the application of Damon to problems in psychometrics.  This website is about the legacy version of Damon, written and maintained from 2010 to 2018, with important updates in 2020 and 2023.  

Damon predicts missing cells and replaces observed values with "most likely" values.  Developed in the field of psychometrics (measurement of mental traits, especially in the field of education), Damon is also applicable to problems in statistics, prediction, and data analysis.  Some examples:

Mathematical Features

Usability Features

Limitations

Methodology

Damon implements a generalized alternating least squares (ALS) algorithm for decomposing matrices into arrays of row and column coordinates of a specified rank or dimensionality determined using Rasch-like objectivity criteria. Data arrays may be rectangular, and they may have an arbitrarily large number of missing cells.  Data may be nominal, dichotomous, ordinal, interval, and ratio, and a given dataset may contain a mix of data types and metrics.  The optimal number of dimensions (dimensionality) is found by using Damon to predict pseudo-missing cells for each of a specified range of dimensionalities and selecting that dimensionality which:  a) maximizes the accuracy of the predictions, and b) maximizes the stability of the coordinate structure.  To the degree Damon cell estimates match the original observations under these conditions, they are said to be "objective" -- likely to reproduce across different samples of row and column entities and unlikely to suffer from the problem of "overfit".  Damon supports the editing of datasets to optimize their objectivity.

Damon is classified in the field of psychometrics as a multidimensional generalization of the Rasch model with similar objectivity properties.  In the field of data mining, it is a generalized form of alternating least squares matrix decomposition.  It is similar to decomposition algorithms used successfully in the famous Netflix competition that concluded in 2009.