Here is what it looks like to run Damon in an interactive Python interpreter. To write reusable programs, you type the same commands into a python script (a text file with a .py extension). In this interactive session, we:
Running Damon is not hard but it does require some comfort with coding in Python and a way to get information about how to use Damon methods. If you're willing to climb that learning curve, Damon is as powerful and customizable as you want it to be.
# Import Numpy and Damon, plus some tools>>> import numpy as np>>> import damon1.core as dmn>>> import damon1.tools as dmnt# Create a 100 x 80 (rows x cols) artificial dataset. Make it 3-dimensional. Add noise. Make 10% of cells missing.>>> created = dmn.create_data(100, 80, ndim=3, noise=5, p_nan=0.10)create_data() is working...Number of Rows= 100Number of Columns= 80Number of Dimensions= 3Data Min= -9.097Data Max= 8.604Proportion made missing= 0.098Not-a-Number Value (nanval)= -999.0create_data() is done.Contains:['fac0coord', 'model', 'fac1coord', 'data', 'anskey'] # "d" is a Damon object containing "data", with noise and missing values.>>> d = created['data']# "m" is a Damon object containing "model" values, no noise or missing values.>>> m = created['model']# Here's a snippet of what the data looks like. -999 means missing.>>> print d.coredata[[ -1.45 -0.43 -3.52 ..., 1.96 0.66 -999. ] [ -2.3 0.97 2.91 ..., -3.91 2.88 -2.49] [-999. -1.1 -999. ..., 0.24 0.17 -1.16] ..., [ 0.7 2.7 1.83 ..., 4.01 5.92 0.54] [ -0.17 -1.6 2.51 ..., -0.36 -4.02 -0.37] [ -2.34 -1.29 -1.16 ..., -999. -2.67 -999. ]]# Run the coord() method to get coordinates for some optimal dimensionality between 1 and 7. The coordinates are not displayed but stored "inside" the Damon object.>>> d.coord(ndim=[range(1, 7)])coord() is working...Getting best dimensionality...1..2..3..4..5..6..Dim Acc Stab Obj Speed Err 1 0.439 0.889 0.625 0.432 2.383 2 0.638 0.936 0.773 0.87 2.044 3 0.794 0.955 0.871 0.996 1.617 4 0.791 0.821 0.806 0.88 1.628 5 0.773 0.698 0.735 0.813 1.692 6 0.719 0.617 0.666 0.796 1.882 Best Dimensionality = 3 Seed Acc Stab Obj Speed Err 1 0.794 0.955 0.87 0.985 1.617 2 0.794 0.955 0.87 0.988 1.617 3 0.794 0.955 0.87 0.996 1.617 Best coordinate seed is 3 , out of 3 attempts.Warning in coord()/seed(): Unable to find starting coordinates that meet your 'seed' requirements. It is possible the dataset cannot yield the desired objectivity.Dim Fac Iter Change jolt_3 0 0 1.0001 3 1 0 1.0001 3 0 1 0.05837 3 1 1 0.05837 3 0 2 0.03156 3 1 2 0.03156 3 0 3 0.00237 3 1 3 0.00237 3 0 4 0.00019 3 1 4 0.00019 coord() is done -- see my_obj.coord_outContains:['ndim', 'fac1coord', 'anchors', 'changelog', 'facs_per_ent', 'fac0coord'] # Using the coordinates, calculate an estimate for each cell, including missing.>>> d.base_est()base_est() is working...base_est() is done -- see my_obj.base_est_outContains:['nheaders4cols', 'key4rows', 'nheaders4rows', 'rowlabels', 'validchars', 'rowkeytype', 'coredata', 'colkeytype', 'nanval', 'collabels', 'key4cols', 'ecutmaxpos'] # Here are the estimates. Missing cells are filled in.>>> estimates = d.base_est_out['coredata']>>> print estimates[[-1.22 1.59 -0.69 ..., 0.74 2.23 3.55] [-2.06 0.67 2.29 ..., -1.2 0.93 -2.18] [ 0.47 -1. 0.22 ..., -0.94 -1.66 -2.26] ..., [-0.63 1.9 3.78 ..., 3.17 4.74 0.32] [-0.37 -0.98 0.64 ..., -1.97 -1.96 -3.24] [-3.9 0.76 -0.92 ..., -4.41 -1.05 -0.15]]# Here are the original "true" values, i.e., the model values without noise or missing values.>>> true = m.coredata>>> print true[[-0.55 1.22 -1.42 ..., 1.11 2.36 3.42] [-2.3 0.57 2.16 ..., -1.56 1.03 -1.34] [-0.15 -0.65 0.93 ..., -1.11 -1.38 -2.06] ..., [-0.7 1.85 3.78 ..., 2.81 5.07 -0.21] [-0.47 -0.7 1.11 ..., -1.56 -1.57 -2.37] [-4.19 0.36 -0.76 ..., -4.7 -0.87 0.64]]# How well do the estimates match the true values? A 0.976 correlation.>>> est_v_true = dmnt.correl(estimates, true)>>> print est_v_true0.97596437021# How well did Damon predict the true missing values? The correlation is almost as high!>>> missing = d.coredata == -999>>> est_v_true_missing = dmnt.correl(estimates[missing], true[missing])>>> print est_v_true_missing0.975683152898**************************************************# The above looks like a lot of code but it's not. # Here is what it looks like to calculate estimates without all the printouts (controlled by verbose=None):>>> created = dmn.create_data(100, 80, ndim=3, noise=5, p_nan=0.10, verbose=None)>>> d = created['data']>>> d.coord(ndim=[range(1, 7)])>>> d.base_est()>>> print d.base_est_out['coredata'][[-1.22 1.59 -0.69 ..., 0.74 2.23 3.55] [-2.06 0.67 2.29 ..., -1.2 0.93 -2.18] [ 0.47 -1. 0.22 ..., -0.94 -1.66 -2.26] ..., [-0.63 1.9 3.78 ..., 3.17 4.74 0.32] [-0.37 -0.98 0.64 ..., -1.97 -1.96 -3.24] [-3.9 0.76 -0.92 ..., -4.41 -1.05 -0.15]]# In just 5 lines, we created, analyzed, and got cell predictions for a 3-D dataset.