Damon Objects

To analyze a dataset, you need to turn it into a "damon object".  That is what Damon() does.  (The parentheses just mean that Damon is a class, method, or function which takes parameters.)  Damon() loads data, formats it, and allows you to apply a series of "methods" to the dataset in a specific order.  There is one dataset per Damon object, and typically each method is used once within that Damon object.

Create a Damon Script Module
So far we have written our commands in Python's IDLE interpreter.  This is fine for experimentation and short commands but unwieldy for something like Damon.  The alternative is to write a Python script, called a "module".  The usual way to do this is to click on File/New Window in the IDLE menu, and save the resulting text as a python file (with a .py extension).  These text files are where you write your Python programs.

In the case of Damon, the simplest thing is to use Damon's template.py module.  This pre-loads the necessary modules and packages and contains a handy cheat-sheet of Damon methods (also accessible by importing damon1 and typing >>>  help(damon1) ).

Go to the IDLE interpreter and click on File/Open Module.  In the box, type damon1.templates.blank .  (Notice you don't need to add the .py extension; Python knows what you mean.)  A text window with a bunch of text will pop up.  

damon1.template


"""template.py
Template for writing Damon programs.

Copyright (c) 2009 - 2011, [Developer Name] for [Company Name].

Purpose:  

Damon Version:  
Python Version:
Numpy Version:

License
-------
This program references one or more software modules that are
under copyright to Pythias Consulting, LLC.  Therefore, it is subject
to either the Gnu Affero General Public License or the Pythias
Commercial License, a copy of which is contained in the current
working directory.

How To Use
----------
You can run Damon from the prompt interactively on IDLE, or
you can run it from scripts.  This template is for writing Damon

  |
  |
snip
  |
  |
  V

"""

import os
import sys

import numpy as np
import numpy.random as npr
import numpy.linalg as npla
import numpy.ma as npma

try:
    import matplotlib.pyplot as plt
except ImportError:
    pass

import damon1 as damon1
import damon1.core as dmn
import damon1.tools as dmnt


# Start programming here...


Save the blank.py file (File/Save As) into a directory of your choice under a different name.  Call it my_script1.py.  (If you skip this step, you lose your pristine blank.py file and will have to create a new one, no big deal.)

Now scan through the text of this file.  You will notice it contains some information followed by a "cheat-sheet" of Damon methods in the order you will use them (if you need to).  You can read those later.  For now, let me point out a few things.  First, this information is in a green text block, demarcated by triple quotes (""")at the top and bottom.  This tells Python to ignore the text for all the intervening lines; it is just documentation.  You can also use the "#" pound sign to signify documentation for a given line.

After the instructions there is a series of import statements.  We import from Python, Numpy, and a few other packages.  You may add to these import statements later if you want to use other Python packages.

A Note on Style
Before writing any code, you should know that there is a definite "style" for writing Python code (I learned this the hard way).  It is documented at PEP 8 -- Style Guide for Python.  Example:  All variables, functions, and methods are given names in lower case with connecting underscores.  CamelCase (upper-lower case) is reserved for classes.  So create_data() and base_est() are lower case while Damon() is capitalized.

Create an Array
Now we start programming.  The first thing to do is perform the necessary imports by running the module.  Hit the F5 key on your computer.  This is the Python command for "Run".  Now Numpy and the other modules are loaded and ready to use.

To simulate data, let's create a numpy array of random integers called a.  This uses the numpy.random module, which I abbreviated as npr in the import statement.  We could just use a = npr.rand(10,8), but I added the RandomState(seed=1) bit to make sure that every time I run the script it creates the same set of random numbers (corresponding to Seed #1).

Create an array

import os
import sys

import numpy as np
import numpy.random as npr
import numpy.linalg as npla
import numpy.ma as npma

try:
    import matplotlib.pyplot as plt
except ImportError:
    pass

import damon1 as damon1
import damon1.core as dmn
import damon1.tools as dmnt


# Start programming here...

# Create a 10 x 8 array of random numbers with seed 1
a = npr.RandomState(seed=1).rand(10,8)

# Print out on IDLE screen.  The "\n" character means "new line".
print 'a=\n',a

# Make the array easier to read using numpy's set_printoptions().
np.set_printoptions(precision=2,suppress=True)
print 'a=\n',a,'\n'



Save and hit F5.  Here is what pops up on the IDLE screen.

IDLE Array Display

a=
[[  4.17022005e-01   7.20324493e-01   1.14374817e-04   3.02332573e-01
    1.46755891e-01   9.23385948e-02   1.86260211e-01   3.45560727e-01]
 [  3.96767474e-01   5.38816734e-01   4.19194514e-01   6.85219500e-01
    2.04452250e-01   8.78117436e-01   2.73875932e-02   6.70467510e-01]
 [  4.17304802e-01   5.58689828e-01   1.40386939e-01   1.98101489e-01
    8.00744569e-01   9.68261576e-01   3.13424178e-01   6.92322616e-01]
 [  8.76389152e-01   8.94606664e-01   8.50442114e-02   3.90547832e-02
    1.69830420e-01   8.78142503e-01   9.83468338e-02   4.21107625e-01]
 [  9.57889530e-01   5.33165285e-01   6.91877114e-01   3.15515631e-01
    6.86500928e-01   8.34625672e-01   1.82882773e-02   7.50144315e-01]
 [  9.88861089e-01   7.48165654e-01   2.80443992e-01   7.89279328e-01
    1.03226007e-01   4.47893526e-01   9.08595503e-01   2.93614148e-01]
 [  2.87775339e-01   1.30028572e-01   1.93669579e-02   6.78835533e-01
    2.11628116e-01   2.65546659e-01   4.91573159e-01   5.33625451e-02]
 [  5.74117605e-01   1.46728575e-01   5.89305537e-01   6.99758360e-01
    1.02334429e-01   4.14055988e-01   6.94400158e-01   4.14179270e-01]
 [  4.99534589e-02   5.35896406e-01   6.63794645e-01   5.14889112e-01
    9.44594756e-01   5.86555041e-01   9.03401915e-01   1.37474704e-01]
 [  1.39276347e-01   8.07391289e-01   3.97676837e-01   1.65354197e-01
    9.27508580e-01   3.47765860e-01   7.50812103e-01   7.25997985e-01]]
a=
[[ 0.42  0.72  0.    0.3   0.15  0.09  0.19  0.35]
 [ 0.4   0.54  0.42  0.69  0.2   0.88  0.03  0.67]
 [ 0.42  0.56  0.14  0.2   0.8   0.97  0.31  0.69]
 [ 0.88  0.89  0.09  0.04  0.17  0.88  0.1   0.42]
 [ 0.96  0.53  0.69  0.32  0.69  0.83  0.02  0.75]
 [ 0.99  0.75  0.28  0.79  0.1   0.45  0.91  0.29]
 [ 0.29  0.13  0.02  0.68  0.21  0.27  0.49  0.05]
 [ 0.57  0.15  0.59  0.7   0.1   0.41  0.69  0.41]
 [ 0.05  0.54  0.66  0.51  0.94  0.59  0.9   0.14]
 [ 0.14  0.81  0.4   0.17  0.93  0.35  0.75  0.73]] 



Here, a is our data array.  Now we convert it into a Damon object.  To do this, type >>> help(dmn.Damon.__init__) to get a copy of the Damon.__init__ options and paste it into my_script1.py, making sure to include the closing parentheses.  The result is shown below under "Initialize Damon Object".  Notice that I have switched from the "core" abbreviation to "dmn".  See that when pasting into a script file, all the messy wrap-arounds go away.  The script file basically has no right margin.

Syntax
Let's look at some of the syntax:  

    d = dmn.Damon(data=???,format_=???,workformat='RCD_dicts',validchars=None,...,argn)

This pattern is very common.  We define a Damon object d to be the outcome of running dmn.Damon() using a set of "arguments" or parameters inside parentheses.  (Actually, it's the result of running dmn.Damon.__init__(), but Python doesn't need the __init__.)  

The leading arguments do not have anything after the equal sign.  You have to specify these.  In this case, we make data = a (the name of the data array we just created) and format_ = 'array' (indicating that a is an array).  (The "format_" argument has a trailing underscore to differentiate it from the Python function of the same name.)  The remaining arguments have parameters after the equal sign.  These are defaults you can override, which I am doing in the case of rowkeytype and colkeytype, to force the keys to be integers.

In this case, the d = dmn.Damon(...) statement you just pasted looks messy.  That's because I've given each argument its own line and added in-line documentation (following the "#" character) showing the options for each argument.  But you can get the same result more simply:

    d = dmn.Damon(data=a,format_='array',rowkeytype=int,colkeytype=int), or even 
    d = dmn.Damon(a,'array',rowkeytype=int,colkeytype=int)         # You can leave out the argument names if they are in order

I could get away with specifying just the first two arguments if all the defaults were what I want.  In this case, though, I want to override the rowkeytype and colkeytype arguments to be integer instead of string. 

In general, you will want to override some of the Damon arguments.  That means you need to know what the options are.  To get a full explanation of the options and syntax for each argument, type >>> help(dmn.Damon.__init__) and browse the "Arguments" section.  All the information you need is there.

After building the Damon object, we tell Python to print its row labels, column labels, and core data.  We also tell it to print out the data for Row 1 and Column 5 using the core_row and core_col Damon attributes.  

Initialize Damon Object

# Create a Damon object
d = dmn.Damon(data = a,    # [<array, file, [file list], datadict, Damon object, hd5 file>  => data in format specified by format_=]
              format_ = 'array',    # [<'textfile', ['textfiles'],'array','datadict','datadict_link','datadict_whole','Damon','hd5','pickle'>]
              workformat = 'RCD_dicts',   # [<'RCD','whole','RCD_whole','RCD_dicts','RCD_dicts_whole'>]
              validchars = None,   # [<None,'Guess',['All',[valid chars],<'Num',omitted>],['Cols',{'ID1':['a','b'],'ID2':['All'],'ID3':['1.2 -- 3.5'],'ID4':['0 -- '],...}]>]
              nheaders4rows = 0,  # [number of columns to hold row labels]
              key4rows = 0,   # [<None, nth column from left which holds row keys>]
              rowkeytype = int,     # [<None, type of row keys>]
              nheaders4cols = 0,  # [number of rows to hold column labels]
              key4cols = 0, # [<None, nth row from top which holds column keys>]
              colkeytype = int,     # [<None, type of column keys>]
              check_dups = 'warn',   # [<None,'warn','stop'> => response to duplicate row/col keys]
              dtype = [object,None], # [[type of 'whole' matrix, <None, int number of decimals>], e.g. ['S60',8],[object,None] ]
              nanval = -999,    # [Value to which non-numeric/invalid characters should be converted.]
              missingchars = None,  # [<None, [list of elements to make missing]>]
              miss4headers = None, # [<None, [[list of elements to make missing in headers]>]
              recode = None, # [<None,{0:[[slice(StartRow,EndRow),slice(StartCol,EndCol)],{RecodeFrom:RecodeTo,...}],...}>]
              cols2left = None,    # [<None, [ordered list of col keys, to shift to left and use as rowlabels]>]
              selectrange = None,   # [<None,[slice(StartRow,EndRow),slice(StartCol,EndCol)]>]
              delimiter = ',',  # [<None, character to delimit input file columns (e.g. ',' for .csv and '  ' for .txt tab-delimited files)]
              pytables = None,    # [<None,'filename.hd5'> => Name of .hd5 file to hold Damon outputs]
              verbose = True,    # [<None, True> => report method calls]
              )

print 'd.collabels =\n',d.collabels
print 'd.rowlabels =\n',d.rowlabels
print 'd.coredata =\n',d.coredata
print 'Row 1 =\n',d.core_row[1]
print 'Column 5 =\n',d.core_col[5]



Save and hit F5.  Here is what pops up on the IDLE screen.

Damon Object on IDLE

Building Damon object...

Rows in coredata: 10
Columns in coredata: 8 

Damon object has been built.
Contains:
['missingchars', 'core_row', 'verbose', 'dtype', 'rl_col', 'validchars', 'whole_row', 'coredata', 'recode', 'path', 'colkeytype', 'collabels', 'data_out', 'fileh', 'key4rows', 'nheaders4rows', 'cols2left', 'workformat', 'miss4headers', 'cl_row', 'rowkeytype', 'rl_row', 'selectrange', 'nanval', 'format_', 'nheaders4cols', 'core_col', 'pytables', 'rowlabels', 'check_dups', 'cl_col', 'delimiter', 'whole_col', 'whole', 'key4cols'] 

d.collabels =
[[ 0.  1.  2.  3.  4.  5.  6.  7.  8.]]
d.rowlabels =
[[  0.]
 [  1.]
 [  2.]
 [  3.]
 [  4.]
 [  5.]
 [  6.]
 [  7.]
 [  8.]
 [  9.]
 [ 10.]]
d.coredata =
[[ 0.42  0.72  0.    0.3   0.15  0.09  0.19  0.35]
 [ 0.4   0.54  0.42  0.69  0.2   0.88  0.03  0.67]
 [ 0.42  0.56  0.14  0.2   0.8   0.97  0.31  0.69]
 [ 0.88  0.89  0.09  0.04  0.17  0.88  0.1   0.42]
 [ 0.96  0.53  0.69  0.32  0.69  0.83  0.02  0.75]
 [ 0.99  0.75  0.28  0.79  0.1   0.45  0.91  0.29]
 [ 0.29  0.13  0.02  0.68  0.21  0.27  0.49  0.05]
 [ 0.57  0.15  0.59  0.7   0.1   0.41  0.69  0.41]
 [ 0.05  0.54  0.66  0.51  0.94  0.59  0.9   0.14]
 [ 0.14  0.81  0.4   0.17  0.93  0.35  0.75  0.73]]
d.core_row[1] = Row 1 =
[ 0.42  0.72  0.    0.3   0.15  0.09  0.19  0.35]
d.core_col[5] = Column 5 =
[ 0.15  0.2   0.8   0.17  0.69  0.1   0.21  0.1   0.94  0.93]
>>> 


IDLE prints out (optionally, depending on whether verbose = True):
  1. The progress of Damon's initialization process as it's occurring
  2. The "attributes" of the newly created Damon object
We see that the new Damon object contains a lot of elements, called "attributes", information that Damon methods may eventually need.  The three most important attributes are 'collabels', 'rowlabels', and 'coredata'.  Here we see that Damon automatically assigned numerical row and column labels to our data array a.  It did so because the nheaders4rows and nheaders4cols arguments were (by default) set at 0, meaning that a did not originally have any row or column headers.  Damon assigned them because most Damon methods require each row and column entity to be labeled.

Had array a already contained these labels, we would have set nheaders4rows = 1 and nheaders4cols = 1.

Row and Column Keys
Damon requires that each row and column contain a unique identifier or key.  This identifier will be included in the row and column headers (same as "labels"), but the headers may contain additional information.  For instance, if the rows are students, the rowlabels array may contain for each student a unique student key, the student's classroom, and an ethnicity code.  The key4rows argument tells which column in the rowlabels array contains the student key, where counting starts at 0 (meaning "the first column").  The key4cols argument tells which row in the collabels array contains the unique item identifier.

All Elements in Numpy Arrays Have the Same Type
Why are the labels parsed out from the data?  Aside from the obvious fact that labels can't be analyzed, Numpy requires that all elements in an array have the same type -- typically string (text), int (integer), or float (decimal).  The core data will typically (but not necessarily) be numerical while the labels are int or string.  When an array contains mixed character types, Numpy assigns it the most general type that can contain them all.  Inserting an 'a' into an array of numbers will convert all numbers into strings, (e.g., ['a','1','2','3']).  The most general array type possible is called "object", which can contain anything.  In Damon, the most commonly used types are 'S60' (string, with a maximum of 60 characters per cell), int (integer), and float (decimal).

Because the rowlabel and collabel arrays may be string whereas the unique row and column keys may be integer, Damon contains arguments called rowkeytype and colkeytype.  If rowkeytype = int, this means that whenever Damon extracts row keys, it will convert them to integers before doing anything with them.  Damon is smart enough to convert to integer only those keys that label actual data.  That means it won't be thrown when the header of the row keys column is a string, like "Student_ID".  

Damon is pretty good at handling the type of the row and column keys.  There are some scenarios where it can get confused -- when columns are assigned new names by the parse() method, for instance, or when subscale() adds new columns.  Begin by using your preferred type for the keys.  If you run into trouble, specify rowkeytype and colkeytype as string (e.g., 'S60'). 

Damon Object Attributes
As mentioned, all of the various elements of the Damon object are stored as "attributes" and can be accessed with dot notation.  Damon automatically prints out a list of newly created attributes every time it's run.  Look again at the bottom of my_script1.py.  We accessed the desired arrays using the following syntax:  my_obj.obj_attribute  :

    d.collabels         => the column labels in the d Damon object
    d.rowlabels        => the row labels in the d Damon object
    d.coredata          => the core data in the d Damon object
    d.core_row[1]     => the Row 1 data in the d Damon object
    d.core_col[5]       => the Column 5 data in the d Damon object

Notice that we are able to capture the data in any individual row or column using its key ("1" and "5" are not indexes here; they refer to unique row/column keys).  In this case the keys happen to be integers that were assigned automatically.  Usually, they will come from the dataset itself and consist of student names and item descriptors. 

Objects versus Dictionaries
A Python object is just a specialized Python dictionary with a slightly different syntax.  With a dictionary, you access an entity's data by putting its key in square brackets [...]:

>>> my_dict = {'rowlabels':np.array(['SID','Mark','Armeni','Lani','Jen'])}
>>> my_dict['rowlabels']
np.array(['SID','Mark','Armeni','Lani','Jen'])

With an object, you use dot notation (object.attribute) and drop the quotes:
>>> d.rowlabels
np.array(['SID','Mark','Armeni','Lani','Jen'])

So similar are objects and dictionaries that you could access rowlabels using the d object's __dict__ attribute (automatically assigned by Python):
>>> d.__dict__['rowlabels']
np.array(['SID','Mark','Armeni','Lani','Jen'])

So, in the example above, 'core_row' is a Python dictionary whose keys are accessed using square bracket [...] notation, but is itself an attribute of the d Damon object and accessed using dot notation -- d.core_row.  Damon uses both forms of notation routinely depending on whether it is referring to objects or dictionaries.

Build Another Array and Damon object
Let's do another example.  In my_script1.py, add the following code:

New Damon Object

# Create a new array
b = np.array([['SID','Item1','Item2','Item3','Item4','Item5'],
              ['Mark',0,'M',1,0,0],
              ['Armeni',1,1,'M',0,0],
              ['Lani',0,'M',0,1,0],
              ['Jen',0,1,1,1,1]])

print 'b=\n',b

# Initialize b as a new Damon object.
# All non-valid characters are converted to -999.
b_obj = dmn.Damon(data = b,
                  format_ = 'array',
                  validchars = ['All',[0,1],'Num'],
                  nheaders4rows = 1,
                  key4rows = 0,
                  rowkeytype = 'S10',
                  nheaders4cols = 1,
                  key4cols = 0,
                  colkeytype = 'S10',
                  nanval = -999
                  )

# Print some outputs
print 'b_obj.collabels =\n',b_obj.collabels
print 'b_obj.rowlabels =\n',b_obj.rowlabels
print 'b_obj.coredata =\n',b_obj.coredata
print "b_obj.core_row['Armeni'] = Armeni =\n",b_obj.core_row['Armeni']
print "b_obj.core_col['Item4'] = Item4 =\n",b_obj.core_col['Item4']


Hit F5 to run my_script1.py and review the IDLE outputs.  The missing cells marked 'M' have automatically been converted to a numerical "not-a-number value" (called a nanval in Damon).  The core_row attribute enabled us to call up Armeni's data.  The core_col attribute enabled us to call up the 'Item4' data.

New Object Attributes in IDLE

Building Damon object...

Rows in coredata: 4
Columns in coredata: 5 

Damon object has been built.
Contains:
['missingchars', 'core_row', 'verbose', 'dtype', 'rl_col', 'validchars', 'whole_row', 'coredata', 'recode', 'path', 'colkeytype', 'collabels', 'data_out', 'fileh', 'key4rows', 'nheaders4rows', 'cols2left', 'workformat', 'miss4headers', 'cl_row', 'rowkeytype', 'rl_row', 'selectrange', 'nanval', 'format_', 'nheaders4cols', 'core_col', 'pytables', 'rowlabels', 'check_dups', 'cl_col', 'delimiter', 'whole_col', 'whole', 'key4cols'] 

b_obj.collabels =
[['SID' 'Item1' 'Item2' 'Item3' 'Item4' 'Item5']]
b_obj.rowlabels =
[['SID']
 ['Mark']
 ['Armeni']
 ['Lani']
 ['Jen']]
b_obj.coredata =
[[   0. -999.    1.    0.    0.]
 [   1.    1. -999.    0.    0.]
 [   0. -999.    0.    1.    0.]
 [   0.    1.    1.    1.    1.]]
b_obj.core_row['Armeni'] = Armeni =
[   1.    1. -999.    0.    0.]
b_obj.core_col['Item4'] = Item4 =
[ 0.  0.  1.  1.]
>>> 

 

'datadicts' versus 'Damon objects'
There is one other thing we need to cover.  If you look at the contents of the Damon object b_obj created above, you will see an attribute called 'data_out'.  This is a Python dictionary containing all the Damon object outputs.  That means there are really two ways of accessing the Damon object outputs:
  • Using dot notation, as "attributes" of b_obj
  • Using dictionary notation, as dictionary "keys" of b_obj.data_out
Here is what the two syntaxes look like:

Damon.Attribute vs 'DataDict'

# Access data elements using the DamonObj "attribute" method
print "\n\nDamonObj 'attribute' method:"
print "b_obj.collabels =\n",b_obj.collabels
print "b_obj.rowlabels =\n",b_obj.rowlabels
print "b_obj.coredata =\n",b_obj.coredata
print "Look up Armeni: b_obj.core_row['Armeni'] =\n",b_obj.core_row['Armeni']
print "Look up Item4: b_obj.core_col['Item4'] =\n",b_obj.core_col['Item4']

# Access the same information using the DamonObj.data_out "DataDict" method
print "\n\n'DataDict' method"
print "b_obj.data_out['collabels'] =\n",b_obj.data_out['collabels']
print "b_obj.data_out['rowlabels'] =\n",b_obj.data_out['rowlabels']
print "b_obj.data_out['coredata'] =\n",b_obj.data_out['coredata']
print "Look up Armeni: b_obj.data_out['core_row']['Armeni'] =\n",b_obj.data_out['core_row']['Armeni']
print "Look up Item4: b_obj.data_out['core_col']['Item4'] =\n",b_obj.data_out['core_col']['Item4']


Save and hit F5.  Here is what it looks like in IDLE:

IDLE Display

DamonObj 'attribute' method:
b_obj.collabels =
[['SID' 'Item1' 'Item2' 'Item3' 'Item4' 'Item5']]
b_obj.rowlabels =
[['SID']
 ['Mark']
 ['Armeni']
 ['Lani']
 ['Jen']]
b_obj.coredata =
[[   0. -999.    1.    0.    0.]
 [   1.    1. -999.    0.    0.]
 [   0. -999.    0.    1.    0.]
 [   0.    1.    1.    1.    1.]]
Look up Armeni: b_obj.core_row['Armeni'] =
[   1.    1. -999.    0.    0.]
Look up Item4: b_obj.core_col['Item4'] =
[ 0.  0.  1.  1.]


'DataDict' method
b_obj.data_out['collabels'] =
[['SID' 'Item1' 'Item2' 'Item3' 'Item4' 'Item5']]
b_obj.data_out['rowlabels'] =
[['SID']
 ['Mark']
 ['Armeni']
 ['Lani']
 ['Jen']]
b_obj.data_out['coredata'] =
[[   0. -999.    1.    0.    0.]
 [   1.    1. -999.    0.    0.]
 [   0. -999.    0.    1.    0.]
 [   0.    1.    1.    1.    1.]]
Look up Armeni: b_obj.data_out['core_row']['Armeni'] =
[   1.    1. -999.    0.    0.]
Look up Item4: b_obj.data_out['core_col']['Item4'] =
[ 0.  0.  1.  1.]
>>> 


In other words, the two syntaxes yield the same outputs.  So why the duplication?  Why include data_out at all?  The first answer is that sometimes it is easier to access entities using regular Python dictionaries rather than the Object.Attribute method.  Object attributes carry some extra baggage that can be confusing.  The second answer is that to save some overhead every Damon method outputs a stripped down data dictionary that Damon calls a 'datadict', and these datadicts are assigned to the Damon object as attributes.  For instance,
  • Running my_obj.standardize() creates a datadict called my_obj.standardize_out
  • Running my_obj.coord() creates my_obj.coord_out
  • Running my_obj.base_est() creates my_obj.base_est_out
  • Initializing the my_obj Damon object creates, as we saw, my_obj.data_out
You see the pattern.  The outputs are Python dictionaries which are accessed using the square-bracket syntax.  Thus, my_obj.standardize_out['coredata'] returns the coredata array in the standardize_out dictionary, which is attached to my_obj.

All datadicts are required to contain certain keys ('coredata' for instance), and all are readily convertible into Damon objects (using Damon's format_ argument) should the need arise.  More information about Damon object attributes and output datadicts can be found in the Damon.__init__ documentation.

This is enough to give you some flavor for what Damon() does.  Take it further by reading the documentation ( >>> help(core.DamonObj.__init__) ) and practicing with small datasets.  It is quite a powerful utility and will allow you to read and format a lot of extremely messy data.  Take the time to get used to it.