Contents:
1. What is ValidateIS?
2. Who can use ValidateIS program?
3. How to install ValidateIS?
4. How to use ValidateIS?
5. How to make your own modifications?
6. Where to report bugs?
1. What is ValidateIS?
ValidateIS is an efficient simulaton program for validating
instantaneous sampling as a method of estimating activity durations,
when continous activity or behaviour logs are available. It can also
be used for analyzing pilot studies and determining appropriate
sampling intervals for later experiments.
ValidateIS calculates several evaluation measures which reflect the
accuracy of IS estimates. It is also possible to output only the error
distributions for plotting. In addition, the user can add her or his
own measures to the program.
ValidateIS differs from random simulation programs by simulating all
possible samplings, not just a random subset of them. In practice,
instantaneous sampling is simulated from all possible starting points,
treating the time series cyclically. As a result, validateIS gives
more accurate results than random simulation programs. Still, the
program is efficient, because the simulation is implemented by dynamic
programming.
2. Who can use ValidateIS program?
ValidateIS has the following copyright, which is also included to the
source file:
------------------------------------------------------------------------
(C) Copyright 2012-2015 Wilhelmiina Hämäläinen.
The code can be freely used for academic/research purposes. Direct
or indirect use for commercial advantage is not allowed without
written permission from the author.
The code can be modified and redistributed if this note is left
intact.
-------------------------------------------------------------------------
If you use it for your own publications, please refer to the original
publication:
Wilhelmiina Hämäläinen, Salla Ruuska, Tuomo Kokkonen, Saana Orkola,
Jaakko Mononen: Measuring behaviour accurately with instantaneous sampling:
A new tool for selecting appropriate sampling intervals. To appear 2016
(details later).
Source code:
W. Hämäläinen: ValidateIS (software).
https://sites.google.com/site/whsivut/home/sourcecode/isvalidate
2012-2015.
3. How to install validateIS?
ValidateIS is written for Linux/Unix with gcc compiler. Simply run command
"make". It may work well in other environments and/or with other compilers,
but you have to find it out yourself.
4. How to use ValidateIS?
4.1 Basic syntax
The basic syntax is
validateIS -n<name file> -a<activity code>
Here, the name file lists log files which are used as input (see
4.2 below) and the activity code defines what behaviour or activity is
studied. The activity code should be a natural number between 1 and 255.
The program outputs first some statistics from the input files. For
all log files the range of time stamps, frequency, average
length and standard deviation of the bout lenght and break length, and
proportion of the selected activity. In the end, program outputs
combined summary statistics.
Then the program outputs evaluation measures for all tested sampling
interval lengths Delta (30s, each minute between 1 and 30 min (60s and
1800s), every 10th minute between 30min and 120min (1800s and
7200s). The basic measures calculated over all log files are
avg(avgmagn_i) average of the file-specific average error magnitudes
stdev(avg(avgmagn_i)) standard deviation of the file-specific average
error magnitudes
avg(stdevmagn_i)) average of the file-specific standard deviations of the
error magnitude
min(magn) minimum error magnitude
max(magn) maximum error magnitude
med(avgmagn_i) median of the file-specific average error magnitudes
For more evaluation measures and output modification see 4.3 below.
4.2 Input file format:
The program input consists of log files. The log file names are given in a
separate name file, whose name is given as a command line parameter. The
name file is a text file which contains one log file name on each line.
Each log file is a text file of the following format:
time1 code1
time2 code2
...
So, on each row give the time stamp (in seconds, from a desired
starting point) and the activity code (between 1 and 255).
Note that the log files should be in the same directory where you
execute the program unless you have defined the whole path names in
the name file.
4.3 Extra options
The program output can be modified by extra options:
-e <maxerr> outputs the probability (confidence) that the error magnitude
is at most maxerr. E.g. -e0.10 gives the probability that the error
magnitude is at most 10%. maxerr should be in [0.0,1.0] (i.e. 0-100%).
-p <minp> outputs the minimum error magnitude (confidence bound) which
holds with probability (confidence) minp. E.g. -p0.90 gives the minimum
upperbound for the error magnitude such that at least 90% error magnitudes
are less than the upperbound. minp should be in ]0.0,1.0].
-d <Delta> outputs only the error magnitude distribution for the given
activity and IS interval Delta. Delta should be a positive integer
(interval length in seconds). Note: If Delta is not long enough to
catch any occurrence of the activity, then the error magnitude of 1.0
(100%) is output.
-v<vmode> LOO-crossvalidates PAC-models with the given maxerr and minp.
Two modes:
vmode=1 the model holds in the trainings sets as a whole
vmode=2 the model holds in all separate training sets
-x outputs also avg(err) and stdev(err) (signed error, not its magnitude).
Note: Average error should be 0 or IS makes systematic error.
-o calculates your own measures, defined in own.c
-B uses biassed estimates for variance and stdev (default unbiassed estimates)
-f includes the first activity (may be incomplete and produce unrealistic
results)
5. How to make your own modifications?
You can define your own measures in files own.c and own.h. To call the
measures, give option -o.
6. Where to report bugs?
If you find any bugs, please report to Wilhelmiina Hämäläinen,
wilhelmiina.hamalainen@gmail.com.