UATSA_users

UATSA 2.1

Time Series Analysis Software

University of Arizona

USER’S MANUAL

The University of Arizona, 2009

Main Topics

About the software

UATSA is a Graphical User Interface (GUI) developed at the University of Arizona to be used as a complementary teaching tool in a regular course on time series analysis. It will help students obtain quick numerical and graphical results of some widely known methods on time series analysis in time, frequency and space domains.

UATSA 2.1 was written and compiled as a stand-alone application using MATLAB® 7.6.0 (2008a). This academic software is not intended for commercial use or other non-academic purposes. The software is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and no infringement. In no event shall the authors or copyright holders be liable for any claim, damages or other liability, whether in an action of contract, tort or otherwise, arising from, out of or in connection with the software or the use or other dealings in the software.

Developers

Juan B. Valdés, University of Arizona (jvaldes@u.arizona.edu)

Julio E. Cañón, University of Arizona/Universidad de Antioquia (Medellin, Colombia) (jecanon@udea.edu.co)

Acknowledgements

The authors thank the contributions by Javier Gonzalez (Frequency Analysis) and Miguel Moreno (Kalman Filter). The MSSA routine is based on code developed by Eric Breitenberger (1997).

<Back >

How to install UATSA

UATSA 2.0 was written and compiled using MATLAB® 7.6.0 (2008a). This software, however, works as a stand-alone application independently of MATLAB and can be installed in any PC (Windows supported system) following these steps (information provided by MATLAB user help):

1. Run the executable package uatsa_pkg.exe. This package contains the following files: MCRInstaller.exe and uatsa.exe. A readme file is also provided with the installation procedure described step by step.
2. Run the MCRInstaller.exe.
3. When the MATLAB Component Runtime startup screen appears, click Next to begin the installation.
4. The setup wizard starts. Click Next to continue.
5. The Select Installation Folder dialog lets you choose where you want to install the MCR. This dialog also lets you view available and required disk space on your system. You may also choose whether you want to install the MCR for just yourself or others. Select your options, and then click Next to continue.
6. Confirm your selections by clicking Next. The installation begins. The process takes some time due to the quantity of files that are installed.
7. When the installation is completed, click Close on the Installation Completed dialog to exit.
8. Add the following directory to your system path: <root directory>\MATLAB\MATLAB Compiler Runtime\v78\runtime\win32
9. Run the executable uatsa.exe

<Back >

Structure

The main window of UATSA contains three main features:

1. An upper menu bar to run all the methods and manage files and figure options.
2. An interactive table that displays the analyzed dataset.
3. A panel that displays plots generated by each method.

Main menu

The main menu contains the following options:

Data import

UATSA manages two ways to import data:

1) Opening a file previously processed and stored as an ASCII file (file extensions *.txt and *.dat are allowed).

2) Pasting data directly from the clipboard (data can be copied from Excel worksheets and pasted in UATSA's table by clicking this option).

To facilitate data processing, UATSA manages a single format for all its routines with time series organized in columns as follows:

Edit/save current plot

This option retrieves the figure currently displayed in the current plot panel and allows the user to modify it and save it by using the menu bar and figure tools editor of MATLAB which comes incorporated in the MCRInstaller.

Close

Users can close figures opened out of the main window by clicking the Close figures option. The Close module option will close UATSA and all its figures.

Analytical modules

UATSA includes the following analytical modules:

Exploratory data analysis: The module on exploratory data analysis provides analytical tools and graphical outputs to examine the statistical structure of time series. Its aim is to familiarize users with powerful, yet simple tools to visualize and quantitatively characterize time series. As with all the modules, data can be either imported from ASCII files written in specified formats or directly pasted in a table created in the interactive panel. The module allows the analysis of both single and multiple data series which may have missing values. Users can choose among several plot options (histograms, autocorrelograms, periodograms, box whisker and phase state diagrams) to represent the data series.

After the visual inspection, users may choose among several probability distribution functions to fit the data using the Kolmogorov-Smirnov confidence test as a criterion of acceptance or rejection (Hann, 2002). A probability plot with the results of the test is generated in the main panel. Users can also detrend the data series using linear regression, differencing and smoothing techniques (an option is provided to detrend the series with the module of analysis and synthesis in the frequency domain). The module provides also tools for data augmentation and gap filling using smoothing and cross correlation techniques for multiple time series.

Analysis in the time domain: Once the data's main features have been identified, users can proceed with the module on time domain analysis and synthesis (see figure 2) which provides tools to visualize the autocorrelation functions (ACF) and partial ACF (PACF) for univariate and multivariate analysis to identify ARMA models and to generate annual and seasonal series (for a detailed description of the methods see Bras and Rodriguez-Iturbe, 1985). Each analysis is divided in three main steps: 1) verification of normality and stationarity conditions, 2) determination of the ARMA model that best fit the observed data and 3) generation of synthetic series.

Users must check first that series are normally distributed and stationary before choosing and running the appropriate ARMA model. For univariate annual analysis the module generates a window containing five plots (figure 2d): the first two, representing the series and its distribution over a normal probability scale, serve to check the assumptions of normality and stationarity; the third and forth plots, representing the autocorrelograms of both the series and its residual after subtracting the ARMA component, and the fifth plot, representing the residual in a normal probability scale, help to determine what kind of autoregressive model better fits the data (i.e., the closer the residual values are to a normal distribution the better the model fits the series). If data are neither stationary nor normally distributed, users may subtract any significant trends and periodic components using either the exploratory module or the frequency domain module to obtain the residual series to work with.

Once the ARMA model that best fit the series has been identified, users can generate time series of any given length. The result is an output file containing the generated series and its statistical properties and a figure showing the generated and historical series and their respective autocorrelograms (figure 2e)(the subtracted trends and periodic components must be added back to these generated series). For the annual multivariate case the module additionally generates a comparative bar plot of the multiple time series chosen (figure 2h).

In addition to verify normality and stationarity conditions, seasonal analyses require users to define, for every season considered, the probability distribution that best fits the seasonal data (currently the module only incorporates normal and lognormal distributions). The module displays the generated time series as well as the mean seasonal values of both the observed and generated series bounded by a confidence region representing one standard deviation above and below the historical mean seasonal values (figure 2g). Once the different distribution functions for all the seasons and series have been determined, users can generate correlated ARMA series. Users must review the results by comparing the generated and observed series and checking that first and second order statistical properties are preserved.

Univariate analysis: This module allows you to make single site time series analysis in a seasonal or annual basis using different generation models (AR(1), MA(1), ARMA(p,q)). The module assumes your data is normally distributed and stationary in order to generate the synthetic series (the module of singular spectrum analysis may help preprocessing the signal by subtracting its deterministic structure).

Multivariate analysis: This module allows you to make multiple site time series analysis in a seasonal or annual basis using AR models. The module requires your data to be normally distributed and stationary to generate the synthetic series.

Analysis in the frequency domain: The module of analysis in the frequency domain aims at helping users identify, decompose and extract meaningful information from time series, such as trends and periodical components. It includes three mathematical tools: Fast Fourier Transform (FFT), Multichannel Singular Spectrum Analysis (MSSA) and wavelets.

Fast Fourier Transform (FFT): The procedure is divided in four main steps: 1) frequency identification, 2) frequency decomposition, 3) extraction of significant components and 4) analysis of the residual. In the first step the module generates the FFT periodogram to identify significant frequencies in the series. In the second step, users choose the window length or embedding dimension to decompose the series in its main frequencies. The relative weights of these components (i.e., trends and periodicities) are represented by their eigenvalues plotted in descendent order (figure 3a). W-correlations of the full decomposition eigenvalues are also depicted in a grayscale matrix from full correlation (black) to zero correlation (white) (figure 3b). These graphs allows users to identify and choose significant components in terms of their correlation values.

In the third step users may extract and plot the significant components (i.e., trends and periodicities) that were previously decomposed. The module generates phase state plots of paired harmonics to refine the identification of those components of the spectrum that are in phase-quadrature (figure 3c). A diagram shows the power spectrum of each selected component. In the fourth step, users subtract those previously identified significant components to obtain a residual series which is supposed to represent a random process. The module generates a plot that superimposes the sum of significant components to the original series and plots also the residual series and its spectrum (figure 3d). Users may repeat the process over the residual series until it exhibits no significant frequencies in its power spectrum.

Multichannel Singular Spectrum Analysis (MSSA): is a data-adaptive, nonparametric technique that allows the analysis of multiple signals (channels) simultaneously and the extraction of their common significant trends and oscillations (for detailed discussions on MSSA methodology and its applications see Vautard et al., 1992; Allen and Smith, 1996; Golyandina et al., 2001; and Ghil et al., 2002). The procedure in the module—adapted from a code written by Breitenberger (1997)—is divided in four steps: 1) identification of signal structure, 2) signal decomposition, 3) extraction of significant components and 4) reconstruction of significant components and analysis of the residual. Users must choose the window size or embeded dimension to do the analysis and decide which components are significantly different from red noise, which is calculated automatically by generating Monte Carlo simulations of AR(1) processes with the same first and second order statistics of the series. Series must have the same length and be concurrent (single series can also be analyzed in which case the procedure is known simply as SSA). The module displays the eigenvalues against the variance explained and their dominant frequencies, pairs of eigenvectors in decreasing order and the observed and reconstructed series using significant components selected by the user (figures 4a and b).

Wavelets: are useful tools in the time-frequency domain for the analysis of non-stationary variances at different time scales. Wavelets decomposition helps identify dominant modes of variability and changes of variability itself with time (Torrence and Compo, 1998). The procedure in the module allows users to: decompose, reconstruct and de-noise a signal in several levels or scales using different wavelet families as filters (i.e., Deubechies, Symlets) (figure 5). Users define the wavelet filter and the levels to scale or de-noise the signals.

Spatial analysis: Time series in hydrology are generally associated with processes and variables that are distributed in space. Therefore, the analysis of time series is not restricted to the time and frequency domains but includes the spatial analysis of patterns and features evolving through time. This module aims at introducing techniques that are useful for analyzing spatially-related attributes associated with distributed time series. The module includes principal components, canonical correlation and clustering analyses.

Principal components analysis: PCA is a non-parametric technique that transforms a multidimensional set of correlated variables into a set of components that are uncorrelated (orthogonal), with each component explaining a specific amount of variance from the original variables (usually the components are organized in decreasing order of variance explained, being the first components those that explain the most of the variance). The technique is commonly employed to reduce the dimensionality of problems that involve huge datasets (i.e., spatial analysis of distributed variables) by focusing only on those components that explain a significant part of the variance (for a detailed description of PCA see Jolliffe, 2002).

The PCA procedure incorporated in the module (figure 6) allows users to obtain the eigenvalues and their explained variances (in a Pareto diagram, figure 6a) and the eigenvectors and principal components of a set of observations on several variables (the module centers automatically the variables on their mean values). The module may be employed either to determine principal components of several variables in the same time sequence or to determine principal components of a single variable uniformly distributed in space (each case requires a specific format to introduce the data). If the spatial option is enabled, users can also plot the distribution of mean values and the coefficients of the eigenvectors in the space by specifying the properties of the grid to be plotted (i.e., number of grids in latitude and longitude, grid size or resolution and the coordinates of the bottom left corner). Border lines representing basin limits can also be mapped. Figure 6d shows the spatial plots generated by the module for the mean values and the first three EOFs of the analyzed variable (precipitation in the Colorado Basin).

Canonical correlation: CCA is a multivariate statistical technique that correlates multiple dependent variables with multiple independent variables by creating canonical functions from linear composites of the dependent and independent variables (canonical variables) and maximizing the correlation among them (for a detailed description of the method see Hair et al., 1998). The CCA procedure incorporated in the module calculates the eigenvalues and eigenvectors of the cross correlation of two matrices X(n x m) and Y(n x k) which may have different number of sites or variables but the same number of observations. The module normalizes the two sets of variables X and Y introduced by the user and calculates the matrices of canonical correlation coefficients (linear combinations of variables in X and Y) and the matrices of transformed canonical variables (U and V in figure 7). The module produces a plot of the transformed variables as shown figure 7.

Cluster analysis: Cluster analysis encompasses several statistical techniques employed to classify or group data into different sets according to similarity measures (i.e., correlation coefficients). Clustering may be particularly useful to visualize spatial features associated with distributed variables (for detailed descriptions of clustering techniques see Everitt et al., 2001). The analysis currently available in the module uses correlation coefficients as measure of similarity for hierarchical and k-means clustering. It performs the following steps: 1) standardize variables, 2) compute similarity measure (Euclidean distances for correlation coefficients), 3) group data with similar minimum distances (correlations) and 4) plot the clusters in a dendrogram (figure 8a) and, if the spatial option is enabled and coordinates are provided, map the clusters (figure 8b). The user chooses the threshold distance to define the clusters.

State-space models: A state-space model is a mathematical representation of a system defined by inputs, outputs and state variables related by first-order differential equations. Stochastic systems, as those commonly found in hydrology, are characterized by the presence of statistical noise in the system's dynamics and measures. It is, therefore, necessary to filter these noises to estimate the behavior underlying these stochastic systems. The Kalman filter (described in detail by Welsh and Bishop, 2006) is a popular algorithm to efficiently estimate the most likely state of a stochastic system by recursively filtering the noise from observed signals (usually assumed as Gaussian). The module on state-space models currently implemented includes (figure 9): a procedure for verifying the stability and convergence of Kalman filter parameter estimates given that both the observed and generated (system) series are known—an ideal situation used to illustrate the application and efficiency of the filter—and a Kalman filter to estimate the state of the system based on observed time series only.

<Back >

Exploratory Analysis

This module accepts one single column time series at a time. you must first save your time series in a single column format in a txt file with the prefix “stat_” followed by an ID name for your series, as shown the following example:

The program produces plots showing: 1) Data series; 2) Series box plot indicating the mean, standard deviation, the 25^th and 75^th percentiles and outliers; 3) Series' histogram; 4) Series' autocorrelogram; 5) Series' power spectrum in a semilog scale; 6) Distribution of the transformed values in plotting position. Some representative values of the series (mean, variance, skew and first correlation coefficient) are also included.

Tables summarizing the statistical properties of the series are generated in a separate window. The user can copy this results and paste them in Excel, for instance.

<Back >

Analysis in the frequency domain

This module accepts a single column time series and its ordinal timeline, if any, in a separate column (optional).

Time series

3.35

4.39

1.20

3.64

1.047e+001

3.94

8.55e+000

1.610

…

Timeline

1982

1983

1984

1985

1986

1987

1988

1989

…

Fast Fourier Transformation (FFT)

The periodgram in step 1 is calculated using Fast Fourier Transformation (FFT). Use this periodgram to identify representative frequencies in the original series before doing step 2:

In step 2 the original series for a chosen window length or embedding dimension L is decomposed in its singular values (see Golyandina et al., 2001, for a detailed description of basic SSA).

The relative weights of these components, represented by their eigenvalues in descendent order, are plotted as shown in the following graph:

In this case, the first component represents the trend of the series and the two leading paired components (2-3) and (4-5) represent possible harmonics.

W-correlations of the full decomposition eigenvalues are also depicted in a matrix of grayscale from full correlation (black) to zero correlation (white). This graph allows you to determine how many components are representative in terms of their correlation values.

In this example, the first five components show high correlations. The first one is caused by the trend in the series. The other two pairs (2-3 and 4-5) describe harmonics, and the other components are likely describing a white noise process.

After having done the decomposition using the selected embedded dimension, in step 3 is possible to make the extraction of trends and periodicities among components, as shown the following graphs (in this example, the first six components have been chosen).

As shown the figure below, the first component is extracting the trend of the series, while the next four are displaying a well defined periodicity. The sixth component, however, is not showing any regular pattern.

The scatterplots of paired harmonic vectors (components) in step 3 allows you to identify which components of the spectrum are in quadrature (forming a closed circle). In this example, components 2 – 3 and 4 - 5 are shown such a quadrature (see the figure below).

A third diagram shows the power spectrum of each selected component.

Step 4 subtracts those components you identify in step 3 as representative of trends and periodicities in the series. As a result of this subtraction, a new series of residue values is created and stored as file UATSA\results\ ssa_residuals_<username>.txt.

The upper part of the plot shown below represents the sum of the leading components after the decomposition process (including trends and harmonics) that you choose in step 4 of the module. The sum of leading components in red overlaps the original series in blue in order to get you an idea of how representative these components are.

In the bottom left side, you will find the residue obtained after subtracting the sum of the trend and periodic components you choose from the original series. Subtraction of periodic components may continue until the residue exhibits no representative frequencies in its power spectrum, plotted here to the right. In this example, an identifiable frequency remains in the residue.

Additional components may be extracted from the residue series by repeating the procedure over the residue.

<Back >

Analysis in the time domain

The module allows you to choose between two alternatives to generate series based on annual records and seasonal records for a single site.

Annual Generation Model

To run this module, you must first save your time series in a univaran_*.txt file using a two column format ([year data]) as shown the following example:

You must first check that your data is normally distributed and stationary before choosing and running the appropriate generation model. The module gives three specific generation models, all of them frequently used in the analysis of hydrological time series: AR(1), MA(1) and ARMA(1,1). A general form of ARMA(p,q) is also available and can be used instead of the first three specific models (see Haan, 2002, and Bras and Rodríguez-Iturbe, 1985, for a detailed description of ARMA models and multivariate analysis).

The first two graphs, representing your series and its distribution over a normal probability plot, allow checking the assumption of normality and stationarity. If your series is not stationary nor normally distributed you may use the SSA module to subtract the periodic components and obtain the residue series.

The third and forth graphs, representing the autocorrelograms of both the series and its residue after subtracting the ARMA(p,q) component respectively, help in determining what kind of autoregressive model better fits to your data.

The fifth graph represents the residue plotted over a normal probability scale. In this graph, the more close the residue values are to the normal distribution the better is the model used to fit your series.

Once you have determined the ARMA(p,q) model that best fit your series, you will be ready to generate series for the number of years you have chosen. As a result of these calculations you will obtain a plot showing the generated series and the historical data, as well as their autocorrelograms, as shown the following figure.

The last graph you generate will be saved as univaran_plot_<name>.jpg in UATSA\results\. The last generated series will also be saved in the same folder in a file univaran_results_ <name>.txt.

You must keep in mind that these files will change every time you run the module to generate new series. If you want to save many realizations you must save each file with a different name before running the module again.

Seasonal Generation model

To run this module, you must first save your time series in a univarseason_*.txt file using a spacing column format [year season1 season 2 … season n] as follows:

The module gives you the option to transform the original data series to its logarithms, if this transformation helps fitting your series to a lognormal distribution function.

The following are examples of plots produced by this module that let you check what kind of distribution may better describe your seasonal data: 1) seasonal series; 2) Histograms; 3) Normal Distribution plots; 4) Lognormal Distribution plots.

I Time Series

III Normal Distribution Plot

II Histograms

IV Lognormal Distribution Plot

In the case of the example given, the best fit is obtained when a logarithmic transformation is applied to the original data (straights lines in figure IV). Histograms indicate a positive skew of the series.

The following is an example of the kind of graphical output you will obtain from this module: the first plot shows a range of the generated series (when synthetic series are long the program only shows part of the series). The second plot to the left shows the mean seasonal values of the historical series and the generated series. Lines representing one standard deviation over and below the historical mean seasonal values are also included.

<Back >

Multivariate analysis

The first window of the module allows you to choose between two alternatives to generate series based on annual records and seasonal records for multiple sites.

Multivariate annual generation

To run this module, you must first save your time series in two separate multivaran_<username>.txt files using a single column format ([data]) as follows:

Files must have the same length of record (number of data) for the same years. The module does not accept missing data

The module generates two plots:

I) Comparative bar plot of the two time series, when you choose the data files.

II) Generated series for two sites, when you calculate values.

Multivariate Seasonal model

This version allows you to choose between normal (labeled 0) and lognormal (labeled 1) distributions. The file must look like the one showed in the figure below for four sites (in rows) and 12 seasons (in columns). Remember: 0 represent normal distribution while 1 represents lognormal distribution

To run the module, please follow the instructions included in the figure below:

Once you choose your data files, the module will plot a box diagram for each site, indicating the seasonal values of the mean and the 25^th and 75^th percentiles. The blue lines will show the extension of data and dots will indicate outliers (sample points that are beyond 1.5 times the interquartile range of the sample). The figure below illustrates such a plot for one site:

In the step to verify normality and stationarity the module will plot the seasonal values on normal scales indicating normal distribution in blue dots and lognormal distribution in green dots.

Once you have introduced the different distribution functions for all the seasons and sites in the file PDF_multivarseason.txt, you can generate a series that follows an AR(1) model.

The results will be then plotted in box plots like those obtained in the previous stage for the original data, as is shown in the figure below.

The generated series will be also plotted per site, as shown the figure below.

You must be aware of the results, comparing the generated series with the historical series, and checking that statistics are preserved.

<Back >

Empirical Orthogonal Functions (EOF)

This module allows you to obtain the eigenvalues, eigenvectors and principal components of a set of observations on several variables.You can use the module either to determine principal components of several variables in a time sequence or to determine principal components of a single variable in time series distributed uniformly in the space. Variables must be organized in columns with observations (timeseries) in rows. A column representing time may also be included (the program assigns a timeline by default).

If data are spatially distributed, each row represents the observations (time series) of the analyzed variable in a specific site. Sites must be ordered according with the grid that they represent, starting with one (1) on the bottom left corner and ending in the upper right corner.

The general grid must be regular (in which case the user must provide the entire matrix of datasets) or irregular (in which case the user provides the coordinates of the variables in space). Remember that this version does not accept series with missing data (NaN).

Once the input file has been introduced, you must indicate if data represents a spatial variable.

The variance explained by each eigenvector and the cumulative variance explained are plotted together in a pareto diagram.

The principal components, obtained after applying the eigenvectors to the standardized variables, will be plotted in step 4.

If the spatial option is enabled, you can plot also the distribution of the mean values and the coefficients of the eigenvectors in the space.

Users must specify the properties of the grid to be plotted (i.e. number of grids in latitude, number of grid in longitude, grid size or resolution in grade, and the coordinates of the bottom left corner).

Users can also include a border line representing basin limits. The borderline coordinates should be organized in two columns: first column for latitude and second column for longitude.

More than one basin borderline can be represented using the same two-column array. Only be sure that the coordinates are organized in such a way that each border line is closed at the end of the sequence, using [NaN NaN] row to separate borderlines.

Remember: the coordinates must be consistent with the coordinates assigned to the bottom left corner of the grid in the module. The module works with the following coordinate nomenclature:

Latitude = [-90 to 90]

Longitude = [-180 to 180]

The example file included in this manual represents monthly precipitation data in the Colorado River Basin, distributed over a regular grid of 0.5° x 0.5° (bottom left coordinates: 116°W, 31°N)(International Research Institute for Climate Prediction (IRI) (2004). M.New prcp terrestrial dataset (1900-1998)). The number of grids in latitude is 28 and the number of grids in longitude is 23 for a total of 28x23=644 grids ordered in rows, from the bottom left to the upper right coordinates, as mentioned above. Data in columns represent months from 1900 to 1998 (12x98=1176 columns). A file colorado_border.txt with the borderline coordinates of the Colorado Basin is also included.

<Back >

Canonical Correlation

This module allows you to obtain the eigenvalues and eigenvectors of the cross correlation of two sets of variables you introduce to the module by following the steps indicated in the main box shown below.

The input files must be matrix arrays with columns representing variables and rows representing observations.

The Matlab function canoncorr is employed in this module (the following description is taken from Matlab help documentation, see references at the end).

This function computes the sample canonical coefficients for the n-by-d1 and n-by-d2 data matrices X and Y.

X and Y must have the same number of observations (rows) but can have different numbers of variables (columns). A and B are d1-by-d and d2-by-d matrices, where d = min(rank(X),rank(Y)).

The jth columns of A and B contain the canonical coefficients, i.e., the linear combination of variables making up the jth canonical variable for X and Y, respectively. Columns A and B are scaled to make the covariance matrices of the canonical variables the identity matrix. Matrices must be full rank to obtain valid results.

Before running this module, you must save the two sets of variables you want to compare in two separate files of the form cc_<username>.txt. In these files, the variables must be organized in columns and the observations in rows.

Observations

Variables

30 135 1.5

45 100 1.8

39 96 1.2

48 105 1.7

…

Remember: these two files must have the same number of observations (rows) but can have different numbers of variables (columns).

In step 2, the module normalizes the two set of variables introduced in step 1. In step 3, the module calculates the canonical correlation coefficients A and B and the transformed canonical variables U and V. The module produces a plot of the transformed variables U vs. V as shown the figure below.

The coefficients A and B, the transformed values U and V, the general correlation coefficients r, and the generated plot will be saved as six independent files in UATSA\results\ as follows:

cc_userdata_A.txt

cc_userdata_B.txt

cc_userdata_U.txt

cc_userdata_V.txt

cc_userdata_r.txt

cc_plot_userdata.jpg

<Back >

Cluster analysis

<Back >

Example Files

Some example files have been prepared with the intention of let you gain proficiency both in managing the structure of the input files and in using the different modules of the software. The following files have been stored in the directory UATSA\data:

Sample statistics

stat_example1.txt

Singular Spectrum Analysis

ssa_maunaloa.txt

Univariate analysis of time series

Seasonal: univarseason_example.txt

Annual: univaran_example1.txt

Multivariate analysis of time series

Seasonal: multiseason_data1.txt to multiseason_data5.txt (five files for five different sites)

Annual: multian_example1.txt multian_example2.txt

Empirical Orthogonal Functions (EOF)

eof_colorado.txt (see EOF for an explanation of how to use this file)

col_bd.txt (border line of the Colorado River Basin)

eof_manly.txt (taken from Manly, 1986)

Canonical Correlation Analysis:

cc_data1.txt (taken from Manly, 1986)

cc_data2.txt (taken from Manly, 1986)

<Back >

References

Bras, R.L., Rodriguez-Iturbe, I. (1985). Random Functions and Hydrology. Addison Wesley. U.S.A.

Golyandina, N., Nekrutkin, V., Zhigljavsky, A. (2001). Analysis of Time Series Structure: SSA and Related Techniques. Chapman & Hall/CRC. U.S.A.

Haan, C.T. (2002). Statistical Methods in Hydrology. Second Edition. Blackwell Publishing. U.S.A.

MATLAB (2004). Software Help Documentation and Reference Guide. The Mathworks Inc. USA.

Manly, Bryan F.J. (1986). Multivariate Statistical Methods: A Primer. Chapman and Hall. U.K.

<Back >