Title: Causality Networks From Multivariate Time Series
Title: Causality Networks From Multivariate Time Series
Abstract: In the analysis of multivariate time series (signals), the first objective is the estimation of the connectivity structure of the observed variables (or subsystems), where connectivity is also referred to as interdependence, coupling, information flow or Granger causality. Depending on the type of analysis one wants to pursue, which may be restricted by the size of the data, one selects a connectivity measure to estimate the driving-response connections among the observed variables. For example, if the multivariate time series is very short, one would rather use a linear measure of bivariate (Granger) causality, or even the linear cross-correlation. On the other extreme of a very long multivariate time series, one would prefer to use a nonlinear and even multivariate measure of causality, where for the estimation of a driving-response relationship of two of the observed variables, the other observed variables are also considered. When the measure is computed on all directed pairs of observed variables, a complex network is formed, called also connectivity or causality network, where the nodes are the observed variables, and the connections are the estimated inter-dependences. For a network with binary connections the inter-dependences are discretized to zero (not significant) and one (significant) by applying a criterion for the significance, e.g., arbitrary threshold or statistical testing. In the era of big data and complex systems, the case of high-dimensional time series is of particular interest, where each time series (observed variable) corresponds to a subsystem, and the underlying system is composed of many subsystems. In this case, even in the presence of long time series, the multivariate measure of causality or inter-dependence may fail unless dimension reduction is designed in the estimation scheme. Dimension reduction in the estimation of direct causality of a driving-response variable pair indicates to restrict the number of the other observed variables, which is high to only a small number of them being the most relevant, i.e., most related to the response. I will present first the framework of connectivity analysis of multivariate time series and focus on direct connections and many observed variables. I will present linear and nonlinear causality measures and derive networks when the number of time series is large. I will also introduce causality measures that apply dimension reduction for high-dimensional time series, which we have developed. I will illustrate on simulated data the ability of causality measures using dimension reduction to identify the underlying complex network (connectivity structure of the underlying complex system) solely on the basis of the observed multivariate times series. Case studies on real-world applications will be presented, and in particular multivariate time series records of epileptic electroencephalograms and world financial markets. The tutorial will be completed with a hands-on session, where Matlab programs computing causality measures and networks on some exemplary systems will be illustrated and discussed.