The approach for complexity analysis of multivariate time series

Kazimieras Pukenas1

1Lithuanian Sports University, Kaunas, Lithuania

1Corresponding author

Mathematical Models in Engineering, Vol. 1, Issue 2, 2015, p. 61-66.
Received 15 June 2015; received in revised form 1 August 2015; accepted 11 August 2015; published 30 December 2015

Copyright © 2015 JVE International Ltd.

Table of Contents Abstract Full-text  Download PDF References
Views 9
Reads 5
Downloads 22
since May, 2018
Abstract.

This paper proposes to estimate the complexity of a multivariate time series by the spatio-temporal entropy based on multivariate singular spectrum analysis (M-SSA). In order to account for both within- and cross-component dependencies in multiple data channels the high dimensional block Toeplitz covariance matrix is decomposed as a Kronecker product of a spatial and a temporal covariance matrix and the multivariate spatio-temporal entropy is defined in terms of modulus and angle of the complex quantity constructed from the spatial and temporal components of the multivariate entropy. The benefits of the proposed approach are illustrated by simulations on complexity analysis of multivariate deterministic and stochastic processes.

Keywords: multivariate time series analysis, multivariate entropy, multivariate singular spectrum analysis, Kronecker product expansion.

1. Introduction

Spatially extended complex dynamical systems, common in physics, biology and the social sciences such as economics, contains multiple interacting components and is nonlinear, non-stationary, and noisy, and one goal of data analysis is to detect, characterize, and possibly predict any events that can significantly affect the normal functioning of the system [1]. A lot of analysis techniques, originated from synchronization theory, nonlinear dynamics, information theory, statistical physics, and from the theory of stochastic processes, is available to estimate strength and direction of interactions from time series [2]. Among other techniques, the entropy measures are a powerful tool for the analysis of time series, as they allow describing the probability distributions of the possible state of a system, and therefore are capable to map changes of spatio-temporal patterns over time and detect early signs of transition from one to another regime. Various practical definitions and calculations of entropy exist [3-6], but most of them have been designed predominantly for the scalar case and are not suited for multivariate time series that are routinely measured in experimental and biological systems. Hence, suitable tools for the detection and quantitative description of the rich variety of modes that can be excited due to the interplay between spatial and temporal behavior are needed. Multivariate multiscale entropy (MMSE) has been recently suggested as a novel measure to characterize the complexity of multivariate time series [4]. The MMSE method involves multivariate sample entropy (MSampEn) and allows to account for both within- and cross-channel dependencies in multiple data channels. Furthermore, it operates over multiple temporal scales. The MMSE method is shown to provide an assessment of the underlying dynamical richness of multivariate observations and has more degrees of freedom in the analysis than standard MSE [4]. However, this MSampEn approach has limited application when the number of variables increases. Differently to the univariate sample entropy (SampEn) introduced by Richman and Moorman [5], which operates on delay vectors of a univariate time series, MSampEn calculates, for a fixed embedding dimension m, the average number of composite delay vector pairs that are within a fixed threshold r and repeats the same calculation for the dimension (m+1) [4]. Because the composite delay vectors are high-dimensional (the dimension is equal to the product of the number of variables and the embedding dimension of the reconstructed phase space), the probability that at least two composite delay vectors will match with respect to tolerance r quickly decreases as the number of variables increases. Among the other multivariate entropy approaches described is Multivariate Principal Subspace Entropy (MPSE) [6], introduced as a measure to estimate spatio-temporal complexity of functional magnetic resonance imaging (fMRI) time series. MPSE is based on the multivariate Gaussian density assumption of the fMRI data. Multivariate singular spectrum analysis (M-SSA) [7-10] provides insight into the dynamics of the underlying system by decomposing the delay-coordinate phase space of a given multivariate time series into a set of data-adaptive orthonormal components and is a powerful tool for spatiotemporal analysis of the vector-valued processes. M-SSA combines two useful approaches of statistical analysis: (1) it determines – with the help of principal component analysis – major directions in the system’s phase space that are populated by the multivariate time series; and (2) it extracts major spectral components by means of singular spectrum analysis (SSA). However, the Shannon entropy constructed from the M-SSA eigenvalue spectrum has weak distinguishing ability and does not provide information about the degree of complexity for both temporal and spatial components of multivariate time series.

In the present paper, it is shown that high-dimensional block Toeplitz covariance matrices (on which M-SSA operates) estimation via Kronecker product expansions makes it possible to define the complexity of multivariate time series by means of the spatial and temporal components and to obtain more information about the underlying process.

2. Description of the algorithm

Let x=xdn:d=1,,D,  n=1,,N be a multivariate time series with D components (variables) of length N. In practical implementation (neuroimagery with EEG, geosciences), this denotes samples of a space–time random process defined over a D-grid of space samples and an N-grid of time samples. It is assumed that each component has been centred and normalized. Each time series component of the multivariate system can be reconstructed in M-dimensional phase space by selecting the embedding dimension M and time delay τ. Each phase point in the phase space is thus defined by [11]:

(1)
x d n = x d n , x d n + τ , , x d n + M - 1 τ T ,

where n=1, 2,,N-M-1τ, and T denotes the transpose of a real matrix. At τ=1, the reconstructed phase space matrix Xd with M rows and L=N-M+1 columns (called a trajectory matrix) is defined by:

(2)
X d = x d ( 1 ) x d N - M + 1 x d M x d N ,

and encompasses M delayed versions of each component. The total trajectory matrix of the multivariate system X will be a concatenation of the component trajectory matrices Xd computed for each component, that is, X=X1,X2,,XDT, and has DM rows of length L=N-M+1. The classical M-SSA algorithm [8, 9] then computes the covariance matrix Σ=1/LXXTRDM×DM of X and its eigendecomposition. The total trajectory matrix X can be rearranged such that the delayed versions of all D components follow one another, that is, X=XD1,XD2,,XDMT. In this case, and for LM, the covariance matrix ΣRMD × MD takes the block Toeplitz form:

(3)
Σ = Σ 0 Σ 1 Σ M - 2 Σ M - 1 Σ ( - 1 ) Σ 0 Σ M - 3 Σ M - 2 Σ ( - ( M - 2 ) ) Σ ( - ( M - 3 ) ) Σ 0 Σ 2 Σ ( - ( M - 1 ) ) Σ ( - ( M - 2 ) ) Σ ( - 1 ) Σ 0 ,

with equal block submatrices ΣQRD×D along each (top left to lower right) diagonal. The covariance matrix Σ combines all auto- as well as cross-covariances up to a time lag equal to M-1. The spatio-temporal covariance matrix ΣRMD×MD can be represented as a Kronecker product (KP) [12-16]:

(4)
Σ = i = 1 r T i S i ,

for some separation rank r, some sequence of TiRM×M and SiRD×D. The KP parametrization assumes that an arbitrary spatio-temporal correlation can be modelled as a product of a spatial and a temporal factor. These two factors are independent of each other; hence, the spatial and temporal correlations are separated from each other in the KP model. The spatial covariance matrix captures the spatial correlations between the components of multivariate time series (in real-world application, signals registered at different sensors) and the temporal covariance matrix captures the temporal correlations as lag dependent. The meaning of Eq. (4) is that the temporal covariance matrix Ti is fixed in space and the spatial covariance matrix Si does not vary over time. In other words, space and time are not correlated. In order to effective detect the structure of the spatio-temporal dependencies in high-dimensional multivariate systems, we use the full rank approximation model, i.e., define r=M. The solution to the nearest Kronecker product for a space-time covariance matrix (NKPST) problem is given by the singular value decomposition (SVD) of a permuted version of the space-time covariance matrix ΣRMD×MD in the following:

1) The matrix ΣRMD×MD is rearranged into another matrix ΞΣRM2×D2 such that the sum of squares that arise in Σ-i=1rTiSiF is exactly the same as the sum of squares that arise in ΞΣ-i=1rvecTivecSiTF, where vecW denotes the vectorized form of the square matrix W (concatenation of columns of W into a column vector). The M2 rows of Ξ are defined as the transpose of the vectorized D×D block submatrices of Σ, where each block submatrix is ΣQi,j=Σi-1D+1:iD,  j-1D+1:D, i, j=1,, M. That is, the permutation operator R:RMD×MDRM2 × D2 is defined by setting the i-1M+jth row of Ξ equal to vecΣQi, jT, i, j=1,, M.

2) The SVD is performed on the matrix Ξ:

(5)
Ξ = U Λ V T = i = 1 r u i λ i v i T ,

and the matrices Ti and Si are defined for the selected Kronecker product (full rank approximation) model by vecTi=λiui and vecSi=λiv1, where ui and vi are left and right singular vectors corresponding to the singular value λi [13, 16]. This is converted back to a square TiRM×M and the SiRD × D matrix by applying the inverse permutation operator.

Finally, the SVD is performed on the estimated matrices Ti and Si, and the spatial and temporal components of the multivariate entropy (MVE) can be defined as the Shannon entropy of the corresponding normalized eigenvalue spectrum:

(6)
E S = - 1 l n D j = 1 D σ S , j ln σ S , j ,
(7)
E T = - 1 l n M j = 1 M σ T , j ln σ T , j ,

where σS,j=λS,j/λS,j and σT,j=λT,j/λT, are the summarized λS,j=i=1rλS,ji,λT,j=i=1rλT,ji and normalized eigenvalues of the decomposed matrices Si and Ti respectively. The Shannon entropy quantifies the degree of uniformity in the distribution of the eigenvalues: it tends to unity at uniform distribution of the eigenvalues and tends to zero if the eigenvalue spectrum is concentrated into a single first largest eigenvalue. The entropy features ES and ET can be expressed in complex form EΣ=ES+iET, and the multivariate spatio-temporal entropy can be defined by modulus EΣ and angle φE of this complex quantity. The angle φE reflects the difference between ES and ET and ranges from 0 (at ET=0) to π/2 (at ES=0).

3. Simulation results

To validate the ability of the proposed algorithm to reveal both within- and cross-component properties in multivariate data, we generated a multivariate time series with defined changes of the complexity. The overembedding (M=16) for all trials was chosen in order to capture the temporal properties of the time series. All data processing and analyses were performed using Matlab software (MathWorks, Natick, MA).

First, to illustrate the ability of the proposed algorithm to model cross-component properties in multivariate data, we considered a multivariate deterministic system – a chain of diffusively coupled Rössler oscillators [9, 17]:

(8)
x ˙ j = - ω j y j - z j ,
y ˙ j = ω j x j + α y j + c y j + 1 - 2 y j + y j - 1 ,
z ˙ j = 0.1 + z j x j - 8.5 .

The position in the chain is given by the index j=1,…, J; ωj=ω1+0.02j-1 are the associated natural frequencies, with ω1=1, and we assume that free boundary conditions x0n=x1n and xJ+1n=xJn exist. The frequency mismatch Δjkω between oscillators j and k is given by Δjkω=0.02j-k. The parameter α=0.15, and c0 is the coupling strength. We considered a chain of J=10 Rössler oscillators and generated a time series using the ODE45 integrator of Matlab. The solution was sampled at time intervals t= 0.1, and the proposed algorithm was applied to x-components of length N= 2048 of the Rössler systems. Fig. 1(a) shows the EΣ and φE curves when the coupling strength c is varied. As c increases, the oscillators form synchronous clusters within which the observed frequency is the same [9]. With a further increase of c the number of clusters decreases as clusters merge, and at sufficiently high c all oscillators are synchronized, i.e., all the mean frequencies coincide, although the amplitudes of the oscillators remain chaotic [17]. MVE is characterized by two components – modulus EΣ and angle φE – that provide comprehensive information about the inherent structure of the multivariate data. The increasing value of φE indicates that EΣ decreases due to the growing strength of dependence between components of a multivariate time series.

To illustrate the ability of the proposed algorithm to model within-component properties in multivariate data, we considered a chain of above-mentioned Rössler oscillators at c=0 and gradually increased the parameter α from 0.10 to 0.35. The larger α, the more complex the resulting time series will be. The results are displayed in Fig. 1(b). In this case, the increased value of φE indicates that EΣ slightly increases because of the degree of randomness of a time series.

To demonstrate the performance of the proposed algorithm for quantifying changes in the dynamics underlying the experimental data set, we have applied the algorithm to multi-channel EEG recordings from epilepsy patients. It is well known that during epileptic seizures EEG time series display a marked change of the dynamics (mainly oscillations of comparatively large amplitude and well defined frequency). The EEG data with the clinically marked start and end of seizure are taken from the CHB-MIT Scalp EEG Database (http://www.physionet.org/physiobank/database/chbmit/). We analyzed 23 contacts of one-minute raw (unfiltered) scalp EEG recordings (from record, numbered chb01/chb01_04.edf) using the block Toeplitz covariance matrix, which was constructed over a moving window of length 3.9 s, corresponding to 1000 data points. This window was shifted with 90 % overlap over the whole EEG recording, and the time evolution of the EΣ and φE was computed. As we can see in Fig. 2, both EΣ and φE indicate sudden changes of the brain dynamics in preictal and ictal states. Furthermore, the angle φE of the multivariate entropy provides essential information about factors that cause this complicated dynamics. The considerable positive deviation of φE from its mean value denotes that EΣ changes due to the growing strength of dependence between components of a multivariate time series, and also because of the degree of randomness of a time series.

Fig. 1. Multivariate entropy (modulus EΣ and angle φE) for: a) a chain of coupled Rössler oscillators (x components) at various coupling strengths c; b) a chain of Rössler oscillators (x components) at c=0 and various values of parameter α

a)

b)

Fig. 2. Multivariate entropy (modulus EΣ and angle φE) applied on epileptic EEG. The period between two vertical dashed lines is visually identified by the neurologists as the seizure period

4. Conclusion

A simulation involving a multivariate deterministic and stochastic time series shows that the approach based on a full rank Kronecker product expansion of the block-Toeplitz covariance matrix, calculated from the full augmented trajectory matrix, can reveal temporal dynamics and relationship between the signals; furthermore, it is a suitable tool for the assessment of the underlying dynamic complexity of multivariate observations. The proposed approach defined in terms of modulus and angle of the complex quantity constructed from the spatial and temporal components of the multivariate entropy reflects both the integrative properties of the systems and the ratio between the complexities in the spatial and temporal directions. The algorithm can be applied to the estimation of the complexity of multivariate time series with a large number (tens) of variates (e.g. EEG), when the application of the other approaches becomes problematic.

References

  1. Seely A. J. E., Maclem P. T. Complex systems and the technology of variability analysis. Critical Care, Vol. 8, 2004, p. 367-384.
  2. Lehnertz K., Ansmanna G., Bialonski S., Dickten H., Geier Ch., Porz S. Evolving networks in the human epileptic brain. Physica D, Vol. 267, 2014, p. 7-15.
  3. Bollt E. M., Skufca J. D., McGregor S. J. Control entropy: a complexity measure for nonstationary signals. Mathematical Biosciences and Engineering, Vol. 6, Issue 1, 2009, p. 1-25.
  4. Ahmed M. U., Mandic D. P. Multivariate multiscale entropy: a tool for complexity analysis of multichannel data. Physical Review E, Vol. 84, 2011, p. 061918.
  5. Richman J. S., Moorman J. R. Physiological time-series analysis using approximate entropy and sample entropy. American Journal of Physiology Heart and Circulatory Physiology, Vol. 278, 2000, p. H2039-H2049.
  6. Schütze H., Martinetz T., Anders S., Mamlouk A. M. A multivariate approach to estimate complexity of FMRI time series. Lecture Notes in Computer Science, Vol. 7553, 2012, p. 540-547.
  7. Plaut G., Vautard R. Spells of low-frequency oscillations and weather regimes in the northern hemisphere. Journal of the Atmospheric Sciences, Vol. 51, Issue 2, 1994, p. 210-236.
  8. Ghil M., Allen M. R., Dettinger M. D., Ide K., Kondrashov D., Mann M. E., et al. Advanced spectral methods for climatic time series. Reviews of Geophysics, Vol. 40, 2002, p. 1-41.
  9. Groth A., Ghil M. Multivariate singular spectrum analysis and the road to phase synchronization. Physical Review E, Vol. 84, 2011, p. 036206.
  10. Golyandina N., Korobeynikov A., Shlemov A., Usevich K. Multivariate and 2D extensions of singular spectrum analysis with the Rssa package. arXiv:1309.5050, 2013, p. 1-74.
  11. Kantz H., Schreiber T. Nonlinear Time Series Analysis. Cambridge University Press, Cambridge, 2003.
  12. Loan C. V., Pitsianis N. Approximation with Kronecker Products, in Linear Algebra for Large Scale and Real Time Applications. Kluwer Publications, Amsterdam, 1993, p. 293-314.
  13. Genton M. G. Separable approximations of space-time covariance matrices. Environmetrics, Vol. 18, 2007, p. 681-695.
  14. Tsiligkaridis T., Hero A. O. Covariance estimation in high dimensions via Kronecker product expansions. IEEE Transactions on Signal Processing, Vol. 61, Issue 21, 2013, p. 5347-5360.
  15. Wirfält P., Jansson M. On Kronecker and linearly structured covariance matrix estimation. IEEE Transactions on Signal Processing, Vol. 62, Issue 6, 2014, p. 1536-1547.
  16. Bijma F., de Munck J. C., Heethaar R. M. The spatiotemporal MEG covariance matrix modeled as a sum of Kronecker products. NeuroImage, Vol. 27, 2005, p. 402-415.
  17. Osipov G. V., Pikovsky A. S., Rosenblum M. G., Kurths J. Phase synchronization effects in a lattice of nonidentical Roessler oscillators. Physical Review E, Vol. 55, 1997, p. 2353-2361.