Multivariate Time Series Dataset for Space Weather Data Analytics [pdf][post][data]
> [DOI: 10.1038/s41597-020-0548-x]
Angryk, R.A., Martens, P.C., Aydin, B., Kempton, D., Mahajan, S.S., Basodi, S., Ahmadzadeh, A., Cai, X., Boubrahimi, S.F., Hamdi, S.M., Schuh, M.A. and Georgoulis, M.K.
Scientific Data, NatureMultivariate Time Series Dataset for Space Weather Data Analytics [pdf][post][data]
> [DOI: 10.1038/s41597-020-0548-x]
Angryk, R.A., Martens, P.C., Aydin, B., Kempton, D., Mahajan, S.S., Basodi, S., Ahmadzadeh, A., Cai, X., Boubrahimi, S.F., Hamdi, S.M., Schuh, M.A. and Georgoulis, M.K.
Scientific Data, Nature
We are only just recently finding access to high-quality time series data for use in solar flare prediction. Many previous and current projects use point-in-time measurements. It is possible that a time-series approach will enable new progress on using machine learning for flare prediction. Here we will use a benchmark dataset, named Space Weather ANalytics for Solar Flares (SWAN-SF), released by Angryk et al. [A-1], and made entirely of multivariate time series, aiming to carry out an unbiased flare forecasting and hopefully set the above question to rest.
The SWAN-SF dataset is made of five partitions (see Fig. 1). These partitions are temporally non-overlapping and divided in such a way that each contains approximately an equal number of X- and M- class flares (see Fig. 2). The data points in this dataset are time series slices of physical (magnetic field) parameters extracted from the flaring and flare-quiet regions, in a sliding fashion. That is, for a particular flare with a unique id, k equal-length multivariate time series are collected from a fixed period of time in the history of that flare. This period is called an observation window, denoted by T_obs, and spans over 24 hours. Given that t_i indicates the starting point of the i-th slice of the multivariate time series, the i+1-th slice starts at t_i + τ , where T_obs = 8τ . [Xb]
SWAN-SF contains a collection of 82 physical parameters derived from the vector magnetic field data. These parameters could potentially be important to analyze and forecast solar flares and coronal mass ejections. This sprint's goal is to rank these features in the order of their of their usefulness in prediction of flare activities.
On the right 24 of those magnetic-field parameters are listed, with their formulas and units.