Massive feature extraction for explaining and foretelling hydroclimatic time series forecastability at the global scale

Papacharalampous, G.; Tyralis, H.; Pechlivanidis, I. G.; Grimaldi, S.; Volpi, E.

doi:10.1016/j.gsf.2022.101349

Statistical analyses and descriptive characterizations are sometimes assumed to be offering information on time series forecastability. Despite the scientific interest suggested by such assumptions, the relationships between descriptive time series features (e.g., temporal dependence, entropy, seasonality, trend and linearity features) and actual time series forecastability (quantified by issuing and assessing forecasts for the past) are scarcely studied and quantified in the literature. In this work, we aim to fill in this gap by investigating such relationships, and the way that they can be exploited for understanding hydroclimatic forecastability and its patterns. To this end, we follow a systematic framework bringing together a variety of –mostly new for hydrology– concepts and methods, including 57 descriptive features and nine seasonal time series forecasting methods (i.e., one simple, five exponential smoothing, two state space and one automated autoregressive fractionally integrated moving average methods). We apply this framework to three global datasets originating from the larger Global Historical Climatology Network (GHCN) and Global Streamflow Indices and Metadata (GSIM) archives. As these datasets comprise over 13,000 monthly temperature, precipitation and river flow time series from several continents and hydroclimatic regimes, they allow us to provide trustable characterizations and interpretations of 12-month ahead hydroclimatic forecastability at the global scale. We first find that the exponential smoothing and state space methods for time series forecasting are rather equally efficient in identifying an upper limit of this forecastability in terms of Nash-Sutcliffe efficiency, while the simple method is shown to be mostly useful in identifying its lower limit. We then demonstrate that the assessed forecastability is strongly related to several descriptive features, including seasonality, entropy, (partial) autocorrelation, stability, (non)linearity, spikiness and heterogeneity features, among others. We further (i) show that, if such descriptive information is available for a monthly hydroclimatic time series, we can even foretell the quality of its future forecasts with a considerable degree of confidence, and (ii) rank the features according to their efficiency in explaining and foretelling forecastability. We believe that the obtained rankings are of key importance for understanding forecastability. Spatial forecastability patterns are also revealed through our experiments, with East Asia (Europe) being characterized by larger (smaller) monthly temperature time series forecastability and the Indian subcontinent (Australia) being characterized by larger (smaller) monthly precipitation time series forecastability, compared to other continental-scale regions, and less notable differences characterizing monthly river flow from continent to continent. A comprehensive interpretation of such patters through massive feature extraction and feature-based time series clustering is shown to be possible. Indeed, continental-scale regions characterized by different degrees of forecastability are also attributed to different clusters or mixtures of clusters (because of their essential differences in terms of descriptive features).

Papacharalampous, G., Tyralis, H., Pechlivanidis, I.G., Grimaldi, S., Volpi, E. (2022). Massive feature extraction for explaining and foretelling hydroclimatic time series forecastability at the global scale. GEOSCIENCE FRONTIERS, 13(3), 101349 [10.1016/j.gsf.2022.101349].