The Italian National Institute of Statistics, as well as most of the National Statistical Institutes in the world, produces forecasts of socio-economic indicators by means of statistical models that make no use of information from external sources and rely only the data provided by its own sample survey. In the field of Official Statistics, some studies have recently been conducted to assess whether online search data can be used to facilitate the estimation of phenomena of interest or to produce additional information, starting from data such as internet search data, whose main features are easy availability and low cost. Several studies have used the Google Trends (GT) time series for the nowcast of important short-term economic indicators. One of the most studied is the unemployment rate and specifically many studies have focused on the prediction of the youth unemployment rate because it is assumed that these use more than the others the online job search channel. The paper tries to verify the consistency of the time series available from GT and compare different models for the nowcast of the quarterly unemployment rate for different age categories, specifically 15-24, 25-34 and 35-49. Some analysis aimed at studying the volatility of the time series provided by GT are provided in the paper. The results show critical issues in terms of high variability for the GT time series, questioning the use of them for the production of Official Statistics. Furthermore, the nowcast results show that for each age category analyzed the best predictions are always those provided by the ARIMA model in which the exogenous variable is the GT query share. The results obtained also showed that the age category that has the greatest prediction improvements is the 25-34.
Fasulo, A., Naccarato, A., Pizzichini, A. (2019). Nowcasting the Italian unemployment rate with Google Trends. RIVISTA ITALIANA DI ECONOMIA, DEMOGRAFIA E STATISTICA, LXXIII, 29-40.
Nowcasting the Italian unemployment rate with Google Trends
Andrea Fasulo
;Alessia Naccarato;
2019-01-01
Abstract
The Italian National Institute of Statistics, as well as most of the National Statistical Institutes in the world, produces forecasts of socio-economic indicators by means of statistical models that make no use of information from external sources and rely only the data provided by its own sample survey. In the field of Official Statistics, some studies have recently been conducted to assess whether online search data can be used to facilitate the estimation of phenomena of interest or to produce additional information, starting from data such as internet search data, whose main features are easy availability and low cost. Several studies have used the Google Trends (GT) time series for the nowcast of important short-term economic indicators. One of the most studied is the unemployment rate and specifically many studies have focused on the prediction of the youth unemployment rate because it is assumed that these use more than the others the online job search channel. The paper tries to verify the consistency of the time series available from GT and compare different models for the nowcast of the quarterly unemployment rate for different age categories, specifically 15-24, 25-34 and 35-49. Some analysis aimed at studying the volatility of the time series provided by GT are provided in the paper. The results show critical issues in terms of high variability for the GT time series, questioning the use of them for the production of Official Statistics. Furthermore, the nowcast results show that for each age category analyzed the best predictions are always those provided by the ARIMA model in which the exogenous variable is the GT query share. The results obtained also showed that the age category that has the greatest prediction improvements is the 25-34.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.