Data processing is the core of any statistical information system. Statisticians are interested in specifying transformations and manipulations of data at a high level, in terms of entities of statistical models. We illustrate here a proposal where a high-level language, EXL, is used for the declarative specification of statistical programs, and a translation into executable form in various target systems is available. The language is based on the theory of schema mappings, in particular those defined by a specific class of tgds, which we actually use to optimize user programs and facilitate the translation towards several target systems. The characteristics of such class guarantee good tractability properties and the applicability in Big Data settings. A concrete implementation, EXLEngine, has been carried out and is currently used at the Bank of Italy.
Atzeni, P., Bellomarini, L., Bugiotti, F., & de Leonardis, M. (2017). Executable schema mappings for statistical data processing. DISTRIBUTED AND PARALLEL DATABASES, 1-36 [10.1007/s10619-017-7212-2].