Bayesian Networks (BNs) are multivariate statistical models satisfying sets of conditional independence statements. Recently, BNs have been applied to official statistics problems. The association structure can be learnt from data by a sequence of independence and conditional independence tests using the PC algorithm. The learning process is based on the assumption of independent and identically distributed observations. This assumption is almost never valid for sample survey data since most of the commonly used survey designs employ stratification and/or cluster sampling and/or unequal selection probabilities. Then the design may be not ignorable and it must be taken into account in the learning process. Alternative procedures of Bayesian network structural learning for complex designs are becoming of interest. A PC correction is proposed for taking into account the sampling design complexity. In most cases, the design effects are provided only for the cells and for specific marginals of the contingency table. In such a situation the first-order Rao Scott corrections can be computed for those loglinear models admitting an explicit solution to the likelihood equations. Therefore, we focus on decomposable models and the subset of hierarchical loglinear models, typically used to investigate the association structure in terms of (conditional) independence between categorical variables.

Marella, D., Musella, F., Vicard, P. (2014). Bayesian network structural learning for complex survey data. In 7th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (ERCIM 2014).

Bayesian network structural learning for complex survey data

MARELLA, Daniela;VICARD, Paola
2014-01-01

Abstract

Bayesian Networks (BNs) are multivariate statistical models satisfying sets of conditional independence statements. Recently, BNs have been applied to official statistics problems. The association structure can be learnt from data by a sequence of independence and conditional independence tests using the PC algorithm. The learning process is based on the assumption of independent and identically distributed observations. This assumption is almost never valid for sample survey data since most of the commonly used survey designs employ stratification and/or cluster sampling and/or unequal selection probabilities. Then the design may be not ignorable and it must be taken into account in the learning process. Alternative procedures of Bayesian network structural learning for complex designs are becoming of interest. A PC correction is proposed for taking into account the sampling design complexity. In most cases, the design effects are provided only for the cells and for specific marginals of the contingency table. In such a situation the first-order Rao Scott corrections can be computed for those loglinear models admitting an explicit solution to the likelihood equations. Therefore, we focus on decomposable models and the subset of hierarchical loglinear models, typically used to investigate the association structure in terms of (conditional) independence between categorical variables.
978-84-937822-4-5
Marella, D., Musella, F., Vicard, P. (2014). Bayesian network structural learning for complex survey data. In 7th International Conference of the ERCIM (European Research Consortium for Informatics and Mathematics) Working Group on Computational and Methodological Statistics (ERCIM 2014).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/308254
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact