Current anonymization techniques for statistical databases exhibit significant limitations, related to the utility-privacy trade-off, the introduction of artefacts, and the vulnerability to correlation. We propose an anonymization technique based on the whitening/recolouring procedure that considers the database as an instance of a random population and applies statistical signal processing methods to it. In response to a query, the technique estimates the covariance matrix of the true data and builds a linear transformation of the data, producing an output that has the same statistical characteristics of the true data up to the second order, but is not directly linked to single records. The technique is applied to a real database containing the location data of taxi trips in New York. We show that the technique reduces the amount of artefacts introduced by noise addition while preserving first- and second-order statistical features of the true data (hence maintaining the utility of the query output).

D'Acquisto, G., Mazzoccoli, A., Ciminelli, F., Naldi, M. (2020). Privacy Through Data Recolouring. In Annual Privacy Forum, Springer, Cham (pp.61-72). Springer, Cham [10.1007/978-3-030-55196-4_4].

Privacy Through Data Recolouring

Giuseppe D’Acquisto;Alessandro Mazzoccoli;Maurizio Naldi
2020-01-01

Abstract

Current anonymization techniques for statistical databases exhibit significant limitations, related to the utility-privacy trade-off, the introduction of artefacts, and the vulnerability to correlation. We propose an anonymization technique based on the whitening/recolouring procedure that considers the database as an instance of a random population and applies statistical signal processing methods to it. In response to a query, the technique estimates the covariance matrix of the true data and builds a linear transformation of the data, producing an output that has the same statistical characteristics of the true data up to the second order, but is not directly linked to single records. The technique is applied to a real database containing the location data of taxi trips in New York. We show that the technique reduces the amount of artefacts introduced by noise addition while preserving first- and second-order statistical features of the true data (hence maintaining the utility of the query output).
2020
978-3-030-55195-7
D'Acquisto, G., Mazzoccoli, A., Ciminelli, F., Naldi, M. (2020). Privacy Through Data Recolouring. In Annual Privacy Forum, Springer, Cham (pp.61-72). Springer, Cham [10.1007/978-3-030-55196-4_4].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/420487
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact