Reconstructing weighted networks from partial information is necessary in many important circumstances, e.g. for a correct estimation of systemic risk. It has been shown that, in order to achieve an accurate reconstruction, it is crucial to reliably replicate the empirical degree sequence, which is however unknown in many realistic situations. More recently, it has been found that the knowledge of the degree sequence can be replaced by the knowledge of the strength sequence, which is typically accessible, complemented by that of the total number of links, thus considerably relaxing the observational requirements. Here we further relax these requirements and devise a procedure valid when even the the total number of links is unavailable. We assume that, apart from the heterogeneity induced by the degree sequence itself, the network is homogeneous, so that its (global) link density can be estimated by sampling subsets of nodes with representative density. We show that the best way of sampling nodes is the random selection scheme, any other procedure being biased towards unrealistically large, or small, link densities. We then introduce our core technique for reconstructing both the topology and the link weights of the unknown network in detail. When tested on real economic and financial data sets, our method achieves a remarkable accuracy and is very robust with respect to the sampled subsets, thus representing a reliable practical tool whenever the available topological information is restricted to small portions of nodes.
Squartini, T., Cimini, G., Gabrielli, A., Garlaschelli, D. (2017). Network reconstruction via density sampling. APPLIED NETWORK SCIENCE, 2(1), 3 [10.1007/s41109-017-0021-8].
Network reconstruction via density sampling
Gabrielli A.;
2017-01-01
Abstract
Reconstructing weighted networks from partial information is necessary in many important circumstances, e.g. for a correct estimation of systemic risk. It has been shown that, in order to achieve an accurate reconstruction, it is crucial to reliably replicate the empirical degree sequence, which is however unknown in many realistic situations. More recently, it has been found that the knowledge of the degree sequence can be replaced by the knowledge of the strength sequence, which is typically accessible, complemented by that of the total number of links, thus considerably relaxing the observational requirements. Here we further relax these requirements and devise a procedure valid when even the the total number of links is unavailable. We assume that, apart from the heterogeneity induced by the degree sequence itself, the network is homogeneous, so that its (global) link density can be estimated by sampling subsets of nodes with representative density. We show that the best way of sampling nodes is the random selection scheme, any other procedure being biased towards unrealistically large, or small, link densities. We then introduce our core technique for reconstructing both the topology and the link weights of the unknown network in detail. When tested on real economic and financial data sets, our method achieves a remarkable accuracy and is very robust with respect to the sampled subsets, thus representing a reliable practical tool whenever the available topological information is restricted to small portions of nodes.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.