Property-graphs are becoming popular for Intrusion Detection Systems (IDSs) because they allow to leverage distributed graph processing platforms in order to identify malicious network traffic patterns. However, a benchmark for studying their performance when operating on big data has not yet been reported. In general, benchmarking a system involves the execution of workloads on datasets, where both of them must be representative of the application of interest. However, few datasets containing real network traffic are openly available due to privacy concerns, which in turn could limit the scope and results of the benchmark. In this work, we build two synthetic data generators for benchmarking next generation IDSs by introducing the support for property-graphs in two well-known graph generation algorithms: Barabási-Albert and Kronecker. We run an extensive experimental evaluation using a publicly available dataset as seed for the data generation, and we show that the proposed approach is able to generate synthetic datasets with high veracity, while also exhibiting linear performance scalability.
Iannucci, S., Kholidy, H.A., Ghimire, A.D., Jia, R., Abdelwahed, S., & Banicescu, I. (2017). A Comparison of Graph-Based Synthetic Data Generators for Benchmarking Next-Generation Intrusion Detection Systems. In Proceedings - IEEE International Conference on Cluster Computing, ICCC (pp.278-289). 345 E 47TH ST, NEW YORK, NY 10017 USA : Institute of Electrical and Electronics Engineers Inc. [10.1109/CLUSTER.2017.54].