Machine Learning (ML) emerged as a powerful tool for predicting task execution times across the variety of VM types offered by Infrastructure-as-a-Service (IaaS) clouds. However, training ML models to ensure accurate predictions can often become uneconomical for users due to the high costs—in terms of both time and money—for collecting samples, especially when an IaaS cloud offers a wide choice of VM types. This paper investigates a ML model bootstrapping technique that leverages analytical modeling to reduce the cost of collecting training samples while maintaining robust performance predictions. Complementarily, the technique can be used to improve the accuracy of ML models in the case of limited availability of training samples. Experimental results highlighted the potential of the proposed technique with various workloads and with a large set of VM types, paving the way for more cost-effective ML-based performance prediction in IaaS clouds.
Marotta, R., Russo Russo, G., Quaglia, F., Di Sanzo, P. (2025). A Bootstrapping Technique for Reducing the Costs of Machine Learning Models for Predicting Execution Times in IaaS Clouds. In Proceedings of the 2025 ACM Symposium on Cloud Computing. ACM [10.1145/3772052.3772251].
A Bootstrapping Technique for Reducing the Costs of Machine Learning Models for Predicting Execution Times in IaaS Clouds
Romolo Marotta;Gabriele Russo Russo;Pierangelo Di Sanzo
2025-01-01
Abstract
Machine Learning (ML) emerged as a powerful tool for predicting task execution times across the variety of VM types offered by Infrastructure-as-a-Service (IaaS) clouds. However, training ML models to ensure accurate predictions can often become uneconomical for users due to the high costs—in terms of both time and money—for collecting samples, especially when an IaaS cloud offers a wide choice of VM types. This paper investigates a ML model bootstrapping technique that leverages analytical modeling to reduce the cost of collecting training samples while maintaining robust performance predictions. Complementarily, the technique can be used to improve the accuracy of ML models in the case of limited availability of training samples. Experimental results highlighted the potential of the proposed technique with various workloads and with a large set of VM types, paving the way for more cost-effective ML-based performance prediction in IaaS clouds.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


