Software anomalies are recognized as a major problem affecting the performance and availability of many computer systems. Accumulation of anomalies of different nature, such as memory leaks and unterminated threads, may lead the system to both fail or work with suboptimal performance levels. This problem particularly affects web servers, where hosted applications are typically intended to continuously run, thus incrementing the probability, therefore the associated effects, of accumulation of anomalies. Given the unpredictability of occurrence of anomalies, continuous system monitoring would be required to detect possible system failures and/or excessive performance degradation in order to timely start some recovering procedure. In this paper, we present a Machine Learning-based framework for proactive management of client-server applications in the cloud. Through optimized Machine Learning models and continually measuring system features, the framework predicts the remaining time to the occurrence of some unexpected event (system failure, service level agreement violation, etc.) of a virtual machine hosting a server instance of the application. The framework is able to manage virtual machines in the presence of different types anomalies and with different anomaly occurrence patterns. We show the effectiveness of the proposed solution by presenting results of a set of experiments we carried out in the context of a real world-inspired scenario.
DI SANZO, P., Pellegrini, A., Avresky, D.R. (2015). Machine Learning for Achieving Self-* Properties and Seamless Execution of Applications in the Cloud. In 2015 IEEE Fourth Symposium on Network Cloud Computing and Applications (NCCA) (pp.51-58). IEEE Computer Society [10.1109/NCCA.2015.18].