Monitoring-as-a-Service in the Cloud

Shicong Meng
College of Computing at Georgia Tech.


Abstract

Delivering monitoring as a service (MaaS) is gaining significant interests and demands from both cloud service providers and consumers. A fundamental challenge for effective MaaS is to architect state monitoring using a layered design which offers performance and cost elasticity with automated tuning of system parameters and adaptive control of the knobs across different monitoring service stacks. We argue that MaaS should support not only conventional state monitoring functionalities such as periodical data collection, centralized violation detection and single-tenant provisioning, but also advanced capabilities as plug-and-play services to enable cloud monitoring services with better quality, scalability, and higher effectiveness at lower cost by consolidating monitoring demands at different levels (infrastructure, platform and application). In this talk, we present three enhanced MaaS capabilities: violation likelihood based state monitoring, distributed window based violation detection and multi-tenancy enabled state monitoring. We first discuss the limitations of existing state monitoring approaches in terms of state information collection, distributed violation detection and multi-tenancy support, which offer us insights on how enhanced MaaS capabilities can provide value-added benefits to both cloud service providers and consumers. Specifically, for state information tracking and collection, we show that violation likelihood based state monitoring can dynamically adjust monitoring intensity based on the likelihood of detecting important events, leading to significant gain in monitoring service consolidation. For distributed violation detection, we develop a window based detection framework, which is more resilient to noises and outliers and also cost effective in terms of messaging and coordination. For multi-tenancy support, we develop a multi-tenancy based scheduling method to allow multiple cloud service consumers to enjoy MaaS with improved performance and efficiency at more affordable cost. Through an example Cloud application provisioning service, we show how these value-add monitoring services can relieve users from exploring complex options such as virtual machine (VM) types and VM cluster configurations by leveraging cumulative Cloud application monitoring data to efficiently find provisioning plans for a user-specified performance goal.