Microservice architecture has emerged as a powerful paradigm for cloud computing due to its high efficiency in infrastructure management as well as its capability of largescale user service. A cloud provider requires flexible resource management to meet the continually changing demands, such as auto-scaling and provisioning. A common approach used in both commercial and open-source computing platforms is workload-based automatic scaling, which expands instances by increasing the number of incoming requests. Concurrency is a request-based policy that has recently been proposed in the evolving microservice framework; in this policy, the algorithm can expand its resources to the maximum number of configured requests to be processed in parallel per instance. However, it has proven difficult to identify the concurrency configuration that provides the best possible service quality, as various factors can affect the throughput and latency based on the workloads and complexity of the infrastructure characteristics. Therefore, this study aimed to investigate the applicability of an artificial intelligence approach to request-based auto-scaling in the microservice framework. Our results showed that the proposed model could learn an effective expansion policy within a limited number of pods, thereby showing an improved performance over the underlying auto expansion configuration
This research paper sets forward an autoscaler, called EPMA, (Elastic Platform for Microservice-based Applications). Its main objective corresponds to resource optimization. The basic contributions of EPMA are twofold. First, an analysis module that correctly detects and identifies the root cause of performance degradation while considering a cross-layer approach. The detected problems may be related to request issues such as a workload increase or a specific request consuming a lot of resources. Other problems related to container and VM (Virtual Machine) layers are also considered. Second, a planning module that considers the above detected problems and offers an optimized elasticity plan avoiding useless resource provisioning and thus permitting resource optimization. Thus, our planning module selects the appropriate microservices to which resources should be added as well as the optimized amount of resources that should be incorporated to correct the situation. Experimental results conducted on two concrete use cases revealed that EPMA autoscaler detects and identifies different problems generating the overload state and launching an optimized resource provisioning that reduces computing resources.