Stable Modeling on Resource Usage Parameters of MapReduce Application

Hexagon, usage, resources, connection.

Currently, Hadoop MapReduce framework has been applied to many productive fields to analyze big data. MapReduce applications based on the MapReduce programming model are used to generate and process such huge data. Due to various computational purpose, MapReduce applications have different resource requirements. For specific applications, the resource bottleneck of the cloud computing platform must inevitably impact its executive performance. Therefore, identification of the bottleneck about the allocated resource for MapReduce applications is crucially needed from the viewpoint of either cloud operators or program developers. In this paper, we model the relationship of resource usage parameters of MapReduce applications using multiple linear regression methods and investigate the minimum sampling time for stable modeling. Based on the analysis, we propose the approach which can be used to build stable performance model to expose the bottleneck resource of Hadoop platform and give the effective optimization suggestion.

Yangyuan Li