Exploring characteristics of inter-cluster machines and cloud applications on Google clusters

Yuhui Lin, Adam David Barker, Sheriffo Ceesay

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Downloads (Pure)

Abstract

Modern cluster management systems have been evolving to cope with running and managing diverse cloud applications on heterogeneous computing clusters. Consequently, the system behaviours become complex and non-trivial to explain. In this paper we take the recently published Google trace data set version 3 (V3) as a case study to explore various aspects of inter- cluster differences. We analyse the distribution of underlying physical machines resource, e.g. number and types of machine, and metrics of computational job requests, e.g. job duration, utilisation and Cycles Per Instruction (CPI). We also apply an unsupervised learning algorithm on the metrics to characterise jobs. Our analysis suggests that the composition of the underlying machine resources in different cells can be substantially different, and the cells with similar machine resource structures can utilise resources differently depending on the characteristics of job requests.
Original languageEnglish
Title of host publicationThe 4th Workshop on Benchmarking, Performance Tuning and Optimization for Big Data Applications (BPOD)
PublisherIEEE Computer Society
Publication statusPublished - 10 Dec 2020
EventIEEE International Conference on Big Data - IEEE BigData 2020 -
Duration: 10 Dec 202013 Dec 2020
https://bigdataieee.org/BigData2020/

Conference

ConferenceIEEE International Conference on Big Data - IEEE BigData 2020
Period10/12/2013/12/20
Internet address

Keywords

  • Cloud computing
  • Google cloud traces
  • Cloud application characteristics

Fingerprint

Dive into the research topics of 'Exploring characteristics of inter-cluster machines and cloud applications on Google clusters'. Together they form a unique fingerprint.

Cite this