A survey: benchmarking and performance modelling of data intensive applications

Sheriffo Ceesay, Yuhui Lin*, Adam David Barker*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

In recent years, there has been a lot of focus on benchmarking and performance modelling of data-intensive applications to understand and improve the development of big data systems. Several interesting approaches were proposed; however, as of writing this paper and to the best of our knowledge, there are no comprehensive surveys that thoroughly examine the gaps, trends and trajectories of this area. To fill this void, we, therefore, present a review of the state-of-art benchmarking and performance modelling efforts in data-intensive applications. We start by introducing the two most common dataflow patterns used, for each of these patterns, we review their approach to benchmarking, modelling and validation & experimental environments. Furthermore, we construct a taxonomy and classification to provide a deep understanding of the focus areas of this domain and identify the opportunities for further research. We conclude by analysing each research gap and highlighting future trends
Original languageEnglish
Title of host publication2020 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT)
PublisherIEEE
Pages67-76
Number of pages10
ISBN (Electronic)9780738123967
ISBN (Print)9781665415675
DOIs
Publication statusPublished - 7 Dec 2020
Event7th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT 2020) - University of Leicester, Leicester, United Kingdom
Duration: 7 Dec 202012 Dec 2020
Conference number: 7
https://www.cs.le.ac.uk/events/BDCAT2020/index.htm

Conference

Conference7th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT 2020)
Abbreviated titleBDCAT 2020
Country/TerritoryUnited Kingdom
CityLeicester
Period7/12/2012/12/20
Internet address

Keywords

  • Dataflow with cycles
  • Communication patterns
  • Modelling
  • Machine learning
  • Big data
  • Data-intensive application

Cite this