Abstract
In recent years, there has been a lot of focus on benchmarking and performance modelling of data-intensive applications to understand and improve the development of big data systems. Several interesting approaches were proposed; however, as of writing this paper and to the best of our knowledge, there are no comprehensive surveys that thoroughly examine the gaps, trends and trajectories of this area. To fill this void, we, therefore, present a review of the state-of-art benchmarking and performance modelling efforts in data-intensive applications.
We start by introducing the two most common dataflow patterns used, for each of these patterns, we review their approach to benchmarking, modelling and validation & experimental environments. Furthermore, we construct a taxonomy and classification to provide a deep understanding of the focus areas of this domain and identify the opportunities for further research. We conclude by analysing each research gap and highlighting future trends
Original language | English |
---|---|
Title of host publication | 2020 IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT) |
Publisher | IEEE |
Pages | 67-76 |
Number of pages | 10 |
ISBN (Electronic) | 9780738123967 |
ISBN (Print) | 9781665415675 |
DOIs | |
Publication status | Published - 7 Dec 2020 |
Event | 7th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT 2020) - University of Leicester, Leicester, United Kingdom Duration: 7 Dec 2020 → 12 Dec 2020 Conference number: 7 https://www.cs.le.ac.uk/events/BDCAT2020/index.htm |
Conference
Conference | 7th IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT 2020) |
---|---|
Abbreviated title | BDCAT 2020 |
Country/Territory | United Kingdom |
City | Leicester |
Period | 7/12/20 → 12/12/20 |
Internet address |
Keywords
- Dataflow with cycles
- Communication patterns
- Modelling
- Machine learning
- Big data
- Data-intensive application