Asterism: Pegasus and Dispel4py Hybrid Workflows for Data-Intensive Science

Rosa Filgueira, Rafael Ferreira Da Silva, Amrey Krause, Ewa Deelman, Malcolm Atkinson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present Asterism, an open source data-intensive framework, which combines the strengths of traditional workflow management systems with new parallel stream-based dataflow systems to run data-intensive applications across multiple heterogeneous resources, without users having to: re-formulate their methods according to different enactment engines; manage the data distribution across systems; parallelize their methods; co-place and schedule their methods with computing resources; and store and transfer large/small volumes of data. We also present the Data-Intensive workflows as a Service (DIaaS) model, which enables easy dataintensive workow composition and deployment on clouds using containers. The feasibility of Asterism and DIaaS model have been evaluated using a real domain application on the NSF-Chameleon cloud. Experimental results shows how Asterism successfully and efficiently exploits combinations of diverse computational platforms, whereas DIaaS delivers specialized software to execute data-intensive applications in a scalable, efficient, and robust way reducing the engineering time and computational cost.

Original languageEnglish
Title of host publicationProceedings of DataCloud 2016
Subtitle of host publication7th International Workshop on Data-Intensive Computing in the Clouds - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-8
Number of pages8
ISBN (Electronic)9781509061587
DOIs
Publication statusPublished - 6 Feb 2017
Event7th International Workshop on Data-Intensive Computing in the Clouds, DataCloud 2016 - Salt Lake City, United States
Duration: 14 Nov 2016 → …

Publication series

NameProceedings of DataCloud 2016: 7th International Workshop on Data-Intensive Computing in the Clouds - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference7th International Workshop on Data-Intensive Computing in the Clouds, DataCloud 2016
Country/TerritoryUnited States
CitySalt Lake City
Period14/11/16 → …

Keywords

  • Data-Intensive science
  • Deployment and reusability of execution environments
  • scientific workows
  • stream-based system

Fingerprint

Dive into the research topics of 'Asterism: Pegasus and Dispel4py Hybrid Workflows for Data-Intensive Science'. Together they form a unique fingerprint.

Cite this