Seastar: a comprehensive framework for telemetry data in HPC environments

Ole Weidner, Adam David Barker, Malcolm Atkinson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A large number of 2nd generation high-performance computing applications and services rely on adaptive and dynamic architectures and execution strategies to run efficiently,resiliently, and at scale on today’s HPC infrastructures. They require information about applications and their environment to steer and optimize execution. We define this information as telemetry data.

Current HPC platforms do not provide the infrastructure,interfaces and conceptual models to collect, store, analyze,and access such data. Today, applications depend on application and platform specific techniques for collecting telemetry data; introducing significant development overheads that inhibit portability and mobility. The development and adoption of adaptive, context-aware strategies is thereby impaired. To facilitate 2nd generation applications,more efficient application development, and swift adoption of adaptive applications in production, a comprehensive framework for telemetry data management must be provided by future HPC systems and services.

We introduce Seastar, a conceptual model and a software framework to collect, store, analyze, and exploit streams of telemetry data generated by HPC systems and their applications. We show how Seastar can be integrated with HPC platform architectures and how it enables common application execution strategies.
Original languageEnglish
Title of host publicationProceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers ROSS 2017
Place of PublicationNew York
PublisherACM
Number of pages8
ISBN (Print)9781450350860
DOIs
Publication statusPublished - 27 Jun 2017
EventInternational Workshop on Runtime and Operating Systems for Supercomputers - Washington, DC, United States
Duration: 27 Jun 201727 Jun 2017
http://www.mcs.anl.gov/events/workshops/ross/2017/index.php

Conference

ConferenceInternational Workshop on Runtime and Operating Systems for Supercomputers
Abbreviated titleROSS
Country/TerritoryUnited States
CityWashington, DC
Period27/06/1727/06/17
Internet address

Keywords

  • HPC platform models
  • APIs
  • Telemetry data management
  • Context awareness
  • Adaptive appilcations

Fingerprint

Dive into the research topics of 'Seastar: a comprehensive framework for telemetry data in HPC environments'. Together they form a unique fingerprint.

Cite this