Abstract
The integration of clusters of computers into computational grids has recently gained the atten- tion of many computational scientists. While considerable progress has been made in building middleware and workflow tools that facilitate the sharing of compute resources, little attention has been paid to grid scheduling and load balancing techniques to reduce job waiting time. Based on a detailed analysis of usage characteristics of an existing grid that involves a large CPU cluster, we observe that grid scheduling decisions can be significantly improved if the characteristics of current usage patterns are understood and extrapolated into the future. We describe a formal framework that uses Kalman filter theory to predict future CPU resource utilisation. This ability to predict future resource utilisation forms the basis for significantly improved grid scheduling decisions. The paper describes the architecture for such a prediction and grid scheduling framework and its implementation using Condor. By way of replicated experiments we demonstrate that the prediction achieves a precision within 15-20% of the utilisation later observed and can significantly improve scheduling quality, compared to approaches that only take into account current load indicators.
Original language | English |
---|---|
Title of host publication | Proceedings of the Parallel and Distributed Processing Symposium (IPDPS'07) |
Publisher | IEEE |
Pages | 1-10 |
Number of pages | 10 |
ISBN (Print) | 1-4244-0910-1 |
DOIs | |
Publication status | Published - 2007 |