Detailed Information on Publication Record
2015
Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments
MAKATUN, Dzmitry, Jerome LAURET, Hana RUDOVÁ and Michal ŠUMBERABasic information
Original name
Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments
Authors
MAKATUN, Dzmitry (112 Belarus), Jerome LAURET (840 United States of America), Hana RUDOVÁ (203 Czech Republic, guarantor, belonging to the institution) and Michal ŠUMBERA (203 Czech Republic)
Edition
Prague, Czech Republic, Journal of Physics: Conference Series, vol. 608, p. 1-6, 6 pp. 2015
Publisher
Institute of Physics Publishing
Other information
Language
English
Type of outcome
Stať ve sborníku
Field of Study
10201 Computer sciences, information science, bioinformatics
Country of publisher
United Kingdom of Great Britain and Northern Ireland
Confidentiality degree
není předmětem státního či obchodního tajemství
Publication form
printed version "print"
References:
RIV identification code
RIV/00216224:14330/15:00081123
Organization unit
Faculty of Informatics
ISSN
UT WoS
000358218000028
Keywords in English
planning; constraint programming; distributed computational resources; STAR experiment
Tags
International impact, Reviewed
Změněno: 27/8/2019 12:26, RNDr. Pavel Šmerk, Ph.D.
Abstract
V originále
When running data intensive applications on distributed computational resources long I/O overheads may be observed as access to remotely stored data is performed. Latencies and bandwidth can become the major limiting factor for the overall computation performance and can reduce the CPU/WallTime ratio to excessive IO wait. Reusing the knowledge of our previous research, we propose a constraint programming based planner that schedules computational jobs and data placements (transfers) in a distributed environment in order to optimize resource utilization and reduce the overall processing completion time. The optimization is achieved by ensuring that none of the resources (network links, data storages and CPUs) are oversaturated at any moment of time and either (a) that the data is pre-placed at the site where the job runs or (b) that the jobs are scheduled where the data is already present. Such an approach eliminates the idle CPU cycles occurring when the job is waiting for the I/O from a remote site and would have wide application in the community. Our planner was evaluated and simulated based on data extracted from log files of batch and data management systems of the STAR experiment. The results of evaluation and estimation of performance improvements are discussed in this paper.
Links
GAP202/12/0306, research and development project |
|