Planning for distributed workflows: constraint-based
coscheduling of computational jobs and data placement in
distributed environments

D 2015

Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments

MAKATUN, Dzmitry; Jerome LAURET; Hana RUDOVÁ a Michal ŠUMBERA

Základní údaje

Originální název

Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments

Autoři

MAKATUN, Dzmitry; Jerome LAURET; Hana RUDOVÁ a Michal ŠUMBERA

Vydání

Prague, Czech Republic, Journal of Physics: Conference Series, vol. 608, od s. 1-6, 6 s. 2015

Nakladatel

Institute of Physics Publishing

Další údaje

Jazyk

angličtina

Typ výsledku

Stať ve sborníku

Obor

10201 Computer sciences, information science, bioinformatics

Stát vydavatele

Velká Británie a Severní Irsko

Utajení

není předmětem státního či obchodního tajemství

Forma vydání

tištěná verze "print"

Odkazy

URL

Označené pro přenos do RIV

Ano

Kód RIV

RIV/00216224:14330/15:00081123

Organizační jednotka

Fakulta informatiky

ISSN

DOI

https://doi.org/10.1088/1742-6596/608/1/012028A

UT WoS

000358218000028

EID Scopus

2-s2.0-84937847761

Klíčová slova anglicky

planning; constraint programming; distributed computational resources; STAR experiment

Příznaky

Mezinárodní význam, Recenzováno

Změněno: 27. 8. 2019 12:26, RNDr. Pavel Šmerk, Ph.D.

Anotace

V originále

When running data intensive applications on distributed computational resources long I/O overheads may be observed as access to remotely stored data is performed. Latencies and bandwidth can become the major limiting factor for the overall computation performance and can reduce the CPU/WallTime ratio to excessive IO wait. Reusing the knowledge of our previous research, we propose a constraint programming based planner that schedules computational jobs and data placements (transfers) in a distributed environment in order to optimize resource utilization and reduce the overall processing completion time. The optimization is achieved by ensuring that none of the resources (network links, data storages and CPUs) are oversaturated at any moment of time and either (a) that the data is pre-placed at the site where the job runs or (b) that the jobs are scheduled where the data is already present. Such an approach eliminates the idle CPU cycles occurring when the job is waiting for the I/O from a remote site and would have wide application in the community. Our planner was evaluated and simulated based on data extracted from log files of batch and data management systems of the STAR experiment. The results of evaluation and estimation of performance improvements are discussed in this paper.

Návaznosti

GAP202/12/0306, projekt VaV

Název: Dyschnet - Dynamické plánování a rozvrhování výpočetních a síťových zdrojů (Akronym: Dyschnet)

Investor: Grantová agentura ČR, Dyschnet - Dynamické plánování a rozvrhování výpočetních a síťových zdrojů

Citovat

MAKATUN, Dzmitry; Jerome LAURET; Hana RUDOVÁ a Michal ŠUMBERA. Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments. In Journal of Physics: Conference Series, vol. 608. Prague, Czech Republic: Institute of Physics Publishing, 2015, s. 1-6. ISSN 1742-6588. Dostupné z: https://doi.org/10.1088/1742-6596/608/1/012028A.

@inproceedings{1313981,
   author = {Makatun, Dzmitry and Lauret, Jerome and Rudová, Hana and Šumbera, Michal},
   address = {Prague, Czech Republic},
   booktitle = {Journal of Physics: Conference Series, vol. 608},
   doi = {https://doi.org/10.1088/1742-6596/608/1/012028A},
   keywords = {planning; constraint programming; distributed computational resources; STAR experiment},
   howpublished = {tištěná verze "print"},
   language = {eng},
   location = {Prague, Czech Republic},
   pages = {1-6},
   publisher = {Institute of Physics Publishing},
   title = {Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments},
   url = {http://dx.doi.org/10.1088/1742-6596/608/1/012028A},
   year = {2015}
}

TY  - CONF
ID  - 1313981
AU  - Makatun, Dzmitry - Lauret, Jerome - Rudová, Hana - Šumbera, Michal
PY  - 2015
TI  - Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments
PB  - Institute of Physics Publishing
CY  - Prague, Czech Republic
KW  - planning
KW  - constraint programming
KW  - distributed computational resources
KW  - STAR experiment
UR  - http://dx.doi.org/10.1088/1742-6596/608/1/012028A
L2  - http://dx.doi.org/10.1088/1742-6596/608/1/012028A
N2  - When running data intensive applications on distributed computational resources long I/O overheads may be observed as access to remotely stored data is performed. Latencies and bandwidth can become the major limiting factor for the overall computation performance and can reduce the CPU/WallTime ratio to excessive IO wait. Reusing the knowledge of our previous research, we propose a constraint programming based planner that schedules computational jobs and data placements (transfers) in a distributed environment in order to optimize resource utilization and reduce the overall processing completion time. The optimization is achieved by ensuring that none of the resources (network links, data storages and CPUs) are oversaturated at any moment of time and either (a) that the data is pre-placed at the site where the job runs or (b) that the jobs are scheduled where the data is already present. Such an approach eliminates the idle CPU cycles occurring when the job is waiting for the I/O from a remote site and would have wide application in the community. Our planner was evaluated and simulated based on data extracted from log files of batch and data management systems of the STAR experiment. The results of evaluation and estimation of performance improvements are discussed in this paper.
ER  -

MAKATUN, Dzmitry; Jerome LAURET; Hana RUDOVÁ a Michal ŠUMBERA. Planning for distributed workflows: constraint-based coscheduling of computational jobs and data placement in distributed environments. In \textit{Journal of Physics: Conference Series, vol. 608}. Prague, Czech Republic: Institute of Physics Publishing, 2015, s.~1-6. ISSN~1742-6588. Dostupné z: https://doi.org/10.1088/1742-6596/608/1/012028A.

Přehled o publikaci