Umpalumpa: a framework for efficient execution of complex image
processing workloads on heterogeneous nodes

J 2023

Umpalumpa: a framework for efficient execution of complex image processing workloads on heterogeneous nodes

STŘELÁK, David, David MYŠKA, Filip PETROVIČ, Jan POLÁK, Jaroslav OĽHA et. al.

Basic information

Original name

Umpalumpa: a framework for efficient execution of complex image processing workloads on heterogeneous nodes

Authors

STŘELÁK, David (203 Czech Republic, belonging to the institution), David MYŠKA (203 Czech Republic, belonging to the institution), Filip PETROVIČ (703 Slovakia, belonging to the institution), Jan POLÁK (203 Czech Republic, belonging to the institution), Jaroslav OĽHA (703 Slovakia, belonging to the institution) and Jiří FILIPOVIČ (203 Czech Republic, belonging to the institution)

Edition

Computing, Springer, 2023, 0010-485X

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Austria

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

URL

Impact factor

Impact factor: 3.700 in 2022

RIV identification code

RIV/00216224:14610/23:00131054

Organization unit

Institute of Computer Science

DOI

http://dx.doi.org/10.1007/s00607-023-01190-w

UT WoS

001010699200001

Keywords in English

Image processing; task-based systems; auto-tuning; data-aware architecture; CUDA

Abstract

V originále

Modern computers are typically heterogeneous devices—besides the standard central processing unit (CPU), they commonly include an accelerator such as a graphics processing unit (GPU). However, exploiting the full potential of such computers is challenging, especially when complex workloads consisting of multiple computationally demanding tasks are to be processed. This paper proposes a framework called Umpalumpa, which aims to manage complex workloads on heterogeneous computers. Umpalumpa combines three aspects that ease programming and optimize code performance. Firstly, it implements a data-centric design, where data are described by their physical properties (e. g., location in memory, size) and logical properties (e. g., dimensionality, shape, padding). Secondly, Umpalumpa utilizes task-based parallelism to schedule tasks on heterogeneous nodes. Thirdly, tasks can be dynamically autotuned on a source code level according to the hardware where the task is executed and the processed data. Altogether, Umpalumpa allows for implementing a complex workload, which is automatically executed on CPUs and accelerators, and allows autotuning to maximize the performance with the given hardware and data input. Umpalumpa focuses on image processing workloads, but the concept is generic and can be extended to different types of workloads. We demonstrate the usability of the proposed framework on two previously accelerated applications from cryogenic electron microscopy: 3D Fourier reconstruction and Movie alignment. We show that, compared to the original implementations, Umpalumpa reduces the complexity and improves the maintainability of the main applications’ loops while improving performance through automatic memory management and autotuning of the GPU kernels.

Links

LM2018140, research and development project

Name: e-Infrastruktura CZ (Acronym: e-INFRA CZ)

Investor: Ministry of Education, Youth and Sports of the CR

Citovat

STŘELÁK, David, David MYŠKA, Filip PETROVIČ, Jan POLÁK, Jaroslav OĽHA and Jiří FILIPOVIČ. Umpalumpa: a framework for efficient execution of complex image processing workloads on heterogeneous nodes. Computing. Springer, 2023, vol. 105, No 11, p. 2389-2417. ISSN 0010-485X. Available from: https://dx.doi.org/10.1007/s00607-023-01190-w.

@article{2293221,
   author = {Střelák, David and Myška, David and Petrovič, Filip and Polák, Jan and Oľha, Jaroslav and Filipovič, Jiří},
   article_number = {11},
   doi = {http://dx.doi.org/10.1007/s00607-023-01190-w},
   keywords = {Image processing; task-based systems; auto-tuning; data-aware architecture; CUDA},
   language = {eng},
   issn = {0010-485X},
   journal = {Computing},
   title = {Umpalumpa: a framework for efficient execution of complex image processing workloads on heterogeneous nodes},
   url = {https://doi.org/10.1007/s00607-023-01190-w},
   volume = {105},
   year = {2023}
}

TY  - JOUR
ID  - 2293221
AU  - Střelák, David - Myška, David - Petrovič, Filip - Polák, Jan - Oľha, Jaroslav - Filipovič, Jiří
PY  - 2023
TI  - Umpalumpa: a framework for efficient execution of complex image processing workloads on heterogeneous nodes
JF  - Computing
VL  - 105
IS  - 11
SP  - 2389-2417
EP  - 2389-2417
PB  - Springer
SN  - 0010485X
KW  - Image processing
KW  - task-based systems
KW  - auto-tuning
KW  - data-aware architecture
KW  - CUDA
UR  - https://doi.org/10.1007/s00607-023-01190-w
N2  - Modern computers are typically heterogeneous devices—besides the standard central processing unit (CPU), they commonly include an accelerator such as a graphics processing unit (GPU). However, exploiting the full potential of such computers is challenging, especially when complex workloads consisting of multiple computationally demanding tasks are to be processed. This paper proposes a framework called Umpalumpa, which aims to manage complex workloads on heterogeneous computers. Umpalumpa combines three aspects that ease programming and optimize code performance. Firstly, it implements a data-centric design, where data are described by their physical properties (e. g., location in memory, size) and logical properties (e. g., dimensionality, shape, padding). Secondly, Umpalumpa utilizes task-based parallelism to schedule tasks on heterogeneous nodes. Thirdly, tasks can be dynamically autotuned on a source code level according to the hardware where the task is executed and the processed data. Altogether, Umpalumpa allows for implementing a complex workload, which is automatically executed on CPUs and accelerators, and allows autotuning to maximize the performance with the given hardware and data input. Umpalumpa focuses on image processing workloads, but the concept is generic and can be extended to different types of workloads. We demonstrate the usability of the proposed framework on two previously accelerated applications from cryogenic electron microscopy: 3D Fourier reconstruction and Movie alignment. We show that, compared to the original implementations, Umpalumpa reduces the complexity and improves the maintainability of the main applications’ loops while improving performance through automatic memory management and autotuning of the GPU kernels.
ER  -

STŘELÁK, David, David MYŠKA, Filip PETROVIČ, Jan POLÁK, Jaroslav OĽHA and Jiří FILIPOVIČ. Umpalumpa: a framework for efficient execution of complex image processing workloads on heterogeneous nodes. \textit{Computing}. Springer, 2023, vol.~105, No~11, p.~2389-2417. ISSN~0010-485X. Available from: https://dx.doi.org/10.1007/s00607-023-01190-w.

Detailed Information on Publication Record