D 2015

Acceleration of dRMSD Calculation and Efficient Usage of GPU Caches

FILIPOVIČ, Jiří, Jan PLHÁK and David STŘELÁK

Basic information

Original name

Acceleration of dRMSD Calculation and Efficient Usage of GPU Caches

Name in Czech

Akcelerace dRMSD výpočtu a efektivní užití GPU cache

Authors

FILIPOVIČ, Jiří (203 Czech Republic, guarantor, belonging to the institution), Jan PLHÁK (203 Czech Republic, belonging to the institution) and David STŘELÁK (203 Czech Republic, belonging to the institution)

Edition

neuveden, Proceedings of IEEE International Conference on High Performance Computing & Simulation, p. 47-54, 8 pp. 2015

Publisher

IEEE

Other information

Language

English

Type of outcome

Stať ve sborníku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

Netherlands

Confidentiality degree

není předmětem státního či obchodního tajemství

Publication form

printed version "print"

RIV identification code

RIV/00216224:14330/15:00083460

Organization unit

Faculty of Informatics

ISBN

978-1-4673-7812-3

UT WoS

000375684100006

Keywords (in Czech)

RMSD; GPU; optimalizace kódu; cache

Keywords in English

RMSD; GPU; code optimization; cache

Tags

International impact, Reviewed
Změněno: 13/7/2016 11:10, doc. RNDr. Jiří Filipovič, Ph.D.

Abstract

V originále

In this paper, we introduce the GPU acceleration of dRMSD algorithm, used to compare different structures of a molecule. Comparing to multithreaded CPU implementation, we have reached 13.4x speedup in clustering and 62.7x speedup in 1:1 dRMSD computation using mid-end GPU. The dRMSD computation exposes strong memory locality and thus is compute-bound. Along with conservative implementation using shared memory, we have decided to implement variants of the algorithm using GPU caches to maintain memory locality. Our implementation using cache reaches 96.5 % and 91.6 % of shared memory performance on Fermi and Maxwell, respectively. We have identified several performance pitfalls related to cache blocking in compute-bound codes and suggested optimization techniques to improve the performance.

Links

EE2.3.30.0037, research and development project
Name: Zaměstnáním nejlepších mladých vědců k rozvoji mezinárodní spolupráce