FACULTY OF INFORMATICS Masaryk University PA039: Supercomputer Architecture and Intensive Computing Message Passing Interface Luděk Matýska Spring 2024 Luděk Matýska • MPI • Spring 2024 1/46 FACULTY OF INFORMATICS Masaryk University Parallel programming ■ Data paraLLeLism ■ Identical instructions on different processors process different data ■ In principle the SIMD model (Single Instruction Multiple Data) ■ For example loop parallelization ■ Task parallelism ■ MIMD model (Multiple Instruction Multiple Data) ■ Independent blocks (functions, procedures, programs) run in parallel ■ SPMD ■ No synchronization at the level of individual instructions ■ Equivalent to MIMD ■ Message passing targets SPMD/MIMD Luděk Matýska • MPI • Spring 2024 2/46 FACULTY OF INFORMATICS Masaryk University Before MPI ■ Many competing message passing Libraries ■ Vendor specific/proprietary libraries ■ Academic, narrow specific implementations ■ Different communication models ■ Difficult application development ■ Need for "own" communication model to encapsulate the specific models ■ MPI an attempt to define a standard set of communication calls Luděk Matýska • MPI • Spring 2024 3/46 FACULTY OF INFORMATICS Masaryk University Message Passing Interface ■ Communication interface for parallel, programs ■ Defined through API ■ Standardized ■ Several independent implementations ■ Potential for optimization for specific hardware ■ Some problems with real interoperability Luděk Matýska • MPI • Spring 2024 4/46 FACULTY OF INFORMATICS Masaryk University Programming model MPI designed originally for distributed memory architectures Luděk Matýska • MPI • Spring 2024 5/46 FACULTY OF INFORMATICS Masaryk University Programming model Currently supports hybrid models CPU CPU CPU CPU network :PU :PU CPU CPU Memory CPU CPU Memory 1 CPU CPU Luděk Matýska • MPI • Spring 2024 6/46 FACULTY OF INFORMATICS I Masaryk University MPI Evolution ■ Versions ■ 1.0(1994) ■ Basic, never implemented ■ Bindings for C and Fortran ■ 1.1 (1995) ■ Removal of major deficiencies in Version 1.0 ■ Implemented ■ 1.2 (1996) ■ Intermediate version (precedes MPI-2) ■ Extension of MPI-1 standard Luděk Matýska • MPI • Spring 2024 7/46 FACULTY OF INFORMATICS I Masaryk University MPI-2.0 (1997) ■ Experimental Implementation of the fuLL MPI-2 standard ■ Extensions ■ Parallel I/O ■ Unidirectional operations (put, get) ■ Process manipulation ■ Bindings for C++ and Fortran 90 ■ Stable for 10 years ■ Version 2.2 in 2009 Luděk Matýska • MPI • Spring 2024 8/46 FACULTY OF INFORMATICS I Masaryk University MPI-3.0 (2012) ■ Motivated by weaknesses of previous versions and also to reflect hardware innovation (esp. muLticore processors), see http://www.mpi-forum.org/ ■ Major new features ■ Non-blocking collectives, neighbourhood collectives ■ Improved one-sided communication ■ New tools interface and bindings for Fortran 2008 ■ Other new features ■ Matching Probe and Recv for thread-safe probe and receive ■ New functions ■ Removed previously deprecated functions from C++ bindings ■ Working groups ■ MPI 3.1 ratified in June 2015 ■ Fully adopted in all major MPI implementations Ludek Matyska • MPI • Spring 2024 9/46 FACULTY OF INFORMATICS I Masaryk University MPI 4 ■ The current version ■ Major additions: ■ "Big count" operations ■ Persistent Collectives ■ Partitioned Communication ■ Topology Solutions ■ Simple fault handling to enable fault tolerance solutions ■ New tool interface for events ■ OpenMPI implementation ■ Currently Version 4.1 (approved in 2023) ■ Joint project of developers of several MPI streams Luděk Matýska • MPI • Spring 2024 10/46 FACULTY OF INFORMATICS Masaryk University MPI Design Goals ■ Portability ■ Define standard APIs ■ Define bindings for different languages ■ Independent implementations ■ Performance ■ Independent hardware specific optimization ■ Libraries, potential for changes in algorithms ■ e.g. new versions of collective operations ■ Functionality ■ Goal to cover all aspects of inter-processor communication Luděk Matýska • MPI • Spring 2024 11/46 FACULTY OF INFORMATICS Masaryk University Design Goals II ■ Library for message passing ■ Designed for use on parallel, computers, clusters and even Grids ■ Make parallel, hardware available for ■ Users ■ Libraries'authors ■ Tools and applications developers Luděk Matýska • MPI • Spring 2024 12/46 FACULTY OF INFORMATICS Masaryk University Core MPI MPLInit MPI Initialization MPLComm_Size Provide number of processes MPLComm_Rank Provide own (process) identity MPLSend Send a message MPLRecv Receive a message MPLFinaLize MPI finish Luděk Matýska • MPI • Spring 2024 FACULTY OF INFORMATICS Masaryk University MPI Initialization ■ Create an environment ■ Specify that the program will use the MPI Libraries ■ No explicit work with processes ■ Added since MPI-3.0 Luděk Matýska • MPI • Spring 2024 14/46 FACULTY OF INFORMATICS Masaryk University Identity ■ Any parallel, (distributed) program needs to know ■ How many processes are participating on the computation ■ Identity of "own" process ■ MPLComm_size(MPLCOMM_WORLD, &size) ■ Returns number of processes that share the default MPLCOMM_WORLD communicator (see later) ■ MPLComm_rank(MPLCOMM_WORLD, &rank) ■ Returns number of the calling process (identity) Luděk Matýska • MPI • Spring 2024 15/46 FACULTY OF INFORMATICS Masaryk University Work with messages ■ Naive/primitive model ■ Process A sends a message: operation send ■ Process B receives a message: operation receive ■ Lot of questions ■ How to properly specify (define) the data? ■ How to specify (identify) process B (the receiver)? ■ How the receiver recognises that the data are for it? ■ How a successful completion is recognised? Luděk Matýska • MPI • Spring 2024 16/46 FACULTY OF INFORMATICS Masaryk University Classical approach ■ We send data as a byte stream ■ It is left to sender and receiver to properly setup and recognize data ■ Each process has a unique identifier ■ We have to know identity of sender and receiver ■ Broadcast operation ■ We can specify some tag for the better recognition (e.g. the message sequence number) ■ Synchronization ■ Explicit collaboration between a sender and a receiver ■ It defines order of messages Luděk Matýska • MPI • Spring 2024 17/46 FACULTY OF INFORMATICS Masaryk University Classical approach II ■ send(buffer, Len, destination, tag) ■ buffer contains data, its length is ten ■ Message is sent to process whose identity is destination ■ Message has a tag tag ■ recv(buffer, maxLen, source, tag, actLen) ■ Message will be accepted (read) into a memory space defined by the buffer whose length is maxien ■ Actual size of accepted message is actlen (actten