Every structure should divide the information that they store across processors, in order to do this two main approach are considered inside OpenFPM.
- Domain decomposition
- Generate graph model
- Find optimal decomposition
- Post process the decomposition
- Domain decomposition
Data driven decomposition
- Divide the total information in equal chunks
The first approach try to create a model for computation and comunication and try to find a decomposition that minimize it.
The second just divide the domain as the user specified and assign 1 sub-domain to each processor
In a distributed-memory setting, where data are scattered across processors, two factors are important: equal division of work across processors and reduction of the communication overhead. A typical approach, is to formulate the problem as a graph-partitioning problem: the physical domain is divided into sub-sub-domains (Domain decomposition) (vertices of the graph), each of them carrying a weight modelling the computational cost. The communication pattern between sub-domains is represented as links between the sub-sub-domains (edges of the graph) with weights formalizing the communication cost (Generate graph model). The requirement of balanced computation with minimal communication then translates to the optimization problem of finding a graph partitioning where each group contains the same sum of weights, and the sum of the cut edges is minimal (Find optimal decomposition). As final step the decomposition can be post-process further more to be for optimal, based on factors not considered by the optimization process, like merging vertex, or merging sub-sub-domain, to create bigger sub-domains. Even if a model decomposition it is not bind to a graph model it is true that until now it is the main approach
The decomposition is chosen by the user and kept fixed.
Data Driven decomposition
Data is divided across processors, It is the default decomposition when a decomposition is not available yet, it basically divide the data consistently across processors regardless of geometrical meaning or communication
Every sub-domain has an extended part called ghost
With the dashed red box we indicate the local domain while with the othe color boxes we indicate the external Ghost boxes that cover this extended part, these boxes are indicated by G5_0 G8_0 G9_0 and G9_1. These boxes can be calculated as the intersection of the extended domain with the adjacent sub-domains. Note also that each external ghost box of a sub-domain is an internal ghost box on another processor sub-domain
The following image show the internal ghost boxes of the same local processor indicated with B5_0 B8_0 B9_0 and B9_1. The production of these ghost boxes can be produced as the intersection of the local sub-domain with the expanded adjacent processor sub-domains.
Each distributed structure in general define a function to synchronize the ghost part