Changeset 2906 for inundation/parallel/documentation/parallel.tex
- Timestamp:
- May 18, 2006, 11:45:48 AM (18 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
inundation/parallel/documentation/parallel.tex
r2849 r2906 27 27 28 28 The first step in parallelising the code is to subdivide the mesh 29 into, roughly, equally sized partitions. On a rectangular domainthis may be29 into, roughly, equally sized partitions. On a rectangular mesh this may be 30 30 done by a simple co-ordinate based dissection, but on a complicated 31 domain such as the Merimbula gridshown in Figure \ref{fig:mergrid}31 domain such as the Merimbula mesh shown in Figure \ref{fig:mergrid} 32 32 a more sophisticated approach must be used. We use pymetis, a 33 33 python wrapper around the Metis … … 40 40 \begin{figure}[hbtp] 41 41 \centerline{ \includegraphics[scale = 0.75]{figures/mermesh.eps}} 42 \caption{The Merimbula grid.}42 \caption{The Merimbula mesh.} 43 43 \label{fig:mergrid} 44 44 \end{figure} … … 87 87 setting up the communication pattern as well as assigning the local numbering scheme for the submeshes. 88 88 89 Consider the example subpartitioning given in Figure \ref{fig:subdomain}. During the \code{evolve} calculations Triangle 3 in Submesh 0 will need to access its neighbour Triangle 4stored in Submesh 1. The standard approach to this problem is to add an extra layer of triangles, which we call ghost triangles. The ghost triangles89 Consider the example subpartitioning given in Figure \ref{fig:subdomain}. During the \code{evolve} calculations Triangle 2 in Submesh 0 will need to access its neighbour Triangle 3 stored in Submesh 1. The standard approach to this problem is to add an extra layer of triangles, which we call ghost triangles. The ghost triangles 90 90 are read-only, they should not be updated during the calculations, they are only there to hold any extra information that a processor may need to complete its calculations. The ghost triangle values are updated through communication calls. Figure \ref{fig:subdomaing} shows the submeshes with the extra layer of ghost triangles. 91 91 92 92 \begin{figure}[hbtp] 93 93 \centerline{ \includegraphics[scale = 0.6]{figures/subdomain.eps}} 94 \caption{An example subpartioning .}94 \caption{An example subpartioning of a mesh.} 95 95 \label{fig:subdomain} 96 96 \end{figure} … … 99 99 \begin{figure}[hbtp] 100 100 \centerline{ \includegraphics[scale = 0.6]{figures/subdomainghost.eps}} 101 \caption{An example subpartioning with ghost triangles. }101 \caption{An example subpartioning with ghost triangles. The numbers in brackets shows the local numbering scheme that is calculated and stored with the mesh, but not implemented until the local mesh is built. See Section \ref{sec:part4}. } 102 102 \label{fig:subdomaing} 103 103 \end{figure} 104 104 105 When partitioning the mesh we introduce new, dummy, boundary edges. For example, Triangle 3 in Submesh 1, from Figure \ref{fig:subdomaing}, originally shared an edge with Triangle 2, but after partitioning that edge becomes a boundary edge. These new boundary edges are are tagged as \code{ghost} and should, in general, be assigned a type of \code{None}. The following piece of code taken from {\tt run_parallel_advection.py} shows an example. 106 105 When partitioning the mesh we introduce new, dummy, boundary edges. For example, Triangle 2 in Submesh 1 from Figure \ref{fig:subdomaing} originally shared an edge with Triangle 1, but after partitioning that edge becomes a boundary edge. These new boundary edges are are tagged as \code{ghost} and should, in general, be assigned a type of \code{None}. The following piece of code taken from {\tt run_parallel_advection.py} shows an example. 107 106 {\small \begin{verbatim} 108 107 T = Transmissive_boundary(domain) … … 112 111 \end{verbatim}} 113 112 114 115 Looking at Figure \ref{fig:subdomaing} we see that after each \code{evolve} step Processor 0 will have to send the updated values for Triangle 3 and Triangle 5 to Processor 1, and similarly Processor 1 will have to send the updated values for triangles 4, 7 and 6 (recall that Submesh $p$ will be assigned to Processor $p$). The \code{build_submesh} function builds a dictionary that defines the communication pattern. 116 117 Finally, the ANUGA code assumes that the triangles (and nodes etc.) are numbered consecutively starting from 1 (FIXME (Ole): Isn't it 0?). Consequently, if Submesh 1 in Figure \ref{fig:subdomaing} was passed into the \code{evolve} calculations it would crash due to the 'missing' triangles. The \code{build_submesh} function determines a local numbering scheme for each submesh, but it does not actually update the numbering, that is left to the function \code{build_local}. 113 Looking at Figure \ref{fig:subdomaing} we see that after each \code{evolve} step Processor 0 will have to send the updated values for Triangle 2 and Triangle 4 to Processor 1, and similarly Processor 1 will have to send the updated values for Triangle 3 and Triangle 5 (recall that Submesh $p$ will be assigned to Processor $p$). The \code{build_submesh} function builds a dictionary that defines the communication pattern. 114 115 Finally, the ANUGA code assumes that the triangles (and nodes etc.) are numbered consecutively starting from 0. Consequently, if Submesh 1 in Figure \ref{fig:subdomaing} was passed into the \code{evolve} calculations it would crash. The \code{build_submesh} function determines a local numbering scheme for each submesh, but it does not actually update the numbering, that is left to \code{build_local}. 116 118 117 119 118 \subsection {Sending the Submeshes}\label{sec:part3} … … 121 120 All of functions described so far must be run in serial on Processor 0. The next step is to start the parallel computation by spreading the submeshes over the processors. The communication is carried out by 122 121 \code{send_submesh} and \code{rec_submesh} defined in {\tt build_commun.py}. 123 The \code{send_submesh} function should be called on Processor 0 and sends the Submesh $p$ to Processor $p$, while \code{rec_submesh} should be called by Processor $p$ to receive Submesh $p$ from Processor 0. Note that the order of communication is very important, if any changes are made to the \code{send_submesh} function the corresponding change must be made to the \code{rec_submesh} function. 122 The \code{send_submesh} function should be called on Processor 0 and sends the Submesh $p$ to Processor $p$, while \code{rec_submesh} should be called by Processor $p$ to receive Submesh $p$ from Processor 0. 123 124 As an aside, the order of communication is very important. If someone was to modify the \code{send_submesh} routine the corresponding change must be made to the \code{rec_submesh} routine. 124 125 125 126 While it is possible to get Processor 0 to communicate it's submesh to itself, it is an expensive and unnecessary communication call. The {\tt build_commun.py} file also includes a function called \code{extract_hostmesh} that should be called on Processor 0 to extract Submesh 0. 126 127 127 128 128 \subsection {Building the Local Mesh} 129 \subsection {Building the Local Mesh}\label{sec:part4} 129 130 After using \code{send_submesh} and \code{rec_submesh}, Processor $p$ should have its own local copy of Submesh $p$, however as stated previously the triangle numbering will be incorrect on all processors except number $0$. The \code{build_local_mesh} function from {\tt build_local.py} primarily focuses on renumbering the information stored with the submesh; including the nodes, vertices and quantities. Figure \ref{fig:subdomainf} shows what the mesh in each processor may look like. 130 131 … … 147 148 \begin{verbatim} 148 149 ####################### 149 # Partition the domain150 # Partition the mesh 150 151 ####################### 151 152 … … 157 158 \end{verbatim} 158 159 159 This rectangular mesh is artificial, and the approach to subpartitioning the mesh is different to the one described above, however this example may be of interest to those who want to measure the parallel efficiency of the code on their machine. A rectangular mesh should give a good load balance and is therefore an important first test problem.160 161 162 A more \lq real life\rq\ mesh is the Merimbula mesh used in the code shown in Section \ref{sec:codeRPMM}. This example also solves the advection equation. In this case the techniques described in Section \ref{sec:part} must be used to partition the mesh. Figure \ref{fig:code} shows the part of the code that is responsible for spreading the domainover the processors. We now look at the code in detail.160 Most simulations will not be done on a rectangular mesh, and the approach to subpartitioning the mesh is different to the one described above, however this example may be of interest to those who want to measure the parallel efficiency of the code on their machine. A rectangular mesh should give a good load balance and is therefore an important first test problem. 161 162 163 A more \lq real life\rq\ mesh is the Merimbula mesh used in the code shown in Section \ref{sec:codeRPMM}. This example also solves the advection equation. In this case the techniques described in Section \ref{sec:part} must be used to partition the mesh. Figure \ref{fig:code} shows the part of the code that is responsible for spreading the mesh over the processors. We now look at the code in detail. 163 164 164 165 \begin{figure}[htbp] … … 170 171 filename = 'merimbula_10785.tsh' 171 172 172 domain_full = pmesh_to_domain_instance(filename, Advection_Domain) 173 domain_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0)) 174 175 # Define the domain boundaries for visualisation 176 177 rect = array(domain_full.xy_extent, Float) 173 mesh_full = pmesh_to_domain_instance(filename, Advection_Domain) 174 mesh_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0)) 178 175 179 176 # Subdivide the mesh 180 177 181 178 nodes, triangles, boundary, triangles_per_proc, quantities =\ 182 pmesh_divide_metis( domain_full, numprocs)179 pmesh_divide_metis(mesh_full, numprocs) 183 180 184 181 # Build the mesh that should be assigned to each processor. … … 195 192 # Build the local mesh for processor 0 196 193 197 hostmesh = extract_hostmesh(submesh) 198 points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \ 199 build_local_mesh(hostmesh, 0, triangles_per_proc[0], numprocs) 194 points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict =\ 195 extract_hostmesh(submesh, triangles_per_proc) 200 196 201 197 else: … … 213 209 \begin{itemize} 214 210 \item 215 These first few lines of code read in and define the (global) mesh. 211 These first few lines of code read in and define the (global) mesh. The \code{Set_Stage} function sets the initial conditions. See the code in \ref{sec:codeRPMM} for the definition of \code{Set_Stage}. 216 212 \begin{verbatim} 217 213 filename = 'merimbula_10785.tsh' 218 domain_full = pmesh_to_domain_instance(filename, Advection_Domain) 219 domain_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0)) 220 \end{verbatim} 221 222 \item 223 The \code{rect} array is used by the visualiser and records the domain size. 214 mesh_full = pmesh_to_domain_instance(filename, Advection_Domain) 215 mesh_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0)) 216 \end{verbatim} 217 224 218 \item \code{pmesh_divide_metis} divides the mesh into a set of non-overlapping subdomains as described in Section \ref{sec:part1}. 225 219 \begin{verbatim} 226 220 nodes, triangles, boundary, triangles_per_proc, quantities =\ 227 pmesh_divide_metis( domain_full, numprocs)228 \end{verbatim} 229 230 \item The next step is to build a boundary layer of ghost triangles and define the communication pattern. This step is implemented by \code{build_submesh} as discussed in Section \ref{sec:part2}. 221 pmesh_divide_metis(mesh_full, numprocs) 222 \end{verbatim} 223 224 \item The next step is to build a boundary layer of ghost triangles and define the communication pattern. This step is implemented by \code{build_submesh} as discussed in Section \ref{sec:part2}. The \code{submesh} variable contains a copy of the submesh for each processor. 231 225 \begin{verbatim} 232 226 submesh = build_submesh(nodes, triangles, boundary, quantities, \ … … 240 234 \end{verbatim} 241 235 242 The processors receive a given subpartition by calling \code{rec_submesh}. The \code{rec_submesh} routine also calls \code{build_local_mesh}. The \code{build_local_mesh} routine described in Section \ref{sec:part4} ensures that the information is stored in a way that is compatible with the Domain datastructure. This means, for example, that the triangles and nodes must be numbered consecutively starting from 1 (FIXME (Ole): or is it 0?). 243 \begin{verbatim} 244 points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \ 236 237 The processors receives a given subpartition by calling \code{rec_submesh}. The \code{rec_submesh} routine also calls \code{build_local_mesh}. The \code{build_local_mesh} routine described in Section \ref{sec:part4} ensures that the information is stored in a way that is compatible with the Domain datastructure. This means, for example, that the triangles and nodes must be numbered consecutively starting from 0. 238 239 \begin{verbatim} 240 points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict=\ 245 241 rec_submesh(0) 246 242 \end{verbatim} 247 243 248 Note that the submesh is not received by, or sent to, Processor 0. Rather \code{hostmesh = extract_hostmesh(submesh)} extracts the appropriate information. This saves the cost of an unnecessary communication call. It is described further in Section \ref{sec:part3}. 249 \begin{verbatim} 250 hostmesh = extract_hostmesh(submesh) 251 points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \ 252 build_local_mesh(hostmesh, 0, triangles_per_proc[0], numprocs) 244 Note that the submesh is not received by, or sent to, Processor 0. Rather \code{hostmesh = extract_hostmesh(submesh)} simply extracts the mesh that has been assigned to Processor 0. Recall \code{submesh} contains the list of submeshes to be assigned to each processor. This is described further in Section \ref{sec:part3}. The \code{build_local_mesh} renumbers the nodes 245 \begin{verbatim} 246 points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict)=\ 247 extract_hostmesh(submesh, triangles_per_proc) 253 248 \end{verbatim} 254 249
Note: See TracChangeset
for help on using the changeset viewer.