Ignore:
Timestamp:
May 18, 2006, 11:45:48 AM (18 years ago)
Author:
linda
Message:

Made correction to the parallel report

File:
1 edited

Legend:

Unmodified
Added
Removed
  • inundation/parallel/documentation/parallel.tex

    r2849 r2906  
    2727
    2828The first step in parallelising the code is to subdivide the mesh
    29 into, roughly, equally sized partitions. On a rectangular domain this may be
     29into, roughly, equally sized partitions. On a rectangular mesh this may be
    3030done by a simple co-ordinate based dissection, but on a complicated
    31 domain such as the Merimbula grid shown in Figure \ref{fig:mergrid}
     31domain such as the Merimbula mesh shown in Figure \ref{fig:mergrid}
    3232a more sophisticated approach must be used.  We use pymetis, a
    3333python wrapper around the Metis
     
    4040\begin{figure}[hbtp]
    4141  \centerline{ \includegraphics[scale = 0.75]{figures/mermesh.eps}}
    42   \caption{The Merimbula grid.}
     42  \caption{The Merimbula mesh.}
    4343 \label{fig:mergrid}
    4444\end{figure}
     
    8787setting up the communication pattern as well as assigning the local numbering scheme for the submeshes.
    8888
    89 Consider the example subpartitioning given in Figure \ref{fig:subdomain}. During the \code{evolve} calculations Triangle 3 in Submesh 0 will need to access its neighbour Triangle 4 stored in Submesh 1. The standard approach to this problem is to add an extra layer of triangles, which we call ghost triangles. The ghost triangles
     89Consider the example subpartitioning given in Figure \ref{fig:subdomain}. During the \code{evolve} calculations Triangle 2 in Submesh 0 will need to access its neighbour Triangle 3 stored in Submesh 1. The standard approach to this problem is to add an extra layer of triangles, which we call ghost triangles. The ghost triangles
    9090are read-only, they should not be updated during the calculations, they are only there to hold any extra information that a processor may need to complete its calculations. The ghost triangle values are updated through communication calls. Figure \ref{fig:subdomaing} shows the submeshes with the extra layer of ghost triangles.
    9191
    9292\begin{figure}[hbtp]
    9393  \centerline{ \includegraphics[scale = 0.6]{figures/subdomain.eps}}
    94   \caption{An example subpartioning.}
     94  \caption{An example subpartioning of a mesh.}
    9595 \label{fig:subdomain}
    9696\end{figure}
     
    9999\begin{figure}[hbtp]
    100100  \centerline{ \includegraphics[scale = 0.6]{figures/subdomainghost.eps}}
    101   \caption{An example subpartioning with ghost triangles.}
     101  \caption{An example subpartioning with ghost triangles. The numbers in brackets shows the local numbering scheme that is calculated and stored with the mesh, but not implemented until the local mesh is built. See Section \ref{sec:part4}. }
    102102 \label{fig:subdomaing}
    103103\end{figure}
    104104
    105 When partitioning the mesh we introduce new, dummy, boundary edges. For example, Triangle 3 in Submesh 1, from Figure \ref{fig:subdomaing}, originally shared an edge with Triangle 2, but after partitioning that edge becomes a boundary edge. These new boundary edges are are tagged as \code{ghost} and should, in general, be assigned a type of \code{None}. The following piece of code taken from {\tt run_parallel_advection.py} shows an example. 
    106  
     105When partitioning the mesh we introduce new, dummy, boundary edges. For example, Triangle 2 in Submesh 1 from Figure \ref{fig:subdomaing} originally shared an edge with Triangle 1, but after partitioning that edge becomes a boundary edge. These new boundary edges are are tagged as \code{ghost} and should, in general, be assigned a type of \code{None}. The following piece of code taken from {\tt run_parallel_advection.py} shows an example. 
    107106{\small \begin{verbatim} 
    108107T = Transmissive_boundary(domain)
     
    112111\end{verbatim}}
    113112
    114 
    115 Looking at Figure \ref{fig:subdomaing} we see that after each \code{evolve} step Processor 0  will have to send the updated values for Triangle 3 and Triangle 5 to Processor 1, and similarly Processor 1 will have to send the updated values for triangles 4, 7 and 6 (recall that Submesh $p$ will be assigned to Processor $p$). The \code{build_submesh} function builds a dictionary that defines the communication pattern.
    116 
    117 Finally, the ANUGA code assumes that the triangles (and nodes etc.) are numbered consecutively starting from 1 (FIXME (Ole): Isn't it 0?). Consequently, if Submesh 1 in Figure \ref{fig:subdomaing} was passed into the \code{evolve} calculations it would crash due to the 'missing' triangles. The \code{build_submesh} function determines a local numbering scheme for each submesh, but it does not actually update the numbering, that is left to the function \code{build_local}.
     113Looking at Figure \ref{fig:subdomaing} we see that after each \code{evolve} step Processor 0  will have to send the updated values for Triangle 2 and Triangle 4 to Processor 1, and similarly Processor 1 will have to send the updated values for Triangle 3 and Triangle 5 (recall that Submesh $p$ will be assigned to Processor $p$). The \code{build_submesh} function builds a dictionary that defines the communication pattern.
     114
     115Finally, the ANUGA code assumes that the triangles (and nodes etc.) are numbered consecutively starting from 0. Consequently, if Submesh 1 in Figure \ref{fig:subdomaing} was passed into the \code{evolve} calculations it would crash. The \code{build_submesh} function determines a local numbering scheme for each submesh, but it does not actually update the numbering, that is left to \code{build_local}.
     116
    118117
    119118\subsection {Sending the Submeshes}\label{sec:part3}
     
    121120All of functions described so far must be run in serial on Processor 0. The next step is to start the parallel computation by spreading the submeshes over the processors. The communication is carried out by
    122121\code{send_submesh} and \code{rec_submesh} defined in {\tt build_commun.py}.
    123 The \code{send_submesh} function should be called on Processor 0 and sends the Submesh $p$ to Processor $p$, while \code{rec_submesh} should be called by Processor $p$ to receive Submesh $p$ from Processor 0. Note that the order of communication is very important, if any changes are made to the \code{send_submesh} function the corresponding change must be made to the \code{rec_submesh} function.
     122The \code{send_submesh} function should be called on Processor 0 and sends the Submesh $p$ to Processor $p$, while \code{rec_submesh} should be called by Processor $p$ to receive Submesh $p$ from Processor 0.
     123
     124As an aside, the order of communication is very important. If someone was to modify the \code{send_submesh} routine the corresponding change must be made to the \code{rec_submesh} routine.
    124125
    125126While it is possible to get Processor 0 to communicate it's submesh to itself, it is an expensive and unnecessary communication call. The {\tt build_commun.py} file also includes a function called \code{extract_hostmesh} that should be called on Processor 0 to extract Submesh 0.
    126127
    127128
    128 \subsection {Building the Local Mesh}
     129\subsection {Building the Local Mesh}\label{sec:part4}
    129130After using \code{send_submesh} and \code{rec_submesh}, Processor $p$ should have its own local copy of Submesh $p$, however as stated previously the triangle numbering will be incorrect on all processors except number $0$. The \code{build_local_mesh} function from {\tt build_local.py} primarily focuses on renumbering the information stored with the submesh; including the nodes, vertices and quantities. Figure \ref{fig:subdomainf} shows what the mesh in each processor may look like.
    130131
     
    147148\begin{verbatim}
    148149#######################
    149 # Partition the domain
     150# Partition the mesh
    150151#######################
    151152
     
    157158\end{verbatim}
    158159
    159 This rectangular mesh is artificial, and the approach to subpartitioning the mesh is different to the one described above, however this example may be of interest to those who want to measure the parallel efficiency of the code on their machine. A rectangular mesh should give a good load balance and is therefore an important first test problem. 
    160 
    161 
    162 A more \lq real life\rq\ mesh is the Merimbula mesh used in the code shown in Section \ref{sec:codeRPMM}. This example also solves the advection equation. In this case the techniques described in Section \ref{sec:part} must be used to partition the mesh. Figure \ref{fig:code} shows the part of the code that is responsible for spreading the domain over the processors. We now look at the code in detail.
     160Most simulations will not be done on a rectangular mesh, and the approach to subpartitioning the mesh is different to the one described above, however this example may be of interest to those who want to measure the parallel efficiency of the code on their machine. A rectangular mesh should give a good load balance and is therefore an important first test problem. 
     161
     162
     163A more \lq real life\rq\ mesh is the Merimbula mesh used in the code shown in Section \ref{sec:codeRPMM}. This example also solves the advection equation. In this case the techniques described in Section \ref{sec:part} must be used to partition the mesh. Figure \ref{fig:code} shows the part of the code that is responsible for spreading the mesh over the processors. We now look at the code in detail.
    163164
    164165\begin{figure}[htbp]
     
    170171    filename = 'merimbula_10785.tsh'
    171172
    172     domain_full = pmesh_to_domain_instance(filename, Advection_Domain)
    173     domain_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0))
    174 
    175     # Define the domain boundaries for visualisation
    176    
    177     rect = array(domain_full.xy_extent, Float)
     173    mesh_full = pmesh_to_domain_instance(filename, Advection_Domain)
     174    mesh_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0))
    178175
    179176    # Subdivide the mesh
    180177
    181178    nodes, triangles, boundary, triangles_per_proc, quantities  =\
    182             pmesh_divide_metis(domain_full, numprocs)
     179            pmesh_divide_metis(mesh_full, numprocs)
    183180
    184181    # Build the mesh that should be assigned to each processor.
     
    195192    # Build the local mesh for processor 0
    196193
    197     hostmesh = extract_hostmesh(submesh)
    198     points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \
    199              build_local_mesh(hostmesh, 0, triangles_per_proc[0], numprocs)
     194     points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict =\
     195              extract_hostmesh(submesh, triangles_per_proc)
    200196
    201197else:
     
    213209\begin{itemize}
    214210\item
    215 These first few lines of code read in and define the (global) mesh.
     211These first few lines of code read in and define the (global) mesh. The \code{Set_Stage} function sets the initial conditions. See the code in \ref{sec:codeRPMM} for the definition of \code{Set_Stage}.
    216212\begin{verbatim}
    217213    filename = 'merimbula_10785.tsh'
    218     domain_full = pmesh_to_domain_instance(filename, Advection_Domain)
    219     domain_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0))
    220 \end{verbatim}
    221 
    222 \item
    223 The \code{rect} array is used by the visualiser and records the domain size.
     214    mesh_full = pmesh_to_domain_instance(filename, Advection_Domain)
     215    mesh_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0))
     216\end{verbatim}
     217
    224218\item \code{pmesh_divide_metis} divides the mesh into a set of non-overlapping subdomains as described in Section \ref{sec:part1}.
    225219\begin{verbatim}
    226220    nodes, triangles, boundary, triangles_per_proc, quantities  =\
    227             pmesh_divide_metis(domain_full, numprocs)
    228 \end{verbatim}
    229 
    230 \item The next step is to build a boundary layer of ghost triangles and define the communication pattern. This step is implemented by \code{build_submesh} as discussed in Section \ref{sec:part2}.
     221            pmesh_divide_metis(mesh_full, numprocs)
     222\end{verbatim}
     223
     224\item The next step is to build a boundary layer of ghost triangles and define the communication pattern. This step is implemented by \code{build_submesh} as discussed in Section \ref{sec:part2}. The \code{submesh} variable contains a copy of the submesh for each processor.
    231225\begin{verbatim}       
    232226    submesh = build_submesh(nodes, triangles, boundary, quantities, \
     
    240234\end{verbatim}
    241235
    242 The processors receive a given subpartition by calling \code{rec_submesh}. The \code{rec_submesh} routine also calls \code{build_local_mesh}. The \code{build_local_mesh} routine described in Section \ref{sec:part4} ensures that the information is stored in a way that is compatible with the Domain datastructure. This means, for example, that the triangles and nodes must be numbered consecutively starting from 1 (FIXME (Ole): or is it 0?).
    243 \begin{verbatim}
    244     points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \
     236
     237The processors receives a given subpartition by calling \code{rec_submesh}. The \code{rec_submesh} routine also calls \code{build_local_mesh}. The \code{build_local_mesh} routine described in Section \ref{sec:part4} ensures that the information is stored in a way that is compatible with the Domain datastructure. This means, for example, that the triangles and nodes must be numbered consecutively starting from 0.
     238
     239\begin{verbatim}
     240    points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict=\
    245241             rec_submesh(0)
    246242\end{verbatim}
    247243
    248 Note that the submesh is not received by, or sent to, Processor 0. Rather     \code{hostmesh = extract_hostmesh(submesh)} extracts the appropriate information. This saves the cost of an unnecessary communication call. It is described further in Section \ref{sec:part3}.
    249 \begin{verbatim}
    250     hostmesh = extract_hostmesh(submesh)
    251     points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \
    252              build_local_mesh(hostmesh, 0, triangles_per_proc[0], numprocs)
     244Note that the submesh is not received by, or sent to, Processor 0. Rather     \code{hostmesh = extract_hostmesh(submesh)} simply extracts the mesh that has been assigned to Processor 0. Recall \code{submesh} contains the list of submeshes to be assigned to each processor. This is described further in Section \ref{sec:part3}. The \code{build_local_mesh} renumbers the nodes
     245\begin{verbatim}
     246    points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict)=\
     247              extract_hostmesh(submesh, triangles_per_proc)
    253248\end{verbatim}
    254249
Note: See TracChangeset for help on using the changeset viewer.