Context Navigation

← Previous Changeset
Next Changeset →

Changeset 2786

Timestamp:

May 1, 2006, 10:36:09 AM (18 years ago)

Author:

linda

Message:

Continued working on parallel documentation. Finished (waiting on feedback) description of how to parallelise the code and example codes

Location:

inundation/parallel/documentation

Files:

: 6 edited

code.tex (modified) (1 diff)
code/RunParallelAdvection.py (modified) (2 diffs)
code/RunParallelMerimbulaMetis.py (modified) (2 diffs)
code/RunParallelSwMerimbulaMetis.py (modified) (1 diff)
parallel.tex (modified) (12 diffs)
visualisation.tex (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

inundation/parallel/documentation/code.tex

-                      r2768
+                      r2786
 \chapter{Code Listing}
 \section{run_parallel_advection.py}
 \verbatiminput{RunParallelAdvection.py}
+\chapter{Code Listing}\label{chap:code}
+\section{run_parallel_advection.py}\label{sec:codeRPA}
+\verbatiminput{code/RunParallelAdvection.py}
 \newpage
 \section{run_parallel_merimbula_metis.py}
 \verbatiminput{RunParallelMerimbulaMetis.py}
+\section{run_parallel_merimbula_metis.py}\label{sec:codeRPMM}
+\verbatiminput{code/RunParallelMerimbulaMetis.py}
 \newpage
 \section{run_parallel_sw_merimbula_metis.py}
 \verbatiminput{RunParallelSwMerimbulaMetis.py}
+\section{run_parallel_sw_merimbula_metis.py}\label{sec:codeRPSMM}
+\verbatiminput{code/RunParallelSwMerimbulaMetis.py}

inundation/parallel/documentation/code/RunParallelAdvection.py

-                      r2785
+                      r2786
 T = Transmissive_boundary(domain)
 domain.default_order = 2
+domain.set_boundary( {'left': T, 'right': T, 'bottom': T, 'top': T, 'ghost': None} )
+domain.set_boundary( {'left': T, 'right': T, 'bottom': T, 'top': T, \
+                      'ghost': None} )
 # Ensure that the domain definitions make sense
 …
 if myid == 0:
     print 'That took %.2f seconds' %(time.time()-t0)
+    print 'Communication time %.2f seconds'%domain.communication_time
+    print 'Reduction Communication time %.2f seconds'%domain.communication_reduce_time
+    print 'Communication time %.2f seconds'\
+          %domain.communication_time
+    print 'Reduction Communication time %.2f seconds'\
+          %domain.communication_reduce_time

inundation/parallel/documentation/code/RunParallelMerimbulaMetis.py

-                      r2785
+                      r2786
     rect = array(domain_full.xy_extent, Float)
     # Subdivide the mes
+    # Subdivide the mesh
     nodes, triangles, boundary, triangles_per_proc, quantities  =\
 …
 T = Transmissive_boundary(domain)
 #R = Reflective_boundary(domain)
+domain.set_boundary( {'outflow': T, 'inflow': T, 'inner':T, 'exterior': T, 'open':T, 'ghost':None} )
+domain.set_boundary( {'outflow': T, 'inflow': T, 'inner':T, \
+                      'exterior': T, 'open':T, 'ghost':None} )
 # Set the initial quantities

inundation/parallel/documentation/code/RunParallelSwMerimbulaMetis.py

-                      r2785
+                      r2786
     print 'That took %.2f seconds' %(time.time()-t0)
     print 'Communication time %.2f seconds'%domain.communication_time
+    print 'Reduction Communication time %.2f seconds'%domain.communication_reduce_time
+    print 'Broadcast time %.2f seconds'%domain.communication_broadcast_time
+    print 'Reduction Communication time %.2f seconds'\
+          %domain.communication_reduce_time
+    print 'Broadcast time %.2f seconds'\
+          %domain.communication_broadcast_time

inundation/parallel/documentation/parallel.tex

-                      r2767
+                      r2786
 \chapter{Running the Code in Parallel}
 This chapter looks at how to run the code in parallel. The first section gives an overview of the main steps required to divide the mesh over the processors. The second sections gives some example code and the final section talks about how to run the code on a few specific architectures.
 \section {Partitioning the Mesh}
+This chapter looks at how to run the code in parallel. The first section gives an overview of the main steps required to divide the mesh over the processors. The second sections describes some example code and the final section talks about how to run the code on a few specific architectures.
+\section {Partitioning the Mesh}\label{sec:part}
 There are four main steps required to run the code in parallel. They are;
 \begin{enumerate}
 …
 \begin{figure}[hbtp]
   \centerline{ \includegraphics[scale = 0.75]{domain.eps}}
+  \centerline{ \includegraphics[scale = 0.75]{figures/domain.eps}}
   \caption{The main steps used to divide the mesh over the processors.}
   \label{fig:subpart}
 \end{figure}
 \subsection {Subdividing the Global Mesh}
+\subsection {Subdividing the Global Mesh}\label{sec:part1}
 The first step in parallelising the code is to subdivide the mesh
 …
 \begin{figure}[hbtp]
   \centerline{ \includegraphics[scale = 0.75]{mermesh.eps}}
+  \centerline{ \includegraphics[scale = 0.75]{figures/mermesh.eps}}
   \caption{The Merimbula grid.}
  \label{fig:mergrid}
 …
 \begin{figure}[hbtp]
   \centerline{ \includegraphics[scale = 0.75]{mermesh4c.eps}
   \includegraphics[scale = 0.75]{mermesh4a.eps}}
   \centerline{ \includegraphics[scale = 0.75]{mermesh4d.eps}
   \includegraphics[scale = 0.75]{mermesh4b.eps}}
+  \centerline{ \includegraphics[scale = 0.75]{figures/mermesh4c.eps}
+  \includegraphics[scale = 0.75]{figures/mermesh4a.eps}}
+  \centerline{ \includegraphics[scale = 0.75]{figures/mermesh4d.eps}
+  \includegraphics[scale = 0.75]{figures/mermesh4b.eps}}
   \caption{The Merimbula grid partitioned over 4 processors using Metis.}
  \label{fig:mergrid4}
 …
 The number of submeshes found by Pymetis is equal to the number of processors; Submesh $p$ will be assigned to Processor $p$.
 \subsection {Building the Ghost Layer}
+\subsection {Building the Ghost Layer}\label{sec:part2}
 The function {\tt build_submesh.py} is the work-horse and is responsible for
 setting up the communication pattern as well as assigning the local numbering scheme for the submeshes.
 …
 \begin{figure}[hbtp]
   \centerline{ \includegraphics[scale = 0.6]{subdomain.eps}}
+  \centerline{ \includegraphics[scale = 0.6]{figures/subdomain.eps}}
   \caption{An example subpartioning.}
  \label{fig:subdomain}
 …
 \begin{figure}[hbtp]
   \centerline{ \includegraphics[scale = 0.6]{subdomainghost.eps}}
+  \centerline{ \includegraphics[scale = 0.6]{figures/subdomainghost.eps}}
   \caption{An example subpartioning with ghost triangles.}
  \label{fig:subdomaing}
 …
 Finally, the ANUGA code assumes that the triangles (and nodes etc.) are numbered consecutively starting from 1. Consequently, if Submesh 1 in Figure \ref{fig:subdomaing} was passed into the \code{evolve} calculations it would crash. The \code{build_submesh} function determines a local numbering scheme for each submesh, but it does not actually update the numbering, that is left to \code{build_local}.
 \subsection {Sending the Submeshes}
+\subsection {Sending the Submeshes}\label{sec:part3}
 All of functions described so far must be run in serial on Processor 0, the next step is to start the parallel computation by spreading the submeshes over the processors. The communication is carried out by
 …
 \begin{figure}[hbtp]
   \centerline{ \includegraphics[scale = 0.6]{subdomainfinal.eps}}
+  \centerline{ \includegraphics[scale = 0.6]{figures/subdomainfinal.eps}}
   \caption{An example subpartioning after the submeshes have been renumbered.}
  \label{fig:subdomainf}
 …
 \section{Some Example Code}
+\begin{figure}
+Chapter \ref{chap:code} gives full listings of some example codes.
+The first example in Section \ref{sec:codeRPA} solves the advection equation on a
+rectangular mesh. A rectangular mesh is highly structured so a coordinate based decomposition can be use and the partitioning is simply done by calling the
+routine \code{parallel_rectangle} as show below.
+\begin{verbatim}
+#######################
+# Partition the domain
+#######################
+# Build a unit mesh, subdivide it over numproces processors with each
+# submesh containing M*N nodes
+points, vertices, boundary, full_send_dict, ghost_recv_dict =  \
+    parallel_rectangle(N, M, len1_g=1.0)
+\end{verbatim}
+This rectangular mesh is artificial, and the approach to subpartitioning the mesh is different to the one described above, however this example may be of interest to those who want to measure the parallel efficiency of the code on their machine. A rectangular mesh should give a good load balance and is therefore an important first test problem.
+A more \lq real life\rq\ mesh is the Merimbula mesh used in the code shown in Section \ref{sec:codeRPMM}. This example also solves the advection equation. In this case the techniques described in Section \ref{sec:part} must be used to partition the mesh. Figure \ref{fig:code} shows the part of the code that is responsible for spreading the domain over the processors. We now look at the code in detail.
+\begin{figure}[htbp]
 \begin{verbatim}
 if myid == 0:
 …
     # Read in the test files
+    filename = 'merimbula_10785_1.tsh'
+    # Build the whole domain
+    domain_full = pmesh_to_domain_instance(filename, Domain)
+    filename = 'merimbula_10785.tsh'
+    domain_full = pmesh_to_domain_instance(filename, Advection_Domain)
+    domain_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0))
     # Define the domain boundaries for visualisation
     rect = array(domain_full.xy_extent, Float)
-    # Initialise the wave
-    domain_full.set_quantity('stage', Set_Stage(756000.0,756500.0,2.0))
     # Subdivide the mesh
     nodes, triangles, boundary, triangles_per_proc, quantities = \
          pmesh_divide_metis(domain_full, numprocs)
+    nodes, triangles, boundary, triangles_per_proc, quantities  =\
+            pmesh_divide_metis(domain_full, numprocs)
     # Build the mesh that should be assigned to each processor,
     # this includes ghost nodes and the communicaiton pattern
     submesh = build_submesh(nodes, triangles, boundary,\
                             quantities, triangles_per_proc)
+    submesh = build_submesh(nodes, triangles, boundary, quantities, \
+                            triangles_per_proc)
     # Send the mesh partition to the appropriate processor
 …
 else:
     # Read in the mesh partition that belongs to this
     # processor (note that the information is in the
+    # correct form for the GA data structure)
+    points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict \
+            = rec_submesh(0)
+\end{verbatim}
+\end{figure}
+    # correct form for the GA data structure
+    points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \
+             rec_submesh(0)
+\end{verbatim}
+  \caption{A section of code taken from {\tt run_parallel_merimbula_metis.py} (Section \protect \ref{sec:codeRPMM}) showing how to subdivide the mesh.}
+ \label{fig:code}
+\end{figure}
+\newpage
+\begin{itemize}
+\item
+These first few lines of code read in and define the (global) mesh.
+\begin{verbatim}
+    filename = 'merimbula_10785.tsh'
+    domain_full = pmesh_to_domain_instance(filename, Advection_Domain)
+    domain_full.set_quantity('stage', Set_Stage(756000.0,756500.0,4.0))
+\end{verbatim}
+\item
+The \code{rect} array is used by the visualiser and records the domain size.
+\item \code{pmesh_divide_metis} divides the mesh into a set of non-overlapping subdomains as described in Section \ref{sec:part1}.
+\begin{verbatim}
+    nodes, triangles, boundary, triangles_per_proc, quantities  =\
+            pmesh_divide_metis(domain_full, numprocs)
+\end{verbatim}
+\item The next step is to build a boundary layer of ghost triangles and define the communication pattern. This step is implemented by \code{build_submesh} as discussed in Section \ref{sec:part2}.
+\begin{verbatim}
+    submesh = build_submesh(nodes, triangles, boundary, quantities, \
+                            triangles_per_proc)
+\end{verbatim}
+\item The actual parallel communication starts when the submesh partitions are sent to the processors by calling \code{send_submesh}.
+\begin{verbatim}
+    for p in range(1, numprocs):
+      send_submesh(submesh, triangles_per_proc, p)
+\end{verbatim}
+The processors receive a given subpartition by calling \code{rec_submesh}. The \code{rec_submesh} routine also calls \code{build_local_mesh}. The \code{build_local_mesh} routine described in Section \ref{sec:part4} ensures that the information is stored in a way that is compatible with the Domain datastructure. This means, for example, that the triangles and nodes must be numbered consecutively starting from 1.
+\begin{verbatim}
+    points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \
+             rec_submesh(0)
+\end{verbatim}
+Note that the submesh is not received by, or sent to, Processor 0. Rather     \code{hostmesh = extract_hostmesh(submesh)} extracts the appropriate information. This saves the cost of an unnecessary communication call. It is described further in Section \ref{sec:part3}.
+\begin{verbatim}
+    hostmesh = extract_hostmesh(submesh)
+    points, vertices, boundary, quantities, ghost_recv_dict, full_send_dict = \
+             build_local_mesh(hostmesh, 0, triangles_per_proc[0], numprocs)
+\end{verbatim}
+\end{itemize}
 \section{Running the Code}

inundation/parallel/documentation/visualisation.tex

r2723	r2786
30	30	\end{itemize}
31	31	Screenshot:\\
32		\includegraphics{vis-screenshot.eps}\\
	32	\includegraphics{figures/vis-screenshot.eps}\\
33	33
34	34	Unlike the old VPython visualiser, the behaviour of the VTK

Note: See TracChangeset for help on using the changeset viewer.