source: pypar/DOC @ 85

Last change on this file since 85 was 85, checked in by ole, 20 years ago

Added pypar files

File size: 10.2 KB
Line 
1------------------------------------------------
2PyPAR - Parallel Python, no-frills MPI interface
3
4Author:  Ole Nielsen (2001, 2002, 2003)
5Email:   Ole.Nielsen@anu.edu.au
6Version: See pypar.__version__
7Date:    See pypar.__date__
8------------------------------------------------
9
10CONTENTS:
11  TESTING THE INSTALLATION
12  GETTING STARTED 
13  FIRST PARALLEL PROGRAM     
14  PROGRAMMING FOR EFFICIENCY
15  PYPAR REFERENCE
16  DATA TYPES
17  STATUS OBJECT   
18  EXTENSIONS
19  PROBLEMS
20  REFERENCES
21
22TESTING THE INSTALLATION
23
24  After having installed pypar run
25 
26    mpirun -np 2 testpypar.py
27   
28  to verify that everything works.
29  If it doesn't try to run a C/MPI program
30  (for example ctiming.c) to see if the problem lies with
31  the local MPI installation or with pypar.
32   
33   
34GETTING STARTED
35  Assuming that the pypar C-extension compiled you should be able to
36  write
37  >> import pypar
38  from the Python prompt.
39  (This may not work under MPICH under Sunos - this doesn't matter,
40  just skip this interactive bit and go to FIRST PARALLEL PROGRAM.)
41 
42  Then you can write
43  >> pypar.size() 
44  to get the number of parallel processors in play.
45  (When using pypar from the command line this number will be 1).
46 
47  Processors are numbered from 0 to pypar.size()-1.
48  Try
49  >> pypar.rank()
50  to get the processor number of the current processor.
51  (When using pypar from the command line this number will be 0). 
52 
53  Finally, try
54  >> pypar.Get_processor_name()
55  to return host name of current node
56 
57FIRST PARALLEL PROGRAM   
58  Take a look at the file demo.py supplied with this distribution.
59  To execute it in parallel on four processors, say, run
60 
61    mpirun -np 4 demo.py
62   
63 
64  from the UNIX command line.
65  This will start four copies of the program each on its own processor.
66   
67  The program demo.py makes a message on processor 0 and
68  sends it around in a ring
69  - each processor adding a bit to it - until it arrives back at
70  processor 0.
71 
72  All parallel programs must find out what processor they are running on.
73  This is accomplished by the call
74 
75    myid = pypar.rank()
76 
77  The total number of processors is obtained from
78 
79    proc = pypar.size()
80   
81  One can then have different codes for different processors by branching
82  as in
83 
84    if myid == 0
85      ... 
86 
87  To send a general Python structure A to processor p, write
88    pypar.send(A, p)
89  and to receive something from processor q, write
90    X = pypar.receive(q) 
91 
92  This will cater for any (picklable Python structure) and make parallel
93  programs very simple and readable. While reasonably fast, it does not
94  achieve the full bandwidth obtainable by MPI.
95
96
97PROGRAMMING FOR EFFICIENCY   
98  For really fast communication one must stick to Numeric arrays and use
99  'raw' versions of send and receive, e.g.:
100  To send a Numeric array A to processor p, write
101    pypar.raw_send(A, p)
102  and to receive the array from processor q, write
103    X = pypar.raw_receive(X, q)   
104  Note that X acts as a buffer and must be pre-allocated prior to the
105  receive statement as in Fortran and C programs using MPI.
106 
107  See the script pytiming for an example of communication of Numeric arrays.
108     
109
110MORE PARALLEL PROGRAMMING WITH PYPAR
111
112  When you are ready to move on, have a look at the supplied
113  demos and the script testpypar.py which all give examples
114  of parallel programming with pypar.
115  You can also look at a standard MPI reference (C or Fortran)
116  to get some ideas. The principles are the same in pypar except
117  that many calls have been simplified.
118 
119
120     
121PYPAR REFERENCE
122  Here is a list of functions provided by pypar:
123  (See section on Data types for explanation of 'vanilla'). 
124 
125  size() -- Number of processors
126  rank() -- Id of current processor
127  Get_processor_name() -- Return host name of current node 
128 
129  send(x, destination, tag=0, vanilla=0) -- Blocking send (all types)
130    Sends data in x to destination with given tag.
131   
132  y=receive(source, tag=0) -- Blocking receive (all types)
133    receives data (y) from source (possible with specified tag).
134 
135  y, status=receive(source, tag=0, return_status=True) -- Blocking receive (all types)
136    receives data (y) and status object from source (possible with specified tag).   
137   
138  raw_send(x, destination, tag=0, vanilla=0): -- Blocking send (Fastest)
139    Sends data in x to destination with given tag.     
140    It differs from send in that the receiver MUST provide a buffer
141    to store the received data.
142    Although it will accept all types raw_send is thought to be used
143    mainly for Numeric arrays.
144 
145  raw_receive(x, source, tag=0, vanilla=0):  -- Raw blocking receive (Fastest)
146    receives data from source (possible with specified tag) and puts
147    it in x (which must be of compatible size and type).
148    It also returns a reference to x.
149    Although it will accept all types raw_send is thought to be used
150    mainly for Numeric arrays.   
151
152  x, status = raw_receive(x, source, tag=0, vanilla=0, return_status=True):  -- Raw blocking receive (Fastest)
153    receives data and status object from source (possible with specified tag) and puts
154    it in x (which must be of compatible size and type).
155   
156
157  bcast(X, rootid) -- Broadcasts X from rootid to all other processors.
158                      All processors must issue the same bcast.
159
160
161  raw_scatter(x, nums, buffer, source, vanilla=0): 
162     Scatter the first nums elements in x to buffer
163     (of size given by nums) from source.
164
165   
166  scatter(x, source, vanilla=0):
167     Scatter all elements in x to a buffer
168     created by this function and returned.
169
170
171  raw_gather(x, buffer, source, vanilla=0):
172     Gather all elements in x to buffer
173     from source.
174     Buffer must have size len(x) * numprocs and
175     shape[0] == x.shape[0]*numprocs
176
177  gather(x, source, vanilla=0):
178     Gather all elements in x to buffer of
179     size len(x) * numprocs
180     created by this function and returned.     
181     If x is multidimensional buffer will have
182     the size of zero'th dimension multiplied by numprocs
183
184
185  raw_reduce(x, buffer, op, source, vanilla=0):
186     Reduce all elements in x to buffer (of the same size as x)
187     at source applying operation op elementwise.
188
189     
190  reduce(x, op, source, vanilla=0):
191     Reduce all elements in x at source
192     applying operation op elementwise and return result in new buffer. 
193     Buffer is created and returned.
194
195                                         
196  Wtime() -- MPI wall time
197  Barrier() -- Synchronisation point. Makes processors wait until all
198               processors have reached this point.
199  Abort() -- Terminate all processes.
200  Finalize() -- Cleanup MPI. No parallelism can take place after this point.
201
202  See pypar.py for doc strings on individual functions.   
203
204 
205DATA TYPES
206  Pypar automatically handles different data types differently
207  There are three protocols:
208    'array': Numeric arrays of type 'i', 'l', 'f', or 'd' can be communicated
209             with the underlying mpiext.send_array and mpiext.receive_array.
210             This is the fastest mode.
211    'string': Text strings can be communicated with mpiext.send_string and
212              mpiext.receive_string.
213    'vanilla': All other types can be communicated using the scripts
214               send_vanilla and receive_vanilla provided that the objects
215               can be serialised using
216               pickle (or cPickle). The latter mode is less efficient than the
217               first two but it can handle complex structures.
218
219     Rules:
220     If keyword argument vanilla == 1, vanilla is chosen regardless of
221     x's type.
222     Otherwise if x is a string, the string protocol is chosen
223     If x is an array, the 'array' protocol is chosen provided that x has one
224     of the admissible typecodes.
225     
226  Function that take vanilla as a keyword argument can force vanilla mode
227  on any datatype.     
228 
229STATUS OBJECT
230  A status object can be optionally returned from receive and raw_receive by
231  specifying return_status=True in the call.
232  The status object can subsequently be queried for information about the communication.
233  The fields are:       
234    status.source: The origin of the received message (use e.g. with pypar.any_source)
235    status.tag: The tag of the received message (use e.g. with pypar.any_tag)   
236    status.error: Error code generated by underlying C function
237    status.length: Number of elements received
238    status.size: Size (in bytes) of one element
239    status.bytes(): Number of payload bytes received (excl control info)
240
241  The status object is essential when use together with any_source or any_tag.         
242 
243
244EXTENSIONS
245  At this stage only a subset of MPI is implemented. However,
246  most real-world MPI programs use only these simple functions.
247  If you find that you need other MPI calls, please feel free to
248  add them to the C-extension. Alternatively, drop me note and I'll
249  use that as an excuse to update pypar.
250 
251 
252PERFORMANCE
253  If you are passing simple Numeric arrays around you can reduce
254  the communication time by using the '_raw' versions of send and
255  receive (see REFERENCE above). These version are closer to the underlying MPI
256  implementation in that one must provide receive buffers of the right size.
257  However, you will find that this can be somewhat faster as they bypass
258  pypar's mechanism for automatically transferring the needed buffer size.
259  Also, using simple numeric arrays will bypass pypar's pickling of complex
260  structures.
261
262  This will mainly be of interest if you are using a 'fine grained'
263  parallelism, i.e. if you have frequent communications.       
264 
265  Try both versions and see for yourself if there is any noticable difference.
266 
267
268PROBLEMS
269  If you encounter any problems with this package please drop me a line.
270  I cannot guarantee that I can help you, but I will try.
271 
272  Ole Nielsen
273  Australian National University
274  Email: Ole.Nielsen@anu.edu.au
275   
276 
277REFERENCES   
278  To learn more about MPI in general, see WEB sites
279    http://www-unix.mcs.anl.gov/mpi/
280    http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg245380.pdf
281
282  or books 
283    "Using MPI, 2nd Edition", by Gropp, Lusk, and Skjellum,
284    "The LAM companion to 'Using MPI...'" by Zdzislaw Meglicki
285    "Parallel Programming With MPI", by Peter S. Pacheco
286    "RS/6000 SP: Practical MPI Programming", by Yukiya Aoyama and Jun Nakano
287     
288  To learn more about Python, see the WEB site
289    http://www.python.org
290   
291         
292
293
294
295
296
297
Note: See TracBrowser for help on using the repository browser.