[5779] | 1 | ------------------------------------------------ |
| 2 | PyPAR - Parallel Python, no-frills MPI interface |
| 3 | |
| 4 | Author: Ole Nielsen (2001, 2002, 2003) |
| 5 | Email: Ole.Nielsen@anu.edu.au |
| 6 | Version: See pypar.__version__ |
| 7 | Date: See pypar.__date__ |
| 8 | ------------------------------------------------ |
| 9 | |
| 10 | CONTENTS: |
| 16 | DATA TYPES |
| 19 | PROBLEMS |
| 21 | |
| 23 | |
| 24 | After having installed pypar run |
| 25 | |
| 26 | mpirun -np 2 testpypar.py |
| 27 | |
| 28 | to verify that everything works. |
| 29 | If it doesn't try to run a C/MPI program |
| 30 | (for example ctiming.c) to see if the problem lies with |
| 31 | the local MPI installation or with pypar. |
| 32 | |
| 33 | |
| 35 | Assuming that the pypar C-extension compiled you should be able to |
| 36 | write |
| 37 | >> import pypar |
| 38 | from the Python prompt. |
| 39 | (This may not work under MPICH under Sunos - this doesn't matter, |
| 40 | just skip this interactive bit and go to FIRST PARALLEL PROGRAM.) |
| 41 | |
| 42 | Then you can write |
| 43 | >> pypar.size() |
| 44 | to get the number of parallel processors in play. |
| 45 | (When using pypar from the command line this number will be 1). |
| 46 | |
| 47 | Processors are numbered from 0 to pypar.size()-1. |
| 48 | Try |
| 49 | >> pypar.rank() |
| 50 | to get the processor number of the current processor. |
| 51 | (When using pypar from the command line this number will be 0). |
| 52 | |
| 53 | Finally, try |
| 54 | >> pypar.Get_processor_name() |
| 55 | to return host name of current node |
| 56 | |
| 58 | Take a look at the file demo.py supplied with this distribution. |
| 59 | To execute it in parallel on four processors, say, run |
| 60 | |
| 61 | mpirun -np 4 demo.py |
| 62 | |
| 63 | |
| 64 | from the UNIX command line. |
| 65 | This will start four copies of the program each on its own processor. |
| 66 | |
| 67 | The program demo.py makes a message on processor 0 and |
| 68 | sends it around in a ring |
| 69 | - each processor adding a bit to it - until it arrives back at |
| 70 | processor 0. |
| 71 | |
| 72 | All parallel programs must find out what processor they are running on. |
| 73 | This is accomplished by the call |
| 74 | |
| 75 | myid = pypar.rank() |
| 76 | |
| 77 | The total number of processors is obtained from |
| 78 | |
| 79 | proc = pypar.size() |
| 80 | |
| 81 | One can then have different codes for different processors by branching |
| 82 | as in |
| 83 | |
| 84 | if myid == 0 |
| 85 | ... |
| 86 | |
| 87 | To send a general Python structure A to processor p, write |
| 88 | pypar.send(A, p) |
| 89 | and to receive something from processor q, write |
| 90 | X = pypar.receive(q) |
| 91 | |
| 92 | This will cater for any (picklable Python structure) and make parallel |
| 93 | programs very simple and readable. While reasonably fast, it does not |
| 94 | achieve the full bandwidth obtainable by MPI. |
| 95 | |
| 96 | |
| 98 | Really low latency communication can be achieved by sticking |
| 99 | to Numeric arrays and specifying receive buffers whenever possible. |
| 100 | |
| 101 | To send a Numeric array A to processor p, write |
| 102 | pypar.raw_send(A, p, use_buffer=True) |
| 103 | and to receive the array from processor q, write |
| 104 | X = pypar.receive(q, buffer=X) |
| 105 | Note that X acts as a buffer and must be pre-allocated prior to the |
| 106 | receive statement as in Fortran and C programs using MPI. |
| 107 | |
| 108 | These forms have superseded the raw forms present in pypar |
| 109 | prior to version 1.9. The raw forms have been recast in terms of the |
| 110 | above and have been retained for backwars compatibility. |
| 111 | See the script pytiming for an example of communication of Numeric arrays. |
| 112 | |
| 113 | |
| 115 | |
| 116 | When you are ready to move on, have a look at the supplied |
| 117 | demos and the script testpypar.py which all give examples |
| 118 | of parallel programming with pypar. |
| 119 | You can also look at a standard MPI reference (C or Fortran) |
| 120 | to get some ideas. The principles are the same in pypar except |
| 121 | that many calls have been simplified. |
| 122 | |
| 123 | |
| 124 | |
| 126 | Here is a list of functions provided by pypar: |
| 127 | (See section on Data types for explanation of 'vanilla'). |
| 128 | |
| 129 | Identification: |
| 130 | --------------- |
| 131 | |
| 132 | size() -- Number of processors |
| 133 | rank() -- Id of current processor |
| 134 | get_processor_name() -- Return host name of current node |
| 135 | |
| 136 | Basic send forms: |
| 137 | -------------------- |
| 138 | send(x, destination) |
| 139 | Sends data x of any type to destination with default tag. |
| 140 | |
| 141 | send(x, destination, tag=t) |
| 142 | Sends data x of any type to destination with tag t. |
| 143 | |
| 144 | send(x, destination, use_buffer=True) |
| 145 | Sends data x of any type to destination |
| 146 | assuming that recipient will specify a suitable buffer. |
| 147 | |
| 148 | send(x, destination, bypass=True) |
| 149 | Send Numeric array of any type to recipient assuming |
| 150 | that a suitable buffer has been specified and that |
| 151 | recipient also specifies bypass=True |
| 152 | |
| 153 | |
| 154 | Basic receive forms: |
| 155 | -------------------- |
| 156 | y=receive(source) |
| 157 | receives data y of any type from source with default tag. |
| 158 | |
| 159 | y=receive(source, tag=t) |
| 160 | receives data y of any type from source with tag t. |
| 161 | |
| 162 | y,status=receive(source, return_status=True) |
| 163 | receives data y and status object from source |
| 164 | |
| 165 | y=receive(source, buffer=x) |
| 166 | receives data y from source and puts |
| 167 | it in x (which must be of compatible size and type). |
| 168 | It also returns a reference to x. |
| 169 | (Although it will accept all types this form is thought to be used |
| 170 | mainly for Numeric arrays). |
| 171 | |
| 172 | Collective Communication: |
| 173 | ------------------------- |
| 174 | |
| 175 | broadcast(x, root): |
| 176 | Broadcasts x from root to all other processors. |
| 177 | All processors must issue the same bcast. |
| 178 | |
| 179 | gather(x, root): |
| 180 | Gather all elements in x to buffer of |
| 181 | size len(x) * numprocs |
| 182 | created by this function. |
| 183 | If x is multidimensional buffer will have |
| 184 | the size of zero'th dimension multiplied by numprocs. |
| 185 | A reference to the created buffer is returned. |
| 186 | |
| 187 | gather(x, root, buffer=y): |
| 188 | Gather all elements in x to specified buffer y |
| 189 | from source. |
| 190 | Buffer must have size len(x) * numprocs and |
| 191 | shape[0] == x.shape[0]*numprocs. |
| 192 | A reference to the buffer y is returned. |
| 193 | |
| 194 | scatter(x, root): |
| 195 | Scatter all elements in x from root to all other processors |
| 196 | in a buffer created by this function. |
| 197 | A reference to the created buffer is returned. |
| 198 | |
| 199 | scatter(x, root, buffer=y): |
| 200 | Scatter all elements in x from root to all other processors |
| 201 | using specified buffer y. |
| 202 | A reference to the buffer y is returned. |
| 203 | |
| 204 | reduce(x, op, root): |
| 205 | Reduce all elements in x at root |
| 206 | applying operation op elementwise and return result in |
| 207 | buffer created by this function. |
| 208 | A reference to the created buffer is returned. |
| 209 | |
| 210 | reduce(x, op, root, buffer=y): |
| 211 | Reduce all elements in x to specified buffer y |
| 212 | (of the same size as x) |
| 213 | at source applying operation op elementwise. |
| 214 | A reference to the buffer y is returned. |
| 215 | |
| 216 | |
| 217 | Other functions: |
| 218 | ---------------- |
| 219 | |
| 220 | time() -- MPI wall time |
| 221 | barrier() -- Synchronisation point. Makes processors wait until all |
| 222 | processors have reached this point. |
| 223 | abort() -- Terminate all processes. |
| 224 | finalize() -- Cleanup MPI. No parallelism can take place after this point. |
| 225 | initialized() -- True if MPI has been initialised |
| 226 | |
| 227 | |
| 228 | See pypar.py for doc strings on individual functions. |
| 229 | |
| 230 | |
| 231 | |
| 232 | DATA TYPES |
| 233 | Pypar automatically handles different data types differently |
| 234 | There are three protocols: |
| 235 | 'array': Numeric arrays of type Int ('i', 'l'), Float ('f', 'd'), |
| 236 | or Complex ('F', 'D') can be communicated |
| 237 | with the underlying mpiext.send_array and mpiext.receive_array. |
| 238 | This is the fastest mode. |
| 239 | Note that even though the underlying C implementation does not |
| 240 | support Complex as a native datatype, pypar handles them |
| 241 | efficiently and seemlessly by transmitting them as arrays of |
| 242 | floats of twice the size. |
| 243 | 'string': Text strings can be communicated with mpiext.send_string and |
| 244 | mpiext.receive_string. |
| 245 | 'vanilla': All other types can be communicated using the scripts |
| 246 | send_vanilla and receive_vanilla provided that the objects |
| 247 | can be serialised using |
| 248 | pickle (or cPickle). The latter mode is less efficient than the |
| 249 | first two but it can handle general structures. |
| 250 | |
| 251 | Rules: |
| 252 | If keyword argument vanilla == 1, vanilla is chosen regardless of |
| 253 | x's type. |
| 254 | Otherwise if x is a string, the string protocol is chosen |
| 255 | If x is an array, the 'array' protocol is chosen provided that x has one |
| 256 | of the admissible typecodes. |
| 257 | |
| 258 | Function that take vanilla as a keyword argument can force vanilla mode |
| 259 | on any datatype. |
| 260 | |
| 262 | A status object can be optionally returned from receive and raw_receive by |
| 263 | specifying return_status=True in the call. |
| 264 | The status object can subsequently be queried for information about the communication. |
| 265 | The fields are: |
| 266 | status.source: The origin of the received message (use e.g. with pypar.any_source) |
| 267 | status.tag: The tag of the received message (use e.g. with pypar.any_tag) |
| 268 | status.error: Error code generated by underlying C function |
| 269 | status.length: Number of elements received |
| 270 | status.size: Size (in bytes) of one element |
| 271 | status.bytes(): Number of payload bytes received (excl control info) |
| 272 | |
| 273 | The status object is essential when use together with any_source or any_tag. |
| 274 | |
| 275 | |
| 276 | EXTENSIONS |
| 277 | At this stage only a subset of MPI is implemented. However, |
| 278 | most real-world MPI programs use only these simple functions. |
| 279 | If you find that you need other MPI calls, please feel free to |
| 280 | add them to the C-extension. Alternatively, drop me note and I'll |
| 281 | use that as an excuse to update pypar. |
| 282 | |
| 283 | |
| 285 | If you are passing simple Numeric arrays around you can reduce |
| 286 | the communication time by using the 'buffer' keyword arguments |
| 287 | (see REFERENCE above). These version are closer to the underlying MPI |
| 288 | implementation in that one must provide receive buffers of the right size. |
| 289 | However, you will find that these version have lower latency and |
| 290 | can be somewhat faster as they bypass |
| 291 | pypar's mechanism for automatically transferring the needed buffer size. |
| 292 | Also, using simple numeric arrays will bypass pypar's pickling of general |
| 293 | structures. |
| 294 | |
| 295 | This will mainly be of interest if you are using a 'fine grained' |
| 296 | parallelism, i.e. if you have frequent communications. |
| 297 | |
| 298 | Try both versions and see for yourself if there is any noticable difference. |
| 299 | |
| 300 | |
| 301 | PROBLEMS |
| 302 | If you encounter any problems with this package please drop me a line. |
| 303 | I cannot guarantee that I can help you, but I will try. |
| 304 | |
| 305 | Ole Nielsen |
| 306 | Australian National University |
| 307 | Email: Ole.Nielsen@anu.edu.au |
| 308 | |
| 309 | |
| 310 | REFERENCES |
| 311 | To learn more about MPI in general, see WEB sites |
| 312 | http://www-unix.mcs.anl.gov/mpi/ |
| 313 | http://www.redbooks.ibm.com/pubs/pdfs/redbooks/sg245380.pdf |
| 314 | |
| 315 | or books |
| 316 | "Using MPI, 2nd Edition", by Gropp, Lusk, and Skjellum, |
| 317 | "The LAM companion to 'Using MPI...'" by Zdzislaw Meglicki |
| 318 | "Parallel Programming With MPI", by Peter S. Pacheco |
| 319 | "RS/6000 SP: Practical MPI Programming", by Yukiya Aoyama and Jun Nakano |
| 320 | |
| 321 | To learn more about Python, see the WEB site |
| 322 | http://www.python.org |
| 323 | |
| 324 | |
| 325 | |
| 326 | |
| 327 | |
| 328 | |
| 329 | |
| 330 | |