Changeset 123 for pypar/DOC


Ignore:
Timestamp:
Jul 11, 2005, 3:20:03 PM (20 years ago)
Author:
ole
Message:

Files for 64 bit machine + latest Cvs version

File:
1 edited

Legend:

Unmodified
Added
Removed
  • pypar/DOC

    r85 r123  
    9696
    9797PROGRAMMING FOR EFFICIENCY   
    98   For really fast communication one must stick to Numeric arrays and use
    99   'raw' versions of send and receive, e.g.:
     98  Really low latency communication can be achieved by sticking
     99  to Numeric arrays and specifying receive buffers whenever possible.
     100
    100101  To send a Numeric array A to processor p, write
    101     pypar.raw_send(A, p)
     102    pypar.raw_send(A, p, use_buffer=True)
    102103  and to receive the array from processor q, write
    103     X = pypar.raw_receive(X, q)   
     104    X = pypar.receive(q, buffer=X)   
    104105  Note that X acts as a buffer and must be pre-allocated prior to the
    105106  receive statement as in Fortran and C programs using MPI.
    106107 
     108  These forms have superseded the raw forms present in pypar
     109  prior to version 1.9. The raw forms have been recast in terms of the
     110  above and have been retained for backwars compatibility.
    107111  See the script pytiming for an example of communication of Numeric arrays.
    108112     
     
    123127  (See section on Data types for explanation of 'vanilla'). 
    124128 
     129  Identification:
     130  ---------------
     131   
    125132  size() -- Number of processors
    126133  rank() -- Id of current processor
    127   Get_processor_name() -- Return host name of current node 
    128  
    129   send(x, destination, tag=0, vanilla=0) -- Blocking send (all types)
    130     Sends data in x to destination with given tag.
    131    
    132   y=receive(source, tag=0) -- Blocking receive (all types)
    133     receives data (y) from source (possible with specified tag).
    134  
    135   y, status=receive(source, tag=0, return_status=True) -- Blocking receive (all types)
    136     receives data (y) and status object from source (possible with specified tag).   
    137    
    138   raw_send(x, destination, tag=0, vanilla=0): -- Blocking send (Fastest)
    139     Sends data in x to destination with given tag.     
    140     It differs from send in that the receiver MUST provide a buffer
    141     to store the received data.
    142     Although it will accept all types raw_send is thought to be used
    143     mainly for Numeric arrays.
    144  
    145   raw_receive(x, source, tag=0, vanilla=0):  -- Raw blocking receive (Fastest)
    146     receives data from source (possible with specified tag) and puts
     134  get_processor_name() -- Return host name of current node 
     135 
     136  Basic send forms:
     137  --------------------
     138  send(x, destination)
     139    Sends data x of any type to destination with default tag.
     140       
     141  send(x, destination, tag=t)
     142    Sends data x of any type to destination with tag t.
     143   
     144  send(x, destination, use_buffer=True)
     145    Sends data x of any type to destination
     146    assuming that recipient will specify a suitable buffer.
     147   
     148  send(x, destination, bypass=True)
     149    Send Numeric array of any type to recipient assuming
     150    that a suitable buffer has been specified and that
     151    recipient also specifies bypass=True
     152       
     153       
     154  Basic receive forms: 
     155  --------------------
     156  y=receive(source)
     157    receives data y of any type from source with default tag.
     158     
     159  y=receive(source, tag=t)
     160    receives data y of any type from source with tag t.
     161 
     162  y,status=receive(source, return_status=True)
     163    receives data y and status object from source
     164   
     165  y=receive(source, buffer=x)   
     166    receives data y from source and puts
    147167    it in x (which must be of compatible size and type).
    148168    It also returns a reference to x.
    149     Although it will accept all types raw_send is thought to be used
    150     mainly for Numeric arrays.   
    151 
    152   x, status = raw_receive(x, source, tag=0, vanilla=0, return_status=True):  -- Raw blocking receive (Fastest)
    153     receives data and status object from source (possible with specified tag) and puts
    154     it in x (which must be of compatible size and type).
    155    
    156 
    157   bcast(X, rootid) -- Broadcasts X from rootid to all other processors.
    158                       All processors must issue the same bcast.
    159 
    160 
    161   raw_scatter(x, nums, buffer, source, vanilla=0): 
    162      Scatter the first nums elements in x to buffer
    163      (of size given by nums) from source.
    164 
    165    
    166   scatter(x, source, vanilla=0):
    167      Scatter all elements in x to a buffer
    168      created by this function and returned.
    169 
    170 
    171   raw_gather(x, buffer, source, vanilla=0):
    172      Gather all elements in x to buffer
     169    (Although it will accept all types this form is thought to be used
     170    mainly for Numeric arrays).   
     171
     172  Collective Communication:
     173  ------------------------- 
     174   
     175  broadcast(x, root):
     176    Broadcasts x from root to all other processors.
     177    All processors must issue the same bcast.
     178
     179  gather(x, root):
     180     Gather all elements in x to buffer of
     181     size len(x) * numprocs
     182     created by this function.
     183     If x is multidimensional buffer will have
     184     the size of zero'th dimension multiplied by numprocs.
     185     A reference to the created buffer is returned.
     186
     187  gather(x, root, buffer=y):
     188     Gather all elements in x to specified buffer y
    173189     from source.
    174190     Buffer must have size len(x) * numprocs and
    175      shape[0] == x.shape[0]*numprocs
    176 
    177   gather(x, source, vanilla=0):
    178      Gather all elements in x to buffer of
    179      size len(x) * numprocs
    180      created by this function and returned.     
    181      If x is multidimensional buffer will have
    182      the size of zero'th dimension multiplied by numprocs
    183 
    184 
    185   raw_reduce(x, buffer, op, source, vanilla=0):
    186      Reduce all elements in x to buffer (of the same size as x)
     191     shape[0] == x.shape[0]*numprocs.
     192     A reference to the buffer y is returned.     
     193
     194  scatter(x, root):
     195     Scatter all elements in x from root to all other processors
     196     in a buffer created by this function.
     197     A reference to the created buffer is returned.
     198
     199  scatter(x, root, buffer=y):
     200     Scatter all elements in x from root to all other processors
     201     using specified buffer y.
     202     A reference to the buffer y is returned.     
     203
     204  reduce(x, op, root):
     205     Reduce all elements in x at root
     206     applying operation op elementwise and return result in
     207     buffer created by this function. 
     208     A reference to the created buffer is returned.         
     209
     210  reduce(x, op, root, buffer=y):
     211     Reduce all elements in x to specified buffer y
     212     (of the same size as x)
    187213     at source applying operation op elementwise.
     214     A reference to the buffer y is returned.               
    188215
    189216     
    190   reduce(x, op, source, vanilla=0):
    191      Reduce all elements in x at source
    192      applying operation op elementwise and return result in new buffer. 
    193      Buffer is created and returned.
    194 
     217  Other functions:
     218  ----------------     
    195219                                         
    196   Wtime() -- MPI wall time
    197   Barrier() -- Synchronisation point. Makes processors wait until all
     220  time() -- MPI wall time
     221  barrier() -- Synchronisation point. Makes processors wait until all
    198222               processors have reached this point.
    199   Abort() -- Terminate all processes.
    200   Finalize() -- Cleanup MPI. No parallelism can take place after this point.
    201 
     223  abort() -- Terminate all processes.
     224  finalize() -- Cleanup MPI. No parallelism can take place after this point.
     225  initialized() -- True if MPI has been initialised
     226
     227 
    202228  See pypar.py for doc strings on individual functions.   
    203229
    204  
     230
     231   
    205232DATA TYPES
    206233  Pypar automatically handles different data types differently
    207234  There are three protocols:
    208     'array': Numeric arrays of type 'i', 'l', 'f', or 'd' can be communicated
     235    'array': Numeric arrays of type Int ('i', 'l'), Float ('f', 'd'),
     236             or Complex ('F', 'D') can be communicated
    209237             with the underlying mpiext.send_array and mpiext.receive_array.
    210238             This is the fastest mode.
     239             Note that even though the underlying C implementation does not
     240             support Complex as a native datatype, pypar handles them
     241             efficiently and seemlessly by transmitting them as arrays of
     242             floats of twice the size. 
    211243    'string': Text strings can be communicated with mpiext.send_string and
    212244              mpiext.receive_string.
     
    215247               can be serialised using
    216248               pickle (or cPickle). The latter mode is less efficient than the
    217                first two but it can handle complex structures.
     249               first two but it can handle general structures.
    218250
    219251     Rules:
     
    252284PERFORMANCE
    253285  If you are passing simple Numeric arrays around you can reduce
    254   the communication time by using the '_raw' versions of send and
    255   receive (see REFERENCE above). These version are closer to the underlying MPI
     286  the communication time by using the 'buffer' keyword arguments
     287  (see REFERENCE above). These version are closer to the underlying MPI
    256288  implementation in that one must provide receive buffers of the right size.
    257   However, you will find that this can be somewhat faster as they bypass
     289  However, you will find that these version have lower latency and
     290  can be somewhat faster as they bypass
    258291  pypar's mechanism for automatically transferring the needed buffer size.
    259   Also, using simple numeric arrays will bypass pypar's pickling of complex
     292  Also, using simple numeric arrays will bypass pypar's pickling of general
    260293  structures.
    261294
Note: See TracChangeset for help on using the changeset viewer.