Changeset 1946


Ignore:
Timestamp:
Oct 17, 2005, 4:39:37 PM (18 years ago)
Author:
duncan
Message:

changes

File:
1 edited

Legend:

Unmodified
Added
Removed
  • inundation/wiki/least_squares_refactoring.txt

    r1918 r1946  
    66
    77 Software, called least squares, has been developed to implement a
    8 penalised least-squares fit and associated interpolations. It is in
    9 least_squares.py.  Steven Roberts outlined the least squares algorithm
    10 used. The code is currently not meeting user requirements with
    11 regards to speed and memory use.  Additional functionality is also needed.
     8penalised least-squares fit and associated interpolations. It is
     9currently in least_squares.py.  Steven Roberts outlined the least
     10squares algorithm used. The code is currently not meeting user
     11requirements with regards to speed and memory use.  Additional
     12functionality is also needed.
     13
    1214
    1315SCOPE
    14 This is looking at refactoring the Class Interpolation and the
    15 functions fit_to_mesh_file, fit_to_mesh and the command line use of
    16 least_squares.py.
     16
     17This is looking at redesigning the least_squares module, excluding
     18interpolate_function and pts2rectangle.  (Other modules will
     19be modified, based on these changes.)
     20
    1721
    1822LIFE-CYCLE
     
    3438INTRODUCTION
    3539
    36 This document specifies the requirements to implement least squares
    37 is refactored.
     40This document specifies the requirements of the refactoring of least squares.
    3841
    3942DEFINITIONS
     
    4952
    5053
    51 Requirements
     54Requirements - Fitting
    5255
    53 When interpolating, return information on which points were outside
    54 the mesh (in such a way that it can not be interporated as real data).
    55 If attribute information for points outside the mesh is returned, that can
    56 only be a float, let the user set the default value that is returned
    57 for points outside the mesh.
     56When in verbose mode display how many points are in the
     57mesh and how many are outside its boundary. Also display the max # of
     58points/ triangle and the minimum number of points.  If this operation
     59takes a lot of time, let it be optional.
    5860
    59 When fitting, when in verbose mode display how many triangles are in the
    60 mesh and how many are out. Also display the max # of points/ triangle
    61 and the minimum number of points.  If this operation takes alot of
    62 time, let it be optional.
    63 
    64 Warn the user when fitting to mesh if the mesh is outside the points boundary.
    65  (Note: by using alpha least squares can always return a result in this
    66  situation)
     61Raise an exception if no points are inside the mesh.
    6762
    6863Fitting 5 million points to a one million triangles mesh will not cause a
    6964memory error.
    7065
     66Remove least_squares commandline interface.
     67
     68Points do not have to be supplied all at once.  They can be supplied
     69as blocks. (This will improve memory use)
     70
     71Requirements - Interpolation
     72
     73Return information on which points were outside the mesh (in such a
     74way that it can not be interpreted as real data).  If attribute
     75information for points outside the mesh is returned, that can only be
     76a float, let the user set the default value that is returned for
     77points outside the mesh.
     78
    7179Reduce the time spent building matrix A. (not a testable requirement)
    7280
    73 (This isn't long, are there any other requirements?)
    74  
    75 Requirement Notes
    76 How should we handle points outside of the mesh when interpolating?
    77 Currently the interpolated values are set to 0.0. 
    78 Other options are, set to a given value.
    79 Raise an error.
    80 Raise/return a warning and set to a given value/0.0. (I like this one)
    8181
     82Points do not have to be supplied all at once.  They can be supplied
     83as blocks. (This will improve memory use)
    8284
    83 How should interpolate_sww respond? 
    84 warning/error/exit/replace with given value.
     85A target to aim for
     86Algorithmic complexity of build equations should be linear in number
     87of data points and logorithmic in the number of vertices.
     88
     89Constraint
     90Memory consumption of least squares equations must be indepenent on
     91the number of datapoints. (Except for the memory used to store the points.)
     92
     93How should interpolate_sww respond to a ? 
     94Warning and replace with given value.
    8595
    8696General Error Handling Guidelines
     
    115125*To save memory;
    116126When interpolating, don't build AtA, B and D. (This has been implemented)
    117 When fitting, don't build A. - Work with AtA and Atz.
     127When fitting, don't build A. - Work with AtA and Atz. Build Atz
     128directly, rather than through At*z - use code similar to AtA
    118129
    119130
     
    124135This is for the layer interpolation and fitting with IO: Use blocking
    125136when interpolating/fitting a large number of points (how large?  Can the
    126 program check the available memory and work out a good block size?)
     137program check the available memory and work out a good block size?
     138Maybe have it user specified.  Do have the option for user specified)
    127139
    128 The code has to be refactored so fitting can handling blocked point
     140The code has to be refactored so fitting can handle blocked point
    129141info.  Need a building phase and a solve phase. 
    130142
     
    137149Options (still thinking about this one)
    138150*  It should look in the cell of the point and then all adjacent
    139 cells. If it isn't there then ... we have problems.  My feeling is it
    140 has to be there.
    141 - Building the structure to keep this info could be prohibative.
     151cells. If it isn't there then ... expand more.
     152- Building the structure to keep this info could be prohibitive.
    142153
     154*A feature that could work for the current implementation is to keep a
     155 list of triangles that have been checked, so each incorrect triangle
     156 isn't checked three times, if all three verts are in the cell being checked.
    143157
    144158Another speed up option is to write the bottlenecks in C. Do it if necessary.
    145159
     160* To give warnings.
     161Use warnings module?  Check this out.
    146162
    147163DESIGN IDEAS, refactoring
     
    156172
    157173Have least squares as two stand alone packages.  One that has the guts
    158 of LS (e.g. the two new classes).  One that is has the IO API's
    159 (functions that load and write files).  The guts package will be
    160 dependent of the general mesh package.  The IO API will rely on an IO
    161 module.
     174of LS (e.g. the two new classes).  One that has the IO API's
     175(functions that load and write files and do the blocking).  The guts
     176package will be dependent of the general mesh package.  The IO API
     177will rely on an IO module.
    162178
    163179HOW WILL IMPLEMENTATION OCCUR?
     
    165181and paste from the old least squares. Implement memory and time tests
    166182on the old least squares (and new LS) before doing any changes. 
     183
     184For the speed up, discuss/ get go ahead from Ole before implementing
     185any major changes, since there is no obvious way forward.
Note: See TracChangeset for help on using the changeset viewer.