wiki:ClassGeospatial_data

Version 13 (modified by rwilson, 16 years ago) (diff)

--

Internal Documentation/Classes


Class Geospatial_data

Defined in file geospatial_data/geospatial_data.py.

Discussion

Class Attributes

data_points The points data in the object.
geo_reference Current origin the points data is relative to.
attributes Dictionary of key:values attributes for the point data. Each values is a list/array of attributes values of the same dimensionality as the point data.
file_name Name of .pts or .txt file that the points data was read from or is being read from (blocking).
max_read_lines Maximum number of text lines that are read each step during blocking reads.
default_attribute_name The name of the default attribute if attribute name not specified.
verbose True if instance methods are to be verbose. Instance methods shouldn't take a verbose parameter - it should come from the attribute.
blocking_georef ??
blocking_keys ??
number_of_points ??
fid ??
start_row ??
last_row ??
show_verbose ??
verbose_block_size ??
block_number ??
number_of_blocks ??
header ??
file_pointer ??

Class Methods

_ _init_ _(data_points=None, attributes=None, geo_reference=None, default_attribute=None, file_name=None, latitudes=None, longitudes=None, points_are_lats_longs=False, max_read_lines=None, load_file_now=True, verbose=False) data_points is either a 2-dimensional (Mx2) set of points or a filename (ending in .pts, .csv or .txt) of a file containing the point data. attributes contains the attribute value to be stored at each data point and must be either a list or vector array (length M, assumed attribute 'elevation') or a dictionary with multiple attribute name keys and value lists. geo_reference is the Geo_reference reference defining the origin of the points in this object. default_attribute_name is the name of the attribute considered to be default in the get_attribute() method. file_name is the name of a NetCDF (or .txt) file to load points data from. latitudes and longitudes are lists/vectors defining points as lats/longs rather then the data_points parameter. points_are_lats_longs is a boolean which is True if the points data are actually lats/longs and not UTM coordinates. max_read_lines sets the number of lines to read in a 'block' if blocking data from a file. load_file_now is True if the file data is to be automatically loaded into memory (used if blocking). verbose is True if this class is to be verbose.
_ _len_ _() Returns the length of the data_points attribute.
_ _repr_ _() Returns a string representation of get_data_points(absolute=True).
check_data_points(data_points) Populate the data_points attribute dataset.
set_attributes(attributes) Assign attributes data to this instance. attributes is either list/array of attributes values (default name elevation) or a dictionary of attribute name:values pairs.
set_geo-reference() Set or change the Geo_reference reference for this instance. If changing, ensure that absolute coordinate values are unchanged.
set_default_attribute(default_attribute_name) Set the instance default attribute name (used in get_attribute()).
set_verbose(verbose=False) Set the verbose attribute for this instance.
clip(polygon, closed=True, verbose=False) Clip the instance points set by a given polygon. polygon is either a list or array of points (Nx2) or a Geospatial_data instance. closed is True if any points on the bounding polygon are to be considered inside the polygon. verbose - not sure why this is here - class has a verbose attribute. Method returns the point set inside the clipping polygon.
clip_outside(polygon, closed=True, verbose=False) Clip the instance points set by a given polygon. polygon is either a list or array of points (Nx2) or a Geospatial_data instance. closed is True if any points on the bounding polygon are to be considered inside the polygon. verbose - not sure why this is here - class has a verbose attribute. Method returns the point set outside the clipping polygon.
get_geo_reference() Return the Geo_reference reference assigned to this instance.
get_data_points(absolute=True, geo_reference=None, as_lat_long=False, isSouthHemisphere=True Get coordinates of the points dataset in this instance. Id absolute is True return absolute UTM coordinates, else return coordinates relative to the associated Geo_reference origin. If geo_reference is supplied return coordinates relative to that origin. If as_lat_long is True return coordinates as lat/long points. If isSouthHemisphere is True return any lat/long coordinates as for the southern hemisphere, else the northern.
get_attributes(attribute_name=None) Return the attribute values for one attribute. If attribute_name is None return the values for the default attribute.
get_all_attributes() Return all attributes as a (possibly empty) dictionary or None (if no attributes).
_ _add_ _(other) Method to allow addition of two Geo_reference instances. In this case, 'add' means concatenate the point datasets and attributes. Throws an exception if the attributes aren't the same in both instances. Returns a new instance of Geo_reference which is absolute.
_ _radd_ _(other) Method to allow addition of two Geo_reference instances. In this case, 'add' means concatenate the point datasets and attributes. Throws an exception if the attributes aren't the same in both instances. Returns a new instance of Geo_reference which is absolute.
import_points_file(file_name, delimiter=None, verbose=False) Import a .txt, .csv or .pts file into this instance from file_name. Also reads attributes and geo_reference data from the input file. delimiter is unused. verbose is superfluous.
export_points_file(file_name, absolute=True, as_lat_long=False, isSouthHemisphere=True) Write Geo_reference instance data to a .pts, .csv or .txt file. file_name is the path to the file to write. If absolute is True then the point coordinates in the instance are converted to absolute form and then written out. If absolute is False the Geo_reference point coordinates remain unchanged and written out (note, they may be relative or absolute). If is_lat_long is True then points are written out as lat/long coordinates, southern hemisphere form if isSouthHemisphere is True, northern if False.
get_sample(indices) Get a new Geo_reference instance containing a sample of points in this instance. indices contains the ordinals of the points to be placed in the new instance. Attributes of the sampled points are copied to the new instance.
split(factor=0.5, seed_num=None, verbose=False) Split this instance into two new instances containing a random selection of points. factor is the fraction of the input instance to be copied into the first split instance. seed_num is the random seed to use (testing only). If verbose is True then this method is to be verbose.
_ _iter_ _() Allow iteration of the instance over input blocks. This function sets up ready for the next() method. This measn that the blocking_georef, blocking_keys and number_of_points are read from the file if it is a .pts file, or the header and file_pointer are read from the assumed .csv file. In the .pts case we also initialize variables used for PTS blocking.
next() Read another block of data into memory. Return an instance of a new Geospatial_data object containing the data block read.

Module Methods

_set_using_lat_long(latitudes, longitudes, geo_reference, data_points, points_are_lats_longs) Set the points data with lat/long values. If geo_reference is supplied will always throw exception?
_read_pts_file(file_name, verbose=False) Read a NetCDF .pts file and return (dict_points, dict_attributes, geo_reference).
_read_csv_file(file_name, verbose=False) Read a .csv file and return (dict_points, dict_attributes, geo_reference).
_read_csv_file_header(file_pointer, delimiter=CSV_DELIMITER, verbose=False) Read the header form an already open CSV text file. Returns the cleaned header and file_pointer (the open file handle).
_read_csv_file_blocking(file_pointer, header, delimiter=CSV_DELIMITER, max_read_lines=MAX_READ_LINES, verbose=False) Read a .csv file with blocking semantics - will raise StopIteration? if no more data. Otherwise return (points, dict_attributes, geo_ref, file_pointer).
_read_pts_file_header(fid, verbose=False) Read .pts file header information from open file fid. Returns (geo_ref, attribute_keys, number_of_points).
_read_pts_file_blocking(fid, start_row, fin_row, keys) Read the body of a .pts file, with blocking semantics. Returns (point_list, attributes) where point_list is the points subset defined by [start_row:fin_row] and attributes is the attributes from the same subset.
_write_pts_file(file_name, write_data_points, write_attributes=None, write_geo_reference=None Write a .pts data file to a file file_name. write_data_points holds the points data to write. write_attributes contains the attributes to write. write_geo_reference is the Geo_reference data to write.
_write_csv_file(file_name, write_data_points, write_attributes=None, as_lat_long=False, delimiter=',') Write paoint and attributes data to a .csv file file_name. If as_lat_long is True write coordinates out as lat/long.
_write_urs_file(file_name, points, delimiter=' ') Write points data to a URS file file_name. points data is in lat/long form.
_point_atts2array(point_atts) Convert a dictionary of points data into a dictionary of num.array data. Input dictionary has keys 'pointlist' of points data and key 'attributelist' which is a dictionary of attributes data. Changes point_atts in-place.
geospatial_data2points_dictionary(geospatial_data) Convert a geospatial object to a dictionary of points. Dictionary is left with keys 'pointlist' which contains a list/array of points, 'attributelist' which is a dictionary of key:value pairs and 'geo_reference' which is the geo_reference object.
points_dictionary2geospatial_data(geospatial_data) Convert a dictionary of points to a Geo_spatial object. Inverse of geospatial_data2points_dictionary() above.
ensure_absolute(points, geo_reference=None) Ensure that a set of points are in absolute coordinates. If geo_reference is specified, assume the points are relative to it.
ensure_geospatial(points, geo_reference=None) Convert a set of points to a Geospatial_data instance. geo_reference may be either a Geo_reference object or a 3-tuple of (zone, easting, northing).
find_optimal_smoothing_parameter(data_file, alpha_list=None, mesh_file=None, boundary_poly=None, mesh_resolution=100000, north_boundary=None, south_boundary=None, east_boundary=None, west_boundary=None, plot_name='all_alphas', split_factor=0.1, seed_num=False, cache=False, verbose=False) Using a sample of points from data_file and a selection of alpha values from alpha_list returns the alpha value that has the smallest covariance between the predicted value and removed values.
find_optimal_smoothing_parameter() Obsolete version of the above function.

Notes

This class is defined in the 'old' way. Will need to be changed for python 2.6/3.x.

Way too overloaded in the constructor. For instance, data_points are either a 2-dimensional array of points or a filename of input data. Yet there is another file_name parameter!? Maybe better not to fill with data in the constructor but call one of a selection of 'fill_with_data' routines later.

Not sure that building blocking into this class is the correct approach. Probably better to override file semantics to provide blocking and then get this class to contain a reference to a file-like object to get data from. See above point.

Overloaded way too much in some methods. For instance, get_data_points() should be broken up into a simple method and attributes that control UTM/lat/long coordinates, southern/northern hemisphere, etc.

Return values overloaded too much. For instance, get_all_attributes() should always return an empty dictionary if no attributes, not "either an empty dictionary or None".


Internal Documentation/Classes