Opened 17 years ago
Last modified 13 years ago
#272 new enhancement
Write blocking csv2pts script
Reported by: | ole | Owned by: | nariman |
---|---|---|---|
Priority: | low | Milestone: | |
Component: | Efficiency and optimisation | Version: | |
Severity: | normal | Keywords: | |
Cc: |
Description
ted Rigby has shown that ANUGA isn't performing very well if one wants to convert 18 million points in a csv file to the pts netcdf format. Currently, we do this by creating a Geospatial data object from the csv file and then exporting it to the pts format. The problem is that all data is stored in memory and this causes thrashing for large datasets. This problem is holding Ted back in running a validation against the Macquarie-Rivulaet catchment.
One solution would be to make use of blocking so that the csv file isn't read until the export code is run - and then only in blocks.
Another, and perhaps simpler, approach would be to write a conversion script, csv2pts, which does the conversion without ever having to store all data in memory.
Change History (7)
comment:1 Changed 17 years ago by
comment:2 Changed 17 years ago by
Owner: | changed from duncan to ole |
---|
comment:3 Changed 17 years ago by
Status: | new → assigned |
---|
comment:4 Changed 17 years ago by
Consider changing export_points_file/_write_pts_file in geospatial so if a csv/txt file is loaded using blocking calling export_points_file writes it using blocking.
comment:5 Changed 17 years ago by
Owner: | changed from ole to duncan |
---|---|
Priority: | high → normal |
Status: | assigned → new |
comment:6 Changed 16 years ago by
Owner: | changed from duncan to nariman |
---|
comment:7 Changed 13 years ago by
Priority: | normal → low |
---|
Btw - Ted asked why a hardwired blocking value of 1e30 is passed into the reader within on of the methods in Geospatial data.