Opened 17 years ago
Last modified 13 years ago
#272 new enhancement
Write blocking csv2pts script
Reported by: | ole | Owned by: | nariman |
---|---|---|---|
Priority: | low | Milestone: | |
Component: | Efficiency and optimisation | Version: | |
Severity: | normal | Keywords: | |
Cc: |
Description
ted Rigby has shown that ANUGA isn't performing very well if one wants to convert 18 million points in a csv file to the pts netcdf format. Currently, we do this by creating a Geospatial data object from the csv file and then exporting it to the pts format. The problem is that all data is stored in memory and this causes thrashing for large datasets. This problem is holding Ted back in running a validation against the Macquarie-Rivulaet catchment.
One solution would be to make use of blocking so that the csv file isn't read until the export code is run - and then only in blocks.
Another, and perhaps simpler, approach would be to write a conversion script, csv2pts, which does the conversion without ever having to store all data in memory.
Change History (7)
comment:1 Changed 17 years ago by ole
comment:2 Changed 17 years ago by ole
- Owner changed from duncan to ole
comment:3 Changed 17 years ago by ole
- Status changed from new to assigned
comment:4 Changed 17 years ago by duncan
Consider changing export_points_file/_write_pts_file in geospatial so if a csv/txt file is loaded using blocking calling export_points_file writes it using blocking.
comment:5 Changed 16 years ago by ole
- Owner changed from ole to duncan
- Priority changed from high to normal
- Status changed from assigned to new
comment:6 Changed 15 years ago by nariman
- Owner changed from duncan to nariman
comment:7 Changed 13 years ago by habili
- Priority changed from normal to low
Btw - Ted asked why a hardwired blocking value of 1e30 is passed into the reader within on of the methods in Geospatial data.