Opened 15 years ago

Last modified 11 years ago

#272 new enhancement

Write blocking csv2pts script

Reported by: ole Owned by: nariman
Priority: low Milestone:
Component: Efficiency and optimisation Version:
Severity: normal Keywords:
Cc:

Description

ted Rigby has shown that ANUGA isn't performing very well if one wants to convert 18 million points in a csv file to the pts netcdf format. Currently, we do this by creating a Geospatial data object from the csv file and then exporting it to the pts format. The problem is that all data is stored in memory and this causes thrashing for large datasets. This problem is holding Ted back in running a validation against the Macquarie-Rivulaet catchment.

One solution would be to make use of blocking so that the csv file isn't read until the export code is run - and then only in blocks.

Another, and perhaps simpler, approach would be to write a conversion script, csv2pts, which does the conversion without ever having to store all data in memory.

Change History (7)

comment:1 Changed 15 years ago by ole

Btw - Ted asked why a hardwired blocking value of 1e30 is passed into the reader within on of the methods in Geospatial data.

comment:2 Changed 15 years ago by ole

  • Owner changed from duncan to ole

comment:3 Changed 15 years ago by ole

  • Status changed from new to assigned

comment:4 Changed 15 years ago by duncan

Consider changing export_points_file/_write_pts_file in geospatial so if a csv/txt file is loaded using blocking calling export_points_file writes it using blocking.

comment:5 Changed 15 years ago by ole

  • Owner changed from ole to duncan
  • Priority changed from high to normal
  • Status changed from assigned to new

comment:6 Changed 13 years ago by nariman

  • Owner changed from duncan to nariman

comment:7 Changed 11 years ago by habili

  • Priority changed from normal to low
Note: See TracTickets for help on using tickets.