source: branches/numpy/anuga/utilities/csv_tools.tex @ 7110

Last change on this file since 7110 was 7110, checked in by rwilson, 15 years ago

Latex fragment documenting csv_tools.py.

File size: 3.4 KB
Line 
1\documentclass{manual}
2
3\title{Module csv\_tools}
4
5\author{ANUGA Developer}
6
7\begin{document}
8\maketitle
9
10\chapter{Module \code{csv_tools} functions}
11
12This document describes the functions within the \code{csv_tools.py} module.
13The \LaTeX is here as a placeholder while it is decided where this sort of things goes.
14
15\section{\code{merge_csv_key_values()}}
16
17\begin{methoddesc}{merge_csv_key_values}{file_title_list,
18                                         output_file,
19                                         key_col='hours',
20                                         data_col='stage'}
21Module: \module{csv_tools}
22
23Merge one or more CSV files into a single output CSV file.  The output file contains a \emph{key}
24column that is common between all input files and one \emph{data} column from each input file.
25
26\code{file_title_list} is a list of 2-tuples, each containing the path to an input file and a
27new column header string for the \emph{data} column from the file.
28
29\code{output_file} is the path to the output file.
30
31\code{key_col} is the column header string that identifies the \emph{key} column in each
32input file.  If not provided, the \emph{key} column has the string 'hours'.
33
34\code{data_col} is the column header string that identifies the \emph{data} column in each
35input file.  If not provided, the \emph{data} column has the header string 'stage'.
36
37As an example, suppose we have two CSV files \code{alpha.csv}:
38
39\begin{table}[htp]
40  \begin{center}
41    \begin{tabular}{|cccc|}
42      \hline
43      time & hours & stage & depth \\
44      \hline
45      3600 & 1.00 & 100.3 & 10.2 \\
46      3636 & 1.01 & 100.3 & 10.0 \\
47      3672 & 1.02 & 100.3 &  9.7 \\
48      3708 & 1.03 & 100.3 &  8.9 \\
49      3744 & 1.04 & 100.3 &  7.1 \\
50      \hline
51    \end{tabular}
52  \end{center}
53\end{table}
54
55and \code{beta.csv}:
56
57\begin{table}[htp]
58  \begin{center}
59    \begin{tabular}{|cccc|}
60      \hline
61      time & hours & stage & depth \\
62      \hline
63      3600 & 1.00 & 100.3 & 11.3 \\
64      3636 & 1.01 & 100.3 & 10.5 \\
65      3672 & 1.02 & 100.3 & 10.0 \\
66      3708 & 1.03 & 100.3 &  9.7 \\
67      3744 & 1.04 & 100.3 &  8.2 \\
68      \hline
69    \end{tabular}
70  \end{center}
71\end{table}
72
73and we wish to merge these two files, using the \code{hours} column as the \emph{key} and
74the \code{depth} column as \emph{data} in both input files.
75
76We would do this with the code fragment:
77
78\begin{verbatim}
79title_list = [('alpha.csv', 'alpha'), ('beta.csv', 'beta')}
80output_file = 'gamma.csv'
81merge_csv_key_values(title_list, output_file, key_col='hours', data_col='depth')
82\end{verbatim}
83
84The output file \code{gamma.csv} would contain:
85
86\begin{table}[htp]
87  \begin{center}
88    \begin{tabular}{|ccc|}
89      \hline
90      hours & alpha & beta \\
91      \hline
92      1.00 & 10.2 & 11.3 \\
93      1.01 & 10.0 & 10.5 \\
94      1.02 &  9.7 & 10.0 \\
95      1.03 &  8.9 &  9.7 \\
96      1.04 &  7.1 &  8.2 \\
97      \hline
98    \end{tabular}
99  \end{center}
100\end{table}
101
102The function looks for the \emph{key} column in all input files and ensures it is the same in all input files.
103
104The \emph{data} column must exist in all input files, and the column for each file is copied to the
105output file.  The column header string is changed to be the string entered against each input filename
106in the \code{title_list} 2-tuple for the input file.
107\end{methoddesc}
108
109\end{document}
Note: See TracBrowser for help on using the repository browser.