1 | \documentclass{manual} |
---|
2 | |
---|
3 | \title{Module csv\_tools} |
---|
4 | |
---|
5 | \author{ANUGA Developer} |
---|
6 | |
---|
7 | \begin{document} |
---|
8 | \maketitle |
---|
9 | |
---|
10 | \chapter{Module \code{csv_tools} functions} |
---|
11 | |
---|
12 | This document describes the functions within the \code{csv_tools.py} module. |
---|
13 | The \LaTeX is here as a placeholder while it is decided where this sort of things goes. |
---|
14 | |
---|
15 | \section{\code{merge_csv_key_values()}} |
---|
16 | |
---|
17 | \begin{methoddesc}{merge_csv_key_values}{file_title_list, |
---|
18 | output_file, |
---|
19 | key_col='hours', |
---|
20 | data_col='stage'} |
---|
21 | Module: \module{csv_tools} |
---|
22 | |
---|
23 | Merge one or more CSV files into a single output CSV file. The output file contains a \emph{key} |
---|
24 | column that is common between all input files and one \emph{data} column from each input file. |
---|
25 | |
---|
26 | \code{file_title_list} is a list of 2-tuples, each containing the path to an input file and a |
---|
27 | new column header string for the \emph{data} column from the file. |
---|
28 | |
---|
29 | \code{output_file} is the path to the output file. |
---|
30 | |
---|
31 | \code{key_col} is the column header string that identifies the \emph{key} column in each |
---|
32 | input file. If not provided, the \emph{key} column has the string 'hours'. |
---|
33 | |
---|
34 | \code{data_col} is the column header string that identifies the \emph{data} column in each |
---|
35 | input file. If not provided, the \emph{data} column has the header string 'stage'. |
---|
36 | |
---|
37 | As an example, suppose we have two CSV files \code{alpha.csv}: |
---|
38 | |
---|
39 | \begin{table}[htp] |
---|
40 | \begin{center} |
---|
41 | \begin{tabular}{|cccc|} |
---|
42 | \hline |
---|
43 | time & hours & stage & depth \\ |
---|
44 | \hline |
---|
45 | 3600 & 1.00 & 100.3 & 10.2 \\ |
---|
46 | 3636 & 1.01 & 100.3 & 10.0 \\ |
---|
47 | 3672 & 1.02 & 100.3 & 9.7 \\ |
---|
48 | 3708 & 1.03 & 100.3 & 8.9 \\ |
---|
49 | 3744 & 1.04 & 100.3 & 7.1 \\ |
---|
50 | \hline |
---|
51 | \end{tabular} |
---|
52 | \end{center} |
---|
53 | \end{table} |
---|
54 | |
---|
55 | and \code{beta.csv}: |
---|
56 | |
---|
57 | \begin{table}[htp] |
---|
58 | \begin{center} |
---|
59 | \begin{tabular}{|cccc|} |
---|
60 | \hline |
---|
61 | time & hours & stage & depth \\ |
---|
62 | \hline |
---|
63 | 3600 & 1.00 & 100.3 & 11.3 \\ |
---|
64 | 3636 & 1.01 & 100.3 & 10.5 \\ |
---|
65 | 3672 & 1.02 & 100.3 & 10.0 \\ |
---|
66 | 3708 & 1.03 & 100.3 & 9.7 \\ |
---|
67 | 3744 & 1.04 & 100.3 & 8.2 \\ |
---|
68 | \hline |
---|
69 | \end{tabular} |
---|
70 | \end{center} |
---|
71 | \end{table} |
---|
72 | |
---|
73 | and we wish to merge these two files, using the \code{hours} column as the \emph{key} and |
---|
74 | the \code{depth} column as \emph{data} in both input files. |
---|
75 | |
---|
76 | We would do this with the code fragment: |
---|
77 | |
---|
78 | \begin{verbatim} |
---|
79 | title_list = [('alpha.csv', 'alpha'), ('beta.csv', 'beta')} |
---|
80 | output_file = 'gamma.csv' |
---|
81 | merge_csv_key_values(title_list, output_file, key_col='hours', data_col='depth') |
---|
82 | \end{verbatim} |
---|
83 | |
---|
84 | The output file \code{gamma.csv} would contain: |
---|
85 | |
---|
86 | \begin{table}[htp] |
---|
87 | \begin{center} |
---|
88 | \begin{tabular}{|ccc|} |
---|
89 | \hline |
---|
90 | hours & alpha & beta \\ |
---|
91 | \hline |
---|
92 | 1.00 & 10.2 & 11.3 \\ |
---|
93 | 1.01 & 10.0 & 10.5 \\ |
---|
94 | 1.02 & 9.7 & 10.0 \\ |
---|
95 | 1.03 & 8.9 & 9.7 \\ |
---|
96 | 1.04 & 7.1 & 8.2 \\ |
---|
97 | \hline |
---|
98 | \end{tabular} |
---|
99 | \end{center} |
---|
100 | \end{table} |
---|
101 | |
---|
102 | The function looks for the \emph{key} column in all input files and ensures it is the same in all input files. |
---|
103 | |
---|
104 | The \emph{data} column must exist in all input files, and the column for each file is copied to the |
---|
105 | output file. The column header string is changed to be the string entered against each input filename |
---|
106 | in the \code{title_list} 2-tuple for the input file. |
---|
107 | \end{methoddesc} |
---|
108 | |
---|
109 | \end{document} |
---|