[7110] | 1 | \documentclass{manual} |
---|
| 2 | |
---|
| 3 | \title{Module csv\_tools} |
---|
| 4 | |
---|
| 5 | \author{ANUGA Developer} |
---|
| 6 | |
---|
| 7 | \begin{document} |
---|
| 8 | \maketitle |
---|
| 9 | |
---|
| 10 | \chapter{Module \code{csv_tools} functions} |
---|
| 11 | |
---|
| 12 | This document describes the functions within the \code{csv_tools.py} module. |
---|
| 13 | The \LaTeX is here as a placeholder while it is decided where this sort of things goes. |
---|
| 14 | |
---|
| 15 | \section{\code{merge_csv_key_values()}} |
---|
| 16 | |
---|
| 17 | \begin{methoddesc}{merge_csv_key_values}{file_title_list, |
---|
| 18 | output_file, |
---|
| 19 | key_col='hours', |
---|
| 20 | data_col='stage'} |
---|
| 21 | Module: \module{csv_tools} |
---|
| 22 | |
---|
| 23 | Merge one or more CSV files into a single output CSV file. The output file contains a \emph{key} |
---|
| 24 | column that is common between all input files and one \emph{data} column from each input file. |
---|
| 25 | |
---|
| 26 | \code{file_title_list} is a list of 2-tuples, each containing the path to an input file and a |
---|
| 27 | new column header string for the \emph{data} column from the file. |
---|
| 28 | |
---|
| 29 | \code{output_file} is the path to the output file. |
---|
| 30 | |
---|
| 31 | \code{key_col} is the column header string that identifies the \emph{key} column in each |
---|
| 32 | input file. If not provided, the \emph{key} column has the string 'hours'. |
---|
| 33 | |
---|
| 34 | \code{data_col} is the column header string that identifies the \emph{data} column in each |
---|
| 35 | input file. If not provided, the \emph{data} column has the header string 'stage'. |
---|
| 36 | |
---|
| 37 | As an example, suppose we have two CSV files \code{alpha.csv}: |
---|
| 38 | |
---|
| 39 | \begin{table}[htp] |
---|
| 40 | \begin{center} |
---|
| 41 | \begin{tabular}{|cccc|} |
---|
| 42 | \hline |
---|
| 43 | time & hours & stage & depth \\ |
---|
| 44 | \hline |
---|
| 45 | 3600 & 1.00 & 100.3 & 10.2 \\ |
---|
| 46 | 3636 & 1.01 & 100.3 & 10.0 \\ |
---|
| 47 | 3672 & 1.02 & 100.3 & 9.7 \\ |
---|
| 48 | 3708 & 1.03 & 100.3 & 8.9 \\ |
---|
| 49 | 3744 & 1.04 & 100.3 & 7.1 \\ |
---|
| 50 | \hline |
---|
| 51 | \end{tabular} |
---|
| 52 | \end{center} |
---|
| 53 | \end{table} |
---|
| 54 | |
---|
| 55 | and \code{beta.csv}: |
---|
| 56 | |
---|
| 57 | \begin{table}[htp] |
---|
| 58 | \begin{center} |
---|
| 59 | \begin{tabular}{|cccc|} |
---|
| 60 | \hline |
---|
| 61 | time & hours & stage & depth \\ |
---|
| 62 | \hline |
---|
| 63 | 3600 & 1.00 & 100.3 & 11.3 \\ |
---|
| 64 | 3636 & 1.01 & 100.3 & 10.5 \\ |
---|
| 65 | 3672 & 1.02 & 100.3 & 10.0 \\ |
---|
| 66 | 3708 & 1.03 & 100.3 & 9.7 \\ |
---|
| 67 | 3744 & 1.04 & 100.3 & 8.2 \\ |
---|
| 68 | \hline |
---|
| 69 | \end{tabular} |
---|
| 70 | \end{center} |
---|
| 71 | \end{table} |
---|
| 72 | |
---|
| 73 | and we wish to merge these two files, using the \code{hours} column as the \emph{key} and |
---|
| 74 | the \code{depth} column as \emph{data} in both input files. |
---|
| 75 | |
---|
| 76 | We would do this with the code fragment: |
---|
| 77 | |
---|
| 78 | \begin{verbatim} |
---|
| 79 | title_list = [('alpha.csv', 'alpha'), ('beta.csv', 'beta')} |
---|
| 80 | output_file = 'gamma.csv' |
---|
| 81 | merge_csv_key_values(title_list, output_file, key_col='hours', data_col='depth') |
---|
| 82 | \end{verbatim} |
---|
| 83 | |
---|
| 84 | The output file \code{gamma.csv} would contain: |
---|
| 85 | |
---|
| 86 | \begin{table}[htp] |
---|
| 87 | \begin{center} |
---|
| 88 | \begin{tabular}{|ccc|} |
---|
| 89 | \hline |
---|
| 90 | hours & alpha & beta \\ |
---|
| 91 | \hline |
---|
| 92 | 1.00 & 10.2 & 11.3 \\ |
---|
| 93 | 1.01 & 10.0 & 10.5 \\ |
---|
| 94 | 1.02 & 9.7 & 10.0 \\ |
---|
| 95 | 1.03 & 8.9 & 9.7 \\ |
---|
| 96 | 1.04 & 7.1 & 8.2 \\ |
---|
| 97 | \hline |
---|
| 98 | \end{tabular} |
---|
| 99 | \end{center} |
---|
| 100 | \end{table} |
---|
| 101 | |
---|
| 102 | The function looks for the \emph{key} column in all input files and ensures it is the same in all input files. |
---|
| 103 | |
---|
| 104 | The \emph{data} column must exist in all input files, and the column for each file is copied to the |
---|
| 105 | output file. The column header string is changed to be the string entered against each input filename |
---|
| 106 | in the \code{title_list} 2-tuple for the input file. |
---|
| 107 | \end{methoddesc} |
---|
| 108 | |
---|
| 109 | \end{document} |
---|