NOTICE: This version of the NSF Unidata web site (archive.unidata.ucar.edu) is no longer being updated.
Current content can be found at unidata.ucar.edu.

Re: [netcdfgroup] wanted: CSV to netcdf conversion utility

<49AC0A41.5090209@xxxxxxxx>
In-Reply-To: <49AC0A41.5090209@xxxxxxxx>
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="Boundary-00=_K6HsJyi/OUlNDCX"
Message-ID: <200903061238.18585.gordon.keith@xxxxxxxx>
X-Loop:  netcdfgroup@xxxxxxxxxxxxxxxx

--Boundary-00=_K6HsJyi/OUlNDCX
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Solved.

With advice from Steve I've written a small ferret script (attached).

With ferret installed the command:

ferret -script csv2nc data.csv

will create a NetCDF file data.csv.nc with variable name being the upper case
of the alphanumeric parts of the column names, the long_name attribute being
the full column name and the units attribute being anything enclosed in
paranthesis in the column name.

The csv file must have as its first line a comma separated list of column
names which must have unique alphanumeric parts.

Ferrets conversion rules mean dates, times and numbers are stored as floats,
everthing else is stored as chars.

Regards
Gordon

On Tue, 3 Mar 2009 03:33:05 am Steve Hankin wrote:
> Hi Gordon,
>
> The Ferret program (http://www.ferret.noaa.gov/) can do this.  It's not
> a "basic utility" -- rather a full analysis and visualization
> application -- but with two commands it can do the conversion.  With
> another couple of commands it can capture and use the column headings as
> the variable names.  You can also go the next steps to create a proper
> CF-conformant netCDF file by capturing one of the variable (typically
> time or depth) as the netCDF "coordinate" variable of the file.
>
>     - Steve
>
> =============================================
>
> Gordon Keith wrote:
> > Is there a program available to convert CSV files to netcdf format?
> >
> > We are finding netcdf a useful format for accessing data and it would be
> > good to be able to convert data that arrives in CSV (Comma Separated
> > Variable) format to basic netcdf files.
> >
> > I'm thinking a basic utility that reads a CSV file, uses the first row as
> > variable names, uses the second row to determine data types, and puts the
> > data from the file into a CSV file. It occurs to me that such a utility
> > shouldn't be too hard to write, so someone has probably done it.
> >
> > Regards
> > Gordon

--

Gordon Keith
Programmer/Data Analyst
Marine Acoustics
CSIRO Marine and Atmospheric Research
http://www.cmar.csiro.au

"I think there is a world market for maybe five computers"
- Thomas Watson, Chairman of IBM, 1943

--Boundary-00=_K6HsJyi/OUlNDCX
Content-Type: text/plain; charset="iso-8859-1"; name="csv2nc.jnl"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="csv2nc.jnl"

 ! Convert CSV file to NetCDF
 ! Requires one parameter, the name of a CSV file to convert.
 ! Each column of the file must have a name in the first row (trailing unnamed
 columns are dropped)
 ! The alphanumeric portion of the column names must be unique
 !
 ! author: gordon.keith@xxxxxxxx

let csv_cols={spawn:"head -1 $1 | sed s/[^a-z,A-Z_0-9]//g | sed 's/,*$//'"}
columns/var="`csv_cols`"/skip=1 $1
save/file=$1.nc/clobber `csv_cols`

 ! apply long names and units
use $1.nc
let csv_vars="`csv_cols`"
let csv_ttls={spawn:"head -1 $1 | sed 's/[\"\"]//g'"}
let csv_unit={spawn:"head -1 $1 | sed 's/[\"\"]//g' | sed 's/,[^,(]*(/,(/g' |
sed 's/)[^,]*,/),/g' | sed 's/,[^,(]*,/,,/g' | sed 's/,[^,(][^,(]*,/,,/g' | sed
's/^[^,(]*//' | sed 's/[),][^),]*$//' | sed 's/[)(]//g'"}
repeat/range=1:`strlen(csv_vars)` (  \
   let csv_vc=strindex(csv_vars,",") ;    \
   let csv_tc=strindex(csv_ttls,",") ;    \
   let csv_uc=strindex(csv_unit,",") ;    \
   if `csv_vc EQ 0` then exit/loop ;  \
   let csv_var=substring(csv_vars,1,csv_vc - 1) ;\
   if `csv_tc GT 1` then define att/output/type=string
   `csv_var`.long_name="`substring(csv_ttls,1,csv_tc - 1)`"  ; \
   if `csv_uc GT 1` then define att/output/type=string
   `csv_var`.units="`substring(csv_unit,1,csv_uc - 1)`" ; \
   let csv_vars=substring("`csv_vars`",`csv_vc` + 1, 20480) ; \
   let csv_ttls=substring("`csv_ttls`",`csv_tc` + 1, 20480) ; \
   let csv_unit=substring("`csv_unit`",`csv_uc` + 1, 20480) ; \
)
if `csv_tc GT 1` then
   define att/output/type=string
   `csv_vars`.long_name="`substring(csv_ttls,1,csv_tc - 1)`"
elif `strlen(csv_ttls) GT 0` then
   define att/output/type=string `csv_vars`.long_name="`csv_ttls`"
endif
if `csv_uc EQ 1` then
elif `csv_uc GT 1` then
   define att/output/type=string `csv_vars`.units="`substring(csv_unit,1,csv_uc
   - 1)`"
elif `strlen(csv_unit) GT 0` then
   define att/output/type=string `csv_vars`.units="`csv_unit`"
endif
save/file=$1.nc/clobber `csv_cols`


--Boundary-00=_K6HsJyi/OUlNDCX--