Storing data and meta-data

Storing data and meta-data


The data format choices described here for ALMA are very similar to the ones which were made by AMIP for collection data from GCMs. This ensures that the proposed data format is applicable for off-line land-surface simulations as well as coupled experiments and will thus be suited for all actions of GLASS project.

Information needed to describe the data.

To ensure that data can be exchanged easily, the files should not only contain the values but also the full description of the variables. This last information is called the meta-data. Such self-descriptive files can be exchanged and used with little extra information. To achieve this, a thorough analysis has to be performed on the information required to describe fully geophysical data. Such work was performed for modelling applications by groups at the Hadley Centre and the NCAR and resulted in two convention of meta-data which are in process of being merged.

For ALMA it is proposed to adopt this new convention which will build on the one developed by the Hadley Centre and is called GDT. The two most important features it offers to land-surface schemes is that it allows to store non-rectilinear grids and to compress data by gathering. Two points which are essential for land-surface schemes are the possibility to use irregular grids and not have to store ocean points when a longitude latitude grid is used. These two features were not present in older conventions such as the COARDS convention.

Once the meta-data convention is chosen an appropriate numerical format needs to be found which allows to store it all in a file. The only condition such a format has to fulfill is that it also allows to write in the file all of the meta-data in an unambiguous way. If this is the case then the numerical format can be changed when computer technology evolves as programs can be written which transfer all the information from the old files into the new ones without the intervention of an operator. Thus, if the right specification for the meta-data are chosen, the choice of data format can be guided by convenience for the users.

Format used to store the data.

A data format which allows to store all meta-information in the same file as the data is netCDF. This is a binary format which is machine independent and the software needed to store or retrieve data runs on a wide range of platforms. In other words the data is stored in a compact form and a file generated on one machine can be read on any other computer. The netCDF format is in the public domain and is maintained by UNIDATA. This ensures that it is widely distributed in the geophysical community and that its maintenance is assured. Furthermore, a large fraction of data analysis or graphical software used in the geophysical community can read or write netCDF files. The following information can be stored along with the data in a netCDF file : The following information can be store for each variable : Header informations extracted with the "ncdump -h" command from some netCDF files containing off-line forcing data for land-surface schemes are presented :

Software freely available to write and analyze data in the netCDF format.

A long list of software packages which can read or write netCDF files is available on the UNIDATA web site. In the following we would like to highlight the software packages which are public domain and will be particularly useful to work with the data exchanged within GLASS (Additions to this list are welcome).
Data management software :
Graphics :
I/O libraries :

Last modified: Mon May 15 22:32:20 WEST 2000