NetCDF Files and Components

The components of a netCDF file are described in section 2 of the NUG [NUG] . In this section we describe conventions associated with filenames and the basic components of a netCDF file. We also introduce new attributes for describing the contents of a file.

Filename

NetCDF files should have the file name extension ".nc".

Data Types

The netCDF data types char, byte, short, int, float or real, and double are all acceptable. The char type is not intended for numeric data. One byte numeric data should be stored using the byte data type. All integer types are treated by the netCDF interface as signed. It is possible to treat the byte type as unsigned by using the NUG convention of indicating the unsigned range using the valid_min, valid_max, or valid_range attributes.

NetCDF does not support a character string type, so these must be represented as character arrays. In this document, a one dimensional array of character data is simply referred to as a "string". An n-dimensional array of strings must be implemented as a character array of dimension (n,max_string_length), with the last (most rapidly varying) dimension declared large enough to contain the longest string in the array. All the strings in a given array are therefore defined to be equal in length. For example, an array of strings containing the names of the months would be dimensioned (12,9) in order to accommodate "September", the month with the longest name.

Naming Conventions

Variable, dimension and attribute names should begin with a letter and be composed of letters, digits, and underscores. Note that this is in conformance with the COARDS conventions, but is more restrictive than the netCDF interface which allows use of the hyphen character. The netCDF interface also allows leading underscores in names, but the NUG states that this is reserved for system use.

Case is significant in netCDF names, but it is recommended that names should not be distinguished purely by case, i.e., if case is disregarded, no two names should be the same. It is also recommended that names should be obviously meaningful, if possible, as this renders the file more effectively self-describing.

This convention does not standardize any variable or dimension names. Attribute names and their contents, where standardized, are given in English in this document and should appear in English in conforming netCDF files for the sake of portability. Languages other than English are permitted for variables, dimensions, and non-standardized attributes. The content of some standardized attributes are string values that are not standardized, and thus are not required to be in English. For example, a description of what a variable represents may be given in a non-English language using the long_name attribute (see [long-name] ) whose contents are not standardized, but a description given by the standard_name attribute (see [standard-name] ) must be taken from the standard name table which is in English.

Dimensions

A variable may have any number of dimensions, including zero, and the dimensions must all have different names. COARDS strongly recommends limiting the number of dimensions to four, but we wish to allow greater flexibility . The dimensions of the variable define the axes of the quantity it contains. Dimensions other than those of space and time may be included. Several examples can be found in this document. Under certain circumstances, one may need more than one dimension in a particular quantity. For instance, a variable containing a two-dimensional probability density function might correlate the temperature at two different vertical levels, and hence would have temperature on both axes.

If any or all of the dimensions of a variable have the interpretations of "date or time" (T), "height or depth" (Z), "latitude" (Y), or "longitude" (X) then we recommend, but do not require (see [coards-relationship] ), those dimensions to appear in the relative order T, then Z, then Y, then X in the CDL definition corresponding to the file. All other dimensions should, whenever possible, be placed to the left of the spatiotemporal dimensions.

Dimensions may be of any size, including unity. When a single value of some coordinate applies to all the values in a variable, the recommended means of attaching this information to the variable is by use of a dimension of size unity with a one-element coordinate variable. It is also acceptable to use a scalar coordinate variable which eliminates the need for an associated size one dimension in the data variable. The advantage of using either a coordinate variable or an auxiliary coordinate variable is that all its attributes can be used to describe the single-valued quantity, including boundaries. For example, a variable containing data for temperature at 1.5 m above the ground has a single-valued coordinate supplying a height of 1.5 m, and a time-mean quantity has a single-valued time coordinate with an associated boundary variable to record the start and end of the averaging period.

Variables

This convention does not standardize variable names.

NetCDF variables that contain coordinate data are referred to as coordinate variables, auxiliary coordinate variables, scalar coordinate variables, or multidimensional coordinate variables.

Missing Data

The NUG conventions (NUG appendix B) provide the _FillValue, missing_value, valid_min, valid_max, and valid_range attributes to indicate missing data.

The NUG conventions for missing data changed significantly between version 2.3 and version 2.4. Since version 2.4 the NUG defines missing data as all values outside of the valid_range, and specifies how the valid_range should be defined from the _FillValue (which has library specified default values) if it hasn’t been explicitly specified. If only one missing value is needed for a variable then we recommend that this value be specified using the _FillValue attribute. Doing this guarantees that the missing value will be recognized by generic applications that follow either the before or after version 2.4 conventions.

The scalar attribute with the name _FillValue and of the same type as its variable is recognized by the netCDF library as the value used to pre-fill disk space allocated to the variable. This value is considered to be a special value that indicates undefined or missing data, and is returned when reading values that were not written. The _FillValue should be outside the range specified by valid_range (if used) for a variable. The netCDF library defines a default fill value for each data type (NUG appendix C).

The missing values of a variable with scale_factor and/or add_offset attributes (see section [packed-data]) are interpreted relative to the variable’s external values (a.k.a. the packed values, the raw values, the values stored in the netCDF file), not the values that result after the scale and offset are applied. Applications that process variables that have attributes to indicate both a transformation (via a scale and/or offset) and missing values should first check that a data value is valid, and then apply the transformation. Note that values that are identified as missing should not be transformed. Since the missing value is outside the valid range it is possible that applying a transformation to it could result in an invalid operation. For example, the default _FillValue is very close to the maximum representable value of IEEE single precision floats, and multiplying it by 100 produces an "Infinity" (using single precision arithmetic).

Attributes

This standard describes many attributes (some mandatory, others optional), but a file may also contain non-standard attributes. Such attributes do not represent a violation of this standard. Application programs should ignore attributes that they do not recognise or which are irrelevant for their purposes. Conventional attribute names should be used wherever applicable. Non-standard names should be as meaningful as possible. Before introducing an attribute, consideration should be given to whether the information would be better represented as a variable. In general, if a proposed attribute requires ancillary data to describe it, is multidimensional, requires any of the defined netCDF dimensions to index its values, or requires a significant amount of storage, a variable should be used instead. When this standard defines string attributes that may take various prescribed values, the possible values are generally given in lower case. However, applications programs should not be sensitive to case in these attributes. Several string attributes are defined by this standard to contain "blank-separated lists". Consecutive words in such a list are separated by one or more adjacent spaces. The list may begin and end with any number of spaces. See [attribute-appendix] for a list of attributes described by this standard.

Identification of Conventions

We recommend that netCDF files that follow these conventions indicate this by setting the NUG defined global attribute Conventions to the string value "CF-1.6". The string is interpreted as a directory name relative to a directory that is a repository of documents describing sets of discipline-specific conventions. The conventions directory name is currently interpreted relative to the directory pub/netcdf/Conventions/ on the host machine ftp.unidata.ucar.edu. The web based versions of this document are linked from the netCDF Conventions web page.

Description of file contents

The following attributes are intended to provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers. The attribute values are all character strings. For readability in ncdump outputs it is recommended to embed newline characters into long strings to break them into lines. For backwards compatibility with COARDS none of these global attributes is required.

The NUG defines title and history to be global attributes. We wish to allow the newly defined attributes, i.e., institution, source, references, and comment, to be either global or assigned to individual variables. When an attribute appears both globally and as a variable attribute, the variable’s version has precedence.

title: A succinct description of what is in the dataset.
institution: Specifies where the original data was produced.
source: The method of production of the original data. If it was model-generated, source should name the model and its version, as specifically as could be useful. If it is observational, source should characterize it (e.g., "surface observation" or "radiosonde").
history: Provides an audit trail for modifications to the original data. Well-behaved generic netCDF filters will automatically append their name and the parameters with which they were invoked to the global history attribute of an input netCDF file. We recommend that each line begin with a timestamp indicating the date and time of day that the program was executed.
references: Published or web-based references that describe the data or methods used to produce it.
comment: Miscellaneous information about the data or methods used to produce it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ch02.adoc

ch02.adoc

NetCDF Files and Components

Filename

Data Types

Naming Conventions

Dimensions

Variables

Missing Data

Attributes

Identification of Conventions

Description of file contents

Files

ch02.adoc

Latest commit

History

ch02.adoc

File metadata and controls

NetCDF Files and Components

Filename

Data Types

Naming Conventions

Dimensions

Variables

Missing Data

Attributes

Identification of Conventions

Description of file contents