The components of a netCDF file are described in section 2 of the NUG [NUG] . In this section we describe conventions associated with filenames and the basic components of a netCDF file. We also introduce new attributes for describing the contents of a file.
The netCDF data types char
, byte
, short
, int
, float
or real
, and double
are all acceptable. The char
type is not intended for numeric data. One byte numeric data should be stored using the byte
data type. All integer types are treated by the netCDF interface as signed. It is possible to treat the byte
type as unsigned by using the NUG convention of indicating the unsigned range using the valid_min
, valid_max
, or valid_range
attributes.
NetCDF does not support a character string type, so these must be represented as character arrays. In this document, a one dimensional array of character data is simply referred to as a "string". An n-dimensional array of strings must be implemented as a character array of dimension (n,max_string_length), with the last (most rapidly varying) dimension declared large enough to contain the longest string in the array. All the strings in a given array are therefore defined to be equal in length. For example, an array of strings containing the names of the months would be dimensioned (12,9) in order to accommodate "September", the month with the longest name.
Variable, dimension and attribute names should begin with a letter and be composed of letters, digits, and underscores. Note that this is in conformance with the COARDS conventions, but is more restrictive than the netCDF interface which allows use of the hyphen character. The netCDF interface also allows leading underscores in names, but the NUG states that this is reserved for system use.
Case is significant in netCDF names, but it is recommended that names should not be distinguished purely by case, i.e., if case is disregarded, no two names should be the same. It is also recommended that names should be obviously meaningful, if possible, as this renders the file more effectively self-describing.
This convention does not standardize any variable or dimension names. Attribute names and their contents, where standardized, are given in English in this document and should appear in English in conforming netCDF files for the sake of portability. Languages other than English are permitted for variables, dimensions, and non-standardized attributes. The content of some standardized attributes are string values that are not standardized, and thus are not required to be in English. For example, a description of what a variable represents may be given in a non-English language using the long_name
attribute (see [long-name] ) whose contents are not standardized, but a description given by the standard_name
attribute (see [standard-name] ) must be taken from the standard name table which is in English.
A variable may have any number of dimensions, including zero, and the dimensions must all have different names. COARDS strongly recommends limiting the number of dimensions to four, but we wish to allow greater flexibility . The dimensions of the variable define the axes of the quantity it contains. Dimensions other than those of space and time may be included. Several examples can be found in this document. Under certain circumstances, one may need more than one dimension in a particular quantity. For instance, a variable containing a two-dimensional probability density function might correlate the temperature at two different vertical levels, and hence would have temperature on both axes.
If any or all of the dimensions of a variable have the interpretations of "date or time" (T
), "height or depth" (Z
), "latitude" (Y
), or "longitude" (X
) then we recommend, but do not require (see [coards-relationship] ), those dimensions to appear in the relative order T
, then Z
, then Y
, then X
in the CDL definition corresponding to the file. All other dimensions should, whenever possible, be placed to the left of the spatiotemporal dimensions.
Dimensions may be of any size, including unity. When a single value of some coordinate applies to all the values in a variable, the recommended means of attaching this information to the variable is by use of a dimension of size unity with a one-element coordinate variable. It is also acceptable to use a scalar coordinate variable which eliminates the need for an associated size one dimension in the data variable. The advantage of using either a coordinate variable or an auxiliary coordinate variable is that all its attributes can be used to describe the single-valued quantity, including boundaries. For example, a variable containing data for temperature at 1.5 m above the ground has a single-valued coordinate supplying a height of 1.5 m, and a time-mean quantity has a single-valued time coordinate with an associated boundary variable to record the start and end of the averaging period.
This convention does not standardize variable names.
NetCDF variables that contain coordinate data are referred to as coordinate variables, auxiliary coordinate variables, scalar coordinate variables, or multidimensional coordinate variables.
The NUG conventions
(NUG
appendix B) provide the _FillValue
, missing_value
,
valid_min
, valid_max
, and valid_range
attributes to indicate
missing data.
The NUG conventions for missing data changed significantly between version 2.3 and version 2.4. Since version 2.4 the NUG defines missing data as all values outside of the valid_range
, and specifies how the valid_range
should be defined from the _FillValue
(which has library specified default values) if it hasn’t been explicitly specified. If only one missing value is needed for a variable then we recommend that this value be specified using the _FillValue
attribute. Doing this guarantees that the missing value will be recognized by generic applications that follow either the before or after version 2.4 conventions.
The scalar attribute with the name _FillValue
and of the same type as its
variable is recognized by the netCDF library as the value used to pre-fill disk
space allocated to the variable. This value is considered to be a special value
that indicates undefined or missing data, and is returned when reading values
that were not written. The _FillValue
should be outside the range
specified by valid_range
(if used) for a variable. The netCDF library
defines a default fill value for each data type
(NUG
appendix C).
The missing values of a variable with scale_factor
and/or
add_offset
attributes (see section [packed-data]) are
interpreted relative to the variable’s external values (a.k.a. the
packed values, the raw values, the values stored in the netCDF file),
not the values that result after the scale and offset are applied.
Applications that process variables that have attributes to indicate
both a transformation (via a scale and/or offset) and missing values
should first check that a data value is valid, and then apply the
transformation. Note that values that are identified as missing should
not be transformed. Since the missing value is outside the valid range
it is possible that applying a transformation to it could result in an
invalid operation. For example, the default _FillValue
is very
close to the maximum representable value of IEEE single precision
floats, and multiplying it by 100 produces an "Infinity" (using single
precision arithmetic).
This standard describes many attributes (some mandatory, others optional), but a file may also contain non-standard attributes. Such attributes do not represent a violation of this standard. Application programs should ignore attributes that they do not recognise or which are irrelevant for their purposes. Conventional attribute names should be used wherever applicable. Non-standard names should be as meaningful as possible. Before introducing an attribute, consideration should be given to whether the information would be better represented as a variable. In general, if a proposed attribute requires ancillary data to describe it, is multidimensional, requires any of the defined netCDF dimensions to index its values, or requires a significant amount of storage, a variable should be used instead. When this standard defines string attributes that may take various prescribed values, the possible values are generally given in lower case. However, applications programs should not be sensitive to case in these attributes. Several string attributes are defined by this standard to contain "blank-separated lists". Consecutive words in such a list are separated by one or more adjacent spaces. The list may begin and end with any number of spaces. See [attribute-appendix] for a list of attributes described by this standard.
We recommend that netCDF files that follow these conventions indicate
this by setting the NUG defined global attribute Conventions
to
the string value "CF-1.6
". The string is interpreted as a
directory name relative to a directory that is a repository of documents
describing sets of discipline-specific conventions. The conventions
directory name is currently interpreted relative to the directory
pub/netcdf/Conventions/
on the host machine
ftp.unidata.ucar.edu
. The web based versions of this document are
linked from the
netCDF
Conventions web page.
The following attributes are intended to provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers. The attribute values are all character strings. For readability in ncdump outputs it is recommended to embed newline characters into long strings to break them into lines. For backwards compatibility with COARDS none of these global attributes is required.
The NUG defines title
and history
to be global attributes. We wish to allow the newly defined attributes, i.e., institution
, source
, references
, and comment
, to be either global or assigned to individual variables. When an attribute appears both globally and as a variable attribute, the variable’s version has precedence.
title
-
A succinct description of what is in the dataset.
institution
-
Specifies where the original data was produced.
source
-
The method of production of the original data. If it was model-generated,
source
should name the model and its version, as specifically as could be useful. If it is observational,source
should characterize it (e.g., "surface observation
" or "radiosonde
"). history
-
Provides an audit trail for modifications to the original data. Well-behaved generic netCDF filters will automatically append their name and the parameters with which they were invoked to the global history attribute of an input netCDF file. We recommend that each line begin with a timestamp indicating the date and time of day that the program was executed.
references
-
Published or web-based references that describe the data or methods used to produce it.
comment
-
Miscellaneous information about the data or methods used to produce it.