NCL data types overview
Basic numeric types
In NCL, the basic data types include the standard types found in nearly every programming language. Types classified as numeric types support all of the algebraic functions available in NCL (see Expressions).
In version 5.2.0 a number of new types were added to complement the original types. The new types include 64-bit integers and add explicit unsigned integer types for every integer size. However, for backwards compatibility, only the original numeric types (double, float, long, integer, short, and byte), are allowed when the keyword numeric is applied to a function or procedure parameter. Therefore two new keywords have been added as of version 5.2.0: enumeric (extra-numeric), which includes only the newly added numeric types, and snumeric (super-numeric), which includes all the numeric types and which should now be used for a parameter that can actually have any possible numeric type. For more information about the usage of these keywords in function and procedure definitions see the section Constraining the type and dimensionality of input parameters.
Here is a table listing important attributes of each numeric type:
|Type||Numeric category||System type||Type size||Minimum value||Maximum value||Default _FillValue||Literal suffix|
|double||numeric||all||64 bits||+/- 2.22507e-308||+/- 8.98846e+307||9.969209968386869e+36||d or D|
|long||numeric||64 bit||64 bits||-9223372036854775808||9223372036854775807||-2147483647||l|
|ulong||enumeric||64 bit||64 bits||0||18446744073709551615||4294967295||L|
|float||numeric||all||32 bits||+/- 1.175494e-38||+/- 1.701411e+38||9.96921e+36||none|
|long||numeric||32 bit||32 bits||-2147483648||2147483647||-2147483647||l|
|ulong||enumeric||32 bit||32 bits||0||4294967295||4294967295||L|
|integer||numeric||all||32 bits||-2147483648||2147483647||-2147483647||i (optional)|
|byte (*)||numeric||all||8 bits||-128||127||-127||b|
- The "Default _FillValue" column documents the default value of the _FillValue attribute for each numeric type as of as of version 6.0.0. In order to avoid accidental equivalences of fill values with actual data or the results of calculations, the default fill values (a.k.a. missing values) have changed to values closer to an extreme of the possible range of each data type. One consequence is that the fill values have more digits and are harder to input accurately. However, the function default_fillvalue returns the default fill value for each type, eliminating the need to type it literally. More information about fill values can be found in the section Missing values.
- The "Literal suffix" column lists the suffixes that can be used to control the types of literal numerical values entered on the command line or contained in an NCL script. By default, numbers that do not have a decimal point are input as integer types; numbers containing a decimal point or an 'e' or 'E' exponent indicator are input as float types. Adding one of the recognized suffixes indicates the value should be input as the indicated type. For the various sizes of integers, the suffix must immediately follow the final digit of the number. In the case of the double type, the 'd' or 'D' character either directly follows a final digit or decimal point, or if exponential notation is used, it replaces the normal 'e' or 'E'. See Creating data below for more information.
- The long and ulong types have different sizes and maximum and minimum values depending on whether they are built as 32 or 64 bit executables. However, the default fillvalue is the same in either case.
- (*) Note that as of version 6.0.0 the byte type has changed from unsigned to signed. This is to help NCL track more closely with NetCDF usage. If you have byte data that you need to treat as 8 bit unsigned data it is coercible to the ubyte type directly by assignment or through use of the tounsigned function.
- CAUTION: Currently, arithmetic overflow and underflow are not always reported as errors to the user. Assignment of out-of-range values may cause errors.
Non-numeric typesNon-numeric types are types for which there is no numeric value and that cannot be coerced into a numeric type. Also, in general, non-numeric types only support the .ne. and .eq. operations, with the exception of the string and logical types. Strings use the '+' operation to concatenate one or more strings. The logical type supports .and., .or., .not., and .xor.. Here are NCL's non-numeric types:
|Type||Type size||Default _FillValue|
|logical||N/A||Missing (assigned as _Missing)|
The logical type is generated from relational expressions or by assignment from logical literal values. Logical values are either True, False or Missing. However, note that to literally assign the Missing value to a logical variable or attribute you must use the _Missing keyword (including the initial underscore character). General discussions of logical variables will use the printed form Missing, dropping the initial underscore.
The string type is a sequence of characters of any length, Strings can be generated in a number of ways, including by assignment of a literal sequence of characters delimited by a pair of double quote characters ("an example string"). Note that a single string whatever its length is a scalar value, not an array. Therefore it is not possible to get sub-strings of string using NCL subscripting syntax. However, there are numerous functions accessing and manipulating substrings of strings. See http://www.ncl.ucar.edu/Document/Functions/string.shtml. The function strlen gives the length of the string not including the NULL terminating character (0x00). Therefore, the length of any character array derived from a string has a dimension size one greater than the value returned by strlen.
While an essential type, the character type has some peculiarities in
NCL. Although it is not itself a numeric type, currently the only way
to assign literal values to a character variable is by using the ascii
numerical equivalents of individual characters, followed by the suffix
'C'. For instance to create a variable with the value
you would use a statement such as:
a_cap = 65C. To better
track NetCDF usage, as of version
6.0.0 the character type was changed from a signed to an unsigned
type. Values that have no associated printable glyph are formatted as
hexadecimal numbers for output. The easiest way to create a character
variable based on recognizable characters is to use the
function tocharacter with a string type argument
(as of version 6.0.0) or,
with older versions of NCL, by using the
function stringtocharacter. However, because of the
inclusion of the NULL string termination character in any conversion from
string to character, it is still a bit cumbersome to generate a single
scalar character variable starting from a string:
c_array = tocharacter("c") ; character array of length 2: c, 0x00 c_scalar = c_array(0) ; scalar value: c
The file type is a reference to a file in a supported file format. Values of type file are returned by the addfile intrinsic function.
The group type is a reference to a group in a file in a supported file format (only for HDF5 files, as of version 6.0.0). Values of type group are returned by following code:
myfile = addfile("my.h5", "r") ; open an HDF5 file (for read only). mygroup = myfile->/group_name ; access group_name from myfile.
The list type is a container for objects of any type. The addfiles function returns a list type value.
Coercion of types
Coercion is basically the implicit conversion of data from one data type to another. This can occur during assignment of a variable of one type to a variable of a different type, when arguments that are not of the declared type are passed to a function or procedure, or when two values of different types are operands to the same operator. Operands must be of compatible types to perform the requested operation. For example, when a float value is multiplied by an integer value, the integer is automatically converted to a float value and the result value is type float. NCL allows this operation because any possible integer value is within the range of the float type. The table below indicates which conversions can take place automatically. "Yes" in a box indicates that the type specified at the left end of the box row can be converted automatically into the type at the top of the box column.
|from / to||float||double||byte||ubyte||short||ushort||integer||uint||long||ulong||int64||uint64||character||string||logical|
Here are some points to note about the table:
- Automatic conversions are allowed from any of the integer types either to float or double, but there can be loss of information because of the limited precision available for the floating point types. In particular small differences between integers with large absolute values may be lost upon conversion. The float type can only distinguish between approximately the first 6 or 7 decimal digts of a number, while the double type distinguishes the first 14 digits. Even a double type does not have enough precision to distinguish small differences between large 64 bit integers.
- Automatic conversions are allowed in both directions between signed and unsigned integer types of the same size. This is allowed because the operation is fully reversible with no loss of information. This means that negative signed values convert to unsigned values larger than the maximum signed value and vice versa. This conversion is equivalent to adding or subtracting the maximum value of the unsigned type plus 1. For example, a negative short value -16 converts to ushort value 65520 (-16 + 65535 + 1). 40001 ushort converts to short -25535 (40001 - 65536).
- One important point, covered in more detail in the section on constraining the type and dimensionality of input parameters, has to do with how parameters work in functions and procedures. Parameters in NCL are pass-by-reference, meaning that a change to the value of a parameter inside a function changes the value of a variable passed in to the function or procedure. This is important because when a parameter is expected to be a certain type, say float, and an integer variable is passed as the parameter, the integer parameter must be coerced to a float. Since there is no automatic conversion possible from float to integer, changes to the parameter within the function or procedure can not be propagated back to the calling variable. A warning message is given when there is a possibility of this occuring. If you want to allow for more than one numeric type without the warning message, then you can use numeric or snumeric as the input type. However, in this case if the value is modified and you want to avoid errors and ensure that the value is propagated back to its calling variable, you must be careful that any calculations performed do not end up promoting the value to a type that is not coercible back to the input type.
- Coercion of numeric and character values to logical occur as follows: values equal to 0 convert to False; all other values convert to True except for missing values which convert to Missing.
A special set of functions exists for forcing the coercion in cases where automatic coercion is not allowed. As of version 5.2.0 a general set of functions has been implemented that will convert any numeric type, as well as the string and character types, to the type called out in the function name whether or not implicit coercion would be allowed. Since they work even if the input type is the already the same as the output type, they can greatly simplify the code compared to the old single conversion functions. These functions have names like tofloat or touint64. Values that are out-of-range for the 'to' type become missing values. If the _FillValue attribute value of the 'from' type is out-of-range for the 'to' type, it is changed to the default fill value of the 'to' type. This means that in some cases, the use of the explicit conversion functions have different results from automatic conversion. For instance, automatic conversion of a negative integer to a uint value results in a positive value larger than the maximum signed integer, but the result of passing the value to the touint function is a missing value.
Floating point types lose the decimal portion of their value when converted to any of the integer types. When a character type is converted to a numeric value, each character element is treated as the numeric equivalent of its ascii value (e.g. 'A' is decimal 65). String conversion assumes that the string is a representation of a numeric value in string format. Each character of the string is read until a character that could not be part of a valid value of the specified type is encountered.
Prior to version 5.2.0 the only available conversion functions were specific for each type of conversion. These functions have names like doubletointeger or stringtofloat. They are deprecated but still supported. They exist for the original character, string, and numeric types, but not for the newer enumeric types. Where available, they should work just like an equivalent invocation of the appropriate "to" type function.
Creating dataData exist in NCL either as a single scalar value or a multi-dimensional array of scalar elements. The term value in NCL can refer to either a single scalar value or a multi-dimensional array of a specific data type.
There are several ways to create data:
- By entering constant values.
- By using functions like asciiread, cbinread, fbinrecread, and addfile to read data from a file.
- By using functions like ispan or random_normal that generate new data.
- By using functions like sin that generate new data by transforming existing data.
Constants and Arrays of constantsConstant values are values that are either entered at the command line or are written textually in an NCL script. All numeric types can be expressed as constant values using type-specific single-letter suffixes directly following the number. Numbers that appear without a suffix are by default treated as type integer if there is no decimal point. Floating point constants are entered either in scientific notation or as numbers with a decimal point. Double precision constants are entered by replacing the 'e' or 'E' in normal scientific notation with 'd' or 'D', or simply by adding 'd' or 'D' as a suffix to any number. Integer constants are entered without decimal points. String constants' values contain characters enclosed in quotes ("").
The following are examples of single scalar constant values:
Floating Point Constants: 3.141592 1e-12 2. Double Precision Constants: 3.14159265358979d0 1d-12 2D Integer Constants: 100 1 Unsigned Integer Constants: 200I 3I Char Constants: 100C 1C Byte Constants: -100b 1b Ubyte Constants: 0B 243B Short Constants: -100h 1h Long Constants: -100l 1l Unsigned Long Constants: 100L 1L Int64 Constants: -100q 1q Unsigned Int64 Constants: 100Q 1Q String Constants: "a" "Hello World" "100.00" Logical Constants: True False _MissingNotes:
- The only way to specify characters literally is to use
the ascii equivalent numerical value followed by the letter 'C'
(e.g. 65C specifies the character 'A'). However, you can use
the tochar function with a string argument
to convert a string to an array of characters. But note that
conversions from strings always include the terminating NULL
character. This means that converting an empty string to character
produces a character array of length one with contents equal to numerical zero.
- The logical constants are NCL keywords. The keyword _Missing was added in version 5.1.0 and contains the initial underscore only to avoid backwards compatibility issues where users may have used the string "Missing" as the name of a variable. When NCL outputs a logical result with this value it appears as "Missing". General discussion of logical values in this document primarily use the printed form.
Arrays of constants can be constructed using the array designator characters "(/" and "/)" with constant values separated by commas. Multi-dimensional constant arrays can be created by nesting the array designator characters. The following are examples of constant arrays. Each example uses the assignment statement to assign the constant arrays to a variable.
1D Floating Point Constant Array: var0 = (/ 1.2, 2.3, 3.4, 4.5, 5.6, 6.7 /) x 2D Floating Point Constant Array: var1 = (/ (/ 1.2, 2.3, 3.4, 4.5, 5.6, 6.7 /), (/ 7.8, 8.9, 9.1, 10.2, 11.3, 12.4/) /) 1D Double Precision Constant Array: var0 = (/ 1.2d, 2.3d, 3.4d, 4.5d, 5.6d, 6.7d /) 1D Integer Constant Array: var2 = (/ -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5 /) or: varsi = (/ -5i, -4i, -3i, -2i, -1i, 0i, 1i, 2i, 3i, 4i, 5i /) 1D Unsigned Integer Constant Array: varuI = (/ 85I, 64I, 43I, 12I, 21I, 0I, 1I, 2I, 3I, 4I, 5I /) 1D Byte Constant Array: varsb = (/ -5b, -4b, -3b, -2b, -1b, 0b, 1b, 2b, 3b, 4b, 5b /) 1D Unsigned Byte Constant Array: varub = (/ 85B, 64B, 43B, 12B, 21B, 0B, 1B, 2B, 3B, 4B, 5B /) 1D Short Constant Array: varss = (/ -5h, -4h, -3h, -2h, -1h, 0h, 1h, 2h, 3h, 4h, 5h /) 1D Unsigned Short Constant Array: varus = (/ 85H, 64H, 43H, 12H, 21H, 0H, 1H, 2H, 3H, 4H, 5H /) 1D Long Constant Array: varsl = (/ -5l, -4l, -3l, -2l, -1l, 0l, 1l, 2l, 3l, 4l, 5l /) 1D Unsigned Long Constant Array: varul = (/ 85L, 64L, 43L, 12L, 21L, 0L, 1L, 2L, 3L, 4L, 5L /) 1D Int64 Constant Array: varsq = (/ -5q, -4q, -3q, -2q, -1q, 0q, 1q, 2q, 3q, 4q, 5q /) 1D Unsigned Int64 Constant Array: varuq = (/ 85Q, 64Q, 43Q, 12Q, 21Q, 0Q, 1Q, 2Q, 3Q, 4Q, 5Q /) 1D Character Constant Array: varc = (/ 85C, 64C, 43C, 12C, 21C, 0C, 1C, 2C, 3C, 4C, 5C /) 1D String Constant Array: var3 = (/ "one", "two", "three", "four", "five" /) 1D Logical Constant Array: varlc = (/ True, True, False, _Missing, True /)
Creating new data arraysAll numeric data, string, character, logical, and graphic types can be created using the new statement. It is important to note that new is not a function; it is a statement. Because new is a statement, it is possible to use a type keyword as an argument to new. There are three possible ways to use new:
- Create an array of data with a specific missing value assigned to each element.
- Create the array of data using the default missing value.
- Create the array of data with no missing value.
The new statement takes, as parameters, an array of integer dimension sizes, the keyword for the data type, and optionally a missing value to assign to each element of the new data array, or the "No_FillValue" string to indicate no missing values are to be assigned. (Recognition of "No_FillValue" was added in version a034.)
Here are three examples showing each of the different ways to use new:
- The following creates a 5 x 6 x 7 three-dimensional float array
filled with the default missing value for float types and the
attribute _FillValue is also set:
a = new( (/ 5, 6, 7 /), float) print(a) Variable: a Type: float Total Size: 840 bytes 210 values Number of Dimensions: 3 Dimensions and sizes:  x  x  Coordinates: Number Of Attributes: 1 _FillValue : 9.96921e+36 (0,0,0) 9.96921e+36 (0,0,1) 9.96921e+36 (0,0,2) 9.96921e+36 . . .
The default missing values are shown in the charts at the top of this page in the Basic numeric types section or the Non-numeric types section. Note that the default fill values have changed in version 6.0.0. See Missing values for more information.
- The following is an example of how to assign a specific missing
value. The result is an array filled with -1e12 at every index.
a = new( (/ 5, 6, 7 /), float, -1e12)
- The following is an example of how to create an array of data with
no missing values. Note that the data array will not be initialized to
any particular value, so you must only use this option if you know you
will be initializing the whole array at some point.
a = new( (/ 5, 6, 7 /), float, "No_FillValue")
If variable a was previously defined before the above statement was called and it had a _FillValue attribute, then the attribute and its value will be unchanged even though the elements of the array now have all undefined values.
Importing data arrays and filesData can be read in to NCL in a variety of ways. If data exist as a UNIX file in either ASCII, C, or Fortran binary data, the data can be read in to NCL with one of the following functions respectively, asciiread, cbinread, or fbinrecread. Each of these functions requires the user to enter a string file name, a dimension size array similar to the new command, and finally a string denoting what type the data are stored in. These functions return a single array of a single data type per call. Currently, support for binary or ASCII files containing more than one array is not officially supported.
Another way to import data is to import from a supported file format. Supported file formats are specially recognized data formats that store variables and other information. Supported formats allow direct access to variables and other information in data files through NCL. This access is very different than the abovementioned methods of reading files. NCL has a special syntax for referencing variables in files that simplifies the importation of external data. The addfile function is used to open external files that are in a supported format. The addfile function returns a reference to a file that is used to access data within that file. This is all covered in the variables section of this reference guide.
For specific information on a supported file format, see the Supported data format information section of the reference guide.