unitvals
)Unitvals.Rmd
Hector both allows (internally to each component, and when reading inputs) and requires (passing data between components, and writing outputs) values to have associated units. The internal unit system is simple, not providing unit conversion for example, but also lightweight and easy. Our use of units is prompted primarily by safety–to avoid the “Mars Climate Orbiter” problem of mismatched unit–as well as having easy-to-interpret outputs.
The internal C++ type is called unitval
, and its code is located in headers/data/unitval.hpp
.
Two ways to create a unitval
:
unitval x(1, U_PGC); # x now holds 1 petagram of carbon
unitval y;
y.set(2, U_CO2_PPMV);
The unitval
class supports basic mathematical operations.
# assume that y and z are already defined unitvals
unitval x = y + z; # y and z must have identical units or an error is thrown
unitval x = y * 2.0; # multiplying by a constant always works
unitval
value
Internally within components, we often simply want to work with numbers, extracting them from unitval
objects. This is straightforward, but first we have to ‘prove’ that we know the correct units:
double x = y.value(U_PGC); # we assert that y's units are petagrams C
unitval
required?
Using unitvals
within a component is optional, but it’s required when passing data between components. This falls into three categories:
unitval
optional?
In input files.
TODO.
if( varName == D_PREINDUSTRIAL_N2O ) {`
`H_ASSERT( data.date == Core::undefinedIndex() , "date not allowed" );`
`N0 = unitval::parse_unitval( data.value_str, data.units_str, U_PPBV_N2O );`
Briefly, when you need to return a missing value, use the MISSING_FLOAT
macro defined in inst/include/unitval.hpp
. This macro will return NA_REAL
(from Rcpp
) if Hector is compiled as an R package (technically, if the USE_RCPP
macro is defined) and NAN
(from std
) otherwise.
Why this complexity? R (and, by extension, Rcpp) distinguish between two kinds of “placeholder” numeric values: A missing value (NA_real_
in R; often abbreviated to NA
) represents an unobserved value; for instance, if you take annual observations from 1990 to 2000 but failed to make a measurement in 1996. On the other hand, not a number (NaN
) represents the result of an invalid mathematical operation, such as taking the square root of a negative number. Both of these values propagate through any calculations in which they are involved, both in R and C++, and both can be detected and removed by R’s is.na()
function and removed by na.omit
, na.rm = TRUE
, and friends. (The R function is.nan()
can used to check specifically for NaN
values – is.nan(NaN) => TRUE
, but is.nan(NA) => FALSE
). However, because they mean fundamentally different things (and suggest different errors), we try to preserve this distinction if we can. (For example, if some code you run returns NA
then you would suspect that required values were not set somewhere. On the other hand, if the code returns NaN
, you know that all values were set but resulted in a mathematically invalid calculation. Your approach to debugging these problems would likely be different.)