Hector both allows (internally to each component, and when reading inputs) and requires (passing data between components, and writing outputs) values to have associated units. The internal unit system is simple, not providing unit conversion for example, but also lightweight and easy. Our use of units is prompted primarily by safety – to avoid the “Mars Climate Orbiter” problem of the mismatched unit – as well as having easy-to-interpret outputs.
The internal C++ type is called unitval
, and its code is
located in inst/include/unitval.hpp
.
(There is also a restricted type of unitval
called a
fluxpool
that allows only non-negative values.)
Creating a variable with associated unit
Two ways to create a unitval
:
unitval x(1, U_PGC); # x now holds 1 petagram of carbon
unitval y;
y.set(2, U_CO2_PPMV);
Doing math
The unitval
class supports basic mathematical
operations.
# assume that y and z are already defined unitvals
unitval x = y + z; # y and z must have identical units or an error is thrown
unitval x = y * 2.0; # multiplying by a constant always works
Extracting a unitval
value
Internally within components, we often simply want to work with
numbers, extracting them from unitval
objects. This is
straightforward, but first we have to ‘prove’ that we know the correct
units:
double x = y.value(U_PGC); # we assert that y's units are petagrams C
The core will throw an error if we get the results of y
wrong.
When is unitval
required?
Using unitvals
within a component is optional,
but it’s required when passing data between components. This
falls into three categories:
- Querying the core for data. Read more about Hector’s internal data passing.
- Responding to a query for data.
- Output. Read more about Hector’s output visitors.
Fluxpool values
Hector v3 introduces a new internal unit class:
fluxpool
, a subclass of unitval
defined in
inst/include/fluxpool.hpp
. This class imposes a
restriction: values may only be zero of positive, and any negative
value, or operation that results in a negative value, throws an error.
This is used for carbon fluxes and pools that may not be
negative; why this is so for pools is hopefully obvious, and fluxes we
think of as possibly being ‘negative’ (e.g., net biome production) are
in fact the difference between two nominally positive fluxes.
Currently fluxpools may not be passed between model components.
Finally, the fluxpool class is where carbon tracking is implemented; see the relevant exmaple.
Missing values
Briefly, when you need to return a missing value, use the
MISSING_FLOAT
macro defined in
inst/include/unitval.hpp
. This macro will return
NA_REAL
(from Rcpp
) if Hector is compiled as
an R package (technically, if the USE_RCPP
macro is
defined) and NAN
(from std
) otherwise.
Why this complexity?
R (and, by extension, Rcpp) distinguishes between two kinds of
“placeholder” numeric values: A missing value
(NA_real_
in R; often abbreviated to NA
)
represents an unobserved value; for instance, if you take annual
observations from 1990 to 2000 but failed to make a measurement in 1996.
On the other hand, not a number (NaN
) represents
the result of an invalid mathematical operation, such as taking the
square root of a negative number.
Both of these values propagate through any calculations in which they
are involved, both in R and C++, and can be detected and removed by R’s
is.na()
and similar functions. (Note that the R function
is.nan()
can be used to check specifically for
NaN
values – is.nan(NaN) => TRUE
, but
is.nan(NA) => FALSE
). However, because they mean
fundamentally different things (and suggest different errors), we try to
preserve this distinction if we can.
See also notes on hector units