
Simulate TPCs with bootstrap to propagate uncertainty
Source:vignettes/articles/tpcs-simulate-bootstrap.Rmd
tpcs-simulate-bootstrap.Rmd
Simulate -thermal performance curves using bootstrap with residual resampling
Why propagating uncertainty?
Forecasting necessarily incorporates uncertainties in how much we know about the knowledge of the target system (i.e., model structure, parameter uncertainty, response or predictor variable errors, etc) that adds together with uncertainty in how to communicate findings and an unavoidable randomness within natural systems (Simmonds et al. 2022).
At least three different uncertainty sources of the models should be addressed in forecasts:
-
Parameter uncertainty: the accuracy of estimated
parameters of the models may affect the confidence of the predictions.
For example in the TPCs model
fitting article, the
briere2
model yields . Let’s imagine a forecaster aiming to identify “safe” regions where the pest may not be established due to extremely high maximum temperatures (e.g., ). It’s possible that all the forecasting regions have monthly maximum temperatures of about 34ºC, which lie below the estimate of 36.5 ºC, leading to identify no risk for the assessment. However, if we incorporate how the uncertainty of each parameter contributes to the variability of the predicted TPC with simulated TPC ribbons in the plot –see e.g., below withplot_uncertainties()
, there are possible scenarios at which several TPC-calculated ’s lie below 34ºC, yielding a not-negligible risk likelihood. - Predictor uncertainty: incorporating the variability of the predictor –in the above case, monthly maximum temperatures– will also yield a probability distribution of forecast outcomes (let’s say ). This would result in some scenarios with maximum temperatures above estimate of ºC and some others where they have not.
- Source data uncertainty: additionally, TPCs are usually fitted to summarized data from experiments in laboratory conditions. These measures incorporate both both mesasurement error (at the individual level) and uncertainty measures summarizing the variability of rate estimates at the population level.
For now, mappestRisk
enables to explicitly account for
parameter uncertainty by simulating
TPCs using bootstrapping techniques with residual resampling, as
suggested and implemented by rTPC
package (see this
vignette/article (Padfield, O’Sullivan, and
Windram 2025)) as described in the section below. Measurement
uncertainty of source data might be further incorporated in future
enhancement updates of the package through a weights
argument of fit_devmodels()
and
predict_curves()
(Padfield,
O’Sullivan, and Pawar 2021) since they are based on the
rTPC
- nls.multstart
framework that has
recently incorporated an article
on how to simulate curves by weighted bootstrapping. Finally, a
discussion on how to deal with predictor uncertainty of the forecasts
and how to overcome communication uncertainty is given in the generate-risk-maps vignette
article.
Simulate bootstrapped TPCs with predict_curves()
As mentioned above, mappestRisk
includes two functions
that automate the workflow suggested by rTPC
package to
simulate curves for propagating parameter uncertainty. We implemented
the residual resampling method following rTPC
vignete suggestions, but with a raw resampling code workflow rather than
using car::Boot()
function for simplicity. We opted for the
residual resampling rather than the case resampling bootstrapping method
as the predictor data are controlled by the experimental researchers and
there are usually few data points per curve, with low representation of
the hot decay TPC region in the source data that would make many case
resampling iterations difficult for refitting new TPC models. For
further insights on the methodology, please refer to the the
original rTPC
vignette. Further updates of the package
may incorporate variance modelling for heteroskedastic residuals, as
well as an argument to choice between residual
resampling and case resampling1.
The predict_curves()
function incorporates the
temperature and development rate data arguments, a
fitted_parameters
argument that requires as input the table
obtained from fit_devmodels()
, a
model_name_2boot
argument to choose which TPC model(s) to
bootstrap along those available in fitted_parameters
and
the number of bootstrap samples, or n_boots_samples
. The
function implicitly extracts the residuals and the fitted values of the
estimated TPC from fit_devmodels()
. Then, it resamples with
replacement these residuals. Next, the function automatically calculates
the new resampled observations for the iteration
(i.e.,
)
as follows:
where
represent the fitted values from the TPC model and
denotes the resampled residuals of that model. Note that
-th
iteration is given by n_boots_samples
, or
.
This results in
resampled data sets. Each of them is next used for fitting a new
nonlinear model using fit_devmodels()
; those that
adequately converged (a total
models) constitute newly bootstrapped nonlinear regression model.
Finally, the function calculates the predictions of these bootstrapped
models along temperature data –more specifically, 20ºC below and 15ºC
above the minimum and maximum temperature values, respectively, in
temp
argument each 0.01ºC. This results in a total
simulated thermal performance curves that are used for propagating
parameter uncertainty for inference.
Here we have an example:
#fit previously:
data("aphid")
fitted_tpcs_aphid <- fit_devmodels(temp = aphid$temperature,
dev_rate = aphid$rate_value,
model_name = "all")
#> Warning in fit_devmodels(temp = aphid$temperature, dev_rate = aphid$rate_value, : TPC model beta had one or more parameters with
#> unexpectedly large standard errors. Please consider it for further analyses
#> Warning in fit_devmodels(temp = aphid$temperature, dev_rate = aphid$rate_value, : TPC model boatman had one or more parameters with
#> unexpectedly large standard errors. Please consider it for further analyses
#> Warning in fit_devmodels(temp = aphid$temperature, dev_rate = aphid$rate_value, : TPC model briere1 had one or more parameters with
#> unexpectedly large standard errors. Please consider it for further analyses
#> Warning in fit_devmodels(temp = aphid$temperature, dev_rate = aphid$rate_value, : TPC model joehnk had one or more parameters with
#> unexpectedly large standard errors. Please consider it for further analyses
#> Warning in fit_devmodels(temp = aphid$temperature, dev_rate = aphid$rate_value, : TPC model kamykowski had one or more parameters with
#> unexpectedly large standard errors. Please consider it for further analyses
preds_boots_aphid <- predict_curves(temp = aphid$temperature,
dev_rate = aphid$rate_value,
fitted_parameters = fitted_tpcs_aphid,
model_name_2boot = c("briere2", "lactin2"),
propagate_uncertainty = TRUE,
n_boots_samples = 10)
#> Warning in predict_curves(temp = aphid$temperature, dev_rate =
#> aphid$rate_value, : 100 iterations might be desirable. Consider increasing
#> `n_boots_samples` if possible
#>
#> Note: the simulation of new bootstrapped curves takes some time. Await patiently or reduce your `n_boots_samples`
#>
#> Bootstrapping simulations completed for briere2
#> Warning in fit_devmodels(temp = x$temp, dev_rate = x$dev_rate, model_name = unique(x$model_name)): TPC model lactin2 had one or more parameters with
#> unexpectedly large standard errors. Please consider it for further analyses
#> Warning in fit_devmodels(temp = x$temp, dev_rate = x$dev_rate, model_name = unique(x$model_name)): TPC model lactin2 had one or more parameters with
#> unexpectedly large standard errors. Please consider it for further analyses
#>
#> Bootstrapping simulations completed for lactin2
By default, the predict_curves
function does propagate
uncertainty by simulating as many curves as asked through
n_boots_samples
argument for each of the selected models
from model_name_2boot
. If able to perform bootstrap, this
default configuration will output a tibble
with simulated
TPCs for both ploting purposes and thermal traits calculation.
n_boots_samples
is set up to 100 by default. We recommend
to avoid lower values that may inaccurately reflect uncertainty and to
think carefully when trying to perform bootstrap with larger samples,
since they may exponentially increase computational demands with little
benefit for the purposes addressed here. If
propagate_uncertainty
is set to FALSE
, the
function will output the same tibble
with one single curve
(temperatures and predictions) coming from the estimated TPC (similar to
those plotted in plot_devmodels()
for applying subsequent
steps in the mappestRisk
suggested workflow.
The tibble
output of predict_curves
can be
easily visualized with plot_uncertainties()
, where the
curve from the model estimate is plotted as a thicker, dark orange line
and the bootstrapped curves are depicted as slighter, dark blue lines
composing a sort of a ribbon for the central curve. If more than one
model has been succesfully bootstrapped, predicted curves will be
plotted along different facets.
plot_uncertainties(bootstrap_tpcs = preds_boots_aphid,
temp = aphid$temperature,
dev_rate = aphid$rate_value,
species = "Brachycaudus schwartzi",
life_stage = "Nymphs")
These simulated TPCs may guide ecologically realistic model selection
and propagating parameter uncertainty for subsequent analyses. However,
as for plot_devmodels()
, we discourage selecting among TPC
models solely based on statistical information, but rather on informed
ecological criteria.
Please, consider carefully whether your data is suitable for these procedures.