Pool estimates and variances obtained by analysing multiple synthetic datasets
syntheticPool.Rd
This function pools estimates and variances which have been obtained by analysing multiple synthetic imputations (e.g. created used gFormulaImpute) using the method developed by Raghunathan et al 2003.
Arguments
- fits
Collection of model fits produced by a call of the form
with(imps, lm(y~regime))
whereimps
is a collection of imputed datasets of classmids
.
Details
The only argument to syntheticPool
is a set of model fits obtained by running
an analysis on an imputed dataset collection of class mids
, as created
for example using the mice
function in the mice
package.
The function returns a table containing the overall parameter estimates, the within, between and total imputation variances, 95% confidence intervals, and p-values testing the null hypothesis that the corresponding parameters equal zero.
It is possible for the variance estimator developed by Raghunathan et al 2003 to
be negative. In this case syntheticPool
stops and informs you to re-impute
using a larger number of imputations M
and/or nSim
.
The development of the gFormulaMI
package was supported by a grant from the UK
Medical Research Council (MR/T023953/1).
References
Raghunathan TE, Reiter JP, Rubin DB. 2003. Multiple imputation for statistical disclosure limitation. Journal of Official Statistics, 19(1), p.1-16.
Author
Jonathan Bartlett jonathan.bartlett1@lshtm.ac.uk
Examples
set.seed(7626)
#impute synthetic datasets under two regimes of interest using gFormulaImpute
imps <- gFormulaImpute(data=simDataFullyObs,M=10,
trtVars=c("a0","a1","a2"),
trtRegimes=list(c(0,0,0),c(1,1,1)))
#> [1] "Input data is a regular data frame."
#> [1] "Variables imputed using:"
#> l0 a0 l1 a1 l2 a2 y regime
#> "norm" "" "norm" "" "norm" "" "norm" ""
#> [1] "Predictor matrix is set to:"
#> l0 a0 l1 a1 l2 a2 y regime
#> l0 0 0 0 0 0 0 0 0
#> a0 1 0 0 0 0 0 0 0
#> l1 1 1 0 0 0 0 0 0
#> a1 1 1 1 0 0 0 0 0
#> l2 1 1 1 1 0 0 0 0
#> a2 1 1 1 1 1 0 0 0
#> y 1 1 1 1 1 1 0 0
#> regime 1 1 1 1 1 1 1 0
#fit linear model to final outcome with regime as covariate
fits <- with(imps, lm(y~factor(regime)))
#pool results using Raghunathan et al 2003 rules
syntheticPool(fits)
#> Estimate Within Between Total df
#> (Intercept) -0.02071539 0.0008125695 0.001763004 0.001126735 3.038045
#> factor(regime)2 2.96593502 0.0016251389 0.004203106 0.002998278 3.784951
#> 95% CI L 95% CI U p
#> (Intercept) -0.1267874 0.08535658 5.803156e-01
#> factor(regime)2 2.8104386 3.12143146 1.304839e-06