Using tbl_regression with imputed data/pooled regression models - imputation

I've had great success using the gtsummary::tbl_regression function to display regression model results. I can't see how to use tbl_regression with pooled regression models from imputed data sets, however, and I'd really like to.
I don't have a reproducible example handy, I just wanted to see if anyone else has found a way to work with, say, mids objects created by the mice package in tbl_regression.

In the current development version of gtsummary, it's possible to summarize models estimated on imputed data from the mice package. Here's an example
# install dev version of gtsummary
remotes::install_github("ddsjoberg/gtsummary")
library(gtsummary)
packageVersion("gtsummary")
#> [1] ‘1.3.5.9012’
# impute the data
df_imputed <- mice::mice(trial, m = 2)
# build the model
imputed_model <- with(df_imputed, lm(age ~ marker + grade))
# present beautiful table with gtsummary
tbl_regression(imputed_model)
#> pool_and_tidy_mice: Tidying mice model with
#> `mice::pool(x) %>% mice::tidy(exponentiate = FALSE, conf.int = TRUE, conf.level = 0.95)`
Created on 2020-12-16 by the reprex package (v0.3.0)
It's important to note that you pass the mice model object to tbl_regression() BEFORE you pool the results. The tbl_regression() function needs access to the individual models in order to correctly identify the reference row and variable labels (among other things). Internally, the tidying function used on the mice model will first pool the results, then tidy the results. The code used for this process is printed to the console for transparency (as seen in the example above).

Related

Testing a trading system on bootstrap samples using Arch library in python

I am trying to test a hypothesis on outperformance of a trading strategy over the buy and hold. I have original data's returns containing 1261 observations as a sample to be used for bootstrap.
I want to know if I have applied it correctly.
def back_test_series(x):
df= pd.DataFrame(x, columns= ['Close'])
return df.Close
from arch.bootstrap import CircularBlockBootstrap
bs = CircularBlockBootstrap(40, sample_return)
results = bs.apply(back_test_series, 2500)
Above, sample_return is the sample containing 2761 returns on actual data. I created 2500 bootstrapped samples containing 2761 observations each.
and then created a cummulative return to get price time series.
time_series = []
for simu in results:
df = pd.DataFrame(simu, columns=["Close"])
df['Close'] = (1+df).cumprod()
time_series.append(df)
and finally ran my backtesting in the price series obatained from bootstrap.
final_results = []
for simulation in enumerate(time_series):
x = Backtesting.scrip_backtest(simulation)
final_results.append(x)
Backtesting.scrip_backtest is my trading strategy which will return stats like buy and hold cagr, strategy cagr, std dev of strategy returns.
My question is can I use bootstrap in this way? Should I use MovingBlockBootstrap or CircularBlockBootstrap?
It it correct to run trading strategy on bootstrapped time series as mentioned above?

FMU 2.0 interaction - requires parallel "container" for parameter values etc?

I work with pyfmi in Jupyter notebooks to run simulations and I like to work interactively and evaluate incremental changes in parameters etc. Long time ago I found it necessary to introduce a dictionary that work as a "container" for parameter and initial values. Now I wonder if here is a way to get rid of this "container" that after all is partly a parallel structure to "model"?
A typical workflow look like this:
create a diagram where results from different simulations below should be shown
model = load_fmu(fmu_model)
parDict['model.x_0'] = 1
parDict['model.a'] = 2
for key in parDict.keys(): model.set(key,parDict[key])
sim_res = model.simulate(10)
plot results...
model = load_fmu(fmu_model)
parDict['model.x_0'] = 3
for key in parDict.keys(): model.set(key,parDict[key])
sim_res = model.simulate(10)
plot results...
There is a function model.reset() that brings the state back to default values at compilation without loading again, but you need to do more than the following
model.reset()
parDict['model.x_0'] = 3
for key in parDict.keys(): model.set(key,parDict[key])
sim_res = model.simulate(10)
plot results...
So,  this does NOT work...
and after all parameters and initial values needs to be brought back and we still need parDict, but we may avoid the load-command though.

Visualizing an AutoDiff MultibodyPlant in PyDrake

I am trying to build a simple multibody plant system in Drake using the basic DrakeVisualizer. However, for my use case, I also want to be able to automatically track the derivatives through the physics simulation, so am using the AutoDiffXd version of system:
timestep = 1e-3
builder = DiagramBuilder_[AutoDiffXd]()
plant = MultibodyPlant(timestep)
scene_graph = SceneGraph_[AutoDiffXd]()
brick_file = FindResourceOrThrow("drake/examples/manipulation_station/models/061_foam_brick.sdf")
parser = Parser(plant)
brick = parser.AddModelFromFile(brick_file, model_name="brick")
plant.Finalize()
plant_ad = plant.ToAutoDiffXd()
plant_ad.RegisterAsSourceForSceneGraph(scene_graph)
scene_graph.AddRenderer("renderer", MakeRenderEngineVtk(RenderEngineVtkParams()))
DrakeVisualizer.AddToBuilder(builder, scene_graph)
builder.AddSystem(plant_ad)
builder.AddSystem(scene_graph)
builder.Connect(plant_ad.get_geometry_poses_output_port(), scene_graph.get_source_pose_port(plant_ad.get_source_id()))
builder.Connect(scene_graph.get_query_output_port(), plant_ad.get_geometry_query_input_port())
diagram = builder.Build()
context = diagram.CreateDefaultContext()
simulator = Simulator_[AutoDiffXd](diagram, context)
simulator.AdvanceTo(2.0)
However, when I run this, I get the following error:
File "/home/craig/Repos/drake-exps/autoDiffExperiment.py", line 102, in auto_phys
DrakeVisualizer.AddToBuilder(builder, scene_graph)
TypeError: AddToBuilder(): incompatible function arguments. The following argument types are supported:
1. (builder: pydrake.systems.framework.DiagramBuilder_[float], scene_graph: drake::geometry::SceneGraph<double>, lcm: pydrake.lcm.DrakeLcmInterface = None, params: pydrake.geometry.DrakeVisualizerParams = <pydrake.geometry.DrakeVisualizerParams object at 0x7ff6274e14b0>) -> pydrake.geometry.DrakeVisualizer
2. (builder: pydrake.systems.framework.DiagramBuilder_[float], query_object_port: pydrake.systems.framework.OutputPort_[float], lcm: pydrake.lcm.DrakeLcmInterface = None, params: pydrake.geometry.DrakeVisualizerParams = <pydrake.geometry.DrakeVisualizerParams object at 0x7ff627736730>) -> pydrake.geometry.DrakeVisualizer
Invoked with: <pydrake.systems.framework.DiagramBuilder_[AutoDiffXd] object at 0x7ff65654f8f0>, <pydrake.geometry.SceneGraph_[AutoDiffXd] object at 0x7ff656562130>
From this error, it appears the DrakeVisualizer class only accepts systems which use float scalars exlusively. So I am stuck --- either I can go back to floats (but lose the autodiff differentiable simulation functionality I was after in the first place), or continue to use autodiffxd systems (but be completely unable to visualize what is going on in my simulation).
Is there a way to get both that I am missing?
Sorry for the pain and inconvenience. Your description and assessment are all spot on. Most of the visualization mechanisms are float only and, in its current state, attempts to visualizing an AutoDiff diagram will fail.
You have a couple of options (neither of which is appealing):
Go with one of the outcomes you've described above (no vis or no derivatives).
Put in a Drake feature request to be able to attach a visualizer to an AutoDiff diagram.
I can come up with some hacky workarounds (that aren't immediately clear would even work). So, if you're desperate for derivatives and visualization, they could be explored. But, ultimately, the feature request and a formal Drake solution would be the best long-term resolution.
=====================================
Big update. As of #14569, the DrakeVisualizer class is now templated on the scalar type (item 2 in the list above). That has two implications:
You can build an AutoDiffXd-valued diagram with a visualizer in it (as in your example), or
You can create a double-valued diagram and scalar convert it (i.e., diagram.ToAutoDiffXd() into an AutoDiffXd-valued diagram.

Is it possible to use tbl_regression fonction with lmer fonction with random effect?

I work on antifungal activity of some molecules ("cyclo") added with fungicides and I want to assess impact of these cyclos and their concentration ratio. CMI is a quantitative variable and all other variables are factors.
I have this script:
mod=lmer(CMI ~ cyclo*ratio + (1|fungicide) + (1|strains), data)
And I'd like to know if I can use tbl_regression() (library(gtsummary)) with my lmer()?
If yes, what do I have to specify for exponentiate term ?
If I write exponentiate=FALSE I obtain the same values than the estimates in the classical summary(mod).
Thank you for your help
Steffi
The default behavior for tbl_regression() for a mixed-effects models is to print the fixed-effects only. To see the full output, including the random components, you need to override the default function for tidying up the model results using the tidy_fun= argument.
library(gtsummary)
lme4::lmer(age ~ marker + (1|grade), trial) %>%
tbl_regression(
# set the tidying function to broom.mixed::tidy to show random effects
tidy_fun = broom.mixed::tidy,
)
You can use the label= argument to update the label displayed for the random components if you wish.
The default is exponentiate = FALSE, so you don't need specify in the tbl_regression() call.
For more details on the tidy_fun= argument, you can review this help file: http://www.danieldsjoberg.com/gtsummary/reference/vetted_models.html
Hope this helps! Happy Coding!

Calculations in table based on variable names in matlab

I am trying to find a better solution to calculation using data stored in table. I have a large table with many variables (100+) from which I select smaller sub-table with only two observations and their difference for smaller selection of variables. Thus, the resulting table looks for example similarly to this:
air bbs bri
_________ ________ _________
test1 12.451 0.549 3.6987
test2 10.2 0.47 3.99
diff 2.251 0.078999 -0.29132
Now, I need to multiply the ‘diff’ row with various coefficients that differ between variables. I can get the same result with the following code:
T(4,:) = array2table([T.air(3)*0.2*0.25, T.bbs(3)*0.1*0.25, T.bri(3)*0.7*0.6/2]);
However, I need more flexible solution since the selection of variables will differ between applications. I was thinking that better solution might be using either varfun or rowfun and speficic function that would assign correct coefficients/equations based on variable names:
T(4,:) = varfun(#func, T(3,:), 'InputVariables', {'air' 'bbs' 'bri'});
or
T(4,:) = rowfun(#func, T(3,:), 'OutputVariableNames', T.Properties.VariableNames);
However, the current solution I have is similarly inflexible as the basic calculation above:
function [air_out, bbs_out, bri_out] = func(air, bbs, bri)
air_out = air*0.2*0.25;
bbs_out = bbs*0.1*0.25;
bri_out = bri*0.7*0.6/2;
since I need to define every input/output variable. What I need is to assign in the function coefficients/equations for every variable and the ability of the function to apply it only to the variables that are present in the specific sub-table.
Any suggestions?