Merging tbl_svysummary and stacked tbl_regression tables with different variable names but same labels - gtsummary

Follow up question to (Renaming Rows in gtsummary, tbl_regression/tbl_stack):
I am now trying to merge the renamed, stacked table (Table 1) with a tbl_summary table that includes the prevalence for each of the outcomes (Table 2). However, because each renamed line of Table 1 is, in reality, just the same variable repeated over and over again, it doesn't merge with Table 2, instead creating a (Table 3) that has duplicated outcome names stacked onto one another. Any way to merge these tables so that the lines of Table 1 match seamlessly with those from Table 2?

UPDATE:
As of gtsummary v 1.4.0, tbl_uvregression() now accepts survey objects.
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.4.0'
# convert trial data frame to survey object
tbl <-
survey::svydesign(
data = trial[c("response", "death", "age", "marker")],
ids = ~1,
weights = ~1
) %>%
# build univariate regression models
tbl_uvregression(
x = age,
method = survey::svyglm,
method.args = list(family = binomial),
exponentiate = TRUE,
formula = "{y} ~ {x} + marker",
label = list(response = "Response", death = "Death"),
hide_n = TRUE,
include = -marker
) %>%
add_n() %>%
add_nevent() %>%
modify_header(
label = "**Outcome**",
estimate = "**Age OR**"
)
Created on 2021-04-14 by the reprex package (v2.0.0)

Related

Gtsummary table values less than -1 outputting as 1.00?

Since installing the newest version of R, all my gtsummary table values less than -1 have been outputting to 1.00. Does anyone have insight on how to fix this very weird issue?
Here is example code:
library(tidyverse)
library(gtsummary)
library(haven)
library(mice)
library(googlesheets4)
data <- read_sheet("https://docs.google.com/spreadsheets/d/1yyw-0xseZSLjD4jc8sw7IksN-S0M3vcKWHy4ksMPL4c/edit?usp=sharing")
datami <- mice(data, m = 23, seed=10)
datareg <- with(datami,
lm(SUD ~ NUM + MIND +
AGE + SEX + CRAVE) )
table <- tbl_regression(datareg,
estimate_fun = purrr::partial(style_ratio, digits = 2),
pvalue_fun = ~style_pvalue(.x, digits = 2),
add_estimate_to_reference_rows = TRUE
) %>% modify_header(label="**Predictor**",estimate="**Unstandardized Coefficient**") %>%
modify_footnote(update = c(p.value, ci, estimate) ~ "Reference group")%>%
modify_caption("Table: Multiple Imputation Predicting Variable")
table
Have re-installed R & gtsummary multiple times to no avail.
You're using style_ratio() to round/format the estimates: this is meant to round odds ratios, risk ratios, etc., which are all positive numbers. Update this to use style_number().
I should update the ratio function to have better behavior when negative values are passed.

There is a way to add a column with the test statistics in gtsummary::tbl_regression()?

Is there a way to add the Statistic for each variable as a column in a tbl_regresion() with {gtsummary}? In Psychology is not uncommon for reviewers to expect this column in the tables.
When using {sjPlot}, just need to add the show.stat parameter, and the Statistic column will appear showing the t value from summary(model).
library(gtsummary)
library(sjPlot)
model <- lm(mpg ~ cyl + wt, data = mtcars)
tab_model(model, show.stat = TRUE)
With {gtsummary} I can't find anywhere how to add an equivalent column when using tbl_regression(). Would just want to show a column with the values already present in the model.
library(gtsummary)
model <- lm(mpg ~ cyl + wt, data = mtcars)
stats_to_include =c("r.squared", "adj.r.squared", "nobs")
tbl_regression(model, intercept = TRUE, show_single_row = everything()) %>%
bold_p() %>%
add_glance_table(include = all_of(stats_to_include))
Yep! The statistic column is already in the table, but it's hidden by default. You can unhide it (and other columns) with the modify_column_unhide() function. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.2'
model <- lm(mpg ~ cyl + wt, data = mtcars)
tbl <-
tbl_regression(model, intercept = TRUE, show_single_row = everything()) %>%
bold_p() %>%
add_glance_table(include = c(r.squared, adj.r.squared, nobs)) %>%
modify_column_unhide(columns = c(statistic, std.error))
Created on 2022-02-07 by the reprex package (v2.0.1)
FYI if you're interested, we also support journal themes in gtsummary. You can, for example, load the JAMA theme and the gtsummary results will be auto-formatted for publication in JAMA. We don't have Psychology theme, but if you file a GitHub Issue, we can collaborate on adding one. We can add things like showing the statistic column by default (and much more).
https://www.danieldsjoberg.com/gtsummary/reference/theme_gtsummary.html

Rstudio gtsummary table with three categorical variables

I'm trying to figure out how to use the gtsummary package for my dataset.
I have three categorical values and two of those set as strata. I'm not interested in the frequency of each sample but want the numeric value in the table.
Currently I'm using this simple code (x, y, z, are my categorical values, whereas SOC is the numerical values. Y and Z should go in the headline (strata).
Data %>%
select(x, y, z , SOC) %>%
tbl_strata(strata=z,
.tbl_fun =
~ .x %>%
tbl_summary(by = y , missing = "no"),
statistic = list(all_continuous()~ "{mean} ({sd})" ))%>%
modify_caption("**Soil organic carbon [%]**")%>%
bold_labels()
Edit
Let's take the trial dataset as an example:
trial %>%
select(trt, grade, stage, age, marker) %>%
tbl_strata(strata=stage,
~tbl_summary(.x, by = grade , missing = "no"),
statistic = list(all_continuous()~ "{mean} ({sd})"
))%>%
bold_labels()
What I'm looking for is a table like this, but without the frequency showing of each treatment (Drug A, B). I only want the age and marker to show up in my table but organized by treatment. I'd like to have the first section showing only the age and marker for the group that received Drug A. Then a section showing the same for Drug B.
Edit 2
Your input is exactly what I am looking for. With the trial dataset it works perfectly fine. However, ones I put in my data, the numeric values are all in one column instead of in rows. I also still get the frequencies and I can't figure out why. I use exactly the same code and the same amount of variables and my table looks somewhat like this:
I think nesting calls to tbl_strata() (one merging and the other stacking) will get you what you're after. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.2'
tbl <-
trial %>%
select(trt, grade, stage, age, marker) %>%
tbl_strata(
strata = trt,
function(data) {
data %>%
tbl_strata(
strata = stage,
~ tbl_summary(
.x,
by = grade,
statistic = all_continuous() ~ "{mean} ({sd})",
missing = "no"
) %>%
modify_header(all_stat_cols() ~ "**{level}**")
)
},
.combine_with = "tbl_stack",
.combine_args = list(group_header = c("Drug A", "Drug B"))
) %>%
bold_labels()
Created on 2022-03-04 by the reprex package (v2.0.1)

Using mutate to swap columns renders a negative Difference output

I have used the gtsummary package (great package btw) since last month on my reports.
Now I am building a cohort table that will show pre-test value, post-test value, difference (p.p) and a t-test p-value.
I'm trying to build the same table as I have built it under Arsenal with pre-test being the first column and post-test being in the second column and so on, but the difference column shows a negative output when it isn't supposed to be.
I used mutate() to swap both columns, as when I don't use it it shows the post-test as the first column. I also tried swapping the post-test columns at first rows in the dataset itself as what I read in some posts. But to no avail.
homesurvey %>%
select(period, CB2.Textbooks, CB2.Magazines, CB2.Newspapers, CB2.Religious_books, CB2.Coloring_books, CB2.Comics) %>%
mutate(period = forcats::fct_rev(period)) %>%
tbl_summary(by = period,
statistic = all_continuous() ~ "{n} ({sd})",
label = list (CB2.Textbooks ~ "Textbooks",
CB2.Magazines ~ "Magazines",
CB2.Newspapers ~ "Newspapers",
CB2.Religious_books ~ "Religious books",
CB2.Coloring_books ~ "Coloring books",
CB2.Comics ~ "Comics")
)%>%
add_difference() %>%
modify_column_hide(ci)
It shows a negative difference even if it isn't supposed to be.
Output
I am looking at your example output (thanks for including it). The first row is showing 82% in pre-assessment and 96% in the post-assessment. 82 - 96 = -15%, so the difference should indeed be negative.
You can, however, flip the estimate by multiplying it by -1. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.0'
tbl <-
trial %>%
select(response, death, trt) %>%
tbl_summary(by = trt, missing = "no") %>%
add_difference() %>%
modify_column_hide(ci) %>%
# you can flip the difference estimate by multiplying it by -1
modify_table_body(
~.x %>%
dplyr::mutate(estimate = -1 * estimate)
)
Created on 2021-11-10 by the reprex package (v2.0.1)

Add Control and Cases numbers instead of add_nevent()

The following code produces an N column of "N" and "Event N" as part of the univariate regression table.
I have a case control dataset and I would like to have the columns "Cases" and "Controls" giving the numbers of cases and controls instead.
The "cases" and "control" are in determined by the variable "response" in the code below. e.g. response(1) = "cases", while response(0) = "controls".
How can I do this?
thanks,
nelly
tbl_uv_nevent_ex <-
trial[c("response", "trt", "age", "grade")] %>%
tbl_uvregression(
method = glm,
y = response,
method.args = list(family = binomial)
) %>%
add_nevent()
You can do this by adding the number of controls to the .$table_body data frame. I've included an example below. There is a sticking point at the moment....after you add the number of controls to the data frame that will be printed, we need to add the new column to the internal set of instructions to print in gtsummary. This step is a headache now, but we're working on a solution to make it accessible to users. Here is the first draft of that function: http://www.danieldsjoberg.com/gtsummary/dev/reference/modify_table_header.html
In the meantime, here's how you can get that done:
library(gtsummary)
tbl <-
trial[c("response", "trt", "age")] %>%
tbl_uvregression(
method = glm,
y = response,
method.args = list(family = binomial),
exponentiate = TRUE
) %>%
add_nevent()
# add the number of controls to table
tbl$table_body <-
tbl$table_body %>%
dplyr::mutate(
n_nonevent = N - nevent
) %>%
dplyr::relocate(n_nonevent, .after = nevent)
# updating internal info with new column (this part will not be required in the future)
tbl$table_header <-
gtsummary:::table_header_fill_missing(tbl$table_header,
tbl$table_body)
# print tbl with Case and Control Ns
tbl %>%
modify_header(
list(nevent ~ "**Case N**",
n_nonevent ~ "**Control N**")
)