Rstudio gtsummary table with three categorical variables - gtsummary

I'm trying to figure out how to use the gtsummary package for my dataset.
I have three categorical values and two of those set as strata. I'm not interested in the frequency of each sample but want the numeric value in the table.
Currently I'm using this simple code (x, y, z, are my categorical values, whereas SOC is the numerical values. Y and Z should go in the headline (strata).
Data %>%
select(x, y, z , SOC) %>%
tbl_strata(strata=z,
.tbl_fun =
~ .x %>%
tbl_summary(by = y , missing = "no"),
statistic = list(all_continuous()~ "{mean} ({sd})" ))%>%
modify_caption("**Soil organic carbon [%]**")%>%
bold_labels()
Edit
Let's take the trial dataset as an example:
trial %>%
select(trt, grade, stage, age, marker) %>%
tbl_strata(strata=stage,
~tbl_summary(.x, by = grade , missing = "no"),
statistic = list(all_continuous()~ "{mean} ({sd})"
))%>%
bold_labels()
What I'm looking for is a table like this, but without the frequency showing of each treatment (Drug A, B). I only want the age and marker to show up in my table but organized by treatment. I'd like to have the first section showing only the age and marker for the group that received Drug A. Then a section showing the same for Drug B.
Edit 2
Your input is exactly what I am looking for. With the trial dataset it works perfectly fine. However, ones I put in my data, the numeric values are all in one column instead of in rows. I also still get the frequencies and I can't figure out why. I use exactly the same code and the same amount of variables and my table looks somewhat like this:

I think nesting calls to tbl_strata() (one merging and the other stacking) will get you what you're after. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.2'
tbl <-
trial %>%
select(trt, grade, stage, age, marker) %>%
tbl_strata(
strata = trt,
function(data) {
data %>%
tbl_strata(
strata = stage,
~ tbl_summary(
.x,
by = grade,
statistic = all_continuous() ~ "{mean} ({sd})",
missing = "no"
) %>%
modify_header(all_stat_cols() ~ "**{level}**")
)
},
.combine_with = "tbl_stack",
.combine_args = list(group_header = c("Drug A", "Drug B"))
) %>%
bold_labels()
Created on 2022-03-04 by the reprex package (v2.0.1)

Related

R gtsummary package: How to Hide Colums in Summary Table

I'm using gtsummary to prepare my tables, I'm trying to hide one of the columns from the groups, the third column labeled as "1, N = 61"
Below is the code ran,
library(gtsummary)
trial <- trial
trial %>%
tbl_summary(by=response,
statistic = list(trt~"{n}/{N} ({p}%)")) %>%
add_overall() %>%
add_p() %>%
modify_column_hide(columns = "1")
The output provided
I was expecting that the third column would be hidden "1, N = 61"
The 2 columns (0 , 1) are named "stat_1" and "stat_2" respectively.
So, to hide the one you ask for:
trial %>%
tbl_summary(by=response,
statistic = list(trt~"{n}/{N} ({p}%)")) %>%
modify_column_hide(columns = "stat_2") %>%
add_p() %>%
add_overall()
Output:
Now, if you want to hide parameters of statistical tests you run, you would do it as indicated here:
https://rdrr.io/cran/gtsummary/man/modify_column_hide.html

Trying to display both mean and median with gtsummary

What is the correct syntax to display both median and mean of a continuous variable using tbl_continuous? Also, is it possible to display on 2 lines as you can do with tbl_summary and the continuous2 argument?
Code below is just displaying medians (see image).
comparison.data %>%
select(imaging, los.minutes, acuity) %>%
tbl_continuous(
by = imaging,
variable = los.minutes,
statistic =
los.minutes ~ c("{mean} ({sd})",
"{median} ({sd})")
) %>%
modify_spanning_header(all_stat_cols() ~ "**Imaging status**")
Just displaying medians
Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.6.1'
tbl <-
trial %>%
tbl_continuous(
variable = age,
by = trt,
include = grade,
statistic = ~"{mean} \n{median}"
) %>%
as_gt() %>%
gt::fmt_markdown(columns = everything())
Created on 2022-08-24 by the reprex package (v2.0.1)

There is a way to add a column with the test statistics in gtsummary::tbl_regression()?

Is there a way to add the Statistic for each variable as a column in a tbl_regresion() with {gtsummary}? In Psychology is not uncommon for reviewers to expect this column in the tables.
When using {sjPlot}, just need to add the show.stat parameter, and the Statistic column will appear showing the t value from summary(model).
library(gtsummary)
library(sjPlot)
model <- lm(mpg ~ cyl + wt, data = mtcars)
tab_model(model, show.stat = TRUE)
With {gtsummary} I can't find anywhere how to add an equivalent column when using tbl_regression(). Would just want to show a column with the values already present in the model.
library(gtsummary)
model <- lm(mpg ~ cyl + wt, data = mtcars)
stats_to_include =c("r.squared", "adj.r.squared", "nobs")
tbl_regression(model, intercept = TRUE, show_single_row = everything()) %>%
bold_p() %>%
add_glance_table(include = all_of(stats_to_include))
Yep! The statistic column is already in the table, but it's hidden by default. You can unhide it (and other columns) with the modify_column_unhide() function. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.2'
model <- lm(mpg ~ cyl + wt, data = mtcars)
tbl <-
tbl_regression(model, intercept = TRUE, show_single_row = everything()) %>%
bold_p() %>%
add_glance_table(include = c(r.squared, adj.r.squared, nobs)) %>%
modify_column_unhide(columns = c(statistic, std.error))
Created on 2022-02-07 by the reprex package (v2.0.1)
FYI if you're interested, we also support journal themes in gtsummary. You can, for example, load the JAMA theme and the gtsummary results will be auto-formatted for publication in JAMA. We don't have Psychology theme, but if you file a GitHub Issue, we can collaborate on adding one. We can add things like showing the statistic column by default (and much more).
https://www.danieldsjoberg.com/gtsummary/reference/theme_gtsummary.html

Using mutate to swap columns renders a negative Difference output

I have used the gtsummary package (great package btw) since last month on my reports.
Now I am building a cohort table that will show pre-test value, post-test value, difference (p.p) and a t-test p-value.
I'm trying to build the same table as I have built it under Arsenal with pre-test being the first column and post-test being in the second column and so on, but the difference column shows a negative output when it isn't supposed to be.
I used mutate() to swap both columns, as when I don't use it it shows the post-test as the first column. I also tried swapping the post-test columns at first rows in the dataset itself as what I read in some posts. But to no avail.
homesurvey %>%
select(period, CB2.Textbooks, CB2.Magazines, CB2.Newspapers, CB2.Religious_books, CB2.Coloring_books, CB2.Comics) %>%
mutate(period = forcats::fct_rev(period)) %>%
tbl_summary(by = period,
statistic = all_continuous() ~ "{n} ({sd})",
label = list (CB2.Textbooks ~ "Textbooks",
CB2.Magazines ~ "Magazines",
CB2.Newspapers ~ "Newspapers",
CB2.Religious_books ~ "Religious books",
CB2.Coloring_books ~ "Coloring books",
CB2.Comics ~ "Comics")
)%>%
add_difference() %>%
modify_column_hide(ci)
It shows a negative difference even if it isn't supposed to be.
Output
I am looking at your example output (thanks for including it). The first row is showing 82% in pre-assessment and 96% in the post-assessment. 82 - 96 = -15%, so the difference should indeed be negative.
You can, however, flip the estimate by multiplying it by -1. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.0'
tbl <-
trial %>%
select(response, death, trt) %>%
tbl_summary(by = trt, missing = "no") %>%
add_difference() %>%
modify_column_hide(ci) %>%
# you can flip the difference estimate by multiplying it by -1
modify_table_body(
~.x %>%
dplyr::mutate(estimate = -1 * estimate)
)
Created on 2021-11-10 by the reprex package (v2.0.1)

Change the default Statistical test performed by "add_p()" function in gtsummary summary tables

I am using gtsummary package to generate summary tables.
I would like to do the following:
That the "add_p" function performs a two-proportions z-test for the proportions in the "by" variable instead of chi-square test for independence. Using stats::prop.test
Displays on the footnote that the "Statistical tests performed" are "2-sample test for equality of proportions with continuity correction"
How can I do that within this example code?
trial2 <- trial %>% select(trt, grade)
trial3 <- trial2[-which(trial2$grade == "III"),]
trial4 <- droplevels(trial3)
trial4 %>%
tbl_summary(
by = trt,
statistic = list(all_continuous() ~ "{mean} ({sd})",
all_categorical() ~ "{n} / {N} ({p}%)"),
digits = all_continuous() ~ 2,
label = grade ~ "Tumor Grade"
) %>% add_p()
Thank you!
You can have two options. First, build a custom p-value function to calculate the p-value based off of prop.test(). There is an example of this in the add_p.tbl_summary() help file.
The second option (and easier option) is to download the current development version of the package from GitHub. In this version, the prop.test() option is already built in. Example below!
remotes::install_github("ddsjoberg/gtsummary")
library(gtsummary)
packageVersion("gtsummary")
#> [1] ‘1.3.5.9017’
trial %>%
select(response, death, trt) %>%
tbl_summary(by = trt) %>%
add_p(test = everything() ~ "prop.test") %>%
modify_footnote(p.value ~ "2-sample test for equality of proportions with continuity correction")
You may also want to check out the new function add_difference() that also reports the prop.test() p-value along with differences between groups.
trial %>%
select(trt, response, death) %>%
tbl_summary(by = trt,
statistic = all_dichotomous() ~ "{p}%",
missing = "no") %>%
modify_footnote(all_stat_cols() ~ NA) %>%
add_n() %>%
add_difference(estimate_fun = ~paste0(style_sigfig(. * 100), "%"))