I had a script (Markdown) that had been running without a problem, I updated R and R studio on my M1 Mac and now I get an error on tbl_svysummary.
I have setup my survey data. And it used to work.
Here is the code
#The data frame is labelled for having the questions on tables, this was working well. An example of the labelling:
label(srvydat$q3) <- "Q3. What is the postcode and suburb/place that you live in?"
#with each variable having a label
##the data setting it as a survey data
srvydatdes <- svydesign(id=~1, data= srvydat, weights=~thew)
## on only one variable summary it works still
cuales <- srvydat %>% select(starts_with("q4")) %>% names()
srvydatdes %>% tbl_svysummary(include=all_of(cuales), statistic=list(all_categorical()~"{p}%" ))
#no problem with above, but when I do a "By " table
cuales <- srvydat %>% select(starts_with(c("q9","q5a"))) %>% names()
chana <- srvydatdes %>% tbl_svysummary(include=all_of(cuales), by=q5a,statistic=list(all_categorical()~"{p}%" ))
#here I get the following error:
Error in mutate():
! Problem while computing df_stats = pmap(...).
Caused by error in left_join():
! Can't join on x$by x y$by because of incompatible types.
?ℹ x$by is of type >.
ℹ y$by is of type <factor<38ed8>>>.
Backtrace:
... %>% ...
dplyr:::left_join.data.frame(., svy_p, by = c("by", "variable_levels"))
I have checked that the variable being used is a factor.
And it used to work with previous versions.
Thanks
UPDATE
If i do the survey setting of data without the labelling, then tbl_svysummary works with the by (of course i lose the labels and I would still want to be able to have them)
Related
Is there a way to add the Statistic for each variable as a column in a tbl_regresion() with {gtsummary}? In Psychology is not uncommon for reviewers to expect this column in the tables.
When using {sjPlot}, just need to add the show.stat parameter, and the Statistic column will appear showing the t value from summary(model).
library(gtsummary)
library(sjPlot)
model <- lm(mpg ~ cyl + wt, data = mtcars)
tab_model(model, show.stat = TRUE)
With {gtsummary} I can't find anywhere how to add an equivalent column when using tbl_regression(). Would just want to show a column with the values already present in the model.
library(gtsummary)
model <- lm(mpg ~ cyl + wt, data = mtcars)
stats_to_include =c("r.squared", "adj.r.squared", "nobs")
tbl_regression(model, intercept = TRUE, show_single_row = everything()) %>%
bold_p() %>%
add_glance_table(include = all_of(stats_to_include))
Yep! The statistic column is already in the table, but it's hidden by default. You can unhide it (and other columns) with the modify_column_unhide() function. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.2'
model <- lm(mpg ~ cyl + wt, data = mtcars)
tbl <-
tbl_regression(model, intercept = TRUE, show_single_row = everything()) %>%
bold_p() %>%
add_glance_table(include = c(r.squared, adj.r.squared, nobs)) %>%
modify_column_unhide(columns = c(statistic, std.error))
Created on 2022-02-07 by the reprex package (v2.0.1)
FYI if you're interested, we also support journal themes in gtsummary. You can, for example, load the JAMA theme and the gtsummary results will be auto-formatted for publication in JAMA. We don't have Psychology theme, but if you file a GitHub Issue, we can collaborate on adding one. We can add things like showing the statistic column by default (and much more).
https://www.danieldsjoberg.com/gtsummary/reference/theme_gtsummary.html
I'm trying to figure out how to use the gtsummary package for my dataset.
I have three categorical values and two of those set as strata. I'm not interested in the frequency of each sample but want the numeric value in the table.
Currently I'm using this simple code (x, y, z, are my categorical values, whereas SOC is the numerical values. Y and Z should go in the headline (strata).
Data %>%
select(x, y, z , SOC) %>%
tbl_strata(strata=z,
.tbl_fun =
~ .x %>%
tbl_summary(by = y , missing = "no"),
statistic = list(all_continuous()~ "{mean} ({sd})" ))%>%
modify_caption("**Soil organic carbon [%]**")%>%
bold_labels()
Edit
Let's take the trial dataset as an example:
trial %>%
select(trt, grade, stage, age, marker) %>%
tbl_strata(strata=stage,
~tbl_summary(.x, by = grade , missing = "no"),
statistic = list(all_continuous()~ "{mean} ({sd})"
))%>%
bold_labels()
What I'm looking for is a table like this, but without the frequency showing of each treatment (Drug A, B). I only want the age and marker to show up in my table but organized by treatment. I'd like to have the first section showing only the age and marker for the group that received Drug A. Then a section showing the same for Drug B.
Edit 2
Your input is exactly what I am looking for. With the trial dataset it works perfectly fine. However, ones I put in my data, the numeric values are all in one column instead of in rows. I also still get the frequencies and I can't figure out why. I use exactly the same code and the same amount of variables and my table looks somewhat like this:
I think nesting calls to tbl_strata() (one merging and the other stacking) will get you what you're after. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.2'
tbl <-
trial %>%
select(trt, grade, stage, age, marker) %>%
tbl_strata(
strata = trt,
function(data) {
data %>%
tbl_strata(
strata = stage,
~ tbl_summary(
.x,
by = grade,
statistic = all_continuous() ~ "{mean} ({sd})",
missing = "no"
) %>%
modify_header(all_stat_cols() ~ "**{level}**")
)
},
.combine_with = "tbl_stack",
.combine_args = list(group_header = c("Drug A", "Drug B"))
) %>%
bold_labels()
Created on 2022-03-04 by the reprex package (v2.0.1)
I have used the gtsummary package (great package btw) since last month on my reports.
Now I am building a cohort table that will show pre-test value, post-test value, difference (p.p) and a t-test p-value.
I'm trying to build the same table as I have built it under Arsenal with pre-test being the first column and post-test being in the second column and so on, but the difference column shows a negative output when it isn't supposed to be.
I used mutate() to swap both columns, as when I don't use it it shows the post-test as the first column. I also tried swapping the post-test columns at first rows in the dataset itself as what I read in some posts. But to no avail.
homesurvey %>%
select(period, CB2.Textbooks, CB2.Magazines, CB2.Newspapers, CB2.Religious_books, CB2.Coloring_books, CB2.Comics) %>%
mutate(period = forcats::fct_rev(period)) %>%
tbl_summary(by = period,
statistic = all_continuous() ~ "{n} ({sd})",
label = list (CB2.Textbooks ~ "Textbooks",
CB2.Magazines ~ "Magazines",
CB2.Newspapers ~ "Newspapers",
CB2.Religious_books ~ "Religious books",
CB2.Coloring_books ~ "Coloring books",
CB2.Comics ~ "Comics")
)%>%
add_difference() %>%
modify_column_hide(ci)
It shows a negative difference even if it isn't supposed to be.
Output
I am looking at your example output (thanks for including it). The first row is showing 82% in pre-assessment and 96% in the post-assessment. 82 - 96 = -15%, so the difference should indeed be negative.
You can, however, flip the estimate by multiplying it by -1. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.5.0'
tbl <-
trial %>%
select(response, death, trt) %>%
tbl_summary(by = trt, missing = "no") %>%
add_difference() %>%
modify_column_hide(ci) %>%
# you can flip the difference estimate by multiplying it by -1
modify_table_body(
~.x %>%
dplyr::mutate(estimate = -1 * estimate)
)
Created on 2021-11-10 by the reprex package (v2.0.1)
The following code produces an N column of "N" and "Event N" as part of the univariate regression table.
I have a case control dataset and I would like to have the columns "Cases" and "Controls" giving the numbers of cases and controls instead.
The "cases" and "control" are in determined by the variable "response" in the code below. e.g. response(1) = "cases", while response(0) = "controls".
How can I do this?
thanks,
nelly
tbl_uv_nevent_ex <-
trial[c("response", "trt", "age", "grade")] %>%
tbl_uvregression(
method = glm,
y = response,
method.args = list(family = binomial)
) %>%
add_nevent()
You can do this by adding the number of controls to the .$table_body data frame. I've included an example below. There is a sticking point at the moment....after you add the number of controls to the data frame that will be printed, we need to add the new column to the internal set of instructions to print in gtsummary. This step is a headache now, but we're working on a solution to make it accessible to users. Here is the first draft of that function: http://www.danieldsjoberg.com/gtsummary/dev/reference/modify_table_header.html
In the meantime, here's how you can get that done:
library(gtsummary)
tbl <-
trial[c("response", "trt", "age")] %>%
tbl_uvregression(
method = glm,
y = response,
method.args = list(family = binomial),
exponentiate = TRUE
) %>%
add_nevent()
# add the number of controls to table
tbl$table_body <-
tbl$table_body %>%
dplyr::mutate(
n_nonevent = N - nevent
) %>%
dplyr::relocate(n_nonevent, .after = nevent)
# updating internal info with new column (this part will not be required in the future)
tbl$table_header <-
gtsummary:::table_header_fill_missing(tbl$table_header,
tbl$table_body)
# print tbl with Case and Control Ns
tbl %>%
modify_header(
list(nevent ~ "**Case N**",
n_nonevent ~ "**Control N**")
)
Most collaborators prefer tables in word format. With the advent of rmarkdown,knitr,gtsummary and flextable this is finally coming of age, but I cannot wrap my head around how I can generate the final table below without resorting to manually setting the indentation. I think table I below leaves far too much air between the rows, but I cannot figure out how to set the row spacing tighter programmatically (tried autofit, height, height_all, hrule without obtaining desired output). Instead, I used the compact style in word to generate tbl 2 below. However, then I´d have to manually insert the indentation for the cyl categories. Anyone know how this can be done programmatically?
title: "testing T´s"
output:
word_document:
reference_docx: temp.docx
html_document:
df_print: paged
editor_options:
chunk_output_type: inline
---
Plain
====
```{r results='asis',echo=FALSE,message=FALSE}
library(gtsummary)
library(flextable)
set_gtsummary_theme(theme_gtsummary_jama())
a <- mtcars[1:20,c(1,2,9,4)]
b <- tbl_summary(a,
missing="ifany",
by=am,
type=list(cyl~"categorical"))%>%
bold_labels() %>%
add_p() %>% add_overall()
```
Flextable
====
```{r results='asis',echo=FALSE,message=FALSE}
fl <- gtsummary::as_flextable(b) %>% font(fontname = "Bodoni 72",part = "all") %>% fontsize(size=8,part="all") %>% autofit(add_h = -.5)
fl
```
At the moment, there is no simple way to do this. But I have included a code example that I think does solve your problem.
With {flextable} it's important the order the functions are called. Running as_flextable() then appending additional calls doesn't seem to get you what you want.
The alternative is save the calls, insert the new flextable function calls where needed, then evaluate the calls. That is what is done in the example below.
---
title: "Untitled"
output: word_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE, message = FALSE)
```
```{r}
library(tidyverse)
library(gtsummary)
library(flextable)
set_gtsummary_theme(theme_gtsummary_jama())
tbl <-
mtcars[1:20, c(1, 2, 9, 4)] %>%
tbl_summary(
missing = "ifany",
by = am,
type = list(cyl ~ "categorical")
) %>%
bold_labels() %>%
add_p() %>%
add_overall()
```
### Default Flextable
```{r}
gtsummary::as_flextable(tbl)
```
### Compact Flextable
```{r}
# this function inserts additional flextable calls, then evaluates the calls
update_flextable_calls <- function(x, call_list, after) {
# saving calls that create the flextable
x_calls <- gtsummary::as_flextable(x, return_calls = TRUE)
# adding new calls at `after=`
after_n <- names(x_calls) %in% after %>% which()
x_calls <- c(
x_calls[1:after_n],
call_list,
x_calls[(after_n + 1):length(x_calls)]
)
# evaluating calls
x_calls %>%
unlist() %>%
purrr::compact() %>%
# concatenating expressions with %>% between each of them
purrr::reduce(function(x, y) rlang::expr(!!x %>% !!y)) %>%
# evaluating expressions
eval()
}
# list of calls that make a table compact
compact_calls <- list(
rlang::expr(font(fontname = "Bodoni 72", part = "all")),
rlang::expr(fontsize(size = 8, part = "all")),
rlang::expr(padding(padding.top = 0, part = "all")),
rlang::expr(padding(padding.bottom = 0, part = "all"))
)
# adding the compact calls, and evaluating them
update_flextable_calls(
x = tbl, # gtsummary table
call_list = compact_calls, # calls that make flextable compact
after = "footnote" # add calls after the "footnote" functions
)
```
This obviously isn't a great permanent solution. We have a theme called theme_gtsummary_compact() that makes the {gt} tables compact with smaller font and reduced padding. We can update the theme to also make flextables more compact! I'd love it if you created an issue on GitHub to update theme_gtsummary_compact() for flexables, and we can collaborate on a solution that works well for you.
https://github.com/ddsjoberg/gtsummary/issues/new/choose