gtsummary inline statement in gt resulting in NA next to percentage

gtsummary inline statement in gt resulting in NA next to percentage - gtsummary

I use the inline statement in Rmarkdown from the gtsummary package. However, I get a strange result when I use it with a certain variable !!
The problem happens when a variable and a level of the variable have the same level. Here is problem demonstrated with the trial data frame that comes with the package.
var_label(trial) <- list(trt = "Drug A")
tbl1 <- trial %>%
select(trt) %>%
tbl_summary()
inline_text(tbl1, variable = trt, level = "Drug A")
it results in:
[1] NA "98 (49%)"
Any idea why this is happening?
Here is my very minimalistic YAML:
title: "hello"
author: "ebay"
date: "3/5/2021"
and my setup chunk:
library(gtsummary)
knitr::opts_chunk$set(error = F, echo = F, warning = F, fig.width=6.3, fig.height=4.5, fig.align = "center")

The labels of the variable and its levels shouldn't be 100% identical.

Related

I need to create a table where summarize several continuous variables by two categorical variables

In the first figure you can see what I´ve been trying with no success.
"ciclo" and "etnibee" are two categorical variables
In the second figure you can see what I wish I could get...
Expected Outcomes
Please help me, thanks in advance.

The table is possible to construct using the various building blocks of tables available in gtsummary. Admittedly, it's not the easiest, though. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.4.1'
library(tidyverse)
fun1 <- function(data, variable, by) {
# extract variable label
lbl <- attr(data[[variable]], "label") %||% variable
# construct table
data %>%
nest(data = -all_of(by)) %>%
arrange(.data[[by]]) %>%
rowwise() %>%
mutate(
tbl =
tbl_summary(
data = data,
include = variable,
missing = "no",
label = list(as.character(.data[[by]])) %>% setNames(.env$variable)
) %>%
modify_header(stat_0 ~ paste0("**", lbl, "**")) %>%
modify_table_body(~.x %>% mutate(variable = .env$by)) %>%
list()
) %>%
pull(tbl) %>%
tbl_stack(quiet = TRUE)
}
# now stratify all these resulst by another variable
final_tbl <-
tbl_strata(
trial,
strata = trt,
~c("age", "marker") %>%
# now add multiple variable columns
map(function(v) fun1(data = .x, variable = v, by = "grade")) %>%
tbl_merge() %>%
modify_spanning_header(everything() ~ NA),
.combine_with = "tbl_stack"
)
Created on 2021-07-10 by the reprex package (v2.0.0)

Using mgcv gam() and gtsummary tbl_uvregression()

How do we specify a smoothing spline fit for certain variables in tbl_uvregression() with method = gam?
data %>%
select(outcome, predictors) %>%
tbl_uvregression(
method = gam,
y = outcome,
method.args = list(family = binomial),
exponentiate = T)
For example if I want to indicate s(x1) in the gam model formula for variable x1, how do we add that in the above code?

You cannot wrap the variable in a function, like s() in tbl_uvregression(). You will need to construct the individual tables with tbl_regression(), then stack them on top of one another. Code example below! But, it is a bit strange because the smoothed terms don't have a single odds ratio....so you're just getting a table of p-values....
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.4.1'
library(tidyverse)
library(mgcv)
#> Loading required package: nlme
#>
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
#> This is mgcv 1.8-35. For overview type 'help("mgcv-package")'.
tbl_uv <-
tibble(variable = c("age", "marker")) %>%
rowwise() %>%
mutate(
# build reg models
tbl =
glue::glue("response ~ s({variable})") %>%
as.formula() %>%
gam(data = trial, family = binomial) %>%
tbl_regression() %>%
list()
) %>%
# stack the regression tables
pull() %>%
tbl_stack()
Created on 2021-05-20 by the reprex package (v2.0.0)

How to modify the default variable type defined by "all_categorical()" in "gtsummary"? when mean of ordinal variable were wanted?

The variable "Var2" has been set as categorical variable by default, while the mean(sd) were needed sometimes. So I am interested in how to modified this.
data_table_1 =
data %>%
dplyr::select(group, var1, var2)
data_table_1 %>%
tbl_summary(by = group, missing = "no",
statistic = list(all_continuous() ~ "{mean} ± {sd}",
all_categorical() ~ "{n} ({p}%)"),
digits = list(all_continuous() ~ c(2, 2))) %>%
add_p(test = list(all_continuous() ~ "pttest2", all_categorical() ~ "pttest2"),
pvalue_fun = function(x) sprintf(x, fmt='%#.3f'))

The function tbl_summary() does its best to guess the type of summary that best suits the data...but this is not always how you'd like to summarize your data. To update the default summary type, use the type= argument. In this case you'd want to include type = list(Var2 ~ "continuous") to summarize the data continuously.
Hope this helps!

R isn't recognizing the date field I have sent as an input to Rstudio in .csv format

[This is my sample data.]
What I had been trying to achieve is Forecasting in R with dates as CSV input via R studio.
When I've tried to change the data type of the Data field in my input using as.Date(my_date_field, "%Y-%m-%d"), Class(my_date_field) results in Date only but printing the Values of my_date_field results in "NA"s.
So, I am unable to forecast on timeline basis at all.
Please help me out sorting out this issue.
The code I've used for forecasting is:
library(forecast)
library(lubridate)
FitData <- read.csv("~//Power BI//fit.csv")
Fitdataset <- aggregate(FitData$Metric ~ FitData$PED, data = FitData, FUN= sum)
Fitdataset$FitData$PED<- as.Date(Fitdataset$FitData$PED, format="%y-%d-%m")
ts_FitData <- ts(Fitdataset$FitData$Metric, frequency=12, start=c(Fitdataset$FitData$PED`1,1))
decom <- stl(ts_FitData, s.window = "periodic")
pred <- forecast(decom, h = 7)
plot(pred)
`

Remove variable labels attached with foreign/Hmisc SPSS import functions

As usual, I got some SPSS file that I've imported into R with spss.get function from Hmisc package. I'm bothered with labelled class that Hmisc::spss.get adds to all variables in data.frame, hence want to remove it.
labelled class gives me headaches when I try to run ggplot or even when I want to do some menial analysis! One solution would be to remove labelled class from each variable in data.frame. How can I do that? Is that possible at all? If not, what are my other options?
I really want to bypass reediting variables "from scratch" with as.data.frame(lapply(x, as.numeric)) and as.character where applicable... And I certainly don't want to run SPSS and remove labels manually (don't like SPSS, nor care to install it)!
Thanks!

Here's how I get rid of the labels altogether. Similar to Jyotirmoy's solution but works for a vector as well as a data.frame. (Partial credits to Frank Harrell)
clear.labels <- function(x) {
if(is.list(x)) {
for(i in 1 : length(x)) class(x[[i]]) <- setdiff(class(x[[i]]), 'labelled')
for(i in 1 : length(x)) attr(x[[i]],"label") <- NULL
}
else {
class(x) <- setdiff(class(x), "labelled")
attr(x, "label") <- NULL
}
return(x)
}
Use as follows:
my.unlabelled.df <- clear.labels(my.labelled.df)
EDIT
Here's a bit of a cleaner version of the function, same results:
clear.labels <- function(x) {
if(is.list(x)) {
for(i in seq_along(x)) {
class(x[[i]]) <- setdiff(class(x[[i]]), 'labelled')
attr(x[[i]],"label") <- NULL
}
} else {
class(x) <- setdiff(class(x), "labelled")
attr(x, "label") <- NULL
}
return(x)
}

A belated note/warning regarding class membership in R objects. The correct method for identification of "labelled" is not to test for with an is function or equality {==) but rather with inherits. Methods that test for a specific location will not pick up cases where the order of existing classes are not the ones assumed.
You can avoid creating "labelled" variables in spss.get with the argument: , use.value.labels=FALSE.
w <- spss.get('/tmp/my.sav', use.value.labels=FALSE, datevars=c('birthdate','deathdate'))
The code from Bhattacharya could fail if the class of the labelled vector were simply "labelled" rather than c("labelled", "factor") in which case it should have been:
class(x[[i]]) <- NULL # no error from assignment of empty vector
The error you report can be reproduced with this code:
> b <- 4:6
> label(b) <- 'B Label'
> str(b)
Class 'labelled' atomic [1:3] 4 5 6
..- attr(*, "label")= chr "B Label"
> class(b) <- class(b)[-1]
Error in class(b) <- class(b)[-1] :
invalid replacement object to be a class string

You can try out the read.spss function from the foreign package.
A rough and ready way to get rid of the labelled class created by spss.get
for (i in 1:ncol(x)) {
z<-class(x[[i]])
if (z[[1]]=='labelled'){
class(x[[i]])<-z[-1]
attr(x[[i]],'label')<-NULL
}
}
But can you please give an example where labelled causes problems?
If I have a variable MAED in a data frame x created by spss.get, I have:
> class(x$MAED)
[1] "labelled" "factor"
> is.factor(x$MAED)
[1] TRUE
So well-written code that expects a factor (say) should not have any problems.

Suppose:
library(Hmisc)
w <- spss.get('...')
You could remove the labels of a variable called "var1" by using:
attributes(w$var1)$label <- NULL
If you also want to remove the class "labbled", you could do:
class(w$var1) <- NULL
or if the variable has more than one class:
class(w$var1) <- class(w$var1)[-which(class(w$var1)=="labelled")]
Hope this helps!

Well, I figured out that unclass function can be utilized to remove classes (who would tell, aye?!):
library(Hmisc)
# let's presuppose that variable x is gathered through spss.get() function
# and that x is factor
> class(x)
[1] "labelled" "factor"
> foo <- unclass(x)
> class(foo)
[1] "integer"
It's not the luckiest solution, just imagine back-converting bunch of vectors... If anyone tops this, I'll check it as an answer...

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

gtsummary inline statement in gt resulting in NA next to percentage - gtsummary

The labels of the variable and its levels shouldn't be 100% identical.

Related

I need to create a table where summarize several continuous variables by two categorical variables

Using mgcv gam() and gtsummary tbl_uvregression()

How to modify the default variable type defined by "all_categorical()" in "gtsummary"? when mean of ordinal variable were wanted?

R isn't recognizing the date field I have sent as an input to Rstudio in .csv format

Remove variable labels attached with foreign/Hmisc SPSS import functions

Categories

Resources