I'm using a postgres db for a shiny app, and I'm having trouble getting a dplyr query to work.
I have the following reactive. si.division is a dataframe, and input$si_division is a select input:
si_division_selected <- reactive({
si.division %>%
filter(division_name %in% input$si_division) %>%
select(division_code) %>%
unlist(use.names = FALSE)
})
I'm trying to pass this into a dplyr query using src_pool
industry_division_code <- src_pool(pool) %>% tbl("si_alldata") %>%
translate_sql(division_code %in% si_division_selected()) %>%
select(industry_code)
I'm getting the following error:
Error in UseMethod: no applicable method for 'select_' applied to an
object of class "c('sql', 'character')
I have also tried:
industry_division_code <- src_pool(pool) %>% tbl("si_alldata") %>%
filter(division_code %in% si_division_selected()) %>%
select(industry_code)
Which returns:
Error in postgresqlExecStatement(conn, statement, ...) : RS-DBI
driver: (could not Retrieve the result : ERROR: syntax error at or
near "SI_DIVISION_SELECTED" LINE 5: WHERE ("division_code" IN
SI_DIVISION_SELECTED()))
If I load the file into R instead of using the database I have no issues:
industry_division_code <- si_alldata %>%
filter(division_code %in% si_division_selected()) %>%
select(industry_code)
I think if you want to keep using si_division_selected() as the value that is passed in the filter, then you should be able to use the rlang package to force the evaluation of the function, so the line would look like this: filter(division_code %in% !! si_division_selected()). Although, your current solution of just saving the results off to a variable would be my preferred avenue.
Related
I was trying to keep only "Yes" in the out put and keep "No" in the back ground instead of all levels are printed out.
i tried the below code
table4<-Age_group_socio %>%
tbl_summary(by=Age england,missing="no",value =list(c(Cardiac diseases~'Yes', Hypertension~'Yes',Liver diseases~'Yes', Renal diseases~'Yes', Diabetes~'Yes', Neurological diseases~"Yes", Malignancy~'Yes', Malaria~'Yes', HIV~'Yes', Other immune deficiency diseases~'Yes',Tuberculosis~'Yes',Other chronic lung diseases~'Yes',Measles~"Yes" ))) %>%
bold_labels()
Error: 'value' argument must be a list of formulas or named list (see ?syntax). LHS of the formula is the variable specification, and the RHS is the value specification: list(stage ~ "T1")
In the first figure you can see what I´ve been trying with no success.
"ciclo" and "etnibee" are two categorical variables
In the second figure you can see what I wish I could get...
Expected Outcomes
Please help me, thanks in advance.
The table is possible to construct using the various building blocks of tables available in gtsummary. Admittedly, it's not the easiest, though. Example below!
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.4.1'
library(tidyverse)
fun1 <- function(data, variable, by) {
# extract variable label
lbl <- attr(data[[variable]], "label") %||% variable
# construct table
data %>%
nest(data = -all_of(by)) %>%
arrange(.data[[by]]) %>%
rowwise() %>%
mutate(
tbl =
tbl_summary(
data = data,
include = variable,
missing = "no",
label = list(as.character(.data[[by]])) %>% setNames(.env$variable)
) %>%
modify_header(stat_0 ~ paste0("**", lbl, "**")) %>%
modify_table_body(~.x %>% mutate(variable = .env$by)) %>%
list()
) %>%
pull(tbl) %>%
tbl_stack(quiet = TRUE)
}
# now stratify all these resulst by another variable
final_tbl <-
tbl_strata(
trial,
strata = trt,
~c("age", "marker") %>%
# now add multiple variable columns
map(function(v) fun1(data = .x, variable = v, by = "grade")) %>%
tbl_merge() %>%
modify_spanning_header(everything() ~ NA),
.combine_with = "tbl_stack"
)
Created on 2021-07-10 by the reprex package (v2.0.0)
How do we specify a smoothing spline fit for certain variables in tbl_uvregression() with method = gam?
data %>%
select(outcome, predictors) %>%
tbl_uvregression(
method = gam,
y = outcome,
method.args = list(family = binomial),
exponentiate = T)
For example if I want to indicate s(x1) in the gam model formula for variable x1, how do we add that in the above code?
You cannot wrap the variable in a function, like s() in tbl_uvregression(). You will need to construct the individual tables with tbl_regression(), then stack them on top of one another. Code example below! But, it is a bit strange because the smoothed terms don't have a single odds ratio....so you're just getting a table of p-values....
library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.4.1'
library(tidyverse)
library(mgcv)
#> Loading required package: nlme
#>
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
#> This is mgcv 1.8-35. For overview type 'help("mgcv-package")'.
tbl_uv <-
tibble(variable = c("age", "marker")) %>%
rowwise() %>%
mutate(
# build reg models
tbl =
glue::glue("response ~ s({variable})") %>%
as.formula() %>%
gam(data = trial, family = binomial) %>%
tbl_regression() %>%
list()
) %>%
# stack the regression tables
pull() %>%
tbl_stack()
Created on 2021-05-20 by the reprex package (v2.0.0)
I use the inline statement in Rmarkdown from the gtsummary package. However, I get a strange result when I use it with a certain variable !!
The problem happens when a variable and a level of the variable have the same level. Here is problem demonstrated with the trial data frame that comes with the package.
var_label(trial) <- list(trt = "Drug A")
tbl1 <- trial %>%
select(trt) %>%
tbl_summary()
inline_text(tbl1, variable = trt, level = "Drug A")
it results in:
[1] NA "98 (49%)"
Any idea why this is happening?
Here is my very minimalistic YAML:
title: "hello"
author: "ebay"
date: "3/5/2021"
and my setup chunk:
library(gtsummary)
knitr::opts_chunk$set(error = F, echo = F, warning = F, fig.width=6.3, fig.height=4.5, fig.align = "center")
The labels of the variable and its levels shouldn't be 100% identical.
I am trying to merge two data sets and thus trying to cut some NA data with the following code:
vedba <- read.csv(vedba_in)
head(vedba)
vedba$Start <- as.POSIXct(strptime(vedba$Midway,format="%d/%m/%Y %H:%M:%S"),tz="GMT")
head(vedba)
##cut data
cts <- cut(as.numeric(vedba$Start),breaks=c(breaks[1] - 3600,breaks))
## which dive do each vedba event belong
cts_splt <- strsplit(as.character(cts)[!is.na(cts)],split=",")
cts_splt <- unlist(cts_splt)
cts_splt <- cts_splt[seq(1,length(cts_splt), by=2)]
substring(cts_splt,1) <- "0"
cts_splt <- as.numeric(cts_splt)
dive_no <- match(cts_splt,as.numeric(begin))
Yet when I run it, I receive the following error:
Error in seq.default(1, length(cts_splt), by = 2) :
wrong sign in 'by' argument
I am stumped and can't fix it. I have used this argument before and haven't had the error so something must be wrong with my data set. Any clues?
I have uploaded an image of what my data looks like.
My vedba_in data