Exponentiate linear regression coefficient using the gtsummary package - gtsummary

I want to exponentiate my loge values in the gtsummary package but I am getting error when I said exponentiate= TRUE.
Can someone kindly help me?
mu <- lm(log(iron)~ treatment + log(erythroferrone)+ log(epo)+ log(crp)+ log(hepciden), data=endline1) %>% tbl_regression(exponentiate = TRUE)%>% as_gt() theme_gtsummary_journal(journal = "nejm")
Error in tidy_and_attach(., tidy_fun = tidy_fun, conf.int = conf.int, : exponentiate = TRUE is not valid for this type of model.

Related

Set specified factor level as reference in GT regression?

I am using gtsummary package to generate tables from logistic regressions.
I would like to, for example, use the stage level "T3" in the trial data as the reference level, instead of the default "T1". How can I do that within this example code?
I aim to do this for both univariate and multiple variable logistic regression, so I am presuming the answer shall work on both scenarios.
library(gtsummary)
library(dplyr)
trial %>%
dplyr::select(age, trt, marker, stage, response, death, ttdeath) %>%
tbl_uvregression(
method = glm,
y = death,
method.args = list(family = binomial),
exponentiate = TRUE,
pvalue_fun = function(x) style_pvalue(x, digits = 2)) %>%
# overrides the default that shows p-values for each level
add_global_p() %>%
# adjusts global p-values for multiple testing (default method: FDR)
add_q() %>%
# bold p-values under a given threshold (default 0.05)
bold_p() %>%
# now bold q-values under the threshold of 0.10
bold_p(t = 0.10, q = TRUE) %>%
bold_labels() %>% as_gt()
Sincerely,
nelly
I managed to solve my own problem by using the forcats function "fct_relevel" to set the desired level for a categorical variable as the reference.
trial$stage <- forcats::fct_relevel(trial$stage, "T3")

MNIST dataset boosting

I am trying to apply Gradient Boosting to the MNIST dataset. This is my code:
library(dplyr)
library(caret)
mnist <- snedata::download_mnist()
mnist_num <- as.data.frame(lapply(mnist[1:10000,], as.numeric)) %>%
mutate(id = row_number())
mnist_num <- mnist_num[,sapply(mnist_num, function(x){max(x) - min(x) > 0})]
mnist_train <- sample_frac(mnist_num, .70)
mnist_test <- anti_join(mnist_num, mnist_train, by = 'id')
set.seed(5000)
library(gbm)
boost_mnist<-gbm(Label~ .,data=mnist_train, distribution="bernoulli", n.trees=70,
interaction.depth=4, shrinkage=0.3)
It shows the following error:
"Error in gbm.fit(x = x, y = y, offset = offset, distribution = distribution, : Bernoulli requires the response to be in {0,1}"
What is wrong here? Can anyone show me the code to correctly do it?
The error
Error in gbm.fit(x = x, y = y, offset = offset, distribution = distribution, : Bernoulli requires the response to be in {0,1}
is due to the choice of the distribution, you should choose the multinomial instead of the bernoulli, because the bernoulli distribution only works with dichotomous response and the mnist label goes from 1 to 10.

Multivariate linear regression : TensorFlow vs Excel

For my job, i have to implement some simple machine learning...
I didn't have a big background on math (so it's hard to understand what i'm doing).
So my first attempt to do something with TF, is to compute Multivariate linear regression...
I take this litte data pool:
Conso,DJUclim,DJUchau
171408,0,282.8
151620,0.9,171.6
164475,2.7,137.8
153866,10,99.5
162933,65.6,32.4
188475,183,0.8
210994,231.5,0.2
222873,256.3,0
179239,109.9,9.2
159162,45.9,32.5
158104,4.7,142.6
174184,0.6,227.9
And try to found the best value for X1, X2, and B for Conso = X1*DJUchau + X2*DJUclim + B
With excel i have found :
X1 = 118.734745
X2 = 306.035978
B = 140288,882921
and
r_square = 94.8375%
rmse = 5660.507380
Then I try do samething with TensorFlow ...
after +14k iteration i found :
X1 = 118.689559
X2 = 305.991638
B = 140296.921875
and
r_square = 94.8367%
rmse = 4902.14502
Why i don't have same value ?
What is the rightest result ? ( excel or my ML ) ?
Why Excel do this instantly and why tensorflow need lot of trainning ?
TF for simple regression is too overkill and reduce perf ?

Nonlinear Regression & Optimisation

I need some help... I have got a model function which is :
function y = Surf(param,x);
global af1 af2 tData % A2 mER2
A1 = param(1); m1 = param(2); A2 = param(3); m2 = param(4);
m = param(5); n = param(6);
k1 = #(T) A1*exp(mER1/T);
k2 = #(T) A2*exp(mER2/T);
af = #(T) sech(af1*T+af2);
y = zeros(length(x),1);
for i = 1:length(x)
a = x(i,1); T = temperature(i,1);
y(i) = (k2(T)+k1(T)*(a.^m))*((af(T)-a).^n);
end
end
And I have got a set of Data giving Cure, Cure_rate, Temperature. Which are all in a single vertical column matrix.
Basically, I tried to use :
[output,R1] = lsqcurvefit(#Surf, initial_guess, Cure, Cure_rate)
[output2,R2] = nlinfit(Cure,Cure_rate,#Surf,initial_guess)
And they works pretty well, (my initial_guess are initial guess of parameters in the above model which is in : [1.1e+07 -7.8e+03 1.2e+06 -7.1e+03 2.2 0.72])
My main problem is, when I try to look into different methods which could do nonlinear regression such as fminsearch, fmincon, fsolve, fminunc, etc. They just dont work and I am quite confused about the input that I am considering. Mainly beacuse they dont work as same as nlinfit and lsqcurvefit (input of Cure, Cure_rate), most of them considered the model function and the initial guess only, The way I did the above:
output3 = fminsearch(#Surf,initial_guess)
output4 = fsolve(#Surf,initial_guess)
output5 = fmincon(#Surf,x0,A,b,Aeq,beq)
(Not sure what should I put for Linear Inequality Constraint:
A,b and Aeq,beq )
output6 = fminunc(#Surf,initial_guess)
The problem is Matlab keep saying either I have not enough input or too many input which I don't get it and how should I include my Dataset in the fitting function (Cure, Cure_rate) in the above functions, like in nlinfit and lsqcurvefit?

First non demo example for Gaussian process using GPML (Matlab)?

After having some basics understanding of GPML toolbox , I written my first code using these tools. I have a data matrix namely data consist of two array values of total size 1000. I want to use this matrix to estimate the GP value using GPML toolbox. I have written my code as follows :
x = data(1:200,1); %training inputs
Y = data(1:201,2); %, training targets
Ys = data(201:400,2);
Xs = data(201:400,1); %possibly test cases
covfunc = {#covSE, 3};
ell = 1/4; sf = 1;
hyp.cov = log([ell; sf]);
likfunc = #likGauss;
sn = 0.1;
hyp.lik = log(sn);
[ymu ys2 fmu fs2] = gp(hyp, #infExact, [], covfunc, likfunc,X,Y,Xs,Ys);
plot(Xs, fmu);
But when I am running this code getting error 'After having some basics understanding of GPML toolbox , I written my first code using these tools. I have a data matrix namely data consist of two array values of total size 1000. I want to use this matrix to estimate the GP value using GPML toolbox. I have written my code as follows :
x = data(1:200,1); %training inputs
Y = data(1:201,2); %, training targets
Ys = data(201:400,2);
Xs = data(201:400,1); %possibly test cases
covfunc = {#covSE, 3};
ell = 1/4; sf = 1;
hyp.cov = log([ell; sf]);
likfunc = #likGauss;
sn = 0.1;
hyp.lik = log(sn);
[ymu ys2 fmu fs2] = gp(hyp, #infExact, [], covfunc, likfunc,X,Y,Xs,Ys);
plot(Xs, fmu);
But when I am running this code getting:
Error using covMaha (line 58) Parameter mode is either 'eye', 'iso',
'ard', 'proj', 'fact', or 'vlen'
Please if possible help me to figure out where I am making mistake ?
I know this is way late, but I just ran into this myself. The way to fix it is to change
covfunc = {#covSE, 3};
to something like
covfunc = {#covSE, 'iso'};
It doesn't have to be 'iso', it can be any of the options listed in the error message. Just make sure your hyperparameters are set correctly for the specific mode you choose. This is detailed more in the covMaha.m file in GPML.