#gtsummary Odds Ratio in gtsummary - gtsummary

#gtsummary
Is there a way to make a table that has both Odds Ratio and adjusted Odds Ratio in gtsummary?
#RStats
I tried using tbl_reg however that has only adjusted OR and 95CI post regression. I want both in the same table.

I think these slides have the solution you're after: https://www.danieldsjoberg.com/clinical-reporting-gtsummary-rmed/slides/#/tbl_merge-for-side-by-side-tables
You construct two tables: one with univariate results using tbl_uvregression() and a second with adjusted ORs using tbl_regression(). Then you combine them using tbl_merge()

Related

How to pass a vector from tableau to R

I have a need to pass a vector of arguments to Rserve from tableau. Specifically, I am using IRR calculations in R (on Rserve), and i want to pass vector of cash-flows that are as columns in my table (instead of rows/measure). So, i want to collect all those CF in a vector and pass it on to Rserve. Passing them one at a time slows down IO.
SCRIPT_REAL("r_func(c(.arg1, .arg2, .arg3))",sum(cf1), sum(cf2), sum(cf3))
cf1..cfn are cashflows corresponding to various periods. Above code works well when cf are few but takes a long time when i have few hundereds. Further, time spent is not in calculation but IO when communicating with remote Rserve. If i have a local Rserve, this calculation happens under few seconds while on remote, it takes well over a minute.
Also, want to point out that tableau / Rserve, set one argument after another and that takes time. My expectation is that once i have a vector, it would be just 1 transfer and setting of arguments, and therefore this should speed up
The first step in understanding how Tableau interacts with R or Python, is understanding how Tableau's table calcs work.
Tableau Script_XXX() functions are table calculations which means that you invoke them on a vector of aggregate query results and the corresponding R or Python code needs to return a vector usually of the same size. (I think you may be able to return a scalar or smaller vector which gets replicated to appear like a vector of the same size as the argument -- but not certain)
You can control how your data is partitioned into vectors, and also the ordering of data in the vectors, by editing the table calc to specify the partitioning and addressing for that calc.
Partitioning determines how your aggregate query results are broken up into vectors for calculation purposes. Addressing determines how the elements of each vector are ordered. You can either do that based on the physical layout of the table structure, or (better) based on the specific dimensions.
See the Tableau on-line help for table calcs for more info, and look online training videos from Tableau or blog entries (especially from anyone named Bora)
One way to test your understanding of these concepts is create a Tableau table (i.e., a viz with a mark type of text) with several dimensions on row and column shelves. Then create calculated fields for INDEX() and SIZE() and display them on text. Finally, change the partitioning and addressing in different ways by editing those table calcs. Try several different permutations. When you can confidently predict what those functions will produce for different settings, then you're ready to do more complex tasks - such as talking to R.
It is also instructive to experiment with FIRST(), LAST(), LOOKUP(), WINDOW_SUM() etc -- and finally dig into PREVIOUS_VALUE(). Warning, PREVIOUS_VALUE() is a bit odd, and does not behave the way you probably assume it does. Still, it is a useful technique that can implement a recursive calculation, and is about as close to a for loop as Tableau gets.

How to sort by any measure in a Tableau table

I've built a new worksheet that has two dimensions and several facts. When I try to sort on any column, it only seems to sort within the dimensions. Is it possible to sort based on the column, ignoring dimensions? I find if I concatenate the two dimensions into one... that does work, but is not ideal.
Ah yes, sorting in Tableau. Took me a long time to understand it. It doesn't do sorting the way you would expect in other tools like Excel. This is because it's grouping dimensions from left to right. Think of each dimension getting nested inside the one to the left of it. Another way to think of it is that Tableau doesn't sort measures, it sorts dimensions based on some value a measure. That's why concatenating dimensions will yield the expected result, because you have just one calculated dimension and that dimension gets sorted by the value of a measure. You can right click on the concatenated dimension in your Rows shelf and choose Show Header. That's probably your best bet.
See this article from The Information Lab on the sorting in Tableau: https://www.theinformationlab.co.uk/2014/11/03/understanding-sorting-tableau/
There are some Tableau Community posts about it too.
https://community.tableau.com/thread/118958
https://community.tableau.com/thread/221956
https://community.tableau.com/thread/164714

Normalized histogram in MATLAB incorrect?

I have the following set of data:
X=[4.692
6.328
4.677
6.836
5.032
5.269
5.732
5.083
4.772
4.659
4.564
5.627
4.959
4.631
6.407
4.747
4.920
4.771
5.308
5.200
5.242
4.738
4.758
4.725
4.808
4.618
4.638
7.829
7.702
4.659]; % Sample set
I fitted a Pareto distribution to this using the maximum likelihood method and I obtain the following graph:
Where the following bit of code is what draws the histogram:
[N,edges,bin] = histcounts(X,'BinMethod','auto');
bin_middles=mean([edges(1:end-1);edges(2:end)]);
f_X_sample=N/trapz(bin_middles,N);
bar(bin_middles,f_X_sample,1);;
Am I doing this right? I checked 100 times and the Pareto distribution is indeed optimal, but it seems awfully different from the histogram. Is there an error that may be causing this? Thank you!
I would agree with #tashuhka's comment that you need to think about how you're binning your data.
Imagine the extreme case where you lump everything together into one bin, and then try to fit that single point to a distribution. Your PDF would look nothing like your single square bar. Split into two bins, and now the fit still sucks, but at least one bar is (probably) a little bigger than the other, etc., etc. At the other extreme, every data point has its own bar and the bar graph is nothing but a random forest of bars with only one count.
There are a number of different strategies for choosing an "optimal" bin size that minimizes the number of bins but maximizes the representation of the underlying PDF.
Finally, note that you only have 30 points here, so your other problem may be that you just haven't collected enough data to really nail down the underlying PDF.

how to compare each record with the median of the whole set of data in tableau

I have a calculated field for the median, but I want to filter on each record, compare it with the median. The issue is when filtered, the median becomes the record itself. how can I fixed the median?
Thanks.
If you are using the latest Tableau, then you can use LOD calculation as advised by Alex. In your case, simply wrapping it in {} should do the trick: {MEDIAN([Sales])}
This will configure the calculation to ignore the effects of the filter. But if your worksheet is complicated, then more modifications to the formula might be required.
If you are using older versions of Tableau, then there is a more complicated workaround with table calculations and special filters to achieve the same result.

Gaussian Mixture Modelling Matlab

Im using the Gaussian Mixture Model to estimate loglikelihood function(the parameters are estimated by the EM algorithm)Im using Matlab...my data is of the size:17991402*1...17991402 data points of one dimension:
When I run gmdistribution.fit(X,2) I get the desired output
But when I run gmdistribution.fit(X,k) for k>2....the code crashes and I get the error"OUT OF MEMORY"..I have also tried an open source code which again gives me the same problem.Can someone help me out here?..Im basically looking for a code which will allow me to use different number of components on such a large dataset.
Thanks!!!
Is it possible for you to decrease the iteration time? The default is 100.
OPTIONS = statset('MaxIter',50,'Display','final','TolFun',1e-6)
gmdistribution.fit(X,3,OPTIONS)
Or you may consider under-sampling the original data.
A general solution to out of memory problem is described in this document.