gtsummary decimals mean and SD - changing defaults - gtsummary

I would like to obtain 3 decimals for continuous variables outputs in gtsummary for both mean and SD. In the default setting it exports 2 decimals; in a prior post there is a way to change the percentage in categorical variables to 1 decimal
# set theme where percentages are rounded to 1 decimal place
set_gtsummary_theme(list(
"tbl_summary-fn:percent_fun" = function(x) sprintf(x * 100, fmt='%#.1f')
))
Is there a way to change it for Mean and SD for continuous variables?
Thanks!
#gtsummary

The default number of decimal places for continuous variables to be rounded isn't 2 decimal places. The number of decimal places is determined by the spread of the data, e.g. variables with large spread may be rounded to the nearest integer, and variables with a small spread may be rounded to 2 or 3 decimal places. Unfortunately, there is no way to change the default globally. You'll need to use the tbl_summary(digits=) argument to change number of decimal places a variable's summary statistics are rounded to.

Related

Checking if there are rounding issues for epoch figures

I have an array of integers (they are actually epochs) and I would like to check if they can be represented in double precision floating point without rounding issues.
So I have a large n rows by 1 column array like this:
1104757200
1104757320
1135981260
1135981560
1135982040
1135982280
1135982340
1135982580
1135982880
1135983420
1135984020
1135984140
1135984200
1135985100
1135985340
And I would like to know if they can be stored without losing precision as double precision floating point numbers.
The output could be another array -vector- with 0 or 1 depending on if the number can be represented without losing precision or not.
Any tips on how to do that check in Matlab would be welcomed.

In Progress 4GL, is there a way to convert a string to a decimal without any loss in precision?

Say that I needed to turn the character variable "0.0000000001" into a decimal. But if I were to write out the following logic:
define variable tinyChar as character initial "0.0000000001" no-undo.
define variable tinyNum as decimal no-undo.
assign tinyNum = decimal(tinyChar).
display tinyNum.
It produces this result:
0.00
So that must not be the solution, and truncating would also just remove the data I'm trying to preserve. Does anyone know how I can preserve the precision of small decimal numbers? It doesn't have to be this crazy case here to the ten-billionth place, but having at least 7 or 8 numbers of precision would help with my issue.
Variables (the DEFINE VARIABLE statement) default to the maximum of 10 decimal places so you do not need to actually declare the precision.
Your problem is really that the DISPLAY format is decoupled from the internal data representation and precision (this is also true for lots of other field and variable attributes like character string width).
This is a feature. But one that people new to OpenEdge often trip over. It is very common to think that the display format constrains the storage of the field or variable. It does not. (Much to the chagrin of anyone using SQL to access the data.)
You can either change the variable's default format as Mike indicates by changing the DEFINE VARIABLE, or you can override the format in any individual DISPLAY.
So the minimum change needed to your code is:
define variable tinyChar as character initial "0.0000000001" no-undo.
define variable tinyNum as decimal no-undo.
assign tinyNum = decimal(tinyChar).
display tinyNum format "9.9999999999".
Regardless of the DISPLAY format any calculations you might do with the variable will use all 10 decimal places.
Decimal fields defined in a database schema (as opposed to variables) default to 2 decimal places. You can change that when you create them.
Calculated values assigned to a decimal field or variable will be ROUNDED to the precision defined.
Per the doc[1] the DECIMAL type stores up to 10 decimal places. If you have more, I think it truncates the value.
[1] https://docs.progress.com/bundle/openedge-abl-reference-122/page/DEFINE-VARIABLE-statement.html
Try using the DECIMALS and FORMAT options when you define the DECIMAL variable:
define variable tinyChar as character initial "0.0000000001" no-undo.
define variable tinyNum as decimal no-undo
DECIMALS 10 FORMAT "9.9999999999".
assign tinyNum = decimal(tinyChar).
display tinyNum.
Consider converting it by multiplying with a constant (1000000 or something similat). Perhaps even make it an integer?
Progress will not be the best environment for such small floating point numbers. If you depend on the calculations precision will suffer!

Precision of double values in Spark

I am reading some data from a CSV file, and I have custom code to parse string values into different data types. For numbers, I use:
val format = NumberFormat.getNumberInstance()
which returns a DecimalFormat, and I call parse function on that to get my numeric value. DecimalFormat has arbitrary precision, so I am not losing any precision there. However, when the data is pushed into a Spark DataFrame, it is stored using DoubleType. At this point, I am expecting to see some precision issues, however I do not. I tried entering values from 0.1, 0.01, 0.001, ..., 1e-11 in my CSV file, and when I look at the values stored in the Spark DataFrame, they are all accurately represented (i.e. not like 0.099999999). I am surprised by this behavior since I do not expect a double value to store arbitrary precision. Can anyone help me understand the magic here?
Cheers!
There are probably two issues here: the number of significant digits that a Double can represent in its mantissa; and the range of its exponent.
Roughly, a Double has about 16 (decimal) digits of precision, and the exponent can cover the range from about 10^-308 to 10^+308. (Obviously, the actual limits are set by the binary representation used by the ieee754 format.)
When you try to store a number like 1e-11, this can be accurately approximated within the 56 bits available in the mantissa. Where you'll get accuracy issues is when you want to subtract two numbers that are so close together that they only differ by a small number of the least significant bits (assuming that their mantissas have been aligned shifted so that their exponents are the same).
For example, if you try (1e20 + 2) - (1e20 + 1), you'd hope to get 1, but actually you'll get zero. This is because a Double does not have enough precision to represent the 20 (decimal) digits needed. However, (1e100 + 2e90) - (1e100 + 1e90) is computed to be almost exactly 1e90, as it should be.

Number of decimal digits to show

How to change the number of decimal digits?
Changing the format Matlab can show only 4 (if short) or 15 (if long). But I want exactly 3 digits to show.
To elaborate on Hamataro's answer, you could also use roundn function to round to a specific decimal precision, e.g.: roundn(1.23456789,-3) will yield 1.235. However, Matlab will still display the result in either of the formats you have mentioned, i.e 1.2350 if format is set to short, and 1.235000000000000 if format is set for long.
Alternatively, if you use sprintf, you can use the %g formatting option to display only a set number of digits, regardless of where the decimal point is. sprintf('%0.3g',1.23456789) yields 1.23; sprintf('%0.3g',12.3456789) yields 12.3
You can either use sprintf or do *
var2 = round(var1*1000)/1000

Matlab precion when specifying fractions

I wanted to create a vector with three values 1/6, 2/3 and 1/6. Obviously I Matlab has to convert these rational numbers into real numbers but I expected that it would maximize the precision available.
It's storing the values as doubles but it's storing them as -
b =
0.1667 0.6667 0.1667
This is a huge loss of precision. Isn't double supposed to mean 52 bits of accuracy for the fractional part of the number, why are the numbers truncated so severly?
The numbers are only displayed that way. Internally, they use full precision. You can use the format command to change display precision. For example:
format long
will display them as:
0.166666666666667 0.666666666666667 0.166666666666667
So the answer is simple; there is no loss of precision. It's only a display issue.
You can read the documentation on what other formats you can use to display numbers.
you can not store values as 1/2 or 1/4 or 1/6 in to a Double variable... these are stored as decimals behind the system; if you want to store these values , try storing it as string that would work;
Whenever you want to make mathematical calculation using these strings then convert the value into number and continue....