how I must use digits function in matlab - matlab

i have code and use double function several time to convert sym to double.to increase precision , I want to use digits function.
I want to know it is enough that I write digits in the top of code or I must write digits in above of every double function.

digits set's the precision until it is changed again. Calling digits() without any input you get the precision to verify it's set correct.
In many cases digis has absoluetly no influence on symbolic variables because an analytical solution is found. This means there are no precision errors unless you convert to double. When convertig, digits should be set to at least 16 because this matches double precision.

Related

Precision of double values in Spark

I am reading some data from a CSV file, and I have custom code to parse string values into different data types. For numbers, I use:
val format = NumberFormat.getNumberInstance()
which returns a DecimalFormat, and I call parse function on that to get my numeric value. DecimalFormat has arbitrary precision, so I am not losing any precision there. However, when the data is pushed into a Spark DataFrame, it is stored using DoubleType. At this point, I am expecting to see some precision issues, however I do not. I tried entering values from 0.1, 0.01, 0.001, ..., 1e-11 in my CSV file, and when I look at the values stored in the Spark DataFrame, they are all accurately represented (i.e. not like 0.099999999). I am surprised by this behavior since I do not expect a double value to store arbitrary precision. Can anyone help me understand the magic here?
Cheers!
There are probably two issues here: the number of significant digits that a Double can represent in its mantissa; and the range of its exponent.
Roughly, a Double has about 16 (decimal) digits of precision, and the exponent can cover the range from about 10^-308 to 10^+308. (Obviously, the actual limits are set by the binary representation used by the ieee754 format.)
When you try to store a number like 1e-11, this can be accurately approximated within the 56 bits available in the mantissa. Where you'll get accuracy issues is when you want to subtract two numbers that are so close together that they only differ by a small number of the least significant bits (assuming that their mantissas have been aligned shifted so that their exponents are the same).
For example, if you try (1e20 + 2) - (1e20 + 1), you'd hope to get 1, but actually you'll get zero. This is because a Double does not have enough precision to represent the 20 (decimal) digits needed. However, (1e100 + 2e90) - (1e100 + 1e90) is computed to be almost exactly 1e90, as it should be.

Can the Postgres data type NUMERIC store signed values?

In PostgreSQL, I would like to store signed values -999.9 - 9999.9.
Can I use numeric(5.1) for this?
Or what type should I use?
You can certainly use the arbitrary precision type numeric with a precision of 5 and a scale of 1, just like #Simon commented, but without the syntax error. Use a comma(,) instead of the dot (.) in the type modifier:
SELECT numeric(5,1) '-999.9' AS nr_lower
, numeric(5,1) '9999.9' AS nr_upper;
nr_lower | nr_upper
----------+----------
-999.9 | 9999.9
The minus sign and the dot in the string literal do not count against the allowed maximum of significant digits (precision).
If you don't need to restrict the length, just use numeric.
If you need to enforce minimum and maximum, add a check constraint:
CHECK (nr_column BETWEEN -999.9 AND 9999.9)
numeric stores your number exactly. If you don't need the absolute precision and tiny rounding errors are no problem, you might also use one of the floating point types double precision (float8) or real (float4).
Or, since you only allow a single fractional decimal digit, you can multiply by 10 and use integer, which would be the most efficient storage: 4 bytes, no rounding errors and fastest processing. Just use and document the number properly.
Details for numeric types in the manual.

Matlab double function outputs infinity when taking a big number of type "sym"

This is literally the number I obtain (from symsum function), which is of type sym:
a=328791078344903739363762093060350430076929707044786898291940722052812676355129485878814911641516759087483581972443760841410582114920781832660013389681326267351368505696628653562484228680842650173635989588528021721039959787053654401351638478786763875479187208098871238084448485336138651690856082810553570419028927840285091142054111375001
I would like to make mathematical operations (in particular, take a natural log) on this number and so want to transform it to double, however the output from double(a) is simply "Inf". How to go about this problem and convert it from "sum" to a numeric type?
Your number is ~3.3x10335 but the largest number that can be represented by MATLAB's double precision floating point numbers is ~1.8x10308 (see the output of realmax). Converting your number to double precision causes overflow because the number is larger than can be represented so MATLAB just returns Inf.
For an exhaustive overview of floating point representations and arithmetic, you can check out this PDF.
Can you count the digits and insert a decimal point before converting to double?
If so, take advantage of the fact that the natural log of a number that overflows may not itself overflow.
Using "^" for power, you can represent your number as 3.28791078344903739363762093060350430076929707044786898291940722052812676355129485878814911641516759087483581972443760841410582114920781832660013389681326267351368505696628653562484228680842650173635989588528021721039959787053654401351638478786763875479187208098871238084448485336138651690856082810553570419028927840285091142054111375001 * (10 ^ 335).
The decimal log of (10^335) is 335. Its natural log is 335*log(10).
The natural log of the original number is:
log(3.287910783449037393637620930603504300769297070447868982919407220528)
+ 335*log(10)
All inputs, intermediate results, and the final result of this calculation are in the double range.

How to use Bitxor for Double Numbers?

I want to use xor for my double numbers in matlab,but bitxor is only working for int numbers. Is there a function that could convert double to int in Matlab?
The functions You are looking for might be: int8(number), int16(number), uint32(number) Any of them will convert Double to an Integer, but You must pick the best one for the result You want to achieve. Remember that You cannot cast from Double to Integer without rounding the number.
If I understood You correcly, You could create a function that would simply remove the "comma" from the Double number by multiplying your starting value by 2^n and then casting it to Integer using any of the functions mentioned earlier, performing whatever you want and then returning comma to its original position by dividing the number by 2^n
Multiplying the starting value by 2^n is a hack that will decrease the rounding error.
The perfect value for n would be the number of digits after the comma if this number is relatively small.
Please also specify, why are You trying to do this? This doesn't seem to be the optimal solution.
You can just cast to an integer:
a = 1.003
int8(a)
ans =
1
That gives you an 8 bit signed integer, you can also get other size i.e. int16 or else unsigned i.e. uint8 depending on what you want to do

Getting double precision in fortran 90 using intel 11.1 compiler

I have a very large code that sets up and iteratively solves a system of non-linear partial differential equation, written in fortran. I need all variables to be double precision. In the additional module that I have written for the code, I declare all variables as the double precision type, but my module still uses variables from the old source code that are declared as type real. So my question is, what happens when a single-precision variable is multiplied by a double precision variable in fortran? Is the result double precision if the variable used to store the value is declared as double precision? And what if a double precision value is multiplied by a constant without the "D0" at the end? Can I just set a compiler option in Intel 11.1 to make all real/double precision/constants of double precision?
So my question is, what happens when a single-precision variable is multiplied by a double precision variable in fortran? The single precision is promote to double precision and the operation is done in double precision.
Is the result double precision if the variable used to store the value is declared as double precision? Not necessarily. The right-hand side is an expression that doesn't "know" about the precision of the variable on the left hand side, in to which it will be stored. If you have Double = SingleA * SingleB (using names to indicate the types), the calculation will be performed in single precision, then converted to double for storage. This will NOT gain extra precision for the calculation!
And what if a double precision value is multiplied by a constant without the "D0" at the end? This is just like the first question, the constant will be promoted to double precision and the calculation done in double precision. However, the constant is still single precision and even if you wrote down many digits as for a double-precision constant, the internal storage is single precision and cannot represent that accuracy. For example, DoubleVar * 3.14159265359 will be calculated in double precision, but will be something approximating DoubleVar * 3.14159 done in double precision.
If you want to have the compiler retain many digits in a constant, you must specific the precision of a constant. The Fortran 90 way to do this is to define your own real type with whatever precision that you need, e.g., to require at least 14 decimal digits:
integer, parameter :: DoubleReal_K = selected_real_kind (14)
real (DoubleReal_K) :: A
A = 5.0_DoubleReal_K
A = A * 3.14159265359_DoubleReal_K
The Fortran standard is very specific about this; other languages are like this, too, and it's really what you'd expect. If an expression contains an operation on two floating-point variables of different precisions, then the expression is of the type of the higher-precision operand. eg,
(real variable) + (double variable) -> (double)
(double variable)*(real variable) -> (double)
(double variable)*(real constant) -> (double)
etc.
Now, if you are storing the result in a lower-precision floating point variable, it'll get down-converted again. But if you are storing it in a variable of the higher precision, it'll maintain it's precision.
If there's any cases where you're concerned that a single-precision floating point variable is causing a problem, you can force it to be converted to double precision
using the DBLE() intrinsic:
DBLE(real variable) -> double
If you write numbers in the form 0.1D0 it will treat it as double precision number, otherwise if you write 0.1, the precision will be lost in the conversion.
Here is an example:
program main
implicit none
real(8) a,b,c
a=0.2D0
b=0.2
c=0.1*a
print *,a,b,c
end program
When compiled with
ifort main.f90
I get results:
0.200000000000000 0.200000002980232 2.000000029802322E-002
When compiled with
ifort -r8 main.f90
I get results:
0.200000000000000 0.200000000000000 2.000000000000000E-002
If you use the IBM XLF compiler, the equivalence is
xlf -qautodbl=dbl4 main.f90
Jonathan Dursi's answer is correct - the other part of your question was if there was a way to make all real variables double precision.
You can accomplish this with the ifort compiler by using the -i8 (for integers) and -r8 (for reals) options. I'm not sure if there is a way to force the compiler to interpret literals as double-precision without specifying them as such (e.g. by changing 3.14159265359 to 3.14159265359D0) - we ran into this issue a while back.