Source qualifier allowing value more than defined size in Informatica

Source qualifier allowing value more than defined size in Informatica - workflow

One of my informatica session behaving weird. For a particular column precision and scale values are defined as 17,16respectively and enable the high precision is on.
As per the precision and scale values it should allow the numbers which are having only single digit before the decimal point but in my session it allowing upto 2digits before the decimal point and failing for 3digits before the decimal point with the error Invalid number
. I am confused why it's allowing 2digits numbers as it should fail for them too?
Ex: precision 17,scale 16
1.4567--allowed
12.4567--allowed
123.4567-- rejected
In addition to it I observed that in the source that column has data type as number but coming to source qualifier same column has data type decimal. Why this internal convertion happened?
Can anyone help on this?

As for the second part of the question: one of Source Qualifier transformation fuctions is to convert any Source data types to PowerCenter data types. Whatever source you will use, Source Qualifier will convert the data types. This way you end up with same data types and comparison between Orace and MS SQL Server won't be a problem.

This is whats happening in case of '12.4567 -- allowed'. Since you have high precision mode on, infa is converting decimal to double automatically. Double is 8byte data. So, if you set something
col(11,9) and trying to insert 1234567890.1, infa will fail.
Summary -
If you specify the precision greater than the maximum number of digits, the Data Integration Service converts decimal values to double in high precision mode.
This is as per Infa documentation
8 bytes (if high precision is off or precision is greater than 38)
16 bytes (if precision <= 18 and high precision is on)
20 bytes (if precision > 18 and <= 28)
24 bytes (if precision > 28 and <= 38)
Decimal value with declared precision and scale. Scale must be less than or equal to precision.
For transformations that support precision up to 38 digits, the precision is 1 to 38 digits, and the scale is 0 to 38.
For transformations that support precision up to 28 digits, the precision is 1 to 28 digits, and the scale is 0 to 28.
If you specify the precision greater than the maximum number of digits, the Data Integration Service converts decimal values to double in high precision mode.
https://docs.informatica.com/data-integration/powercenter/10-2/developer-tool-guide/data-type-reference/transformation-data-types.html

Related

Storing values as real data type in PostgreSQL. Confusion about the concept of precision

I am confused about the real data type in PostgreSQL:
CREATE TABLE number_data_types (
numeric_column numeric(20,5),
real_column real,
double_column double precision
);
INSERT INTO number_data_types
VALUES (.7, .7, .7),
(2.13579, 2.13579, 2.13579),
(2.1357987654, 2.1357987654, 2.1357987654)
);
SELECT * FROM number_data_types;
The output in the 3rd row, 2nd column is 2.1357987. Since the real data type in PostgreSQL has a precision of 6, the number of digits in a number that can be stored is 6. I expected to see the number 2.13579, because there are 6 digits in the number. What's wrong with my thought?
In my textbook, the author writes: "On the third row, you see PostgreSQL's default behavior in those two columns, which is to output floating-point numbers using their shortest precise decimal representation rather than show the entire value."

The precision on a real is at least 6 digits. From the docs...
On all currently supported platforms, the real type has a range of around 1E-37 to 1E+37 with a precision of at least 6 decimal digits.
When displayed a real (aka float4) can display up to 9 digits.
By default, floating point values are output in text form in their shortest precise decimal representation; the decimal value produced is closer to the true stored binary value than to any other value representable in the same binary precision. (However, the output value is currently never exactly midway between two representable values, in order to avoid a widespread bug where input routines do not properly respect the round-to-nearest-even rule.) This value will use at most 17 significant decimal digits for float8 values, and at most 9 digits for float4 values.
Floating point numbers try to cram a lot of precision into a very small space. As a result they have a lot of quirks.

Postgres floating point math - do I need to do anything special?

I am new to PG and I'm wondering if I need to 'do anything' extra to properly handle floating-point math.
For example, in ruby you use BigDecimal, and in Elixir you use Decimal.
Is what I have below the best solution for PG?
SELECT
COALESCE(SUM(active_service_fees.service_fee * (1::decimal - active_service_fees.withdraw_percentage_discount)), 0)
FROM active_service_fees
Data types:
service_fee integer NOT NULL
withdraw_percentage_discount numeric(3,2) DEFAULT 0.0 NOT NULL

It depends on what you want.
If you want floating point numbers you need to use the data types real or double precision, depending on your precision requirements.
These floating point numbers need a fixed space (4 or 8 bytes), are stored in binary representation and have limited precision.
If you want arbitrary precision, you can use the binary coded decimal type numeric (decimal is a synonym for it).
Such values are stored as decimal digits, and the amount of storage required depends on the number of digits.
The big advantage of floating point numbers is performance – floating point arithmetic is implemented in hardware in the processor, while arithmetic on binary coded decimals is implemented in PostgreSQL.
A rule of thumb would be:
If you need values that are exact up to a certain number of decimal places (like monetary data) and you don't need to do a lot of calculations, use decimal.
If you need to do number crunching and you don't need values rounded to a fixed precision, use double precision.

Efficiently Store Decimal Numbers with Many Leading Zeros in Postgresql

A number like:
0.000000000000000000000000000000000000000123456
is difficult to store without a large performance penalty with the available numeric types in postgres. This question addresses a similar problem, but I don't feel like it came to an acceptable resolution. Currently one of my colleagues landed on rounding numbers like this to 15 decimal places and just storing them as:
0.000000000000001
So that the double precision numeric type can be used which prevents the penalty associated with moving to a decimal numeric type. Numbers that are this small for my purposes are more or less functionally equivalent, because they are both very small (and mean more or less the same thing). However, we are graphing these results and when a large portion of the data set would be rounded like this it looks exceptionally stupid (flat line on the graph).
Because we are storing tens of thousands of these numbers and operating on them, the decimal numeric type is not a good option for us as the performance penalty is too large.
I am a scientist, and my natural inclination would just be to store these types of numbers in scientific notation, but it does't appear that postgres has this kind of functionality. I don't actually need all of the precision in the number, I just want to preserve 4 digits or so, so I don't even need the 15 digits that the float numeric type offers. What are the advantages and disadvantages of storing these numbers in two fields like this:
1.234 (real)
-40 (smallint)
where this is equivalent to 1.234*10^-40? This would allow for ~32000 leading decimals with only 2 bytes used to store them and 4 bytes to store the real value, for a total of maximally 6 bytes per number (gives me the exact number I want to store and takes less space than the existing solution which consumes 8 bytes). It also seems like sorting these numbers would be much improved as you'd need only sort on the smallint field first followed by the real field second.

You and/or your colleague seem to be confused about what numbers can be represented using the floating point formats.
A double precision (aka float) number can store at least 15 significant digits, in the range from about 1e-307 to 1e+308. You have to think of it as scientific notation. Remove all the zeroes and move that to the exponent. If whatever you have once in scientific notation has less than 15 digits and an exponent between -307 and +308, it can be stored as is.
That means that 0.000000000000000000000000000000000000000123456 can definitely be stored as a double precision, and you'll keep all the significant digits (123456). No need to round that to 0.000000000000001 or anything like that.
Floating point numbers have well-known issue of exact representation of decimal numbers (as decimal numbers in base 10 do not necessarily map to decimal numbers in base 2), but that's probably not an issue for you (it's an issue if you need to be able to do exact comparisons on such numbers).

What are the advantages and disadvantages of storing these numbers in
two fields like this
You'll have to manage 2 columns instead of one.
Roughly, what you'll be doing is saving space by storing lower-precision floats. If you only need 4 digits of precision, you can go further and save 2 more bytes by using smallint + smallint (1000-9999 + exponent). Using that format, you could cram the two smallint into one 32 bits int (exponent*2^16 + mantissa), that should work too.
That's assuming that you need to save storage space and/or need to go beyond the +/-308 digits exponent limit of the double precision float. If that's not the case, the standard format is fine.

Datatype in progress 4gl

Integer and decimal datatype accepts only 10 digits after that getting error message value too large to fit in integer or decimal.What is the maximum limit of integer and decimal datatype in progress 4gl?Is it possible to print 100 digits after decimal place in progress 4gl?

No, it is not possible to print 100 digits after the decimal in Progress. But honestly, why would you?
If you don't need to do that specific calculations you can always use a CHARACTER field.
From the F1-help:
DECIMAL
DECIMAL data consists of decimal numbers up to 50 digits in length
including up to 10 digits to the right of the decimal point.
INTEGER
An INTEGER consists of 32-bit data (whole numbers).
(Integer must be between -2147483648 and 2147483647).
INT64
An INT64 consists of 64-bit data (whole numbers).
(INT64 must be between -9223372036854775808 and 9223372036854775807)
Note that these are absolute limits and has nothing to do with the display format of the variable/field/widget. Display format (FORMAT statement) only affects the possibility to display and can be even more limiting (but can also be overridden programmatically).

Meaning of Precision Vs. Range of Double Types

To begin with, allow me to confess that I'm an experienced programmer, who has over 10 years of programming experience. However, the question I'm asking here is the one, which has bugged me ever since, I first picked up a book on C about a decade back.
Below is an excerpt from a book on Python, explaining about Python Floating type.
Floating-point numbers are represented using the native
double-precision (64-bit) representation of floating-point numbers on
the machine. Normally this is IEEE 754, which provides approximately
17 digits of precision and an exponent in the range of –308 to
308.This is the same as the double type in C.
What I never understood is the meaning of the phrase
" ... which provides approximately 17 digits of precision and an
exponent in the range of –308 to 308 ... "
My intuition here goes astray, since i can understand the meaning of precision, but how can range be different from that. I mean, if a floating point number can represent a value up to 17 digits, (i.e. max of 1,000,000,000,000,000,00 - 1), then how can exponent be +308. wouldn't that make a 308 digit number if exponent is 10 or a rough 100 digit number if exponent is 2.
I hope, I'm able to express my confusion.
Regards
Vaid, Abhishek

Suppose that we write 1500 with two digits of precision. That means we are being precise enough to distinguish 1500 from 1600 and 1400, but not precise enough to distinguish 1500 from 1510 or 1490. Telling those numbers apart would require three digits of precision.
Even though I wrote four digits, floating-point representation doesn't necessarily contain all these digits. 1500 is 1.5 * 10^3. In decimal floating-point representation, with two digits of precision, only the first two digits of the number and the exponent would be stored, which I will write (1.5, 3).
Why is there a distinction between the "real" digits and the placeholder zeros? Because it tells us how precisely we can represent numbers, that is, what fraction of their value is lost due to approximation. We can distinguish 1500 = (1.5, 3) from 1500+100 = (1.6, 3). But if we increase the exponent, we can't distinguish 15000 = (1.5, 4) from 15000+100 = (1.51, 4). At best, we can approximate numbers within +/- 10% with two decimal digits of precision. This is true no matter how small or large the exponent is allowed to be.

The regular decimal representation of numbers obscures the issue. If instead, one considers them in normalised scientific notation to separate the mantissa and exponent, then the distinction is immediate. Normalisation is achieved by scaling the mantissa until it is between 0.0 and 1.0 and adjusting the exponent to avoid loss of scale.
The precision of the number is the number of digits in the mantissa. A floating point number has a limited number of bits to represent this portion of the value. It determines how accurately numbers that are similar in size can be distinguished.
The range determines the allowable values of the exponent. In your example, the range of -308 through 308 is represented independently of the mantissa value and is limited by the number of bits in the floating point number allocated to storing the range.
These two values can be varied independently to suit. For example, in many graphic pipelines much smaller values are represented with truncated values that are scaled to fit into even 16 bits.
Numerical methods libraries expend large amounts of effort in ensuring that these limits are not exceeded to maintain the correctness of calculations. Casual use will not usually encounter these limits.
The choices in IEEE 754 are accepted to be a reasonable trade off between precision and range. The 32-bit single has similar, but different limits. The question What range of numbers can be represented in a 16-, 32- and 64-bit IEEE-754 systems? provides a longer summary and further links for more study.

From Wiki, Double Precision Floating Point numbers are expected to have a precision to 17 digits, or 17 SF. The exponent can be in the range -1022 to 1023.
Their -308 to 308 would appear to be an error, or else an idea not fully explained.