This question already has answers here:
Oracle NUMBER(p) storage size?
(3 answers)
Closed 1 year ago.
I have been tasked to convert some Oracle DB's to Postgresql, using AWS Schema Conversion Tool. Oracle only has the Number[(precision[,scale])] type. It would seem that Oracle stores integers using fixed-point. A lot of ID columns are defined as Number(19,0) which is being converted to Numeric(19,0) by SCT.
I've even seen one (so far) simple Number which is converted to Double Precision.
Postgres has proper scalar integer types like bigint.
On first blush it seems that storing integers as fixed-point numbers would be grossly inefficient in both storage and time compared to simple integers.
Am I missing something, does Oracle store them as efficient scalar ints under-the-covers?
Out of interest what's the best type for a simple ID column in Oracle?
Oracle's number data type is a variable length data type, which means that the value 1 uses less storage (and memory) than 123456789
In theory number(19,0) should be mapped to bigint, however Oracle's number(19,0) allows storing a value like 9999999999999999999 which would exceed the range for a bigint in Postgres.
The bigget value a bigint can store is 9223372036854775807 - if that is enough for you, then stick with bigint.
If you really need higher values, you will have to bite the bullet and use numeric in Postgres.
Related
I have a Table with about 200Mio Rows and multiple Columns of Datatype DECIMAL(p,s) with varying precision/scales.
Now, as far as i understand, DECIMAL(p,s) is a fixed size column, with a size depending on the precision, see:
https://learn.microsoft.com/en-us/sql/t-sql/data-types/decimal-and-numeric-transact-sql?view=sql-server-ver16
Now, when altering the table and changing a column from DECIMAL(15,2) to DECIMAL(19,6), for example, i would have expected there to be almost no work to be done on the side of the SQL-Sever as the required bytes to store the value are the same, yet the altering itself does take a long time - so what exactly does the server do when i execute the alter statement?
Also, is there any benefit (other than having constraints on a column) to storing a DECIMAL(15,2) instead of, for example, a DECIMAL(19,2)? It seems to me the storage requirements would be the same, but i could store larger values in the latter.
Thanks in advance!
The precision and scale of a decimal / numeric type matters considerably.
As far as SQL Server is concerned, decimal(15,2) is a different data type to decimal(19,6), and is stored differently. You therefore cannot make the assumption that just because the overall storage requirements do not change, nothing else does.
SQL Server stores decimal data types in byte-reversed (little endian) format with the scale being the first incrementing value therefore changing the definition can require the data to be re-written, SQL Server will use an internal worktable to safely convert the data and update the values on every page.
I have a column in Postgres where data looks something like this:
1.8,
3.4,
7,
1.2,
3
So it has floating numbers in it as well as integers...
What would be the right type for this kind of column?
Numeric data type ?
Here is a similar question: Data types to store Integer and Float values in SQL Server
Numeric should work!
Another option is to use a VARCHAR column, and store a character representation of the value.
But it seems that you would need some type of indicator as to which type of value was stored. And there's several drawbacks to this approach. One big drawback is that these would allow for "invalid" values to be stored.
Another approach would be to use two columns, one of them INTEGER, the other FLOAT, and specify a precedence, and allow a NULL value in the INTEGER column to represent that the value was stored in the FLOAT column.
For all datatypes in SQL look here: Data types to store Integer and Float values in SQL Server
I am working on an application that requires monetary calculations, so we're using BigDecimal to process such numbers.
I currently store the BigDecimals as a string in a PostgreSQL database. It made the most sense to me because I now am sure that the numbers will not lose precision as opposed to when they are stored as a double in the database.
The thing is that I cannot really do a lot of queries for that (i.e 'smaller than X' on a number stored as text is impossible)
For numbers I do have to perform complex queries on, I just create a new column value called indexedY (where Y is the name of the original column). I.e I have amount (string) and indexedAmount (double). I convert amount to indexedAmount by calling toDouble() on the BigDecimal instance.
I now just do the query, and then when a table is found, I just convert the string version of the same number to a BigDecimal and perform the query once again (this time on the fetched object), just to make sure I didn't have any rounding errors while the double was in transit (from the application to DB and back to the application)
I was wondering if I can avoid this extra step of creating the indexedY columns.
So my question comes down to this: is it safe to just store the outcome of a BigDecimal as a double in a (PostgreSQL) table without losing precision?
If BigDecimal is required, I would use a NUMERIC type with as much precision as you need. Eg NUMERIC(20, 20)
However if you only needs 15 digits of precision, using a double in the database might be fine, in which case it should be fine in Java too.
In Java I can say Integer.MAX_VALUE to get the largest number that the int type can hold.
Is there a similar constant/function in Postgres? I'd like to avoid hard-coding the number.
Edit: the reason I am asking is this. There is a legacy table with an ID of type integer, backed by a sequence. There is a lot of incoming rows into this table. I want to calculate how much time before the integer runs out, so I need to know "how many IDs are left" divided by "how fast we are spending them".
There's no constant for this, but I think it's more reasonable to hard-code the number in Postgres than it is in Java.
In Java, the philosophical goal is for Integer to be an abstract value, so it makes sense that you'd want to behave as if you don't know what the max value is.
In Postgres, you're much closer to the bare metal and the definition of the integer type is that it is a 4-byte signed integer.
There is a legacy table with an ID of type integer, backed by a sequence.
In that case, you can get the max value of the sequence by:
select seqmax from pg_sequence where seqrelid = 'your_sequence_name'::regclass.
This might be better than getting the MAX_INT, because sequence may have been created/altered with a specific max value that is different from MAX_INT.
I want to store a 15 digit number in a table.
In terms of lookup speed should I use bigint or varchar?
Additionally, if it's populated with millions of records will the different data types have any impact on storage?
In terms of space, a bigint takes 8 bytes while a 15-character string will take up to 19 bytes (a 4 bytes header and up to another 15 bytes for the data), depending on its value. Also, generally speaking, equality checks should be slightly faster for numerical operations. Other than that, it widely depends on what you're intending to do with this data. If you intend to use numerical/mathematical operations or query according to ranges, you should use a bigint.
You can store them as BIGINT as the comparison using the INT are faster when compared with varchar. I would also advise to create an index on the column if you are expecting millions of records to make the data retrieval faster.
You can also check this forum:
Indexing is fastest on integer types. Char types are probably stored
underneath as integers (guessing.) VARCHARs are variable allocation
and when you change the string length it may have to reallocate.
Generally integers are fastest, then single characters then CHAR[x]
types and then VARCHARS.