POSTGRES SUM big someSting::decimal - postgresql

I try to SUM and cast at the same time. I have a column with big numbers with a lot of decimals for example: "0.0000000000000000000000000000000000000000000043232137067129047"
when I try sum(amount::decimal) I get the following error message org.jkiss.dbeaver.model.sql.DBSQLException: SQL Error [22003]: ERROR: value overflows numeric format Where: parallel worker
What I don't get is that the doc is saying up to 131072 digits before the decimal point; up to 16383 digits after the decimal point
And my longest casted string is 63 digits so I don't get it.
What am I missing and how could I make my sum ?
EDIT:
amount type is varchar(255)
EDIT2:
I found out it's only when I try to CREATE a table from this request that it breaks, request is working fine in itself, how can it be due to create table ?
Complete request:
create table cross_dapp_ft as (select sender,receiver,sum(amount::decimal),contract from ft_transfer_event ftce
where receiver in (
select account_id from batch.cc cc
where classification not in ('ft')
)
group by sender,receiver,contract);

As Samuel Liew suggested in the comments, some rows where corrupted. Conclusion is , to be safe don't store numbers as string.

Related

How to specify on error behavior for postgresql conversion to UUID

I need to write a query to join 2 tables based on UUID field.
Table 1 contains user_uuid of type uuid.
Table 2 has this user_uuid in the end of url, after the last slash.
The problem is that sometimes this url contains other value, not castable to uuid.
My workaround like this works pretty good.
LEFT JOIN table2 on table1.user_uuid::text = regexp_replace(table2.url, '.*[/](.*)$', '\1')
However i have a feeling that better solution would be to try to cast to uuid before joining.
And here i have a problem. Such query:
LEFT JOIN table2 on table1.user_uuid = cast (regexp_replace(table2.url, '.*[/](.*)$', '\1') as uuid)
gives ERROR: invalid input syntax for type uuid: "rfa-hl-21-014.html" SQL state: 22P02
Is there any elegant way to specify the behavior on cast error? I mean without tons of regexp checks and case-when-then-end...
Appreciate any help and ideas.
There are additional considerations when converting a uuid to text. Postgres will yield a converted value in standard form (lower case and hyphened). However there are other formats for the same uuid value that could occur in you input. For example upper case and not hyphened. As text these would not compare equal but as uuid they would. See demo here.
select *
from table1 t1
join table2 t2
on replace(t_uuid::text, '-','') = replace(lower(t2.t_stg),'-','') ;
Since your data clearly contains non-uuid values, you cannot assume standard uuid format either. There are also additional formats (although not apparently often used) for a valid UUID. You may want to review UUID Type documentation
You could cast the uuid from table 1 to text and join that with the suffix from table 2. That will never give you a type conversion error.
This might require an extra index on the expression in the join condition if you need fast nested loop joins.

How to get max length (in bytes) of a variable-length column?

I want to get max length (in bytes) of a variable-length column. One of my columns has the following definition:
shortname character varying(35) NOT NULL
userid integer NOT NULL
columncount smallint NOT NULL
I tried to retrieve some info from the pg_attribute table, but the attlen column has -1 value for all variable-length columns. I also tried to use pg_column_size function, but it doesn't accept the name of the column as an input parameter.
It can be easily done in SQL Server.
Are there any other ways to get the value I'm looking for?
You will need to use a CASE expression checks pg_attribute.attlen and then calculate the maximum size in bytes depending on that. To get the max size for a varchar column you can "steal" the expression used in information_schema.columns.character_octet_length for varchar or char columns
Something along the lines:
select a.attname,
t.typname,
case
when a.attlen <> -1 then attlen
when t.typname in ('bytea', 'text') then pg_size_bytes('1GB')
when t.typname in ('varchar', 'char') then information_schema._pg_char_octet_length(information_schema._pg_truetypid(a.*, t.*), information_schema._pg_truetypmod(a.*, t.*))
end as max_bytes
from pg_attribute a
join pg_type t on a.atttypid = t.oid
where a.attrelid = 'stuff.test'::regclass
and a.attnum > 0
and not a.attisdropped;
Note that this won't return a proper size for numeric as that is also a variable length type. The documentation says "he actual storage requirement is two bytes for each group of four decimal digits, plus three to eight bytes overhead".
As a side note: this seems an extremely strange thing to do. Especially with your mentioning of temp tables in stored procedures. More often than not, the use of temp tables is not needed in Postgres. Instead of blindly copying the old approach that might have worked well in SQL Server, you should understand how Postgres works and change the approach to match the best practices in Postgres.
I have seen many migrations fail or deliver mediocre performance because of the assumption that the best practices for "System A" can be applied without any change to "System B". You need to migrate your mindset as well.
If this checks the columns of a temp table, then why not simply check the actual size of the column values using pg_column_size()?

Generating a random 10-digit ID number in PostgreSQL

I apologize if this has been answered elsewhere. I found similar posts but nothing that exactly matches what I was looking for. I am wondering if it is possible to randomly assign a number as data is entered into a table. For example, for this table
CREATE TABLE test (
id_number INTEGER NOT NULL,
second_number INTEGER NOT NULL)
I would like to be able to do the following:
INSERT INTO test (second_number)
VALUES (1234567)
where id_number is then populated with a random 10-digit number. I guess I want this similar to SERIAL in the way it populates but it needs to be 10 digits and random. Thanks very much
You can try this expression:
CAST(1000000000 + floor(random() * 9000000000) AS bigint)
This will give you a random number between 1000000000 and 9999999999.

Postgres large numeric value operations

I am trying some operations on large numeric field such as 2^89.
Postgres numeric data type can store 131072 on left of decimal and 16383 digits on right of decimal.
I tried some thing like this and it worked:
select 0.037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037037::numeric;
But when I put some operator, it rounds off values after 14 digits.
select (2^89)::numeric(40,0);
numeric
-----------------------------
618970019642690000000000000
(1 row)
I know the value from elsewhere is:
>>> 2**89
618970019642690137449562112
Why is this strange behavior. It is not letting me enter values beyond 14 digits numeric to database.
insert into x select (2^89-1)::numeric;
select * from x;
x
-----------------------------
618970019642690000000000000
(1 row)
Is there any way to circumvent this.
Thanks in advance.
bb23850
You should not cast the result but one part of the operation to make clear that this is a numeric operation, not an integer operation:
select (2^89::numeric)
Otherwise PostgreSQL takes the 2 and the 89 as type integer. In that case the result is type integer, too, which is not an exact value at that size. Your cast is a cast of that inaccurate result, so it cannot work.

ltrim(rtrim(x)) leave blanks on rtl content - anyone knows on a work around?

i have a table [Company] with a column [Address3] defined as varchar(50)
i can not control the values entered into that table - but i need to extract the values without leading and trailing spaces. i perform the following query:
SELECT DISTINCT RTRIM(LTRIM([Address3])) Address3 FROM [Company] ORDER BY Address3
the column contain both rtl and ltr values
most of the data retrieved is retrieved correctly - but SOME (not all) RTL values are returned with leading and or trailing spaces
i attempted to perform the following query:
SELECT DISTINCT ltrim(rTRIM(ltrim(rTRIM([Address3])))) c, ltrim(rTRIM([Address3])) b, [Address3] a, rtrim(LTRIM([Address3])) Address3 FROM [Company] ORDER BY Address3
but it returned the same problem on all columns - anyone has any idea what could cause it?
The rows that return with extraneous spaces might have a kind of space or invisible character the trim functions don't know about. The documentation doesn't even mention what is considered "a blank" (pretty damn sloppy if you ask me). Try taking one of those rows and looking at the characters one by one to see what character they are.
since you are using varchar, just do this to get the ascii code of all the bad characters
--identify the bad character
SELECT
COUNT(*) AS CountOf
,'>'+RIGHT(LTRIM(RTRIM(Address3)),1)+'<' AS LastChar_Display
,ASCII(RIGHT(LTRIM(RTRIM(Address3)),1)) AS LastChar_ASCII
FROM Company
GROUP BY RIGHT(LTRIM(RTRIM(Address3)),1)
ORDER BY 3 ASC
do a one time fix to data to remove the bogus character, where xxxx is the ASCII value identified in the previous select:
--only one bad character found in previous query
UPDATE Company
SET Address3=REPLACE(Address3,CHAR(xxxx),'')
--multiple different bad characters found by previous query
UPDATE Company
SET Address3=REPLACE(REPLACE(Address3,CHAR(xxxx1),''),char(xxxx2),'')
if you have bogus chars in your data remove them from the data and not each time you select the data. you WILL have to add this REPLACE logic to all INSERTS and UPDATES on this column, to keep any new data from having the bogus characters.
If you can't alter the data, you can just select it this way:
SELECT
LTRIM(RTRIM(REPLACE(Address3,CHAR(xxxx),'')))
,LTRIM(RTRIM(REPLACE(REPLACE(Address3,CHAR(xxxx1),''),char(xxxx2),'')))
...