Dividing large numbers in postgresql - postgresql

I am working with numbers of 18 decimals, I have decided to save the number as a "NUMERIC (36)" in database
Now I want to present it by doing the following division
select (5032345678912345678::decimal / power(10, 18)::decimal )::decimal(36,18)
result
5.032345678912345700
expected result
5.032345678912345678
It works if I use a precision of 16 decimals
select (50323456789123456::decimal / power(10, 16)::decimal )::decimal(36,16)
result 5.0323456789123456
Any idea how to work with 18 decimals without losing information?

Use a constant typed as decimal(38,18):
select 5032345678912345678::decimal / 1000000000000000000::decimal(38,18);
?column?
----------------------
5.032345678912345678
(1 row)
A constant should be a bit faster. However the same cast should work for power(10,18) as well.

Related

PostgreSQL: smaller timestamptz type?

Timestamptz time is 8 bytes in PostgreSQL. Is there a way to get a 6 bytes timestamptz dropping some precision?
6 bytes is pretty much out of the question, since there is no data type with that size.
With some contortions you could use a 4-byte real value:
CREATE CAST (timestamp AS bigint) WITHOUT FUNCTION;
SELECT (localtimestamp::bigint / 1000000 - 662774400)::real;
float4
--------------
2.695969e+06
(1 row)
That would give you the time since 2021-01-01 00:00:00 with a precision of about a second (but of course for dates farther from that point, the precision will deteriorate).
But the whole exercise is pretty much pointless. Trying to save 2 or 4 bytes in such a way will not be a good idea:
the space savings will be minimal; today, when you can have terabytes of storage with little effort, that seems pointless
if you don't carefully arrange your table columns, you will lose the bytes you think you have won to alignment issues
using a number instead of a proper timestamp data type will make your queries more complicated and the results hard to interpret, and it will keep you from using date arithmetic
For all these reasons, I would place this idea firmly in the realm of harmful micro-optimization.

How to Format a String with Leading Zeroes to Decimal in TSQL?

I'm trying to parse a string such that the amount field is formatted with 2 decimal places. The original data provides 2 decimal place accuracy. However, the amount field in the data file is 16 characters long with leading zeroes. For example:
0000000000407981
should convert to 4079.81.
I've tried the following in my select statement:
format((substring([Column 0],51,16)/100), '.00') as CheckAmount,
This produces an amount with 2 decimal places but rounds to the nearest whole dollar. I'm using SQL Server 2016.
How do I modify my statement to ensure the CheckAmount is formatted with 2 decimal point accuracy and contains an accurate value?
Update: I attempted to convert to integer as follows:
format(convert(int, (substring([Column 0],51,16)/100.0)), '.00') as CheckAmount,
Unfortunately, I receive an error stating:
Arithmetic overflow error converting varchar to data type numeric.
How should I remedy this?
The divide by 100 does an integer division. You do not get decimal numbers. Dividing by 100.0 will.
declare #num nvarchar(20) = '0000000000407981';
select convert(bigint, #num)/100.0;
Fiddle
I was able to solve this issue with the following code snippet:
CAST((CAST(substring([Column 0],51,16) as int)/100.0) as decimal(18,2)) as CheckAmount,
Hope this helps anyone who might need to accomplish a similar task.

Converting bigint to timestamp in presto

I have a column in my dataset that has a datatype of bigint:
Col1 Col2
1 1519778444938790
2 1520563808877450
3 1519880608427160
4 1520319586578960
5 1519999133096120
How do I convert Col2 to the following format:
year-month-day hr:mm:ss
I am not sure what format my current column is in but I know that it is supposed to be a timestamp.
Any help will be great, thanks!
Have you tried to use functions like from_unixtime? You could use it to convert unix time to timestamp, then you could use date_format to display it in way you want. Notice that in your example your unix time is with microseconds, so you might want to convert it first to milliseconds.
I have not tested that but I am assuming that your code should look like:
date_format(from_unixtime(col2/1000), '%Y-%m-%d %h:%i:%s')
Notice that from_unixtime accepts also a time zone.
Please visit this page to see the more details about date related functions: https://docs.starburstdata.com/latest/functions/datetime.html
I believe the denominator should be 1000000 not 1000. Probably a typo. Anyways juts adding the test results here for others reference.
-- Microseconds
select date_format(from_unixtime(cast('1519778444938790' as bigint)/1000000), '%Y-%m-%d %h:%i:%s');
2018-02-28 12:40:44
If you need to filter the data where the column is in BIGINT Unix format, then you can use the following snippet to compare : from_unixtime(d.started_on /1000) >= CAST('2022-05-10 22:00:00' AS TIMESTAMP )
Accepted answer is a bit misleading. You should divide by 1000.0 otherwise you'll lose ms precision and be limited to second precision:
date_format(from_unixtime(col2/1000.0), '%Y-%m-%d %h:%i:%s')

How to find the days having a drawdown greater than X bips?

What would be the most idiomatic way to find the days with a drawdown greater than X bips? I again worked my way through some queries but they become boilerplate ... maybe there is a simpler more elegant alternative:
q)meta quotes
c | t f a
----| -----
date| z
sym | s
year| j
bid | f
ask | f
mid | f
then I do:
bips:50;
`jump_in_bips xdesc distinct select date,jump_in_bips from (update date:max[date],jump_in_bips:(max[mid]-min[mid])%1e-4 by `date$date from quotes where sym=accypair) where jump_in_bips>bips;
but this will give me the days for which there has been a jump in that number of bips and not only the drawdowns.
I can of course put this result above in a temporary table and do several follow up selects like:
select ... where mid=min(mid),date=X
select ... where mid=max(mid),date=X
to check that the max(mid) was before the min(mid) ... is there a simpler, more idiomatic way?
I think maxs is the key function here, which allows you to maintain a running historical maximum, and you can compare your current value to that maximum. If you have some table quote which contains some series of mids (mids) and timestamps (date), the following query should return the days where you saw a drawdown greater than a certain value:
key select by `date$date from quote
where bips<({(maxs[x]-x)%1e-4};mid) fby `date$date
The lambda {(maxs[x]-x)%1e-4} is doing the comparison at each point to the historical maximum and checking if it's greater than bips, and fby lets you apply the where clause group-wise by date. Grouping with a by on date and taking the key will then return the days when this occurred.
If you want to preserve the information for the max drawdown you can use an update instead:
select max draw by date from
(update draw:(maxs[mid]-mid)%1e-4 by date from #[quote;`date;`date$])
where bips<draw
The date is updated separately with a direct modification to quote, to avoid repeated casting.
Difference between max and min mids for given date may be both increase and drawdown. Depending on if max mid precedes min. Also, as far a sym columns exists, I assume you may have different symbols in the table and want to get drawdowns for all of them.
For example if there are 3 quotes for given day and sym: 1.3000 1.2960 1.3010, than the difference between 2nd and 3rd is 50 pips, but this is increase.
The next query can be used to get dates and symbols with drawdown higher than given threshold
select from
(select drawdown: {max maxs[x]-x}mid
by date, sym from quotes)
where drawdown>bips*1e-4
{max maxs[x]-x} gives maximum drawdown for given date by subtracting each mid for maximum of preceding mids.

Handling oddly-formatted timestamp in Postgres?

I have about 32 million tuples of data of the format:
2012-02-22T16:46:28.9670320+00:00
I have been told that the +00:00 indicates an hour:minute timezone offset, but also that Postgres only takes in hour offset (even in decimals), not the minute. So would I have to process the data in order to remove the last :00 from every tuple and read the data in as timestamps? I would like to avoid pre-processing the data file, but if Postgres will not accept the values otherwise, then I will do so.
In addition, the precision specified in the given data is 7 decimal places in the seconds part, whereas Postgres timestamp data type allows for maximum 6 decimal place precision (milliseconds). Would I have to modify the 7 decimal place precision to 6 in order to allow Postgres to read the records in, or will Postgres automatically convert the 7 to 6 as it reads the tuples?
pgsql=# SELECT '2016-07-10 20:12:21.8372949999+02:30'::timestamp with time zone AS ts;
ts-------------------------------
2016-07-10 17:42:21.837295+00
(1 row)
It seems that at least in PostgreSQL 9.4 and up (maybe earlier), minutes timezone offset handling is not documented, but does get processed properly if used. In a similar vein, if I try to read in a timestamp that has 7 decimal place precision in the seconds, then it will automatically convert that to 6 decimal place (microsecond) precision instead.