query group aggregate value per 24h - postgresql

My data looks like the following:
amount(double precision) time (timestamp without timezone)
3.53456532 "2021-03-29 09:41:09.052+00"
3.77389602 "2021-03-28 23:42:15.413+00"
3.77389602 "2021-03-28 23:42:10.176+00"
3.77389602 "2021-03-28 23:42:02.589+00"
3.77389602 "2021-03-28 23:41:57.226+00"
3.05223612 "2021-03-28 20:12:51.457+00"
21.55 "2021-03-28 18:50:35.174+00"
7.98374607 "2021-03-28 09:30:31.698+00"
What I would like to achieve is a select which would return me the following:
amount(double precision) time (timestamp without timezone)
3.53456532 "2021-03-29 00:00:00.000+00"
47.68156627 "2021-03-28 00:00:00.000+00"
So the total amount per 24h, I tried the following:
SELECT time as "time",
amount
FROM trades
GROUP BY DAY(time)
But I have then following issue:
pq: function day(timestamp with time zone) does not exist
I tried many alternatives but I am a bit stuck, any help? thanks!

There is no day() function in PostgreSQL. Simply cast the time column to a date and aggregate on that.
select time::date as time,
sum(amount) as amount
from trades
group by time::date
Please note: The values for 03/28 in your example sums to 47.68156627 instead of 45.91297.

Related

TIMESTAMP- creation_date :: date between '2022-05-15' and '2022-06-15'

I just wanted to know the difference between these two codes:
select count (user_id) from tb_users where
creation_date :: date between '2022-05-15' and '2022-06-15'
Result: 41,232
select count (user_id) from tb_users where
creation_date between '2022-05-15' and '2022-06-15'
Result: 40,130
As far as I see, it is related with the timestamp, but I do not understand the difference.
Thank you!
Your column creation_date in the table is most probably in timestamp format, which is '2022-05-15 00:00:00'. By adding ::date <- you are casting your timestamp format to date format: '2022-05-15'.
You can read more about casting data types here:
https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-cast/
When you ask Postgres to implicitly coerce a DATE value to a TIMESTAMP value - the hours, minutes and seconds are set to zero.
In the first query, you explicitly cast the creation date to DATE which is successfully compared to the provided DATE values.
In the second query, the creation date is of type TIMESTAMP and so PostgreSQL converts your DATE values to TIMESTAMP values and the comparison becomes
creation_date >= '2022-05-15 00:00:00' AND creation_date <= '2022-06-15 00:00:00'
Obviously, this produces different resultset than the first query.

SUM Data if Duplicate in Postgres

I am having table with three columns and may have duplicate data in it. what I am trying to do is if BATCH Column is having Duplicate Data then START_S and END_S should be according to below Example
CREATE TABLE "DRL_FTO3_DI1_A0_BATCH"
(
"BATCH" character varying(128),
"START_S" integer,
"END_S" integer
)
INSERT INTO "DRL_FTO3_DI1_A0_BATCH"(
"BATCH", "START_S", "END_S")
VALUES ('Batch 1_1',1451120920,1451121008),
('Batch 01_2',1451389014,1451389100),
('Batch 2_1',1451534680,1451534918),
('Batch 3_1',1451539145,1451539264),
('Parth_2',1451540990,1451541285),
('Parth_2',1451541676,1451542254)
SELECT "BATCH",((TIMESTAMP WITHOUT Time Zone 'epoch' + "START_S" * INTERVAL '1 second') AT TIME ZONE 'UTC')::TIMESTAMP WITHOUT Time Zone,
((TIMESTAMP WITHOUT Time Zone 'epoch' + "END_S" * INTERVAL '1 second') AT TIME ZONE 'UTC')::TIMESTAMP WITHOUT Time Zone
FROM "DRL_FTO3_DI1_A0_BATCH"
Now as we can see Parth_2 is duplicate value so START_S and END_S for Parth_S should be
Parth_2 2015-12-31 11:19:50 2015-12-31 11:40:54
You could do it using GROUP BY and MIN/MAX aggregate functions (you can convert into date time later with below query in format you desire) like:
SELECT BATCH, MIN(START_S), MAX(END_S)
FROM DRL_FTO3_DI1_A0_BATCH
GROUP BY BATCH

How can I have timestamp displayed in UTC+02 (same timezone) for both the queries below?

My first query is:
SELECT distinct wfc_request_job_id,wfc_request_job_info,
replace(iso_cc,';',' ') as "iso_cc",to_char(wfc_request_start_ts,'yyyy-MM-dd HH:mm:ss') as ts,
sent_message_count,
(link_object_count + poi_object_count + point_address_object_count) as request_object_count
FROM wfc_request_job
where
wfc_request_job_id=173526;
This returns ts as 2015-08-16 03:08:59
Second Query:
SELECT wfc_request_job_id,wfc_request_start_ts,wfc_request_end_ts,replace(iso_cc,';',' ') as "iso_ccs",sent_message_count,wfc_queue_name
FROM wfc_request_job
where
to_char(wfc_request_start_ts,'YYYY-MM-DD') >= to_char(to_date('08/16/2015','MM/DD/YYYY'),'YYYY-MM-DD')
and to_char(wfc_request_start_ts,'YYYY-MM-DD') <= to_char(to_date('08/16/2015','MM/DD/YYYY'),'YYYY-MM-DD')
order by wfc_request_job_id desc
This returns ts of the job id mentioned above as - "2015-08-16 15:58:59.809+02"
How can I make both the queries return ts in UTC+02 - i.e. same timezone
The data type of wfc_request_start_ts is - timestamp with timezone
I changed to queries to have the format HH24:MI:SS however that did not help. Please note that the webapp using these queries will be opened in both Germany and USA.
According to postgresql manual to_char there is TZ (and OF as of v9.4) template patterns for Date/Time formatting.
Therefore in query you need to add it so
postgres=# select to_char(now(),'yyyy-MM-dd HH24:mm:ss TZ');
to_char
------------------------
2015-08-19 12:08:56 CEST
(1 row)
Also, make sure you specify timezone when converting
so instead
to_date('08/16/2015','MM/DD/YYYY')
use
TIMESTAMP WITH TIME ZONE '2015-08-16 00:00:00+02';
in second query.

Postgresql Query very slow with ::date, ::time, and interval

I have a sql query that is very slow:
select number1 from mytable
where symbol = 25
and timeframe = 1
and date::date = '2008-02-05'
and date::time='10:40:00' + INTERVAL '30 minutes'
The goal is to return one value, and postgresql takes 1.7 seconds to return the desired value(always a single value). I need to execute hundreds of those queries for one task, so this gets extremely slow.
Executing the same query, but pointing to the time directly without using interval and ::date, ::time takes only 17ms:
select number1 from mytable
where symbol = 25
and timeframe = 1
and date = '2008-02-05 11:10:00'
I thought it would be faster if I would not use ::date and ::time, but when I execute a query like:
select number1 from mytable
where symbol = 25
and timeframe = 1
and date = '2008-02-05 10:40:00' + interval '30 minutes'
I get a sql error (22007). I've experimented with different variations but I couldn't get interval to work without using ::date and ::time. Date/Time Functions on postgresql.org didn't help me out.
The table got a multi column index on symbol, timeframe, date.
Is there a fast way to execute the query with adding time, or a working syntax with interval where I do not have to use ::date and ::time? Or do I need to have a special index when using queries like these?
Postgresql version is 9.2.
Edit:
The format of the table is:
date = timestamp with time zone,
symbol, timeframe = numeric.
Edit 2:
Using
select open from ohlc_dukascopy_bid
where symbol = 25
and timeframe = 1
and date = timestamp '2008-02-05 10:40:00' + interval '30' minute
Explain shows:
"Index Scan using mcbidindex on mytable (cost=0.00..116.03 rows=1 width=7)"
" Index Cond: ((symbol = 25) AND (timeframe = 1) AND (date = '2008-02-05 11:10:00'::timestamp without time zone))"
Time is now considerably faster: 86ms on first run.
The first version will not use a (regular) index on the column named date.
You didn't provide much information, but assuming the column named date has the datatype timestamp (and not date), then the following should work:
and date = timestamp '2008-02-05 10:40:00' + interval '30 minutes'
this should use an index on the column named date (but only if it is in fact a timestamp not a date). It is essentially the same as yours, the only difference is the explicit timestamp literal (although Postgres should understand '2008-02-05 10:40:00' as a timestamp literal as well).
You will need to run an explain to find out if it's using an index.
And please: change the name of that column. It's bad practise to use a reserved word as an identifier, and it's a really horrible name, which doesn't say anything about what kind of information is stored in the column. Is it the "start date", the "end date", the "due date", ...?

Date range in PostgreSQL

When I apply a date range to my query, is there anyway to display the dates used in the date range even if there is no data at those dates?
Suppose I use,
... where date between '1/12/2010' and '31/12/2010' order by date
What I want in my result is to show sum of all amount column until 1/12/2010 on that day even if there is no data for that date and also same for 31/12/2010.
Join with generate_series() to fill in the gaps.
Example:
CREATE TEMP TABLE foo AS SELECT CURRENT_DATE AS today;
SELECT
COUNT(foo.*),
generate_series::date
FROM
foo
RIGHT JOIN generate_series('2010-12-18', '2010-12-25', interval '1 day') ON generate_series = today
GROUP BY
generate_series;
Result:
0,'2010-12-18'
0,'2010-12-19'
1,'2010-12-20'
0,'2010-12-21'
0,'2010-12-22'
0,'2010-12-23'
0,'2010-12-24'
0,'2010-12-25'