I have a table like this, and there are three cases,
## case a
| rec_no | read_time | id
+--------+---------------------+----
| 45139 | 2023-02-07 17:00:00 | a
| 45140 | 2023-02-07 17:15:00 | a
| 45141 | 2023-02-07 17:30:00 | a
| 45142 | 2023-02-07 18:15:00 | a
| 45143 | 2023-02-07 18:30:00 | a
| 45144 | 2023-02-07 18:45:00 | a
## case b
| rec_no | read_time | id
+--------+---------------------+----
| 21735 | 2023-02-01 19:15:00 | b
| 21736 | 2023-02-01 19:30:00 | b
| 21742 | 2023-02-01 21:00:00 | b
| 21743 | 2023-02-01 21:15:00 | b
| 21744 | 2023-02-01 21:30:00 | b
| 21745 | 2023-02-01 21:45:00 | b
## case c
| rec_no | read_time | id
+--------+---------------------+----
| 12345 | 2023-02-02 12:15:00 | c
| 12346 | 2023-02-02 12:30:00 | c
| 12347 | 2023-02-02 12:45:00 | c
| 12348 | 2023-02-02 13:15:00 | c
| 12352 | 2023-02-02 14:00:00 | c
| 12353 | 2023-02-02 14:15:00 | c
I'd like to find out the missing readtime field when the rec is not continuous.
read_time is '15 min' interval
in different 'id', rec_no are independent
I'd like something like this,
## case a
## nothing because rec_no is continous
| read_time | id
+---------------------+----
## case b
## get six rows
| read_time | id
+--------+-----------------
| 2023-02-01 19:45:00 | b
| 2023-02-01 20:00:00 | b
| 2023-02-01 20:15:00 | b
| 2023-02-01 20:30:00 | b
| 2023-02-01 20:45:00 | b
| 2023-02-01 21:00:00 | b
## case c
## get two rows (13:00:00 is missing but rec_no is continous)
| read_time | id
+--------+-----------------
| 2023-02-02 13:30:00 | c
| 2023-02-02 13:45:00 | c
Is there a way to do this ? The output format is not too important as long as I can get the result correctly.
step-by-step demo: db<>fiddle
SELECT
rec_no,
id,
gs
FROM (
SELECT
*,
lead(rec_no) OVER (PARTITION BY id ORDER BY rec_no) - rec_no > 1 AS is_gap, -- 1
lead(read_time) OVER (PARTITION BY id ORDER BY rec_no) as next_read_time
FROM mytable
)s, generate_series( -- 3
read_time + interval '15 minutes', -- 4
next_read_time - interval '15 minutes',
interval '15 minutes'
) as gs
WHERE is_gap -- 2
Use lead() window function to move the next rec_no value and the next read_time value to the current row. With this you can check if the difference between the current and next rec_no values are greater than 1.
Filter all records with greater differences
Generate a time series with 15 minutes interval
Because the series includes start and end, you need a start at the next 15 minutes points (+ interval) and end one "slot" before the next recorded value (- interval).
Related
I have three tables.
TABLE_1:
T2_ID ver date boolean
---------------------------------------------------------
1 | X-20-50 | 2019-01-01 16:20:51.722336+00 | TRUE
2 | X-50-30 | 2019-02-26 16:20:51.722336+00 | TRUE
3 | X-20-32 | 2019-03-20 16:20:51.722336+00 | FALSE
1 | X-20-50 | 2019-01-09 16:20:51.722336+00 | FALSE
2 | X-20-50 | 2019-12-02 16:20:51.722336+00 | TRUE
3 | X-20-50 | 2019-01-24 16:20:51.722336+00 | TRUE
TABLE_2:
id | type | scheduler
--------------------------------------------------
1 | ABC | w1,w2,w3,w4,w5,w6,w7,w8,w9,w10,w11,w12
2 | PQR | w5,w9
3 | TRC | w1,w4,w8
TABLE_3
start_date_of_ver | end_date_of_ver | ver_name
-----------------------------------------------------------
2019-01-01 00:00:00+00 | 2019-04-01 00:00:00+00 | X-20-50
2019-02-25 00:00:00+00 | 2019-05-26 00:00:00+00 | X-50-30
2019-03-15 00:00:00+00 | 2019-06-06 00:00:00+00 | X-20-32
Table 4 should fulfill the below condition.
it takes version name (ver_name) as input
from this (ver_name), it takes start date and end date of version (from table_3) if the version period is 3 months then it creates 12 weeks table with id (type) as the first column and creates an entry of twelve-week according to table 2 of the scheduler.
information on table 4 will be updated as and when table 1 has entries of that particular week which are TRUE
Note: table 1, entries get generates on a daily basis.
Desired table: which has only ver_name as input and calculate below table.
When table_1 don't have any entries then table_4 should look like as below
Table_4: X-20-50
id_of_table_2 | week_1 | week_2 | week_3 | week_4 | week_5 | week_6 | week_7 | week_8 | week_9 | week_10 | week_11 | week_12 |
------------------------------------------------------------------------------------------------------------------------------
ABC | w1 | w2 | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12 |
PQR | | | | | w5 | | | | w9 | | | |
TRC | w1 | | | w4 | | | | w8 | | | | |
When table_1 has entries then table_4 should look like as below
X-20-50
id_of_table_2 | week_1 | week_2 | week_3 | week_4 | week_5 | week_6 | week_7 | week_8 | week_9 | week_10 | week_11 | week_12 |
------------------------------------------------------------------------------------------------------------------------------
ABC | Done | Done | w3 | w4 | w5 | w6 | w7 | w8 | w9 | w10 | w11 | w12 |
PQR | | | | | w5 | | | | w9 | | | |
TRC | Done | | | w4 | | | | w8 | | | | |
You can create function which can take starting date of a week as input.
Example-
create function a(start_date)
RETURNS json
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
DECLARE
outputjson json;
BEGIN
EXECUTE 'select json_agg(*) from table_name where date >= '||start_date||' and (date '||start_date||' + integer ''7'')' into outputjson;
RETURN outputjson;
END;
$$
Hope this will help.
Your requirement needs a little refinement. You specify to retrieve weekly data yet fail to define a your week. On what day does it begin? Are all weeks 7 days long? What happens when Dec 31 falls on Tuesday is Friday Jan 3 in the same week (see current year calendar). Then there is the issue of user input and what it represents. Is it the desired start date and the week is that date and the next 6 days or any date within weekly period?
The following assumes an ISO 8601 definition (google it - lots of stuff). Every week begins on Monday and all weeks are 7 days long. (Thus the week containing 31-Dec-2019 also includes 3-Jan-2020). The routine extracts the ISO Year and ISO week user entered date.
--setup
create table weekly_something( c1 text, c2 text, date1 timestamptz, someem boolean);
insert into weekly_something( c1, c2, date1, someem )
values ('ABC','AB-20-50','2019-11-25 16:20:51.722336+00',TRUE)
, ('PQR','AB-50-30','2019-11-26 16:20:51.722336+00',TRUE)
, ('TRC','CD-20-32','2019-11-27 16:20:51.722336+00',FALSE)
, ('ABC','AB-20-50','2019-12-02 16:20:51.722336+00',FALSE)
, ('ABC','AB-20-50','2019-12-02 16:20:51.722336+00',TRUE)
, ('JFF','yy-45-89','2019-12-31 16:20:51.722336+00',TRUE)
, ('JFF','yy-89-30','2020-01-03 16:20:51.722336+00',TRUE) ;
-- JFF Just For Fun
-- SQL Function
create function week_of(week_date date)
returns setof weekly_something
language sql stable strict
as $$
select *
from weekly_something
where (extract('isoyear' from week_date), extract('week' from week_date)) =
(extract('isoyear' from date1), extract('week' from date1));
$$;
-- test
select * from week_of('2019-11-26');
select * from week_of('2019-12-30');
I have a table person_updates in postgresql with rows like:
| id | status | person_id | modified_at |
|----|--------|-----------|------------------|
| 1 | INFO | 2 | 2019-11-01 10:00 |
| 1 | UPDATE | 2 | 2019-11-02 15:00 |
| 1 | DEBUG | 2 | 2019-11-03 12:00 |
| 3 | INFO | 4 | 2019-11-04 14:00 |
| 3 | UPDATE | 4 | 2019-11-05 16:00 |
| 5 | INFO | 6 | 2019-11-06 08:00 |
| 5 | DEBUG | 6 | 2019-11-07 07:00 |
I want to get the INFO rows that are followed by an UPDATE row:
| id | status | person_id | modified_at |
|----|--------|-----------|------------------|
| 1 | INFO | 2 | 2019-11-01 10:00 |
| 3 | INFO | 4 | 2019-11-04 14:00 |
I've attempted this by doing a lead query
select d2.id, d2.status, d2.modified_at, d2.person_id,
lead(d2.status) over (partition by d2.id order by d2.modified_at) as next_status
from person_updates d2
where d2.status = 'INFO'
This returns more rows than I want. Adding a and d2.next_status = 'UPDATE' throws an error. How do I do this query?
Like this:
select t.id, t.status, t.modified_at, t.person_id
from (
select *,
lead(status) over (partition by id order by modified_at) as next_status
from person_updates
) t
where t.status = 'INFO' and t.next_status = 'UPDATE'
See the demo.
Results:
| id | status | modified_at | person_id |
| --- | ------ | ------------------------ | --------- |
| 1 | INFO | 2019-11-01T10:00:00.000Z | 2 |
| 3 | INFO | 2019-11-04T14:00:00.000Z | 4 |
You can use window function lead() to get the status of the next record. Since window functions are not allowed in the where clause, you need to turn the query to a subquery, and then filter in the outer query, like so:
select *
from (
select
t.*,
lead(status) over(partition by id order by modified_at) lead_status
from person_updates t
) t
where status = 'INFO' and lead_status = 'UPDATE'
I have added a column (seq) to a table used for scheduling so the front end can manage the order in which each item can be displayed. Is it possible to craft a SQL query to populate this column with an incremental counter based on the common duplicate values in the date column?
Before
------------------------------------
| name | date_time | seq |
------------------------------------
| ABC1 | 15-01-2017 11:00:00 | |
| ABC2 | 16-01-2017 11:30:00 | |
| ABC1 | 16-01-2017 11:30:00 | |
| ABC3 | 17-01-2017 10:00:00 | |
| ABC3 | 18-01-2017 12:30:00 | |
| ABC4 | 18-01-2017 12:30:00 | |
| ABC1 | 18-01-2017 12:30:00 | |
------------------------------------
After
------------------------------------
| name | date_time | seq |
------------------------------------
| ABC1 | 15-01-2017 11:00:00 | 0 |
| ABC2 | 16-01-2017 11:30:00 | 0 |
| ABC1 | 16-01-2017 11:30:00 | 1 |
| ABC3 | 17-01-2017 10:00:00 | 0 |
| ABC3 | 18-01-2017 12:30:00 | 0 |
| ABC4 | 18-01-2017 12:30:00 | 1 |
| ABC1 | 18-01-2017 12:30:00 | 2 |
------------------------------------
Solved, thanks to both answers.
To make it easier for anybody who finds this, the working code is:
UPDATE my_table f
SET seq = seq2
FROM (
SELECT ctid, ROW_NUMBER() OVER (PARTITION BY date_time ORDER BY ctid) -1 AS seq2
FROM my_table
) s
WHERE f.ctid = s.ctid;
Use the window function row_number():
with my_table (name, date_time) as (
values
('ABC1', '15-01-2017 11:00:00'),
('ABC2', '16-01-2017 11:30:00'),
('ABC1', '16-01-2017 11:30:00'),
('ABC3', '17-01-2017 10:00:00'),
('ABC3', '18-01-2017 12:30:00'),
('ABC4', '18-01-2017 12:30:00'),
('ABC1', '18-01-2017 12:30:00')
)
select *,
row_number() over (partition by name order by date_time)- 1 as seq
from my_table
order by date_time;
name | date_time | seq
------+---------------------+-----
ABC1 | 15-01-2017 11:00:00 | 0
ABC1 | 16-01-2017 11:30:00 | 1
ABC2 | 16-01-2017 11:30:00 | 0
ABC3 | 17-01-2017 10:00:00 | 0
ABC1 | 18-01-2017 12:30:00 | 2
ABC3 | 18-01-2017 12:30:00 | 1
ABC4 | 18-01-2017 12:30:00 | 0
(7 rows)
Read this answer for a similar question about updating existing records with a unique integer.
Check out ROW_NUMBER().
SELECT name, date_time, ROW_NUMBER() OVER (PARTITION BY date_time ORDER BY name) FROM [table]
I had to create a cross tab table from a Query where dates will be changed into column names. These order dates can be increase or decrease as per the dates passed in the query. The order date is in Unix format which is changed into normal format.
Query is following:
Select cd.cust_id
, od.order_id
, od.order_size
, (TIMESTAMP 'epoch' + od.order_date * INTERVAL '1 second')::Date As order_date
From consumer_details cd,
consumer_order od,
Where cd.cust_id = od.cust_id
And od.order_date Between 1469212200 And 1469212600
Order By od.order_id, od.order_date
Table as follows:
cust_id | order_id | order_size | order_date
-----------|----------------|---------------|--------------
210721008 | 0437756 | 4323 | 2016-07-22
210721008 | 0437756 | 4586 | 2016-09-24
210721019 | 10749881 | 0 | 2016-07-28
210721019 | 10749881 | 0 | 2016-07-28
210721033 | 13639 | 2286145 | 2016-09-06
210721033 | 13639 | 2300040 | 2016-10-03
Result will be:
cust_id | order_id | 2016-07-22 | 2016-09-24 | 2016-07-28 | 2016-09-06 | 2016-10-03
-----------|----------------|---------------|---------------|---------------|---------------|---------------
210721008 | 0437756 | 4323 | 4586 | | |
210721019 | 10749881 | | | 0 | |
210721033 | 13639 | | | | 2286145 | 2300040
I have two tables. The first generate the condition for counting records in the second. The two tables are linked by a relation of 1:1 by Timestamp.
The problem is that the second table have many columns, and we need a count for each column that match the condition in the first column.
Example:
Tables met and pot
CREATE TABLE met (
tstamp timestamp without time zone NOT NULL,
h1_rad double precision,
CONSTRAINT met_pkey PRIMARY KEY (tstamp)
)
CREATE TABLE pot (
tstamp timestamp without time zone NOT NULL,
c1 double precision,
c2 double precision,
c3 double precision,
CONSTRAINT met_pkey PRIMARY KEY (tstamp)
)
REALLY pot have 108 columns from c1 to c108.
Tables values:
+ Table met + + Table pot +
+----------------+--------+--+----------------+------+------+------+
| tstamp | h1_rad | | tstamp | c1 | c2 | c3 |
+----------------+--------+--+----------------+------+------+------+
| 20150101 00:00 | 0 | | 20150101 00:00 | 5,5 | 3,3 | 15,6 |
| 20150101 00:05 | 1,8 | | 20150101 00:05 | 12,8 | 15,8 | 1,5 |
| 20150101 00:10 | 15,4 | | 20150101 00:10 | 25,4 | 4,5 | 1,4 |
| 20150101 00:15 | 28,4 | | 20150101 00:15 | 18,3 | 63,5 | 12,5 |
| 20150101 00:20 | 29,4 | | 20150101 00:20 | 24,5 | 78 | 17,5 |
| 20150101 00:25 | 13,5 | | 20150101 00:25 | 12,8 | 5,4 | 18,4 |
| 20150102 00:00 | 19,5 | | 20150102 00:00 | 11,1 | 25,6 | 6,5 |
| 20150102 00:05 | 2,5 | | 20150102 00:05 | 36,5 | 21,4 | 45,2 |
| 20150102 00:10 | 18,4 | | 20150102 00:10 | 1,4 | 35,5 | 63,5 |
| 20150102 00:15 | 20,4 | | 20150102 00:15 | 18,4 | 23,4 | 8,4 |
| 20150102 00:20 | 6,8 | | 20150102 00:20 | 16,8 | 12,5 | 18,4 |
| 20150102 00:25 | 17,4 | | 20150102 00:25 | 25,8 | 23,5 | 9,5 |
+----------------+--------+--+----------------+------+------+------+
What i need is the number of rows of pot where value is higher than 15 when in met the value is higher than 15 with the same timestamp, grouped by day.
With the data supplied we need something like:
+----------+----+----+----+
| day | c1 | c2 | c3 |
+----------+----+----+----+
| 20150101 | 3 | 2 | 1 |
| 20150102 | 2 | 4 | 1 |
+----------+----+----+----+
How can i get this ?
Is this possible with a single query even with subquerys ?
Actually the raw data is stored every minute in others tables. The tables met and pot are summarized and filtered tables for performance.
If necessary, i can create tables with data summarized by days if this simplify the solution.
Thanks
P.D.
Sorry for my english
You can solve this with some CASE statements. Test for both conditions, and if true return a 1. Then SUM() the results using a GROUP BY on the timestamp converted to a date to get your total:
SELECT
date(met.tstamp),
SUM(CASE WHEN met.h1_rad > 15 AND pot.c1 > 15 THEN 1 END) as C1,
SUM(CASE WHEN met.h1_rad > 15 AND pot.c2 > 15 THEN 1 END) as C2,
SUM(CASE WHEN met.h1_rad > 15 AND pot.c3 > 15 THEN 1 END) as C3
FROM
met INNER JOIN pot ON met.tstamp = pot.tstamp
GROUP BY date(met.tstamp)