sql query to count number of users based on event sequence - postgresql

I have a table called test which is sorted by time.
user_id event time
1 e1 t1
1 e3 t2
1 e2 t3
2 e2 t4
2 e1 t5
2 e5 t6
3 e2 t7
3 e4 t8
I have to find out how many unique user_id is there in which event e1 happens before e2. here the answer is one with user_id 1.
I am using postgresql.
Any help would be much appreciated.

This is probably your solution, with a sub-select of events where ev2:
WITH event(user_id,event,time) AS (
VALUES (1,'e1','t1'),
(1,'e3','t2'),
(1,'e2','t3'),
(2,'e2','t4'),
(2,'e1','t5'),
(2,'e5','t6'),
(3,'e2','t7'),
(3,'e4','t8'))
SELECT count(event.event) FROM event
JOIN (SELECT user_id, time
FROM event WHERE event = 'e2') AS ev2 ON event.user_id = ev2.user_id
WHERE event.time < ev2.time AND event.event = 'e1'
Filter all rows before ev2 takes place and the value should be equal to ev1.

SELECT e.user_id,
Count(e.event)
FROM event e
join(SELECT user_id,
TIME
FROM event
WHERE event = 'e2') AS ee
ON e.user_id = ee.user_id
WHERE e.TIME < ee.TIME
AND e.event = 'e1'
GROUP BY e.user_id

Related

DB2: SQL to return all rows in a group having a particular value of a column in two latest records of this group

I have a DB2 table having one of the columns (A) which has either value PQR or XYZ.
I need output where the latest two records based on col C date have value A = PQR.
Sample Table
A B C
--- ----- ----------
PQR Mark 08/08/2019
PQR Mark 08/01/2019
XYZ Mark 07/01/2019
PQR Joe 10/11/2019
XYZ Joe 10/01/2019
PQR Craig 06/06/2019
PQR Craig 06/20/2019
In this sample table, my output would be Mark and Craig records
Since 11.1
You may use the nth_value OLAP function.
Refer to OLAP specification.
SELECT A, B, C
FROM
(
SELECT
A, B, C
, NTH_VALUE (A, 1) OVER (PARTITION BY B ORDER BY C DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) C1
, NTH_VALUE (A, 2) OVER (PARTITION BY B ORDER BY C DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) C2
FROM TAB
)
WHERE C1 = 'PQR' AND C2 = 'PQR'
dbfiddle link.
Older versions
SELECT T.*
FROM TAB T
JOIN
(
SELECT B
FROM
(
SELECT
A, B
, ROWNUMBER() OVER (PARTITION BY B ORDER BY C DESC) RN
FROM TAB
)
WHERE RN IN (1, 2)
GROUP BY B
HAVING MIN(A) = MAX(A) AND COUNT(1) = 2 AND MIN(A) = 'PQR'
) G ON G.B = T.B;
A simple solution could be
SELECT A,B,C
FROM tab
WHERE A = 'PQR'
ORDER BY C DESC FETCH FIRST 2 ROWS only

postgresql inner join duplicating some records

I have a large query developed as cte, in certain parts I have to make totals of secondary tables using inner joins to minimize the number of records processed, somehow two subqueries almost identical one works and the second duplicates 8 times some of the totalized records
I need to use inner join or the response time is shoots to the sky by 15x or more times
with
p0 as (select distinct on (pventa) pventa, p.tipo tpva from lecturas l
left join puntoventa p on l.pventa=p.numero where dia between '2017-10-01' and '2017-10-31' and p.tipo in ('A','E')),
r1 as (select p.tpva, l.pventa, dia, turno from lecturas l
inner join p0 p on p.pventa=l.pventa
where dia between '2017-10-01' and '2017-10-31'),
p1 as (select pva, remision, sum(abono), count(abono) from pagosremisiones p
inner join movsgas m on p.pva=m.pventa and p.remision=m.folio
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno group by 1,2 order by 1,2 ),
f1 as (select c.serie, c.factura, sum(abono), count(abono) from chequefactura c
inner join movsgas m on c.serie=m.serie and c.factura=m.factura
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno group by 1,2 order by 1,2 )
select * from p1
Nprem and ncheck are for debugging
P1 and f1 depend on r1, p1 works (as far as I've tried) without duplicate records (nprem corresponds to existing registers), however, ncheck increases on some records up to 8 times its actual values
I'm not sure if the correct p1's results are purely casual and don't know how to correct duplicates in f1
I do have the alternative of doing direct subqueries but I have a didactic interest in using joins
Btw, so far direct subqueries are much more efficient than the joins possibly because they have been poorly structured
What am I doing wrong?
What would you do to optimize the code?
Thanks in advance
Jose
the trick needed is the new subquery r2 including [ distinct on (serie, factura) ], if I omit it the error persists; duplicates in r2 do not correspond to the number of duplicates in f1, so I had no idea where so many came from; thank you all and again an apology for the terrible description of my problem
with
p0 as (select distinct on (pventa) pventa, p.tipo tpva from lecturas l
left join puntoventa p on l.pventa=p.numero where dia between '2017-10-01' and '2017-10-31' and p.tipo in ('A','E')),
r1 as (select p.tpva, l.pventa, dia, turno from lecturas l
inner join p0 p on p.pventa=l.pventa
where dia between '2017-10-01' and '2017-10-31'),
r2 as (select distinct on (serie, factura) m.serie,m.factura from movsgas m
inner join chequefactura c on c.serie=m.serie and c.factura=m.factura
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno),
p1 as (select pva, remision, sum(abono) payp from pagosremisiones p
inner join movsgas m on p.pva=m.pventa and p.remision=m.folio
inner join r1 r on r.pventa=m.pventa and r.dia=m.dia and r.turno=m.turno group by 1,2 order by 1,2 ),
f1 as (select c.serie, c.factura, sum(abono) payfr2, count(*) from chequefactura c
inner join r2 r on r.serie=c.serie and r.factura=c.factura group by 1,2 order by 1,2 )

Postgresql Query to split the array into rows

I am running a query which is creating a view for me with following details
id name brand_id
1 E1 {3,4}
2 E2 {5,7,8}
3 E4 {1}
I want to split the records for brand_id into equal number of rows. Hence the above view should look like:
id name brand_id
1 E1 {3}
1 E1 {4}
2 E2 {5}
2 E2 {7}
2 E2 {8}
3 E4 {1}
Over here the brand_id is calculated from a subquery by matching the creation date of record with the date of brand
SQL Query:
CREATE OR REPLACE VIEW %I AS
SELECT row_number() over(),
id,
name,
(select array(select id
from brand b
where status = true and (i.creation_date = b.creation_date)
order by b asc) ) as brand_id
FROM events i
group by id order by id
The simplest is probably:
CREATE VIEW some_name AS
SELECT i.id, i.name, b.id AS brand_id
FROM events i
JOIN brand b USING (creation_date)
WHERE b.status
ORDER BY 1;
select id, name, unnest(brand_id) from events;
Please use below query to extract data.
select id
,name
,'{'||replace(replace(regexp_split_to_table(brand_id, E','),'}',''),'{','')||'}'
from umang.t_st_ques
order by 1,2,3;

PostgreSQL / Hive join multiple tables

Table a:
id value0
101 a1
102 a2
103 a3
Table b:
id value1
101 b1
101 b2
101 b3
Table c:
id value2
101 c1
103 c3
103 c4
Rezult table:
id value0 value1 value2
101 a1 b1 0
101 a1 b2 0
101 a1 b3 0
101 a1 0 c1
102 a2 0 0
103 a3 0 c3
103 a3 0 c4
Is it possible to produce rezult table from tables a, b, c with one query (without creating two tables and join them)? Maybe there is a possibility to do it by using only left joins?
This may help you-
select t1.id, t2.id, t3.id
from tablea t1 inner join tableb t2 on t1.id = t2.id
inner join tablec t3 on t2.id=t3.id
group by id
If you have a base table, select that and do a left join to the others. If none of your tables can act as a base table, you can use full joins (both works as outer joins):
select *
from table_a
full join table_b using (id)
full join table_c using (id)
This will select sql NULLs, where there is no data, but you can use COLAESCE(value0, 'N/A'), etc. to select some default data.

Common records for 2 fields in a table?

I have a Table which has 2 fields say A,B. Suppose A has values a1,a2.
Corresponding records for a1 in B are 1,2,3,x,y,z.
Corresponding records for a2 in B are 1,2,3,4,d,e,f
I need a a query to be written in DB2, so that it will fetch the common records in B for each record in A (a1 and a2).
So here the output would be :
A B
a1 1
a1 2
a1 3
a2 1
a2 2
a2 3
Can someone please help on this?
Try something like:
SELECT A, B
FROM Table t1
WHERE (SELECT COUNT(*) FROM Table t2 WHERE t2.B = t1.B)
= (SELECT COUNT(DISTINCT t3.A) FROM Table t3)
ORDER BY A, B
This might not be 100% accurate as I can't test it out in DB2 so you might have to tweak the query a little bit to make it work.
with t(num) as (select count(distinct A) from table)
select t1.A, t1.B
from table t1, table t2, t
where t1.B = t2.B
group by t1.A, t1.B, num
having count(*) = num
Basically, the idea is to join the same table with column B and filter out just the ones that match exactly the same number of times as the number of elements in column A, which indicates that it is a common record out of all the A values.