Writing a select query?

Writing a select query? - group-by

I have two tables:
table1 =tbl_main:
item_id fastec_qty sourse_qty
001 102 100
002 200 230
003 300 280
004 400 500
table2= tbl_dOrder
order_id item_id amount
1001 001 30
1001 002 40
1002 001 50
1002 003 70
How can I write a query so that the result of the tables are as follows:
sum(fastec_qty) sum(sourse_qty) difference1 sum(amount) difference2
1002 1110 -108 190 812
difference1 =sum(fastec_qty)-sum(sourse_qty);
difference2 =sum(fastec_qty)-sum(amount);

select sum(m.fastec_qty)
, sum(m.sourse_qty)
, sum(m.fastec_qty) - sum(m.sourse_qty)
, sum(o.amount)
, sum(m.fastec_qty) - sum(o.amount)
from tbl_main m
, tbl_dOrder o
where m.item_id = o.item_id
group by 1, 2, 3, 4, 5

SELECT sum(a.sourse_qty) as samount, sum(a.fastec_qty) AS amount,
sum(a.sourse_qty- a.fastec_qty) as sfd,
(select sum(ITEM_QTY) from TBL_DO )as qty,
sum(a.fastec_qty) - (select sum(ITEM_QTY) from TBL_DO ) AS difference
FROM tbl_main a group by 1,2,3,4,5
amount samount sfd qty difference
1002 1110 -108 190 812
Thanks All ,

I'll give you a hint, start by joining the tables on the item_id.
select item_id.tbl_main fastec_qty.tbl_main
, source_qty.tbl_main
, order_id.tbl_order
, amount.tbl_order
from tbl_main
, tbl_order
where item_id.tbl_main = item_id.tbl_order;
next step is to sum the three columns and finally do the subtraction.

Related

Count users with more than X amount of transactions within Y days by date

Scenario: Trying to count more active users for time series analysis.
Need: With postgreSQL(redshift) Count customers that have more than X unique transactions within Y days from said date, group by date.
How do i achieve this?
Table: orders
date
user_id
product_id
transaction_id
2022-01-01
001
003
001
2022-01-02
002
001
002
2022-03-01
003
001
003
2022-03-01
003
002
003
...
...
...
...
Outcome:
date
active_customers
2022-01-01
10
2022-01-02
12
2022-01-03
9
2022-01-04
13

You may be able to use the window functions LEAD() and LAG() here but this solution may also work for you.
WITH data AS
(
SELECT o.date
, o.user_id
, COUNT(o.trans_id) tcount
FROM orders o
WHERE o.date BETWEEN o.date - '30 DAYS'::INTERVAL AND o.date -- Y days from given date
GROUP BY o.date, o.user_id
), user_transaction_count AS
(
SELECT d.date
, COUNT(d.user_id) FILTER (WHERE d.tcount > 1) -- X number of transactions
OVER (PARTITION BY d.user_id) user_count
FROM data d
)
SELECT u.date
, SUM(u.user_count) active_customers
FROM user_transaction_count u
GROUP BY u.date
ORDER BY u.date
;
Here is a DBFiddle that demos a couple options.

How to count the number of unique values in column B associated with each value in column A?

My Data resembles something like this
Column A Column B
101 1001
101 1002
101 1003
101 1004
102 1001
102 1005
102 1006
101 1001
102 1001
Expected Output is like this
column_a unique_column_b_vals
101 4
102 3

Knowing that COUNT function supports a distinct argument
http://www.postgresqltutorial.com/postgresql-count-function/
select column_a , count(distinct column_b)
from f1
group by column_a

SQL Server - Renumber in Order

I have a table that I need to reorder a column, but I need to keep the original order by date.
TABLE_1
id num_seq DateTimeStamp
fb4e1683-7035-4895-b2c8-d084d9b42ce3 111 08-02-2005
e40e4c3e-65e4-47b7-b13a-79e8bce2d02d 114 10-07-2017
49e261a8-a855-4844-a0ac-37b313da2222 113 01-30-2010
6c4bffb7-a056-4a20-ae1c-5a31bdf683f2 112 04-15-2006
I want to reorder num_seq starting with 1001 through 1004 and keep the numbering in order. So 111 = 1001 and 112 = 1002 and so forth.
This is what I have so far:
DECLARE #num INT
SET #num = 0
UPDATE Table_1
SET #num = num_seq = #id + 1
GO
I know that UPDATE doesn't let me use the keyword ORDER BY. Is there a way to do this in SQL 2008 R2?

Stage the new num_seq in a CTE, then leverage that in your update statement:
declare #Table_1 table (id uniqueidentifier, num_seq int, DateTimeStamp datetime);
insert into #Table_1
values
('fb4e1683-7035-4895-b2c8-d084d9b42ce3', 111, '08-02-2005'),
('e40e4c3e-65e4-47b7-b13a-79e8bce2d02d', 114, '10-07-2017'),
('49e261a8-a855-4844-a0ac-37b313da2222', 113, '01-30-2010'),
('6c4bffb7-a056-4a20-ae1c-5a31bdf683f2', 112, '04-15-2006');
;with stage as
(
select *,
num_seq_new = 1000 + row_number()over(order by DateTimeStamp asc)
from #Table_1
)
update stage
set num_seq = num_seq_new;
select * from #Table_1
Returns:
id num_seq DateTimeStamp
FB4E1683-7035-4895-B2C8-D084D9B42CE3 1001 2005-08-02 00:00:00.000
E40E4C3E-65E4-47B7-B13A-79E8BCE2D02D 1004 2017-10-07 00:00:00.000
49E261A8-A855-4844-A0AC-37B313DA2222 1003 2010-01-30 00:00:00.000
6C4BFFB7-A056-4A20-AE1C-5A31BDF683F2 1002 2006-04-15 00:00:00.000

T-SQL : Capturing multiple changes during the day but eliminating duplicates

I needed Capture multiple changes during the day but eliminating duplicates if occurs immediately.
Below is the snippet of sample data.
Source Data:
SEQ_ID ID LastName FirstName Updated_Time
50 1010 A A 01/06/2016 10:00
51 1010 B B 01/06/2016 11:00
52 1010 C C 01/06/2016 12:00
53 1010 D D 01/06/2016 15:00
54 1010 D D 01/06/2016 17:00
55 1010 D D 01/06/2016 18:00
56 1010 B B 01/06/2016 20:00
57 1010 B B 01/06/2016 21:00
58 1010 B B 01/06/2016 22:00
59 1010 B B 01/06/2016 23:00
100 2020 X X 01/06/2016 10:00
202 3030 TTT TTT 01/06/2016 10:00
201 3030 UUU UUU 01/06/2016 11:00
203 3030 VVV VVV 01/06/2016 12:00
210 3030 UUU UUU 01/06/2016 15:00
302 4000 KQ KQ 01/06/2016 07:00
300 4000 KQ KQ 01/06/2016 08:00
301 4000 KQ KQ 01/06/2016 09:00
303 4000 KQ KQ 02/06/2016 08:00
Result should be as below :
SEQ_ID ID LastName FirstName Updated_Time
50 1010 A A 01/06/2016 10:00
51 1010 B B 01/06/2016 11:00
52 1010 C C 01/06/2016 12:00
53 1010 D D 01/06/2016 15:00
56 1010 B B 01/06/2016 20:00
100 2020 X X 01/06/2016 10:00
202 3030 TTT TTT 01/06/2016 10:00
201 3030 UUU UUU 01/06/2016 11:00
203 3030 VVV VVV 01/06/2016 12:00
210 3030 UUU UUU 01/06/2016 15:00
302 4000 KQ KQ 01/06/2016 07:00
This is query I could come up with:
SELECT
[ID]
,[LastName]
,[FirstName]
, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID, [Updated_Time])
- ROW_NUMBER() OVER (PARTITION BY ID,
CAST(HASHBYTES('SHA2_256', CONCAT(
ID
,[LastName]
,[FirstName]
)) AS binary(32)) ORDER BY ID ASC, [Updated_Time] ASC) [DWRecordGroupID]
FROM
xxxxxxx.xxxxxxx
order by ID , [Updated_Time] asc
Result of the Query:
ID LastName FirstName DWRecordGroupID
1010 A A 0
1010 B B 1
1010 C C 2
1010 D D 3
1010 D D 3
1010 D D 3
1010 B B 5
1010 B B 5
1010 B B 5
1010 B B 5
2020 X X 0
3030 TTT TTT 0
3030 UUU UUU 1
3030 VVV VVV 2
3030 UUU UUU 2
4000 KQ KQ 0
4000 KQ KQ 0
4000 KQ KQ 0
4000 KQ KQ 0
The idea is to eliminate duplicated based on ID and DWRecordGroupID. But somehow I am missing at the below part where the query gives me same group number and one of them gets eliminated randomly, which is incorrect.
ID LastName FirstName DWRecordGroupID
3030 TTT TTT 0
3030 UUU UUU 1
3030 VVV VVV 2
3030 UUU UUU 2
Any help is really appreciated.
Thanks in advance.

I think you can try this (X1 is your table):
SELECT ID, LAST_NAME, FIRST_NAME, UPDATED_TIME FROM (
SELECT ID, LAST_NAME, FIRST_NAME, UPDATED_TIME
, LAG(LAST_NAME) OVER (PARTITION BY ID ORDER BY UPDATED_TIME, SEQ_ID) AS LNAME_prec
, LAG(FIRST_NAME) OVER (PARTITION BY ID ORDER BY UPDATED_TIME, SEQ_ID) AS FNAME_prec
FROM X1
) X2
WHERE (LAST_NAME <>LNAME_prec AND FIRST_NAME <>FNAME_prec) OR (LNAME_prec IS NULL)
Output:
ID LAST_NAME FIRST_NAME UPDATED_TIME
----------- ---------- ---------- -----------------------
1010 A A 2016-06-01 10:00:00.000
1010 B B 2016-06-01 11:00:00.000
1010 C C 2016-06-01 12:00:00.000
1010 D D 2016-06-01 15:00:00.000
1010 B B 2016-06-01 20:00:00.000
2020 X X 2016-06-01 10:00:00.000
3030 TTT TTT 2016-06-01 10:00:00.000
3030 UUU UUU 2016-06-01 11:00:00.000
3030 VVV VVV 2016-06-01 12:00:00.000
3030 UUU UUU 2016-06-01 15:00:00.000
4000 KQ KQ 2016-06-01 07:00:00.000

This will obviously need to be adapted to incorporate however many columns you have in your table:
;WITH cte ( rownum, seq_id, id, last_name, first_name, updated_time )
AS (SELECT row_number()
OVER (
ORDER BY id, updated_time),
seq_id,
id,
last_name,
first_name,
updated_time
FROM #tbl)
SELECT t.*
FROM cte l
INNER JOIN #tbl t ON l.seq_id = t.seq_id
LEFT OUTER JOIN cte p ON l.rownum - 1 = p.rownum
AND l.id = p.id
AND l.last_name = p.last_name
AND l.first_name = p.first_name
WHERE p.seq_id IS NULL
The real difficulty comes from the fact that, in the end, you have to compare every non-sequence field (i.e. not seq_id and not updated_time) from one row against every non-sequence field from another row.
Note: This solution naively assumes changes to a particular ID are to be treated as a single collection of changes. So if seq_id 548 that comes in on 01/23/2017 for id 1010 has the same first_name, last_name as seq_id 56, it will not be picked up. It could be adapted to work IF the seq_id column could be guaranteed to be in sequence order (but your sample data did not have that).

You can use Row_number() and get the values of 1
;with CTE as (
select *, RowN = row_number() over (partition by lastname order by seq_id) from #yourduplicates
) select * from cte where RowN = 1
Your Input table:
create table #yourDuplicates (Seq_ID int, id int, lastname varchar(10), firstname varchar(10), updated_time datetime)
insert into #yourDuplicates
(SEQ_ID , ID , LastName , FirstName , Updated_Time ) values
( 50 , 1010 ,'A ', 'A ', '01/06/2016 10:00')
, ( 51 , 1010 ,'B ', 'B ', '01/06/2016 11:00')
, ( 52 , 1010 ,'C ', 'C ', '01/06/2016 12:00')
, ( 53 , 1010 ,'D ', 'D ', '01/06/2016 15:00')
, ( 54 , 1010 ,'D ', 'D ', '01/06/2016 17:00')
, ( 55 , 1010 ,'D ', 'D ', '01/06/2016 18:00')
, ( 56 , 1010 ,'B ', 'B ', '01/06/2016 20:00')
, ( 57 , 1010 ,'B ', 'B ', '01/06/2016 21:00')
, ( 58 , 1010 ,'B ', 'B ', '01/06/2016 22:00')
, ( 59 , 1010 ,'B ', 'B ', '01/06/2016 23:00')
, ( 100 , 2020 ,'X ', 'X ', '01/06/2016 10:00')
, ( 202 , 3030 ,'TTT', 'TTT', '01/06/2016 10:00')
, ( 201 , 3030 ,'UUU', 'UUU', '01/06/2016 11:00')
, ( 203 , 3030 ,'VVV', 'VVV', '01/06/2016 12:00')
, ( 210 , 3030 ,'UUU', 'UUU', '01/06/2016 15:00')
, ( 302 , 4000 ,'KQ ', 'KQ ', '01/06/2016 07:00')
, ( 300 , 4000 ,'KQ ', 'KQ ', '01/06/2016 08:00')
, ( 301 , 4000 ,'KQ ', 'KQ ', '01/06/2016 09:00')
, ( 303 , 4000 ,'KQ ', 'KQ ', '02/06/2016 08:00')

Writing a select query

I have two tables:
table1 =tbl_main:
item_id fastec_qty
001 102
002 200
003 300
004 400
table2= tbl_dOrder
order_id item_id amount
1001 001 30
1001 002 40
1002 001 50
1002 003 70
How can I write a query so that the result of the tables are as follows:
item_id amount difference
001 102 22
002 200 160
003 300 230
004 400 400
The difference between the amount in table 1 and the total amounts disbursed from the Table 2.

SELECT q.item_id, a.fastec_qty AS amount, a.fastec_qty - q.amount AS difference
FROM (
SELECT item_id, SUM(amount) AS amount
FROM tbl_dOrder
GROUP BY item_id
) q
JOIN tbl_main a ON a.item_id = q.item_id
Here this query is going to first SUM the amounts from tbl2 grouped by the item_id, then it's going to JOIN the results of that query with the first table so it can do the calculation for the difference column.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Writing a select query? - group-by

select sum(m.fastec_qty) , sum(m.sourse_qty) , sum(m.fastec_qty) - sum(m.sourse_qty) , sum(o.amount) , sum(m.fastec_qty) - sum(o.amount) from tbl_main m , tbl_dOrder o where m.item_id = o.item_id group by 1, 2, 3, 4, 5

Related

Count users with more than X amount of transactions within Y days by date

How to count the number of unique values in column B associated with each value in column A?

SQL Server - Renumber in Order

T-SQL : Capturing multiple changes during the day but eliminating duplicates

Writing a select query

Categories

Resources