How to get the sum of a count derived column in PostgreSQL? - postgresql

I have a table with a shipment_id, no_of_boxes, and no_of_pallets as shown below.
shipment_id
no_of_boxes
no_of_pallets
1
23
0
1
45
0
1
0
1
2
3
0
2
165
0
2
0
10
I want to sum the no_of_boxes, and no_of_pallets columns against their respective shipment_id. The columns no_of_boxes, and no_of_pallets are COUNT derived columns (calculated from a different table with JOINS).
I tried writing a subquery for this but didn't help. Below subquery is for no_of_boxes, a similar query was written for no_of_pallets.
SELECT SUM(no_of_boxes)
FROM (SELECT COUNT(si.shipment_item_id) AS no_of_boxes
FROM shipment_item AS si
JOIN shipment_item AS si
ON si.shipment_order_systemid = sho.system_id
JOIN shipping_unit AS su
ON su.system_id = si.shipping_unit_systemid
WHERE su.unit LIKE 'BOX'
GROUP BY si.shipment_item_id,
su.unit) t
My desired result is:
shipment_id
no_of_boxes
no_of_pallets
1
68
1
2
168
10

To get the result you want, use the following query:
SELECT shipment_id, sum(no_of_boxes), sum(no_of_pallets)
FROM shipments
GROUP BY shipment_id;

Related

PostgreSQL - Update table with previous values ​from another table

I want to update a table with the sum of a second table
This is the table 'x' that I want to update. Has a starting value and a closing value:
id
op_date
initial_value
end_value
1
2020-02-01
0
0
1
2020-02-02
0
0
2
2020-02-01
0
0
2
2020-02-02
0
0
The table 'y' save the values ​​of the day:
id
op_date
value_day
1
2020-01-29
500
1
2020-02-01
100
1
2020-02-02
200
2
2020-01-29
750
2
2020-02-01
100
2
2020-02-02
250
I want the result to look like this:
id
op_date
initial_value
end_value
1
2020-02-01
500
600
1
2020-02-02
600
800
2
2020-02-01
750
850
2
2020-02-02
850
1100
I tried this script, but the process just runs it and doesn't finish it:
UPDATE x
SET
initial_value= (select sum(y.value_day)
from public.y where
y.op_date > '2020-11-01' and y.op_date < x.op_date
and y.id = x.id),
end_value= (select sum(y.value_day)
from public.y where
y.op_date between '2020-11-01' and x.op_date
and y.id = x.id);
You can use window function. To understand window function more you can look this link.
At first i am writing query to select the value.
select id,op_date,
sum(value_day) over (
partition by y.id
rows between unbounded preceding and current row
)-value_day as initial_value,
sum(value_day) over (
partition by y.id
rows between unbounded preceding and current row
) as end_value
from y
;
This is your update query.
UPDATE x
set initial_value=s_statement.initial_value,
end_value=s_statement.end_value
from
(select id,op_date,
sum(value_day) over (
partition by y.id
rows between unbounded preceding and current row
)-value_day as initial_value,
sum(value_day) over (
partition by y.id
rows between unbounded preceding and current row
) as end_value
from y) s_statement
where x.id=s_statement.id
and x.op_date=s_statement.op_date
;
Let me know if its ok with you.

select all columns with suffix _test in q kdb

I have a partitioned table, similar to below table:
q)t:([]date:3#2019.01.01; a:1 2 3; a_test:2 3 4; b_test:3 4 5; c: 6 7 8);
date a a_test b_test c
----------------------------
2019.01.01 1 2 3 6
2019.01.01 2 3 4 7
2019.01.01 3 4 5 8
Now, I want to fetch date column and all columns have names with suffix "_test" from table t.
Expected output:
date a_test b_test
------------------------
2019.01.01 2 3
2019.01.01 3 4
2019.01.01 4 5
In my original table, there are more than 100 columns with name having _test so below is not a practical solution in this case.
q)select date, a_test, b_test from t where date=2019.01.01
I tried various options like below, but of no use:
q)delete all except date, *_test from select from t where date=2019.01.01
If the columns you are selecting are variable then you should use a functional qSQL statement to perform the query. The following can be used in your case
q)query:{[tab;dt;c]?[tab;enlist (=;`date;dt);0b;(`date,c)!`date,c]}
q)query[t;2019.01.01;cols[t] where cols[t] like "*_*"]
date a_test b_test
------------------------
2019.01.01 2 3
2019.01.01 3 4
2019.01.01 4 5
In order to craft a particular functional statement, you can parse your query, putting dummy columns in place if you aren't sure what they should be
q)parse "select date,c1,c2 from tab where date=dt"
?
`tab
,,(=;`date;`dt)
0b
`date`c1`c2!`date`c1`c2
A functional select is probably the best way to go here if you require adding further filters.
?[`t;();0b;{x!x}`date,exec c from meta t where c like "*_test"]
The functional form of any select quesry can be obtained by using the -5! operator on any SQL style statement.
In the example below I have created a table with 20 fields, each one beginning with either a or b.
I then use the functional form to define which fields I want.
q)tab:{[x] enlist x!count[x]#0}`$"_" sv ' raze string `a`b,/:\:til 10
q){[t;s]?[t;();0b;{[x] x!x} cols[t] where cols[t] like s]}[tab;"b*"]
b_0 b_1 b_2 b_3 b_4 b_5 b_6 b_7 b_8 b_9
---------------------------------------
0 0 0 0 0 0 0 0 0 0
q){[t;s]?[t;();0b;{[x] x!x} cols[t] where cols[t] like s]}[tab;"a*"]
a_0 a_1 a_2 a_3 a_4 a_5 a_6 a_7 a_8 a_9
---------------------------------------
0 0 0 0 0 0 0 0 0 0
q)-5!" select a,b from c"
?
`c
()
0b
`a`b!`a`b
Alternatively, if I don't require any filtering I can use the # operator as in below:
{[x;s] (cols[x] where cols[x] like s)#x}[ tab;"a*"]

Remove duplicates based on only 1 column

My data is in the following format:
rep_id user_id other non-duplicated data
1 1 ...
1 2 ...
2 3 ...
3 4 ...
3 5 ...
I am trying to achieve a column for deduped_rep with 0/1 such that only first rep id across the associated users has a 1 and rest have 0.
Expected result:
rep_id user_id deduped_rep
1 1 1
1 2 0
2 3 1
3 4 1
3 5 0
For reference, in Excel, I would use the following formula:
IF(SUMPRODUCT(($A$2:$A2=A2)*($A$2:$A2=A2))>1,0,1)
I know there is the FIXED() LoD calculation http://kb.tableau.com/articles/howto/removing-duplicate-data-with-lod-calculations, but I only see use cases of it deduplicating based on another column. However, mine are distinct.
Define a field first_reg_date_per_rep_id as
{ fixed rep_id : min(registration_date) }
The define a field is_first_reg_date? as
registration_date = first_reg_date_per_rep_id
You can use that last Boolean field to distinguish the first record for each rep_id from later ones
try this query
select
rep_id,
user_id,
row_number() over(partition by rep_id order by rep_id,user_id) deduped_rep
from
table

Build a query that pulls records based on a value in a column

My table has a parent/child relationship, along the lines of parent.id,id. There is also a column that contains a quantity, and another ID representing a grand-parent, like so:
id parent.id qty Org
1 1 1 100
2 1 0 100
3 1 4 100
4 4 1 101
5 4 2 101
6 6 1 102
7 6 0 102
8 6 1 102
What this is supposed to show is ID 1 is the parent, and ID 2 and 3 are children which belongs to ID 1, and ID 1, 2, and 3 all belong to the grandparent 100.
I would like to know if any child or parent has QTY = 0, what are all the other id's associated to that parent, and what are all the other parents associated with that grandparent?
For example, I would want to see a report that shows me this:
Org id parent.id qty
100 1 1 1
100 2 1 0
100 3 1 4
102 6 6 1
102 7 6 0
102 8 6 1
Much appreciate any help you can offer to build a MS SQL 2000 (yeah, I know) query to handle this.
Try this
select * from tablename a
where exists (select 1 from tablename x
where x.parent_id = a.parent_id and qty = 0)
Example:
;with cte as
( select 1 id,1 parent_id, 1 qty, 100 org
union all select 2,1,0,100
union all select 3,1,4,100
union all select 4,4,1,101
union all select 5,4,2,101
union all select 6,6,1,102
union all select 7,6,0,102
union all select 8,6,1,102
)
select * from cte a
where exists (select 1 from cte x
where x.parent_id = a.parent_id and qty = 0)
SQL DEMO HERE

Insert rownumber repeatedly in records in t-sql

I want to insert a row number in a records like counting rows in a specific number of range. example output:
RowNumber ID Name
1 20 a
2 21 b
3 22 c
1 23 d
2 24 e
3 25 f
1 26 g
2 27 h
3 28 i
1 29 j
2 30 k
I rather to try using the rownumber() over (partition by order by column name) but my real records are not containing columns that will count into 1-3 rownumber.
I already try to loop each of record to insert a row count 1-3 but this loop affects the performance of the query. The query will use for the RDL report, that is why as much as possible the performance of the query must be good.
any suggestions are welcome. Thanks
have you tried modulo-ing rownumber()?
SELECT
((row_number() over (order by ID)-1) % 3) +1 as RowNumber
FROM table