Conditional Counting record in PostgreSQL - postgresql

I have a table such as the following
SP MA SL NG
jame j001 1 20200715 |
jame j001 -1 20200715 | -> count is 0
pink p002 3 20200730 }
pink p002 -3 20200730 } => count is 0
jack j002 12 20200731 | => count is 1
jack j002 -2 20200731 |
jack j002 12 20200801 } => count is 1
I want to count record and I want a result like:
SP count
jame 0
pink 0
jack 2
I could do with some help, please. Thanks you!
How the result is to be reached:
If SP, MA ,NG is the same then sum to SL.
Sum is 0 then count is 0,SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.

As i understood your requirements
If SP, MA, NG is the same then sum to SL.
List item Sum is 0 then count is 0 SUM is not 0 then count is 1.
If NG, SP is not the same then count is 1.
Try below Query:
with cte as (
select sp,ma,ng,sum(sl) from example group by sp,ma,ng having sum(sl)>0
),
cte1 as (
select distinct sp from example
)
select
t1.sp,
sum(case when sum>0 then 1 else 0 end)
from cte1 t1 left join cte t2 on t1.sp=t2.sp
group by t1.sp
Demo on Fiddle

Related

how do I write a proper query in kdb this case?

I would like to get all the groups that have a distinct price of 0 from my table, IE only if all prices are 0 in that group should it be returned.
My query & table look something like this.
tab:([]grp:`a`b`c`c`a`a`a;price:0 20 0 1 0 0 0)
select grp from tab where distinct price = 0
The output should only be `a since `a is the only group where all prices are 0.
Using an fby is one way to achieve the result here.
q)tab:([]grp:`a`b`c`c`a`a`a;price:0 20 0 1 0 0 0)
q)select from tab where 0=(max;abs price)fby grp
grp price
---------
a 0
a 0
a 0
a 0
q)distinct exec grp from tab where 0=(max;abs price)fby grp
,`a
Another approach:
q)where exec all 0=price by grp from tab
,`a

indicator for increase over time

I'm trying to create an indicator for value increase over time within a group. In particular, I'm trying to flag certain grp if value ever increases by 50% over time.
I have a raw data that looks like:
id grp value_dt value
--------------------------------
1 1 11/20/20 1.4
1 1 11/21/20 0.8
1 1 11/24/20 2.8
1 1 11/25/20 2.5
1 2 11/29/20 1.5
1 2 12/1/20 1.6
2 1 11/21/20 0.8
2 2 11/26/20 0.9
2 3 12/1/20 0.9
2 3 12/3/20 2.8
You can see that for id = 1 and grp = 1, the value fluctuates as it increases and decreases over time, but because it had increase over time between 11/21/20 and 11/24/20 from 0.8 to 2.8 (greater than 50% increase), I want to flag the whole grp 1. I want my output to look like:
id grp val_ind
-----------------------
1 1 1
1 2 0
2 1 0
2 2 0
2 3 1
I can only think of using min and max (something like below), which doesn't include the 'over the time' factor in...
select id,
grp,
min(value) as min_grp,
max(value) as max_grp,
(max_grp - min_grp) as val_diff,
case when val_diff >= min_grp * 1.5 then 1 else 0 end as val_ind
If anyone can offer their advice, I will greatly appreciate it!
I think you want to know if at any point at time there is an increase of 50% , you flag that group. if yes , here is how you can do it,
you need to use cte and window functions :
; WITH cte AS (
SELECT *
, CASE WHEN COALESCE(LEAD(value) OVER (PARTITION BY id, grp ORDER BY value_dt),0) >= value* 2 THEN 1 ELSE 0 END val_ind
FROM ttt
)
SELECT
id , grp , MAX(val_ind) val_ind
FROM cte
GROUP BY
id , grp
id | grp | val_ind
-: | --: | ------:
1 | 1 | 1
1 | 2 | 0
2 | 1 | 0
2 | 2 | 0
2 | 3 | 1
db<>fiddle here

How to insert row data between consecutive dates in HIVE?

Sample Data:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 4-Jan-17 1
A 5-Jan-17 0
B 3-Jan-17 1
B 5-Jan-17 0
Need to fill every missing txn_date between date range (1-Jan-17 to 5-Jan-2017). Just like below:
Output should be:
customer txn_date tag
A 1-Jan-17 1
A 2-Jan-17 1
A 3-Jan-17 0 (inserted)
A 4-Jan-17 1
A 5-Jan-17 0
B 1-Jan-17 0 (inserted)
B 2-Jan-17 0 (inserted)
B 3-Jan-17 1
B 4-Jan-17 0 (inserted)
B 5-Jan-17 0
select c.customer
,d.txn_date
,coalesce(t.tag,0) as tag
from (select date_add (from_date,i) as txn_date
from (select date '2017-01-01' as from_date
,date '2017-01-05' as to_date
) p
lateral view
posexplode(split(space(datediff(p.to_date,p.from_date)),' ')) pe as i,x
) d
cross join (select distinct
customer
from t
) c
left join t
on t.customer = c.customer
and t.txn_date = d.txn_date
;
c.customer d.txn_date tag
A 2017-01-01 1
A 2017-01-02 1
A 2017-01-03 0
A 2017-01-04 1
A 2017-01-05 0
B 2017-01-01 0
B 2017-01-02 0
B 2017-01-03 1
B 2017-01-04 0
B 2017-01-05 0
Just have the delta content i.e the missing data in a file(input.txt) delimited with the same delimiter you have mentioned when you created the table.
Then use the load data command to insert this records into the table.
load data local inpath '/tmp/input.txt' into table tablename;
Your data wont be in the order you have mentioned , it would get appended to the last. You could retrieve the order by adding order by txn_date in the select query.

how to combine multiple query into one single query

I have three queries as below and I need to combine them into one. Does any body know how to do that?
select COUNT(*) from dbo.VWAnswer where questionId =2 and answer =1
select COUNT(*) from dbo.VWAnswer where questionId =3 and answer =4
select COUNT(*) from dbo.VWAnswer where questionId =5 and answer =2
I want to find out total count of those people whose gender = 1 and Education = 4 and marital status = 2
Following is the table columns(With one ex) that i refer:
questionId questionText anwser AnserSheetID
1 Gender 1 1
2 Qualification 4 1
3 Marital Status 2 1
1 Gender 2 2
2 Qualification 1 2
3 Marital Status 2 2
1 Gender 1 3
2 Qualification 3 3
3 Marital Status 1 3
Basically, these are questions answered by different people whose answers are stored in this table.
So if we consider above table entries I should get 1 as total count based upon above 3 conditions i.e. gender = 1 and Education = 4 and marital status = 2
Can someone tell me what I need to do to get this to work?
If you want to combine your three count queries, you can try the below SQL to get it done.
select
sum(case when questionId =2 and anwser=1 then 1 else 0 end) as FCount,
sum(case when questionId =3 and anwser=4 then 1 else 0 end) as SCount,
sum(case when questionId =5 and anwser=2 then 1 else 0 end) as TCount
from dbo.VWAnswer
Update 1:
select
Sum(case when questionText='Gender' and anwser='1' then 1 else 0 end) as GenderCount,
Sum(case when questionText='Qualification' and anwser='4' then 1 else 0 end) as EducationCount,
Sum(case when questionText='Marital Status' and anwser='2' then 1 else 0 end) as MaritalCount
from VWAnswer
We can only get the counts based on the rows and every condition should apply in each row.
You might use a joined view meeting you conditions and select the count of the rows fitting your conditions.
Select COUNT(*) as cnt from
(
Select a.AnserSheetID
from VWAnswer a
Join VWAnswer b on a.AnserSheetID=b.AnserSheetID and b.questionId = 2 and b.anwser=4
Join VWAnswer c on a.AnserSheetID=c.AnserSheetID and c.questionId = 3 and c.anwser=2
where a.questionId=1 and a.anwser=1
) hlp

Conditional summarizing columns

I have the following situation
ID Value
1 50
1 60
2 70
2 80
1 0
2 50
I need to run a query that would return summed value, grouped by ID. The catch is if the value is 0, then the entire sum should be 0.
Query results would be
ID Value
1 0
2 200
I tried
select ID, case
when Value> 0 then sum(Value) * 1
when Value= 0 then sum(value) * 0
end
from table
but that did not work.
select ID,
sum(value)*sign(min(abs(value))) as [sum(value)]
from YourTable
group by ID
With a case if you like:
select ID,
case sign(min(abs(value)))
when 0 then 0
else sum(value)
end as [sum(value)]
from YourTable
group by ID