Combine two rows into one in Redshift [SQL] - amazon-redshift

How do I combine this result set in SQL(redshift env't):
ID
Flag?
Plan?
MX_123
0
1
MX_123
1
0
MX_456
1
0
MX_456
0
1
TO BECOME THIS
ID
Flag?
Plan?
MX_123
1
1
MX_456
1
1

select id, sum(flag), sum(plan) from table group by id;
but if you can have non-distinct triplets there and want to have max 1, then I'd say something like:
SELECT
    id,
    CASE
        WHEN SUM(flag)>0 THEN 1
        ELSE 0
    END,
    CASE
        WHEN SUM(plan)>0 THEN 1
        ELSE 0
    END
FROM table
GROUP BY id;

Thanks for these suggestions. Was able to figure it out with an aggregate MAX(function) and using a subquery
SELECT a.id,
,max(a.flag)
,max(a.plan)
from
(
SUBQUERY HERE - has the dupes
) a
group by a.id

Related

Remove duplicate with separate column check TSQL

I have 2 tables having same columns and permission records in it.
One columns named IsAllow is available in both tables.
I am getting records of both tables in combine using UNION
But want to skip similar records if IsAllow = 0 in any one column - I don't want those records. But UNION returns all records and am getting confused.
Below are columns
IsAllow, UserId, FunctionActionId
I tried union but it gives both records. I want to exclude IsAllow = 0 in either table.
Sample data table 1
IsAllow UserId FunctionActionId
1 2 5
1 2 8
Sample data table 2
IsAllow UserId FunctionActionId
0 2 5 (should be excluded)
1 2 15
You can try this:
;with cte as(select *, row_number()
over(partition by UserId, FunctionActionId order by IsAllow desc) rn
from
(select * from table1
union all
select * from table2) t)
select * from cte where rn = 1 and IsAllow = 1
Version2:
select distinct coalesce(t1.UserId, t2.UserId) as UserId,
coalesce(t1.FunctionActionId, t2.FunctionActionId) as FunctionActionId,
1 as IsAllow
from tabl1 t1
full join table2 t2 on t1.UserId = t2.UserId and
t1.FunctionActionId = t2.FunctionActionId
where (t1.IsAllow = 1 and t2.IsAllow = 1) or
(t1.IsAllow = 1 and t2.IsAllow is null) or
(t1.IsAllow is null and t2.IsAllow = 1)

Count 'Number of Transactions'

I'm trying to get a count of 'number of transactions' that occurred'.
The data could look like this.
Cust # Trans# TransType LineItem
42 5000 1 1
42 6000 1 1
42 6000 1 2
42 6000 2 1
42 6000 2 2
42 6000 2 3
There can be multiple transaction types for any given transaction number. In this example, my desired returned 'number of transactions' count is '3', as Trans# 5000 only had one different TransType and 6000 had two. If I do a distinct count of Trans# I get '2' and if I do just a count, I get '6'.
I've tried working with:
COUNT(DISTINCT CASE Trans# WHEN ???? THEN 1 ELSE null END) AS [Num of Transactions],
But I know that I'm not quite on the right track. If anyone could point me in the right direction, it'd be much appreciated.
Try this :-
with cte as
(
Select Cust,Trans,row_number() over (partition by trans,TransType order by cust) rn
from Sample
)
Select count(*) as TransCount from cte
where rn=1
SQL FIDDLE DEMO
You can use the following to get the count of distinct transtype for each customer and transaction:
select cust,
trans,
count(distinct transtype) cnt
from yourtable
group by cust, trans;
Then if you want a total of that count, you can apply sum() to the query:
select sum(cnt) Total
from
(
select cust,
trans,
count(distinct transtype) cnt
from yourtable
group by cust, trans
) src
See SQL Fiddle with Demo of both queries.

Understanding Postgres Query

Regarding the difference between...
select * from table_a where id != 30 and name != 'Kevin';
and
select * from table_a where id != 30 or name != 'Kevin';
First one means, "select all rows from table_a where the id is not 30 and the name is not Kevin".
So {Id, Name} row of {30, 'Bill'} would be returned from this first query.
But, the second one means, "select all rows from table_a where the id is not 30 or the name is not 'Kevin'".
So the above {30, 'Bill'} would not be returned from this second query.
Is that right?
select * from table_a where id != 30 and name != 'Kevin';
So {Id, Name} row of {30, 'Bill'} would be returned from this first
query.
No, it wouldn't.
select * from table_a where id != 30 or name != 'Kevin';
So the above {30, 'Bill'} would not be returned from this second
query.
No, it would. You have the logic backwards. Just try it.
Nope. The second query means "select all rows where the id is not 30 or the name is not 'Kevin'", hence a name of 'Bill' qualifies the record for inclusion in the query.
Recap:
A B not(A) not(B) AND OR
1 1 0 0 0 0
1 0 0 1 0 1
0 1 1 0 0 1
0 0 1 1 1 1
So, the two query's will return the same rows only if:
1- id=30 and name='Kevin'
or
2- id!=30 and name!='Kevin'
Quick logic expression transformation tip:
NOT (A AND B) == NOT A OR NOT B
NOT (A OR B) == NOT A AND NOT B

Summing From Consecutive Rows

Assume we have a table and we want to do a sum of the Expend column so that the summation only adds up values of the same Week_Name.
SN Week_Name Exp Sum
-- --------- --- ---
1 Week 1 10 0
2 Week 1 20 0
3 Week 1 30 60
4 Week 2 40 0
5 Week 2 50 90
6 Week 3 10 0
I will assume we will need to `Order By' Week_Name, then compare the previous Week_Name(previous row) with the current row Week_name(Current row).
If both are the same, put zero in the SUM column.
If not the same, add all expenditure, where Week_Name = Week_Name(Previous row) and place in the Sum column. The final output should look like the table above.
Any help on how to achieve this in T-SQL is highly appreciated.
Okay, I was eventually able to resolve this issue, praise Jesus! If you want the exact table I gave above, you can use GilM's response below, it is perfect. If you want your table to have running Cumulatives, i.e. Rows 3 shoud have 60, Row 5, should have 150, Row 6 160 etc. Then, you can use my code below:
USE CAPdb
IF OBJECT_ID ('dbo.[tablebp]') IS NOT NULL
DROP TABLE [tablebp]
GO
CREATE TABLE [tablebp] (
tablebpcCol1 int PRIMARY KEY
,tabledatekey datetime
,tableweekname varchar(50)
,expenditure1 numeric
,expenditure_Cummulative numeric
)
INSERT INTO [tablebp](tablebpcCol1,tabledatekey,tableweekname,expenditure1,expenditure_Cummulative)
SELECT b.s_tablekey,d.PK_Date,d.Week_Name,
SUM(b.s_expenditure1) AS s_expenditure1,
SUM(b.s_expenditure1) + COALESCE((SELECT SUM(s_expenditure1)
FROM source_table bs JOIN dbo.Time dd ON bs.[DATE Key] = dd.[PK_Date]
WHERE dd.PK_Date < d.PK_Date),0)
FROM source_table b
INNER JOIN dbo.Time d ON b.[Date key] = d.PK_Date
GROUP BY d.[PK_Date],d.Week_Name,b.s_tablekey,b.s_expenditure1
ORDER BY d.[PK_Date]
;WITH CTE AS (
SELECT tableweekname
,Max(expenditure_Cummulative) AS Week_expenditure_Cummulative
,MAX(tablebpcCol1) AS MaxSN
FROM [tablebp]
GROUP BY tableweekname
)
SELECT [tablebp].*
,CASE WHEN [tablebp].tablebpcCol1 = CTE.MaxSN THEN Week_expenditure_Cummulative
ELSE 0 END AS [RunWeeklySum]
FROM [tablebp]
JOIN CTE on CTE.tableweekname = [tablebp].tableweekname
I'm not sure why your SN=6 line is 0 rather than 10. Do you really not want the sum for the last Week? If having the last week total is okay, then you might want something like:
;WITH CTE AS (
SELECT Week_Name,SUM([Expend.]) as SumExpend
,MAX(SN) AS MaxSN
FROM T
GROUP BY Week_Name
)
SELECT T.*,CASE WHEN T.SN = CTE.MaxSN THEN SumExpend
ELSE 0 END AS [Sum]
FROM T
JOIN CTE on CTE.Week_Name = T.Week_Name
Based on the requst in the comment wanting a running total in SUM you could try this:
;WITH CTE AS (
SELECT Week_Name, MAX(SN) AS MaxSN
FROM T
GROUP BY Week_Name
)
SELECT T.SN, T.Week_Name,T.Exp,
CASE WHEN T.SN = CTE.MaxSN THEN
(SELECT SUM(EXP) FROM T T2
WHERE T2.SN <= T.SN) ELSE 0 END AS [SUM]
FROM T
JOIN CTE ON CTE.Week_Name = T.Week_Name
ORDER BY SN

SQL select from a group

Suppose we have the following table data:
ID parent stage submitted
1 1 1 1
2 1 2 1
3 1 3 0
4 1 4 0
5 5 1 1
6 5 2 1
7 5 3 1
8 5 4 1
As you can see we have 2 groups (that have the same parent). I want to select the latter stage that is submitted. In the above example i want to select the ID`s 2 and 8. I am completely lost so if anyone can help it will be appreciated a lot. :)
SELECT T.ID, T.PARENT, T.STAGE
from
T,
(
select PARENT, MAX( STAGE) MAX_STAGE
from T
where SUBMITTED = 1
GROUP BY PARENT
) M
where
T.STAGE = M.MAX_STAGE
AND T.PARENT = M.PARENT
Explanation:
First, isolate the max stage for each group with submitted = 1 (the inner select).
Then, join the result with the real table, to filter out the records with no max stage.
Select Parent, max(Id)
From tbl t
Inner Join
(
Select Parent, max(Stage) as Stage
from tbl t
Where Submitted = 1
Group by Parent
) submitted
on t.Parent = submitted.parent and
t.stage = submitted.stage
Group by Parent
This should do it:
SELECT
T1.id,
T1.parent,
T1.stage,
T1.submitted
FROM
Some_Table T1
LEFT OUTER JOIN Some_Table T2 ON
T2.parent = T1.parent AND
T2.submitted = 1 AND
T2.stage > T1.stage
WHERE
T1.submitted = 1 AND
T2.id IS NULL
SELECT * FROM Table WHERE ID = 2 OR ID = 8
Is this what you want?