Get Totals from multi rows - amazon-redshift

Get Totals from multi rows - amazon-redshift

I want to get totals of a column as totals I face a problem in redshift when I want to do it.
this my code
SELECT date(t.adjusted_booking_time),
datepart(week, t.adjusted_booking_time) as Week,
sum(CASE
WHEN t.booking_outcome IN (4,5,6) THEN 1
ELSE 0
END) AS Accepted_trips,
count(DISTINCT t.driver_id) AS Active_drivers,
count(DISTINCT t.customer_id) AS Active_customers,
Sum(t.trip_gross_fare) AS GMV,
sum(t.discount_applied) AS Discount,
FROM (select x.*, dateadd(hour, 2, x.booking_time) as adjusted_booking_time from analytics_v2.trips x)t
LEFT OUTER JOIN analytics_v2.customers c ON c.user_id=t.customer_id
LEFT OUTER JOIN analytics_v2.drivers d ON d.driver_id=t.driver_id
WHERE t.adjusted_booking_time BETWEEN '{{ Start_date }}' AND '{{ End_date }}' and t.country = 'sd'
GROUP By date(t.adjusted_booking_time),week

Related

How to repeat some data points in query results?

I am trying to get the max date by account from 3 different tables and view those dates side by side. I created a separate query for each table, merged the results with UNION ALL, and then wrapped all that in a PIVOT.
The first 2 sections in the link/pic below show what I have been able to accomplish and the 3rd section is what I would like to do.
Query results by step
How can I get the results from 2 of the tables to repeat? Is that possible?
--define var_ent_type = 'ACOM'
--define var_ent_id = '52766'
--define var_dict_id = 113
SELECT
*
FROM
(
SELECT
E.ENTITY_TYPE,
E.ENTITY_ID,
'PERF_SUMMARY' as "TableName",
PS.DICTIONARY_ID,
to_char(MAX(PS.END_EFFECTIVE_DATE), 'YYYY-MM-DD') as "MaxDate"
FROM
RULESDBO.ENTITY E
INNER JOIN PERFORMDBO.PERF_SUMMARY PS ON (PS.ENTITY_ID = E.ENTITY_ID)
WHERE
1=1
-- AND E.ENTITY_TYPE = '&var_ent_type'
-- AND E.ENTITY_ID = '&var_ent_id'
AND PS.DICTIONARY_ID >= 100
AND (E.ACTIVE_STATUS <> 'N' )--and E.TERMINATION_DATE is null )
GROUP BY
E.ENTITY_TYPE,
E.ENTITY_ID,
'PERF_SUMMARY',
PS.DICTIONARY_ID
union all
SELECT
E.ENTITY_TYPE,
E.ENTITY_ID,
'POSITION' as "TableName",
0 as DICTIONARY_ID,
to_char(MAX(H.EFFECTIVE_DATE), 'YYYY-MM-DD') as "MaxDate"
FROM
RULESDBO.ENTITY E
INNER JOIN HOLDINGDBO.POSITION H ON (H.ENTITY_ID = E.ENTITY_ID)
WHERE
1=1
-- AND E.ENTITY_TYPE = '&var_ent_type'
-- AND E.ENTITY_ID = '&var_ent_id'
AND (E.ACTIVE_STATUS <> 'N' )--and E.TERMINATION_DATE is null )
GROUP BY
E.ENTITY_TYPE,
E.ENTITY_ID,
'POSITION',
1
union all
SELECT
E.ENTITY_TYPE,
E.ENTITY_ID,
'CASH_ACTIVITY' as "TableName",
0 as DICTIONARY_ID,
to_char(MAX(C.EFFECTIVE_DATE), 'YYYY-MM-DD') as "MaxDate"
FROM
RULESDBO.ENTITY E
INNER JOIN CASHDBO.CASH_ACTIVITY C ON (C.ENTITY_ID = E.ENTITY_ID)
WHERE
1=1
-- AND E.ENTITY_TYPE = '&var_ent_type'
-- AND E.ENTITY_ID = '&var_ent_id'
AND (E.ACTIVE_STATUS <> 'N' )--and E.TERMINATION_DATE is null )
GROUP BY
E.ENTITY_TYPE,
E.ENTITY_ID,
'CASH_ACTIVITY',
1
--ORDER BY
-- 2,3, 4
)
PIVOT
(
MAX("MaxDate")
FOR "TableName"
IN ('CASH_ACTIVITY', 'PERF_SUMMARY','POSITION')
)

Everything is possible. You only need a window function to make the value repeat across rows w/o data.
--Assuming current query is QC
With QC as (
...
)
select code, account, grouping,
--cash,
first_value(cash) over (partition by code, account order by grouping asc rows unbounded preceding) as cash_repeat,
perf,
--pos,
first_value(pos) over (partition by code, account order by grouping asc rows unbounded preceding) as pos_repeat
from QC
;
See first_value() help here: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/FIRST_VALUE.html#GUID-D454EC3F-370C-4C64-9B11-33FCB10D95EC

How to divide a period in columns

I am trying to create a query where the first column shows the list of the companies and the other 3 columns their revenues per month. This is what I do:
WITH time_frame AS
(SELECT date_trunc('month',NOW())-interval '0 week'),
time_frame1 AS
(SELECT date_trunc('month',NOW())-interval '1 month'),
time_frame2 AS
(SELECT date_trunc('month',NOW())-interval '2 month')
select table1.company_name,
(CASE
WHEN table2.date_of_transaction = (SELECT * FROM time_frame2) THEN sum(table2.amount)
ELSE NULL
END) AS "current week - 2",
(CASE
WHEN table2.date_of_transaction = (SELECT * FROM time_frame1) THEN sum(table2.amount)
ELSE NULL
END) AS "current week - 1",
(CASE
WHEN table2.date_of_transaction = (SELECT * FROM time_frame2) THEN
sum(table2.amount)
ELSE NULL
END) AS "current week - 2"
from table1
join table2 on table2.table1_id = table.id
where table1.company_joined >= '04-20-2019'
group by 1
When I execute the table this comes out: Error running query: column "table2.date_of_transaction" must appear in the GROUP BY clause or be used in an aggregate function LINE 15: WHEN table2.date_of_transaction = (SELECT * FROM time_frame) TH... ^
Do you have any ideas on how to solve it? Thank you.
company name
month1
month2
name 1
£233
£343
name 2
£243
£34
name 3
£133
£43

you can simplify the statement by using the filter() operator
select t1.company_name,
sum(t2.amount) filter (where t2.date_of_transaction = date_trunc('month',NOW())-interval '2 month'),
sum(t2.amount) filter (where t2.date_of_transaction = date_trunc('month',NOW())-interval '1 month'),
sum(t2.amount) filter (where t2.date_of_transaction = date_trunc('month',NOW()))
from table1 t1
join table2 t2 on t2.table1_id = t1.id
where t1.company_joined >= date '2019-04-20'
group by t1.company_name;
If you really want to put the date ranges into a CTE, you only need one:
with dates (r1, r2, r3) as (
values
(date_trunc('month',NOW())-interval '2 month',
date_trunc('month',NOW())-interval '1 month',
date_trunc('month',NOW()))
)
select t1.company_name,
sum(t2.amount) filter (where t2.date_of_transaction = d.r1),
sum(t2.amount) filter (where t2.date_of_transaction = d.r2),
sum(t2.amount) filter (where t2.date_of_transaction = d.r3)
from table1 t1
cross join dates d
join table2 t2 on t2.table1_id = t1.id
where t1.company_joined >= date '2019-04-20'
group by t1.company_name
;
The CTE dates returns a single row with three columns and thus the cross join doesn't change the resulting number of rows.

Checking Slowly Changing Dimension 2

I have a table that looks like this:
A slowly changing dimension type 2, according to Kimball.
Key is just a surrogate key, a key to make rows unique.
As you can see there are three rows for product A.
Timelines for this product are ok. During time the description of the product changes.
From 1-1-2020 up until 4-1-2020 the description of this product was ProdA1.
From 5-1-2020 up until 12-2-2020 the description of this product was ProdA2 etc.
If you look at product B, you see there are gaps in the timeline.
We use DB2 V12 z/Os. How can I check if there are gaps in the timelines for each and every product?
Tried this, but doesn't work
with selectie (key, tel) as
(select product, count(*)
from PROD_TAB
group by product
having count(*) > 1)
Select * from
PROD_TAB A
inner join selectie B
on A.product = B.product
Where not exists
(SELECT 1 from PROD_TAB C
WHERE A.product = C.product
AND A.END_DATE + 1 DAY = C.START_DATE
)
Does anyone know the answer?

The following query returns all gaps for all products.
The idea is to enumerate (RN column) all periods inside each product by START_DATE and join each record with its next period record.
WITH
/*
MYTAB (PRODUCT, DESCRIPTION, START_DATE, END_DATE) AS
(
SELECT 'A', 'ProdA1', DATE('2020-01-01'), DATE('2020-01-04') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'A', 'ProdA2', DATE('2020-01-05'), DATE('2020-02-12') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'A', 'ProdA3', DATE('2020-02-13'), DATE('2020-12-31') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB1', DATE('2020-01-05'), DATE('2020-01-09') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB2', DATE('2020-01-12'), DATE('2020-03-14') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB3', DATE('2020-03-15'), DATE('2020-04-18') FROM SYSIBM.SYSDUMMY1
UNION ALL SELECT 'B', 'ProdB4', DATE('2020-04-16'), DATE('2020-05-03') FROM SYSIBM.SYSDUMMY1
)
,
*/
MYTAB_ENUM AS
(
SELECT
T.*
, ROWNUMBER() OVER (PARTITION BY PRODUCT ORDER BY START_DATE) RN
FROM MYTAB T
)
SELECT A.PRODUCT, A.END_DATE + 1 START_DT, B.START_DATE - 1 END_DT
FROM MYTAB_ENUM A
JOIN MYTAB_ENUM B ON B.PRODUCT = A.PRODUCT AND B.RN = A.RN + 1
WHERE A.END_DATE + 1 <> B.START_DATE
AND A.END_DATE < B.START_DATE;
The result is:
|PRODUCT|START_DT |END_DT |
|-------|----------|----------|
|B |2020-01-10|2020-01-11|
May be more efficient way:
WITH MYTAB2 AS
(
SELECT
T.*
, LAG(END_DATE) OVER (PARTITION BY PRODUCT ORDER BY START_DATE) END_DATE_PREV
FROM MYTAB T
)
SELECT PRODUCT, END_DATE_PREV + 1 START_DATE, START_DATE - 1 END_DATE
FROM MYTAB2
WHERE END_DATE_PREV + 1 <> START_DATE
AND END_DATE_PREV < START_DATE;

Thnx Mark, will try this one of these days.
Never heard of LAG in DB2 V12 for z/Os
Will read about it
Thnx

How can I combine the two select queries on the same table horizontally in Postgresql?

everyone. I am a beginner of Postgresql. Recently I met with one question.
I have one table named 'sales'.
create table sales
(
cust varchar(20),
prod varchar(20),
day integer,
month integer,
year integer,
state char(2),
quant integer
);
insert into sales values ('Bloom', 'Pepsi', 2, 12, 2001, 'NY', 4232);
insert into sales values ('Knuth', 'Bread', 23, 5, 2005, 'PA', 4167);
insert into sales values ('Emily', 'Pepsi', 22, 1, 2006, 'CT', 4404);
insert into sales values ('Emily', 'Fruits', 11, 1, 2000, 'NJ', 4369);
insert into sales values ('Helen', 'Milk', 7, 11, 2006, 'CT', 210);
insert into sales values ('Emily', 'Soap', 2, 4, 2002, 'CT', 2549);
insert into sales values ('Bloom', 'Eggs', 30, 11, 2000, 'NJ', 559);
....
There are 498 rows in total.
Here is the overview of this table:
Now I want to compute the maximum and minimum sales quantities for each product, along with their corresponding customer (who purchased the product), dates (i.e., dates of those maximum and minimum sales quantities) and the state in which the sale transaction took place.
And the average sales quantity for the corresponding products.
The combined one should be like this:
It should have 10 rows because there are 10 distinct products in total.
I have tried:
select prod,
max(quant),
cust as MAX_CUST
from sales
group by prod;
but it returned an error and said the cust should be in the group by. But I only want to classify by the type of product.
What's more, how can I horizontally combine the max_q and its customer, date, state with min_q and its customer, date, state and also the AVG_Q by their product name?
I feel really confused!

You can use analytic function ROW_NUMBER to rank records by increasing/decreasing sales for each product in a subquery, and then do conditional aggregation:
SELECT
prod product,
MAX(CASE WHEN rn2 = 1 THEN quant END) max_quant,
MAX(CASE WHEN rn2 = 1 THEN cust END) max_cust,
MAX(CASE WHEN rn2 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) max_date,
MAX(CASE WHEN rn2 = 1 THEN state END) max_state,
MAX(CASE WHEN rn1 = 1 THEN quant END) min_quant,
MAX(CASE WHEN rn1 = 1 THEN cust END) min_cust,
MAX(CASE WHEN rn1 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) min_date,
MAX(CASE WHEN rn1 = 1 THEN state END) min_state,
avg_quant
FROM (
SELECT
s.*,
ROW_NUMBER() OVER(PARTITION BY prod ORDER BY quant) rn1,
ROW_NUMBER() OVER(PARTITION BY prod ORDER BY quant DESC) rn2,
AVG(quant) OVER(PARTITION BY prod) avg_quant
FROM sales s
) x
WHERE rn1 = 1 OR rn2 = 1
GROUP BY prod, avg_quant

With two aggregate function (min, max) applied on a column and selecting respective row is not that straight forward. if u wanted only one aggregate function u could do something like example below with dense rank (window function).
SELECT prod, quant cust,
dense_rank() OVER (PARTITION BY prod ORDER BY quant DESC) AS c_rank
FROM sales WHERE c_rank < 2;
this will give you rows for a product with maximum quant. you can do same for minimum quant. it will more complicated to do both in same query, you can do it in simple way of creating on the fly tables for each case and joining them as show below.
with max_quant as (
SELECT prod, quant cust,
dense_rank() OVER (PARTITION BY prod ORDER BY quant DESC) AS c_rank
FROM sales WHERE c_rank < 2
),
min_quant as (
SELECT prod, quant cust,
dense_rank() OVER (PARTITION BY prod ORDER BY quant DESC) AS c_rank
FROM sales WHERE c_rank < 2
),
avg_quant as (
select prod, avg(quant) as avg_quant from sales group by prod
)
select mx.prod, mx.quant, mx.cust, mn.quant, mn.cust, ag.avg_quant
from max_quant mx
join min_quant mn on mn.prod = mx.prod
join avg_quant ag on ag.prod = mx.prod;
you cant use a group by to select min/max here as you want to get the complete row for the min/max value of quant which is not possible directly with group by.

How to show the maximum number for each combination of customer and product in a specific state in Postgresql?

I just begin learning Postgresql recently.
I have a table named 'sales':
create table sales
(
cust varchar(20),
prod varchar(20),
day integer,
month integer,
year integer,
state char(2),
quant integer
)
insert into sales values ('Bloom', 'Pepsi', 2, 12, 2001, 'NY', 4232);
insert into sales values ('Knuth', 'Bread', 23, 5, 2005, 'PA', 4167);
insert into sales values ('Emily', 'Pepsi', 22, 1, 2006, 'CT', 4404);
insert into sales values ('Emily', 'Fruits', 11, 1, 2000, 'NJ', 4369);
insert into sales values ('Helen', 'Milk', 7, 11, 2006, 'CT', 210);
......
It looks like this:
And there are 500 rows in total.
Now I want to use the query to implement this:
For each combination of customer and product, output the maximum sales quantities for
NY and minimum sales quantities for NJ and CT in 3 separate columns. Like the first
report, display the corresponding dates (i.e., dates of those maximum and minimum sales
quantities). Furthermore, for CT and NJ, include only the sales that occurred after 2000;
for NY, include all sales.
It should be like this:
I have tried the following query:
SELECT
cust customer,
prod product,
MAX(CASE WHEN rn3 = 1 THEN quant END) NY_MAX,
MAX(CASE WHEN rn3 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) date,
MIN(CASE WHEN rn2 = 1 THEN quant END) NJ_MIN,
MIN(CASE WHEN rn2 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) date,
MIN(CASE WHEN rn1 = 1 THEN quant END) CT_MIN,
MIN(CASE WHEN rn1 = 1 THEN TO_DATE(year || '-' || month || '-' || day, 'YYYY-MM-DD') END) date
FROM (
SELECT
*,
ROW_NUMBER() OVER(PARTITION BY cust, prod ORDER BY quant) rn1,
ROW_NUMBER() OVER(PARTITION BY cust, prod ORDER BY quant) rn2,
ROW_NUMBER() OVER(PARTITION BY cust, prod ORDER BY quant DESC) rn3
FROM sales
) x
WHERE rn1 = 1 OR rn2 = 1 or rn3 = 1
GROUP BY cust, prod;
This is the result:
This is wrong because it shows me the maximum number and minimum number of all states, not of the specific state I want. And I have no idea how to deal with the year as the question as me to do.

We can handle this using separate CTEs along with a calendar table:
WITH custprod AS (
SELECT DISTINCT cust, prod
FROM sales
),
ny_sales AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY cust, prod ORDER BY quant DESC) rn
FROM sales
WHERE state = 'NY'
),
nj_sales AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY cust, prod ORDER BY quant) rn
FROM sales
WHERE state = 'NJ'
),
ct_sales AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY cust, prod ORDER BY quant) rn
FROM sales
WHERE state = 'CT'
)
SELECT
cp.cust,
cp.prod,
nys.quant AS ny_max,
nys.year::text || '-' || nys.month::text || '-' || nys.day::text AS ny_date,
njs.quant AS nj_max,
njs.year::text || '-' || njs.month::text || '-' || njs.day::text AS nj_date,
cts.quant AS ct_max,
cts.year::text || '-' || cts.month::text || '-' || cts.day::text AS ct_date
FROM custprod cp
LEFT JOIN ny_sales nys
ON cp.cust = nys.cust AND cp.prod = nys.prod AND nys.rn = 1
LEFT JOIN nj_sales njs
ON cp.cust = njs.cust AND cp.prod = njs.prod AND njs.rn = 1
LEFT JOIN ct_sales cts
ON cp.cust = cts.cust AND cp.prod = cts.prod AND cts.rn = 1
ORDER BY
cp.cust,
cp.prod;
Note: You didn't provide comprehensive sample data, but the above seems to be working in the demo link below.
Demo

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Get Totals from multi rows - amazon-redshift

Related

How to repeat some data points in query results?

How to divide a period in columns

Checking Slowly Changing Dimension 2

How can I combine the two select queries on the same table horizontally in Postgresql?

How to show the maximum number for each combination of customer and product in a specific state in Postgresql?

Categories

Resources