Hide some datetimes in a query - postgresql

I have a query that returns me this result:
-----DATE--------------VALUE1---VALUE2
|2016-09-20 11:15:00| 5653856 | 37580
|2016-09-20 11:16:00| NULL NULL
|2016-09-20 11:18:00| NULL NULL
|2016-09-20 11:20:00| NULL NULL
|2016-09-20 11:30:00| 5653860 37580
|2016-09-20 11:32:00| NULL NULL
|2016-09-20 11:34:00| NULL NULL
In this table, only the records in xx:00, xx:15, xx:30, xx:45, have values, other records are null.
How can I make a condition in my query to get only 00,15,30 and 45 times records and dont show the others?
This is the query:
SELECT t.date,
MAX(CASE WHEN t.id= '924' THEN t.value END) - MAX(CASE WHEN t.id= '925' THEN t.valueEND) as IMA_71,
MAX(CASE WHEN t.id= '930' THEN t.value END) as IMA_73
FROM records t
where office=10
and date between '2016-09-20 11:15:00' and '2016-10-21 11:15:00'
GROUP BY t.office,t.date order by t.date asc;

You could use extract to determine the minute, and filter on that:
where extract('minute' from t.date) in (0, 15, 30, 45)

Related

Postgresql, set order by desc or asc depending on variable parse into function

I have a function that takes product pricing data from today and yesterday and works out the difference, orders it by price_delta_percentage and then limits to 5. Now currently I order by price_delta_percentage DESC which returns the top 5 products that have increased in price since yesterday.
I would like to parse in a variable - sort - to change the function to either sort by DESC, or ASC. I have tried to use IF statements and get syntax errors and CASE statements which states that price_delta_percentage doesn't exist.
Script:
RETURNS TABLE(
product_id varchar,
name varchar,
price_today numeric,
price_yesterday numeric,
price_delta numeric,
price_delta_percentage numeric
)
LANGUAGE 'sql'
COST 100
STABLE STRICT PARALLEL SAFE
AS $BODY$
WITH cte AS (
SELECT
product_id,
name,
SUM(CASE WHEN rank = 1 THEN trend_price ELSE NULL END) price_today,
SUM(CASE WHEN rank = 2 THEN trend_price ELSE NULL END) price_yesterday,
SUM(CASE WHEN rank = 1 THEN trend_price ELSE 0 END) - SUM(CASE WHEN rank = 2 THEN trend_price ELSE 0 END) as price_delta,
ROUND(((SUM(CASE WHEN rank = 1 THEN trend_price ELSE NULL END) / SUM(CASE WHEN rank = 2 THEN trend_price ELSE NULL END) - 1) * 100), 2) as price_delta_percentage
FROM (
SELECT
magic_sets_cards.name,
pricing.product_id,
pricing.trend_price,
pricing.date,
RANK() OVER (PARTITION BY product_id ORDER BY date DESC) AS rank
FROM pricing
JOIN magic_sets_cards_identifiers ON magic_sets_cards_identifiers.mcm_id = pricing.product_id
JOIN magic_sets_cards ON magic_sets_cards.id = magic_sets_cards_identifiers.card_id
JOIN magic_sets ON magic_sets.id = magic_sets_cards.set_id
WHERE date BETWEEN CURRENT_DATE - days AND CURRENT_DATE
AND magic_sets.code = set_code
AND pricing.trend_price > 0.25) p
WHERE rank IN (1,2)
GROUP BY product_id, name
ORDER BY price_delta_percentage DESC)
SELECT * FROM cte WHERE (CASE WHEN price_today IS NULL OR price_yesterday IS NULL THEN 'NULL' ELSE 'VALID' END) !='NULL'
LIMIT 5;
$BODY$;sql
CASE Statement:
ORDER BY CASE WHEN sort = 'DESC' THEN price_delta_percentage END DESC, CASE WHEN sort = 'ASC' THEN price_delta_percentage END ASC)
Error:
ERROR: column "price_delta_percentage" does not exist
LINE 42: ORDER BY CASE WHEN sort = 'DESC' THEN price_delta_percenta...
You can't use CASE to decide between ASC and DESC like that. Those labels are not data, they are part of the SQL grammar. You would need to do it by combining the text into a string and then executing the string as a dynamic query, which means you would need to use pl/pgsql, not SQL
But since your column is numeric, you could just order by the product of the column and an indicator variable which is either 1 or -1.

Count Distinct with Answer side by side instead of underneath

Here is my query:
SELECT substring(date,1,10), count(distinct id),
CASE WHEN name IS NOT NULL THEN 1 ELSE 0 END
FROM table
WHERE (date >= '2015-09-01')
GROUP BY substring(date,1,10), CASE WHEN name IS NOT NULL THEN 1 ELSE 0 END
ORDER BY substring(date,1,10)
This is my result:
substring count case
2015-09-01 20472 0
2015-09-01 7 1
2015-09-02 20465 0
2015-09-02 470 1
What I want it to look like is this:
substring count count
2015-09-01 20472 7
2015-09-02 20465 470
Thank you!
With PostgreSQL 9.4 or newer, we can filter directly an aggregate with the new FILTER clause:
SELECT substring(date,1,10),
count(distinct id),
count(*) FILTER (WHERE name IS NOT NULL)
FROM table
WHERE (date >= '2015-09-01')
GROUP BY 1
ORDER BY 1
SELECT substring(date,1,10)
, count(distinct CASE WHEN name IS NOT NULL THEN id ELSE null END ) AS count1
, count(distinct CASE WHEN name IS NOT NULL THEN null ELSE id END ) AS count2
FROM event
WHERE (date >= '2015-09-01')
GROUP BY substring(date,1,10)
ORDER BY substring(date,1,10)
This gave me an answer like this: (which is exactly what I wanted so thank you so much)
substring count1 count2
2015-09-01 7 20472
2015-09-02 470 20465
Use case in count to get columns for some condition (name IS NOT NULL), like this:
SELECT substring(date,1,10)
, count(distinct CASE WHEN name IS NOT NULL THEN id ELSE null END ) AS count1
, count(distinct CASE WHEN name IS NOT NULL THEN null ELSE id END ) AS count2
FROM table
WHERE (date >= '2015-09-01')
GROUP BY substring(date,1,10)
ORDER BY substring(date,1,10)
you can also use subquery to create columns:
SELECT dt, Count(id1) count1, Count(distinct id2) count2
FROM (
SELECT distinct substring(date,1,10) AS dt
, CASE WHEN name IS NOT NULL THEN id ELSE null END AS id1
, CASE WHEN name IS NOT NULL THEN null ELSE id END AS id2,
FROM table
WHERE (date >= '2015-09-01')) d
GROUP BY dt
ORDER BY dt

Redshift PostgreSQL Distinct ON Operator

I have a data set that I want to parse for to see multi-touch attribution. The data set is made up by leads who responded to a marketing campaign and their marketing source.
Each lead can respond to multiple campaigns and I want to get their first marketing source and their last marketing source in the same table.
I was thinking I could create two tables and use a select statement from both.
The first table would attempt to create a table with the most recent marketing source from every person (using email as their unique ID).
create table temp.multitouch1 as (
select distinct on (email) email, date, market_source as last_source
from sf.campaignmember
where date >= '1/1/2016' ORDER BY DATE DESC);
Then I would create a table with deduped emails but this time for the first source.
create table temp.multitouch2 as (
select distinct on (email) email, date, market_source as first_source
from sf.campaignmember
where date >= '1/1/2016' ORDER BY DATE ASC);
Finally I wanted to simply select the email and join the first and last market sources to it each in their own column.
select a.email, a.last_source, b.first_source, a.date
from temp.multitouch1 a
left join temp.multitouch b on b.email = a.email
Since distinct on doesn't work on redshift's postgresql version I was hoping someone had an idea to solve this issue in another way.
EDIT 2/22: For more context I'm dealing with people and campaigns they've responded to. Each record is a "campaign response" and every person can have more than one campaign response with multiple sources. I'm trying make a select statement which would dedupe by person and then have columns for the first campaign/marketing source they've responded to and the last campaign/marketing source they've responded to respectively.
EDIT 2/24: Ideal output is a table with 4 columns: email, last_source, first_source, date.
The first and last source columns would be the same for people with only 1 campaign member record and different for everyone who has more than 1 campaign member record.
I believe you could use row_number() inside case expressions like this:
SELECT
email
, MIN(first_source) AS first_source
, MIN(date) first_date
, MAX(last_source) AS last_source
, MAX(date) AS last_date
FROM (
SELECT
email
, date
, CASE
WHEN ROW_NUMBER() OVER (PARTITION BY email ORDER BY date ASC) = 1 THEN market_source
ELSE NULL
END AS first_source
, CASE
WHEN ROW_NUMBER() OVER (PARTITION BY email ORDER BY date DESC) = 1 THEN market_source
ELSE NULL
END AS last_source
FROM sf.campaignmember
WHERE date >= '2016-01-01'
) s
WHERE first_source IS NOT NULL
OR last_source IS NOT NULL
GROUP BY
email
tested here: SQL Fiddle
PostgreSQL 9.3 Schema Setup:
CREATE TABLE campaignmember
(email varchar(3), date timestamp, market_source varchar(1))
;
INSERT INTO campaignmember
(email, date, market_source)
VALUES
('a#a', '2016-01-02 00:00:00', 'x'),
('a#a', '2016-01-03 00:00:00', 'y'),
('a#a', '2016-01-04 00:00:00', 'z'),
('b#b', '2016-01-02 00:00:00', 'x')
;
Query 1:
SELECT
email
, MIN(first_source) AS first_source
, MIN(date) first_date
, MAX(last_source) AS last_source
, MAX(date) AS last_date
FROM (
SELECT
email
, date
, CASE
WHEN ROW_NUMBER() OVER (PARTITION BY email ORDER BY date ASC) = 1 THEN market_source
ELSE NULL
END AS first_source
, CASE
WHEN ROW_NUMBER() OVER (PARTITION BY email ORDER BY date DESC) = 1 THEN market_source
ELSE NULL
END AS last_source
FROM campaignmember
WHERE date >= '2016-01-01'
) s
WHERE first_source IS NOT NULL
OR last_source IS NOT NULL
GROUP BY
email
Results:
| email | first_source | first_date | last_source | last_date |
|-------|--------------|---------------------------|-------------|---------------------------|
| a#a | x | January, 02 2016 00:00:00 | z | January, 04 2016 00:00:00 |
| b#b | x | January, 02 2016 00:00:00 | x | January, 02 2016 00:00:00 |
& a small extension to the request, count the number of contact points.
SELECT
email
, MIN(first_source) AS first_source
, MIN(date) first_date
, MAX(last_source) AS last_source
, MAX(date) AS last_date
, MAX(numof) AS Numberof_Contacts
FROM (
SELECT
email
, date
, CASE
WHEN ROW_NUMBER() OVER (PARTITION BY email ORDER BY date ASC) = 1 THEN market_source
ELSE NULL
END AS first_source
, CASE
WHEN ROW_NUMBER() OVER (PARTITION BY email ORDER BY date DESC) = 1 THEN market_source
ELSE NULL
END AS last_source
, COUNT(*) OVER (PARTITION BY email) as numof
FROM campaignmember
WHERE date >= '2016-01-01'
) s
WHERE first_source IS NOT NULL
OR last_source IS NOT NULL
GROUP BY
email
You can use the good old left join groupwise maximum.
SELECT DISTINCT c1.email, c1.date, c1.market_source
FROM sf.campaignmember c1
LEFT JOIN sf.campaignmember c2
ON c1.email = c2.email AND c1.date > c2.date AND c1.id > c2.id
LEFT JOIN sf.campaignmember c3
ON c1.email = c3.email AND c1.date < c3.date AND c1.id > c3.id
WHERE c1.date >= '1/1/2016' AND c2.date >= '1/1/2016'
AND (c2.email IS NULL OR c3.email IS NULL)
This assumes you have an unique id column, if (date, email) is unique id is not needed.

Teradata Aggregate Function GROUP BY

Hi i am getting the error as GROUP BY and WITH BY CLAUSE MAY NOT CONATIN AGGREGATE FUNCTIONS for below query.
SELECT
distinct CC.CASE_ID as CASE_ID,
/*FIRST_VALUE(CC.CASE_OWN_NM) OVER(PARTITION BY CC.CASE_ID )as FST_AGNT_CASE_OWN_NM,
FIRST_VALUE(CC.LSTMOD_BY_AGNT_PRFL_NM) OVER(PARTITION BY CC.CASE_ID)as FST_AGNT_PRFL_NM,
LAST_VALUE(CC.CASE_OWN_NM) OVER(PARTITION BY CC.CASE_ID) as LST_AGNT_CASE_OWN_NM,
LAST_VALUE(CC.LSTMOD_BY_AGNT_PRFL_NM) OVER(PARTITION BY CC.CASE_ID) as LST_AGNT_PRFL_NM,*/
case when CC.CASE_OWN_NM is not null then MIN(CC.REC_DTTM_PST) end as FST_AGNT_EDIT_DTTM,
case when CC.CASE_OWN_NM is not null then MAX(CC.REC_DTTM_PST) end as LST_AGNT_EDIT_DTTM,
case when CC.CASE_STS_CD='Open' then MIN(CC.REC_DTTM_PST) end as CASE_OPEN_DTTM,
case when CC.CASE_STS_CD in ( 'Closed', 'Auto Closed') then MIN(CC.REC_DTTM_PST) end as CASE_CLSE_OR_AUTO_CLSE_DTTM
--CC.PU_DTTM as LMI_PU_DTTM,
--CC.CLS_DTTM as LMI_CLS_DTTM
FROM EDW_KATAMARI_T.CNTCT_CASE CC
INNER JOIN EDW_KATAMARI_T.CNTCT_CASE_EXTN CCE
ON CC.CNTCT_CASE_APND_KEY = CCE.CNTCT_CASE_APND_KEY
INNER JOIN EDW_STAGE_COMN_SRC.STG_CNTCT_CASE_DELTA DELTA
on CC.CASE_ID = DELTA.CASE_ID
where CC.CASE_ID='23268760'
group by 1,2,3,4,5
when i used only group 1 still it is giving non-aggregate function must be part of group by.
You need to move the CASEs into the aggregate:
MIN(CASE WHEN CC.CASE_OWN_NM IS NOT NULL THEN CC.REC_DTTM_PST END) AS FST_AGNT_EDIT_DTTM,
MAX(CASE WHEN CC.CASE_OWN_NM IS NOT NULL THEN CC.REC_DTTM_PST END) AS LST_AGNT_EDIT_DTTM,
MIN(CASE WHEN CC.CASE_STS_CD='Open' THEN CC.REC_DTTM_PST END) AS CASE_OPEN_DTTM,
MIN(CASE WHEN CC.CASE_STS_CD IN ( 'Closed', 'Auto Closed') THEN CC.REC_DTTM_PST END) AS CASE_CLSE_OR_AUTO_CLSE_DTTM
Then GROUP BY 1 will work

Count based on Or is not differentiating the count

My results are showing both counts the same but there should be some that have different counts as CarCode is sometimes null.
SELECT distinct car.carKey,
car.Weight,
car.CarCode,
COUNT(car.carKey)OVER(PARTITION BY car.carKey) AS TotalCarKeyCount,
COUNT(Case When (car.[Weight] IS not null) and (car.CarCode is null) as CarCountWithoutCode
then 0
else car.carKey End) OVER(PARTITION BY car.carKey) AS CarCount
from car
results show TotalCarKeyCount and CarCountWithoutCode always with the same counts like the case statement isn't working or something.
It sounds like you might want to use SUM() instead:
SELECT distinct car.carKey,
car.Weight,
car.CarCode,
COUNT(car.carKey)OVER(PARTITION BY car.carKey) AS TotalCarKeyCount,
SUM(Case When (car.[Weight] IS not null) and (car.CarCode is null) as CarCountWithoutCode
then 0 else 1 End) OVER(PARTITION BY car.carKey) AS CarCount
from car
SQL Fiddle demo showing the difference between using COUNT() and SUM():
create table test
(
id int
);
insert into test values
(1), (null), (23), (4), (2);
select
count(case when id is null then 0 else id end) [count],
sum(case when id is null then 0 else 1 end) [sum]
from test;
Count returns 5 and Sum returns 4. Or you can change the COUNT() to use null and the null values will be excluded in the final count()
select
count(case when id is null then null else id end) [count],
sum(case when id is null then 0 else 1 end) [sum]
from test;
Your query would be:
SELECT distinct car.carKey,
car.Weight,
car.CarCode,
COUNT(car.carKey)OVER(PARTITION BY car.carKey) AS TotalCarKeyCount,
COUNT(Case When (car.[Weight] IS not null) and (car.CarCode is null) as CarCountWithoutCode
then null else 1 End) OVER(PARTITION BY car.carKey) AS CarCount
from car
Change the then 0 to then null. Zero values are counted, nulls are not.