I have the query below, but,sometimes the 'code' value is not available and then i have to use 'new_code' instead.
My question is: how can I change the query below to prioritize use of code but when its missing use new_code instead? is it possible?
with cte as (select *,
row_number() over (partition by code order by price) rn_low,
row_number() over (partition by code order by price DESC) rn_high
from t
Related
I am working on a query where I should combine 2 tables and get each user as a separate entry (The user should not be duplicate). For the date, I need to get the latest out of those 2 tables
table 1
table 2
Expected output ( I need to combine both tables and get the data's of the user as a single entry and for the date, i need to get the latest date out of those 2 tables)
user_id name date
----------------------------------
1 John 2020-10-29 --The latest date--
2 Tom 2020-11-15 --The latest date--
3 Peter 2020-12-10 --The latest date--
Actual Output
My postgresql
SELECT user_id, name, date
FROM
table_1
UNION
SELECT user_id, name, date
FROM
table_2
I tried many ways but nothing worked. The datas are duplicating when doing the union. Can someone help me
Use combine two tables using UNION ALL then apply ROW_NUMBER() for serializing user_id wise value with descending date. Then retrieve last record by using CTE. Using UNION ALL for avoiding extra ordering.
-- PostgreSQL
WITH c_cte AS (
SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
)
SELECT user_id, name, date
FROM c_cte
WHERE row_num = 1
ORDER BY user_id
Also another way for doing same thing without CTE
SELECT u.user_id, u.name, u.date
FROM (SELECT t.*
, ROW_NUMBER() OVER (PARTITION BY t.user_id ORDER BY t.date DESC) row_num
FROM (SELECT user_id, name, date
FROM table_1
UNION ALL
SELECT user_id, name, date
FROM table_2) t
) u
WHERE u.row_num = 1
ORDER BY u.user_id
I had a spreadsheet that looked like a prior "group by" had left many rows blank where I needed them to be filled with the data above it (see example picture below). I needed each account number to fill all the cells beneath it until the start of the next account number (i.e., A1234 needs to be in all the cells up to B4325, B4325 needs to be in all the cells up to C3452 and so on).
From this stack exchange answer by benjamin berhault I found this code and tailored it to my problem:
SELECT rn, acct, FIRST_VALUE(acct) OVER(PARTITION BY grp)
FROM (SELECT rn, acct, SUM(CASE WHEN acct <> '' THEN 1 END) OVER (ORDER BY rn) AS grp
FROM
(SELECT ROW_NUMBER() OVER() rn
, acct
FROM dataset AS d) AS sub1 ) sub2;
What I don't understand about this query is the ORDER BY clause in this part
SUM(CASE WHEN acct <> '' THEN 1 END) OVER (ORDER BY rn) AS grp
This whole line works to successfully create a new grp column that is all 1's for the first account, all 2's for the second account and so on. From here it can use the FIRST VALUE PARTITION BY in the main query to get the result I am looking for, but what I do not understand is why does ORDER BY rn cause the column to sum in that manner? I would have thought a PARTITION BY would be needed there, but it does not work.
I want to get the last entry for each user but the customer_id is a hash 'ASAG#...' order by customer_id destroys the query. Is there an alternative?
Select Distinct On (l.customer_id)
l.customer_id
,l.created_at
,l.text
From likes l
Order By l.customer_id, l.created_at Desc
Your current query already appears to be working, q.v. here:
Demo
I don't know why your current query is not generating the results you would expect. It should return one distinct record for every customer, corresponding to the more recent one, given your ORDER BY statement.
In any case, if it does not do what you want, an alternative would be to use ROW_NUMBER() here with a partition by user. The inner query assigns a row number to each user, with the value 1 going to the most recent record for each user. Then the outer query retains only the latest record.
SELECT
t.customer_id,
t.created_at,
t.text
FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) rn
FROM likes
) t
WHERE t.rn = 1
To speed up the inner query which uses ROW_NUMBER() you can try adding a composite index on the customer_id and created_at columns:
CREATE INDEX yourIdx ON likes (customer_id, created_at);
Is there any option to get the average of the same values using the RANK() function in PostgreSQL? Here is the example of what I want to do:
This query will do the trick for you
SELECT
test_score,
row_number() OVER (ORDER BY test_score) AS rank,
rank() OVER (ORDER BY test_score)
+ (count(*) OVER (PARTITION BY test_score) - 1) / 2.0 AS "rank (with tied)"
FROM scores
SQLFiddle
Remarks:
What you believe is the "rank" is really the row_number() (i.e. a consecutive series of positive integer with no gaps and no duplicates).
That rank "with tied" that you're looking for can be calculated from the real rank() (rank with gaps) + the number of other elements of the same rank divided by two. This is a faster shortcut to calculate the average row_number() given your specific requirements.
I'm pretty sure you want row_number(), not rank(). Rank will not give repeated values in the way you presented. To get the answer you're looking for:
with rwn as (
select
test_score
,row_number() over (order by test_score) rwn
from
score
)
select
test_score
,avg(rwn) average_rank
from
rwn
group by
test_score;
Here's a SQLFiddle.
#Lukas and #jeremy already explained the difference between rank() and row_number() you seemed to be missing.
You can also compute the row number (rn), and the average over rn (avg_rn) per rank (= per group of same values) in the next step:
SELECT test_score, rn, avg(rn) OVER (PARTITION BY test_score) AS avg_rn
FROM (SELECT test_score, row_number() OVER (ORDER BY test_score) AS rn FROM tbl) sub;
You need a subquery because window functions cannot be nested on the same query level.
You need another window function (not an aggregate function like has been suggested) to preserve all original rows.
The result is ordered by rn by default (for this simple query), but this is just an implementation detail. To guarantee an ordered result, add an explicit ORDER BY (for practically no cost):
...
ORDER BY rn;
SQL Fiddle.
I am trying to generate result set similar in the following table. However, could not achieve the goal. I want to assign each row of the table as shown in the 'I want' column of the following table.
Following SQL generated 'RowNbr' column. Any suggestion would be appreciated. Thank you
SELECT Date, Nbr, status, ROW_NUMBER () over (partition by Date,staus order by date asc) as RowNbr
Thank you
This is a classic "gaps and islands" problem, in case you are searching for similar solutions in the future. Basically you want the counter to reset every time you hit a new status for a given Nbr, ordered by date.
This general overall technique was developed, I believe, by Itzik Ben-Gan, and he has tons of articles and book chapters about it.
;WITH cte AS
(
SELECT [Date], Nbr, [Status],
rn = ROW_NUMBER() OVER (PARTITION BY Nbr ORDER BY [Date])
- ROW_NUMBER() OVER (PARTITION BY Nbr,[Status] ORDER BY [Date])
FROM dbo.your_table_name
)
SELECT [Date], Nbr, [Status],
[I want] = ROW_NUMBER() OVER (PARTITION BY Nbr,rn ORDER BY [Date])
FROM cte
ORDER BY Nbr, [Date];
On 2012, you may be able to achieve something similar using LAG and LEAD; I made a few honest attempts but couldn't get anywhere that would end up being anything less complex than the above.