Extract amount for the minimum date in Postgres - postgresql

Very basic: I have a table with dates, account and amount done by a particular account on that date. I am stuck on a very basic problem - get the amount for the minimum date per account.
Input:
Desired:
If I do the query below it obviously returns the grouping by the amount, too.
SELECT account_ref AS account_alias,
Min(timestamp_made) AS reg_date,
amount
FROM stg_payment_mysql
GROUP BY account_ref,
amount

Perhaps the most performant way to do this would be to use DISTINCT ON:
SELECT DISTINCT ON (account) account, date, amount
FROM stg_payment_mysql
ORDER BY account, date;
A more general ANSI SQL approach to this would use ROW_NUMBER:
WITH cte AS (
SELECT account, date, amount,
ROW_NUMBER() OVER (PARTITION BY account ORDER BY date) rn
FROM stg_payment_mysql
)
SELECT account, date, amount
FROM cte
WHERE rn = 1
ORDER BY account;

Related

postgresql get weekly average of cases with daily data

I have a table called Table1. I am trying to get the weekly average, but I only have daily data. My table contains the following attributes: caseID, date, status and some other (irrelevant) attributes. With the following query, I made the following table which comes close to what I want:
However, I would like to add a average per week of the number of cases. I have look everywhere, but I am not sure how to include that. Has anybody any clues for how to add that.
Thanks.
To expand on #luuk's answer...
SELECT
date,
COUNT(id) as countcase,
EXTRACT(WEEK FROM date) AS weeknbr,
AVG(COUNT(id)) OVER (PARTITION BY EXTRACT(WEEK FROM date)) as weeklyavg
FROM table1
GROUP BY date, weeknbr
ORDER BY date, weeknbr
This is possible as the Aggregation / GROUP BY is applied before the window/analytic function.
select
date,
countcase,
extract(week from date) as weeknbr,
avg(countcase) over (partition by extract(week from date)) as weeklyavg
from table1;

Get the first live record on each quarter

we have a pricing table and I need to get the first live record on each quarter, the table structure is like this:
record_id (int)
start_date (date)
price (decimal)
live (boolean)
I need to be able to get the first "live" record on each quarter.
So far, I've been able to do this:
SELECT DISTINCT EXTRACT(QUARTER FROM start_date::TIMESTAMP) as quarter,
EXTRACT(YEAR FROM start_date::TIMESTAMP) as year,
distinct start_date,
live
FROM record_pricing rp
group by year, quarter,record_instance_uid
order by year,quarter;
I get this:
As you can see there are live and not live records there in the results, I just need the first live record on each Q, as highlighted in the picture above as an example.
you can use:
SELECT *, ROW_NUMBER() OVER(PARTITION BY year,quarter order by start_date asc) as Rank,
FROM (
SELECT EXTRACT(QUARTER FROM start_date::TIMESTAMP) as quarter,
EXTRACT(YEAR FROM start_date::TIMESTAMP) as year,
record_instance_uid,live,start_date
FROM record_pricing rp
)Tab
where tab.Rank=1

SQL SSRS aggregate fuctions

I am trying to figure out the aggregate functions in SQL SSRS to give me to sum of total sales for the given information by YEAR. I need to combine the year, the months within that year and provide the total sum of sales for that year. For example: for 2018 I need to combine month's 2-12 and provide the total sum, for 2019 combine 1-12 and provide total sum and so on.
enter image description here
I'm not sure where to begin on this one as I am new to SQL SSRS. Any help would be appreciated!
UPDATE:
Ideally I want this to be the end result:
id Year Price
102140 2019 ($XXXXX.XX)
102140 2018 ($XXXXX.XX)
102140 2017 ($XXXXX.XX)
And so on.
your query:
Select customer_id
, year_ordered
--, month_ordered
--, extended_price
--, SUM(extended_price) OVER (PARTITION BY year_ordered) AS year_total
, SUM(extended_price) AS year_total
From customer_order_history
Where customer_id = '101646'
Group By
customer_id
, year_ordered
, extended_price
--, month_ordered
Provides this:
enter image description here
multiple "years_ordered" because it is still using each month and that months SUM of price.
There are two approaches.
Do this in your dataset query:
SELECT Customer_id, year_ordered, SUM(extended_price) AS Price
FROM myTable
GROUP BY Customer_id, year_ordered
This option is best when you will never need the month values themselves in the report (i.e. you don't intend to have a drill down to the month data)
Do this in SSRS
By default you will get a RowGroup called "Details" (look under the main design area and you will row groups and column groups).
You can right-click this and add grouping for both customer_id and year_ordered. You can then change the extended_price textbox's value property to =SUM(Fields!extended_price.Value)
You could use a window function in your SQL:
select [year], [month], [price], SUM(PRICE) OVER (PARTITION BY year) as yearTotal
from myTable

Condition and max reference in redshift window function

I have a list of dates, accounts, and sources of data. I'm taking the latest max date for each account and using that number in my window reference.
In my window reference, I'm using row_number () to assign unique rows to each account and sources of data that we're receiving and sorting it by the max date for each account and source of data. The end result should list out one row for each unique account + source of data combination, with the max date available in that combination. The record with the highest date will have 1 listed.
I'm trying to set a condition on my window function where only rows that populate with 1 are listed in the query, while the other ones are not shown at all. This is what I have below and where I get stuck:
SELECT
date,
account,
data source,
MAX(date) max_date,
ROW_NUMBER () OVER (PARTITION BY account ORDER BY max_date) ROWNUM
FROM table
GROUP BY
date,
account,
data source
Any help is greatly appreciated. I can elaborate on anything if necessary
If I understood your question correctly this SQL would do the trick
SELECT
date,
account,
data source,
MAX(date) max_date
FROM (
SELECT
date,
account,
data source,
MAX(date) max_date,
ROW_NUMBER () OVER (PARTITION BY account ORDER BY max_date) ROWNUM
FROM table
GROUP BY
date,
account,
data source
)
where ROWNUM = 1
If you do not need the row number for anything other than uniqueness then a query like this should work:
select distinct t.account, data_source, date
from table t
join (select account, max(date) max_date from table group by account) m
on t.account=m.account and t.date=m.max_date
This can still generate two records for one account if two records for different data sources have the identical date. If that is a possibility then mdem7's approach is probably best.
It's a bit unclear from the question but if you want each combination of account and data_source with its max date making sure there are no duplicates, then distinct should be enough:
select distinct account, data_source, max(date) max_date
from table t
group by account, data_source

SQL (PostgreSQL): How to write a query to fetch the average salary of top 100 highest paid employees?

What I have tried to solve this is:
SELECT AVG(amount) FROM (SELECT amount FROM payment ORDER BY amount LIMIT 100);
This also did not work.
SELECT AVG(highest_amount) FROM (SELECT amount AS highest_amount FROM
payment ORDER BY amount LIMIT 100);
Sorry for asking silly questions. I am a newbie. :(
If you give the subquery in your first attempt an alias, it will work:
SELECT AVG(amount)
FROM
(
SELECT amount
FROM payment
ORDER BY amount
LIMIT 100
) t;
If you want an alternative method, which can be used without using LIMIT, then we can try using row number:
SELECT AVG(amount)
FROM
(
SELECT amount, ROW_NUMBER() OVER (ORDER BY amount) rn
FROM payment
) t
WHERE rn <= 100;
The easiest way to solve this kind of problem is to use WITH queries, also know as Common Table Expression, https://www.postgresql.org/docs/10/static/queries-with.html the result is clear and easy to understand.
WITH top100 AS
(
SELECT
amount
FROM
payment
ORDER BY
amount DESC
LIMIT
100
)
SELECT
avg(amout)
FROM
top100;