How do I find the maximum number of options for a month in Postgres? - postgresql

I would like to find out the maximum number of rate options in a given month for each of my users. Here is what my rates table looks like:
Member | Month | Rate
Joe | Jan | 1
Joe | Jan | 2
Joe | Jan | 3
Joe | Feb | 1
Joe | Feb | 2
Joe | Feb | 2
Joe | Mar | 1
Joe | Mar | 2
Joe | Mar | 2
Max | Jan | 1
Max | Jan | 1
Max | Jan | 1
Max | Feb | 2
Max | Feb | 2
Max | Feb | 2
Max | Mar | 3
Max | Mar | 3
Max | Mar | 3
Ben | Jan | 1
Ben | Jan | 2
Ben | Jan | 2
Ben | Feb | 1
Ben | Feb | 1
Ben | Feb | 1
Ben | Mar | 1
Ben | Mar | 1
Ben | Mar | 1
Joe, in January, has rate options [1,2,3] available for him. Joe, in February and March, only has two [1,2]. For each user, I'd like to display the maximum number of rates available in one month (compared for all months). The outcome table should look like this:
Member | Max rates in one month
Joe | 3
Max | 1
Ben | 2
How would I write this query?

First, you need to group on member and month and count the rates:
SELECT "Member", "Month", count(*) AS n FROM rates GROUP BY 1,2;
Now that you have the count of rates per-member per-month, you can use the result to extract the maximum value for every member:
SELECT "Member", max(n) FROM (
SELECT "Member", "Month", count(*) AS n FROM rates GROUP BY 1,2) X
GROUP BY "Member";
The are other methods using, for example, window functions but the idea is always the same and I think the code above makes it clear what's happening.

You can do this with some aggregation:
SELECT Member, MAX(rate_count) as "Max rates in one month"
FROM
(SELECT Member, Month, COUNT(DISTINCT RATE) rate_count FROM yourtable GROUP BY Member, Month) dt
GROUP BY Member;

Related

How to order the result of ROLLUP by each groups total

So I have the following query that produces the following result:
actname | year | tickets
---------------+----------+---------
Join Division | 2016 | 2
Join Division | 2018 | 2
Join Division | 2020 | 3
Join Division | Total | 7 <<<
QLS | 2018 | 2
QLS | 2019 | 1
QLS | Total | 3 <<<
Scalar Swift | 2017 | 3
Scalar Swift | 2018 | 1
Scalar Swift | 2019 | 1
Scalar Swift | Total | 5 <<<
The Selecter | 2017 | 4
The Selecter | 2018 | 4
The Selecter | Total | 8 <<<
The Where | 2016 | 1
The Where | 2017 | 3
The Where | 2018 | 5
The Where | 2020 | 4
The Where | Total | 13 <<<
ViewBee 40 | 2017 | 3
ViewBee 40 | 2018 | 1
ViewBee 40 | Total | 4 <<<
The problem I have is that I want to re-order the results such that the group with the lowest Total occurs first, such that the results would look like this:
actname | year | tickets
---------------+----------+---------
QLS | 2018 | 2
QLS | 2019 | 1
QLS | Total | 3 <<<
ViewBee 40 | 2017 | 3
ViewBee 40 | 2018 | 1
ViewBee 40 | Total | 4 <<<
Scalar Swift | 2017 | 3
Scalar Swift | 2018 | 1
Scalar Swift | 2019 | 1
Scalar Swift | Total | 5 <<<
Join Division | 2016 | 2
Join Division | 2018 | 2
Join Division | 2020 | 3
Join Division | Total | 7 <<<
The Selecter | 2017 | 4
The Selecter | 2018 | 4
The Selecter | Total | 8 <<<
The Where | 2016 | 1
The Where | 2017 | 3
The Where | 2018 | 5
The Where | 2020 | 4
The Where | Total | 13 <<<
I'm obtaining the results by using the following GROUP:
GROUP BY actname, ROLLUP(year)
Which is combining all the ticket amounts of the same actname and year together.
I can provide the full query if necessary!
Thanks
Using window function (which is sum() in this case) you can set value to groups (groups are partitioned by actname column) , so now every group from actname column, have same value, as its own row where year='Total'.
Then simply sort by that new column, something like this:
with t(actname, year, tickets) as (
VALUES
('Join Division','2016',2),
('Join Division','2018',2),
('Join Division','2020',3),
('Join Division','Total',7),
('QLS','2018',2),
('QLS','2019',1),
('QLS','Total',3 ),
('Scalar Swift','2017',3),
('Scalar Swift','2018',1),
('Scalar Swift','2019',1),
('Scalar Swift','Total',5 ),
('The Selecter','2017',4),
('The Selecter','2018',4),
('The Selecter','Total',8 ),
('The Where','2016',1),
('The Where','2017',3),
('The Where','2018',5),
('The Where','2020',4),
('The Where','Total',13 ),
('ViewBee 40','2017',3),
('ViewBee 40','2018',1),
('ViewBee 40','Total',4 )
)
SELECT * FROM (
select *, sum(case when year = 'Total' then tickets end) over(partition by actname) sm from t
) tt
ORDER BY sm, year

PostgreSQL Query Using Crosstab

Now, I am leaning postgreSQL. In the study, I found crosstab in postgreSQL. I tried to apply this function to my customized table, but it dose not work. please help!!
This is my Table
year | type | count
------+----------+----
2015 | AS | 6
2015 | HY | 6
2015 | KR | 6
2015 | SE | 6
2016 | AS | 2
2016 | HY | 2
2016 | KR | 2
2016 | SE | 2
2017 | AS | 1
2017 | HY | 1
2017 | KR | 1
2017 | SE | 1
2018 | AS | 2
2018 | HY | 2
2018 | KR | 2
2018 | SE | 2
I want to change this table like this
year | AS | HY | KR | SE |
----------------------------------
2015 | 6 | 6 | 6 | 6 |
2016 | 2 | 2 | 2 | 2 |
2017 | 1 | 1 | 1 | 1 |
2018 | 2 | 2 | 2 | 2 |
To make that table, I designed query using crosstab, but dose not work!
Please Let me know the query of this problem.
You could achieve this without Crosstab, you can use Aggregate function.
Query :
select
year,
max(counts) filter (where type = 'AS') as "AS",
max(counts) filter (where type = 'HY') as "HY",
max(counts) filter (where type = 'KR') as "KR",
max(counts) filter (where type = 'SE') as "SE"
from
tbl
group by
year
order by
year asc
Demo <> DB Fiddle
And if you are try learning Crosstab this answer are really great and explain really well about to do pivot use Crosstab.
PostgreSQL Crosstab Query

Tableau month-based bar chart for data with date range

I have data similar to the below:
id | start | end | name
1 | 2017-01-15 | 2017-03-30 | Item 1
2 | 2017-02-01 | 2017-05-15 | Item 2
3 | 2017-02-15 | 2017-04-01 | Item 3
I want to represent this as a bar chart with Month on the horizontal axis, and count on the vertical axis, where the value is computed by how many items fall within that month. In the above data set, January would have a value of 1, February would have a value of 3, March would have a value of 3, April would have a value of 2, and May would have a value of 1.
The closest I can get right now is to represent the count of items with the start or end date, but I want the month to represent how many items fall within that month.
I haven't found a way to do this in Tableau without restructuring my data set to have each current row restated for each month, which I don't have the luxury to do. Is this possible at all?
One solution could be to have 12 calculated fields like below
id | start | end | name | Jan | Feb | Mar | Apr | May...
1 | 2017-01-15 | 2017-03-30 | Item 1 | 1 | 1 | 1 | 0 | 0
2 | 2017-02-01 | 2017-05-15 | Item 2 | 0 | 1 | 1 | 1 | 1
3 | 2017-02-15 | 2017-04-01 | Item 3 | 0 | 1 | 1 | 1 | 0
Definition of calculated fields -
'Jan' is DATENAME('month',[start]) <= 1 & 1 <=
DATENAME('month',[end])
'Feb' is DATENAME('month',[start]) <= 2 & 2 <=
DATENAME('month',[end]) and so on...
Then using Pivot option in Tableau, convert it to something like
name | Month | Count
Item1 | Jan | 1
Item2 | Jan | 0
Item3 | Jan | 0
...
Item1 | Feb | 1
Item2 | Feb | 1
Item3 | Feb | 1
...
Item1 | Mar | 1
Item2 | Mar | 1
Item3 | Mar | 1
...
Drag Month to 'Columns' and SUM(Count) to 'Rows' to generate the final visualization.
Hope this helps!

Is it possible to do mathematical operations on values in the same column but different rows?

Say I have this table,
year | name | score
------+---------------+----------
2017 | BRAD | 5
2017 | BOB | 5
2016 | JON | 6
2016 | GUYTA | 2
2015 | PAC | 2
2015 | ZAC | 0
How would I go about averaging the scores by year and then getting the difference between years?
year | increase
------+-----------
2017 | 1
2016 | 3
You should use a window function, lead() in this case:
select year, avg, (avg - lead(avg) over w)::int as increase
from (
select year, avg(score)::int
from my_table
group by 1
) s
window w as (order by year desc);
year | avg | increase
------+-----+----------
2017 | 5 | 1
2016 | 4 | 3
2015 | 1 |
(3 rows)

PostgreSQL - aggregate series interval 2 years

I have some
id_merchant | data | sell
11 | 2009-07-20 | 1100.00
22 | 2009-07-27 | 1100.00
11 | 2005-07-27 | 620.00
31 | 2009-08-07 | 2403.20
33 | 2009-08-12 | 4822.00
52 | 2009-08-14 | 4066.00
52 | 2009-08-15 | 295.00
82 | 2009-08-15 | 0.00
23 | 2011-06-11 | 340.00
23 | 2012-03-22 | 1000.00
23 | 2012-04-08 | 1000.00
23 | 2012-07-13 | 36.00
23 | 2013-07-17 | 2480.00
23 | 2014-04-09 | 1000.00
23 | 2014-06-10 | 1500.00
23 | 2014-07-20 | 700.50
I want to create table as select with interval 2 years. First date for merchant is min(date). So i generate series (min(date)::date,current(date)::date,'2 years')
I want to get to table like that:
id_merchant | data | sum(sell)
23 | 2011-06-11 | 12382.71
23 | 2013-06-11 | 12382.71
23 | 2015-06-11 | 12382.71
But there is some mistake in my query because sum(sell) is the same for all series and the sum is wrong. Event if i sum sale ther is about 6000 not 12382.71.
My query:
select m.id_gos_pla,
generate_series(m.min::date,dath()::date,'2 years')::date,
sum(rch.suma)
from rch, minmax m
where rch.id_gos_pla=m.id_gos_pla
group by m.id_gos_pla,m.min,m.max
order by 1,2;
Pls for help.
I would do it this way:
select
periods.id_merchant,
periods.date as period_start,
(periods.date + interval '2' year - interval '1' day)::date as period_end,
coalesce(sum(merchants.amount), 0) as sum
from
(
select
id_merchant,
generate_series(min(date), max(date), '2 year'::interval)::date as date
from merchants
group by id_merchant
) periods
left join merchants on
periods.id_merchant = merchants.id_merchant and
merchants.date >= periods.date and
merchants.date < periods.date + interval '2' year
group by periods.id_merchant, periods.date
order by periods.id_merchant, periods.date
We use sub-query to generate date periods for each id_merchant according to the first date for this merchant and required interval. Then join it with merchants table on date within period condition and group by merchant_id and period (periods.date is the starting period date which is enough). And finally we take everything we need: starting date, ending date, merchant and sum.