Ordering the Amount range values (Ascending Order )in Postgres - postgresql

Hi I want to show the Result set in ascending order. I have created the SQL FIDDLE for the same.
select amount_range as amount_range, count(*) as number_of_items,
sum(amount) as total_amount
from (
select *,case
when amount between 0.00 and 2500.00 then '<=$2,500.00'
when amount between 2500.01 and 5000.00 then '$2,500.01 - $5,000.00'
when amount between 5000.01 and 7500.00 then '$5,000.01 - $7,500.00'
when amount between 7500.01 and 10000.00 then '$7,500.01 - $10,000.00'
else '>$10,000.01' end as amount_range
from Sales ) a
group by amount_range order by amount_range;
My Results should be like
<=$2,500.00 4 5000
$2,500.01 - $5,000.00 3 12000
$5,000.01 - $7,500.00 2 13000
$7,500.01 - $10,000.00 1 10000
>$10,000.01 1 15000

The easiest method will be to sort off of a value in each grouping, for example the minimum amount:
select amount_range as amount_range,
count(*) as number_of_items,
sum(amount) as total_amount
from (
select *,case
when amount between 0.00 and 2500.00 then '<=$2,500.00'
when amount between 2500.01 and 5000.00 then '$2,500.01 - $5,000.00'
when amount between 5000.01 and 7500.00 then '$5,000.01 - $7,000.00'
when amount between 7500.01 and 10000.00 then '$7,500.01 - $10,000.00'
else '>$10,000.01' end as amount_range
from Sales ) a
group by amount_range
order by min(amount);
In Postgres, your subquery could also return an array where the first element is the desired position and the second is the string describing the bucket. Then, the outer query can ORDER BY your positioning value.
select amount_range[2] as amount_range,
count(*) as number_of_items,
sum(amount) as total_amount
from (
select *,case
when amount between 0.00 and 2500.00 then ARRAY['1','<=$2,500.00']
when amount between 2500.01 and 5000.00 then ARRAY['2','$2,500.01 - $5,000.00']
when amount between 5000.01 and 7500.00 then ARRAY['3', '$5,000.01 - $7,000.00']
when amount between 7500.01 and 10000.00 then ARRAY['4', '$7,500.01 - $10,000.00']
else ARRAY['5','>$10,000.01'] end as amount_range
from Sales ) a
group by amount_range
order by amount_range[1];
The first method happens to be simpler for your example. The second method would be useful if you were bucketing by something more complicated than ranges.

Related

How to subtract a seperate count from one grouping

I have a postgres query like this
select application.status as status, count(*) as "current_month" from application
where to_char(application.created, 'mon') = to_char('now'::timestamp - '1 month'::interval, 'mon')
and date_part('year',application.created) = date_part('year', CURRENT_DATE)
and application.job_status != 'expired'
group by application.status
it returns the table below that has the number of applications grouped by status for the current month. However I want to subtract a total count of a seperate but related query from the internal review number only. I want to count the number of rows with type = abc within the same table and for the same date range and then subtract that amount from the internal review number (Type is a seperate field). Current_month_desired is how it should look.
status
current_month
current_month_desired
fail
22
22
internal_review
95
22
pass
146
146
UNTESTED: but maybe...
The intent here is to use an analytic and case expression to conditionally sum. This way, the subtraction is not needed in the first place as you are only "counting" the values needed.
SELECT application.status as status
, sum(case when type = 'abc'
and application.status ='internal_review' then 0
else 1 end) over (partition by application.status)) as
"current_month"
FROM application
WHERE to_char(application.created, 'mon') = to_char('now'::timestamp - '1 month'::interval, 'mon')
and date_part('year',application.created) = date_part('year', CURRENT_DATE)
and application.job_status != 'expired'
GROUP BY application.status

How to create buckets and groups within those buckets using PostgresQL

How to find the distribution of credit cards by year, and completed transaction. Group these credit cards into three buckets: less than 10 transactions, between 10 and 30 transactions, more than 30 transactions?
The first method I tried to use was using the width_buckets function in PostgresQL, but the documentation says that only creates equidistant buckets, which is not what I want in this case. Because of that, I turned to case statements. However, I'm not sure how to use the case statement with a group by.
This is the data I am working with:
table 1 - credit_cards table
credit_card_id
year_opened
table 2 - transactions table
transaction_id
credit_card_id - matches credit_cards.credit_card_id
transaction_status ("complete" or "incomplete")
This is what I have gotten so far:
SELECT
CASE WHEN transaction_count < 10 THEN “Less than 10”
WHEN transaction_count >= 10 and transaction_count < 30 THEN “10 <= transaction count < 30”
ELSE transaction_count>=30 THEN “Greater than or equal to 30”
END as buckets
count(*) as ct.transaction_count
FROM credit_cards c
INNER JOIN transactions t
ON c.credit_card_id = t.credit_card_id
WHERE t.status = “completed”
GROUP BY v.year_opened
GROUP BY buckets
ORDER BY buckets
Expected output
credit card count | year opened | transaction count bucket
23421 | 2002 | Less than 10
etc
You can specify the bin sizes in width_bucket by specifying a sorted array of the lower bound of each bin.
In you case, it would be array[10,30]: anything less than 10 gets bin 0, between 10 and 29 gets bin 1 and 30 or more gets bin 2.
WITH a AS (select generate_series(5,35) cnt)
SELECT cnt, width_bucket(cnt, array[10,30])
FROM a;
To figure this out you need to count transactions per credit card in order to figure out the right bucket, then you need to count the credit cards per bucket per year. There are a couple of different ways to get the final result. One way is to first join up all your data and compute the first level of aggregate values. Then compute the final level of aggregate values:
with t1 as (
select year_opened
, c.credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from credit_cards c
join transactions t
on t.credit_card_id = c.credit_card_id
where t.transaction_status = 'complete'
group by year_opened
, c.credit_card_id
)
select count(*) credit_card_count
, year_opened
, buckets
from t1
group by year_opened
, buckets;
However, it may be more perforamant first calculate the first level of aggregate data on the transactions table before joining it to the credit cards table:
select count(*) credit_card_count
, year_opened
, buckets
from credit_cards c
join (select credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from transactions
group by credit_card_id) t
on t.credit_card_id = c.credit_card_id
group by year_opened
, buckets;
If you prefer to unroll the above query and uses Common Table Expressions, you can do that too (I find this easier to read/follow along):
with bkt as (
select credit_card_id
, case when count(*) < 10 then 'Less than 10'
when count(*) < 30 then 'Between [10 and 30)'
else 'Greater than or equal to 30'
end buckets
from transactions
group by credit_card_id
)
select count(*) credit_card_count
, year_opened
, buckets
from credit_cards c
join bkt t
on t.credit_card_id = c.credit_card_id
group by year_opened
, buckets;
Not sure if this is what you are looking for.
WITH cte
AS (
SELECT c.year_opened
,c.credit_card_id
,count(*) AS transaction_count
FROM credit_cards c
INNER JOIN transactions t ON c.credit_card_id = t.credit_card_id
WHERE t.STATUS = 'completed'
GROUP BY c.year_opened
,c.credit_card_id
)
SELECT cte.year_opened AS 'year opened'
,SUM(CASE
WHEN transaction_count < 10
THEN 1
ELSE 0
END) AS 'Less than 10'
,SUM(CASE
WHEN transaction_count >= 10
AND transaction_count < 30
THEN 1
ELSE 0
END) AS '10 <= transaction count < 30'
,SUM(CASE
WHEN transaction_count >= 30
THEN 1
ELSE 0
END) AS 'Greater than or equal to 30'
FROM CTE
GROUP BY cte.year_opened
and the output would be as below.
year opened | Less than 10 | 10 <= transaction count < 30 | Greater than or equal to 30
2002 | 23421 | |

PostgreSQL Sum and Percent of Total Subquery

Table has sales and i'm trying to calculate the sales dollars and percent of total.
WITH t1 AS (
SELECT
product_id,
sum(financial_charge_cents) as cents
FROM "account_charges"
WHERE account_id = 5
AND status = 'Charged'
GROUP BY product_id
ORDER BY cents DESC
)
SELECT
product_id,
cast(cents as money) / 100.0 AS dollars,
cents / (sum(cents) OVER (PARTITION BY product_id)) as percent
FROM t1
ORDER BY dollars DESC
but this is returning all the percents to be 1 and not sure why.
not sure if this is a type problem or what.
OVER (PARTITION BY produuct_id) causes that the sums are the same as cents. You do not need partitions:
SELECT
product_id,
cast(cents as money) / 100.0 AS dollars,
cents* 1.0 / (sum(cents) OVER ()) as percent
FROM t1
ORDER BY dollars DESC
Note, I have added *1.0 to get numeric results. Skip it if cents are numeric (not integer).

How to select records in date order that total to an arbitrary amount?

I have a table of fuel deliveries as follows:
Date Time Qty
20160101 0800 4500
20160203 0900 6000
20160301 0810 3400
20160328 1710 5300
20160402 1201 6000
I know that on April 1st I had 10,000 litres in the tank so now I want to select just the deliveries that make up the total. This means I want the records for 20160328,20160301 and 20160203. I am using Postgres and I want to know how to structure a select statement that would accomplish this task.
I understand how to use the where clause to filter records whose date is less than on equal April 1st but I do not know how to instruct Postgres to select the records in reverse date order until the quantity selected is greater than or equal to 10,000.
with d as (
select *, sum(qty) over (order by date desc, time desc) as total
from delivery
where date between '20160101' and '20160401'
)
select *
from d
where total < 10000
union
(
select *
from d
where total >= 10000
order by date desc, time desc
limit 1
)
order by date desc, time desc
;
date | time | qty | total
------------+----------+------+-------
2016-03-28 | 17:10:00 | 5300 | 5300
2016-03-01 | 08:10:00 | 3400 | 8700
2016-02-03 | 09:00:00 | 6000 | 14700
The data:
create table delivery (date date, time time, qty int);
insert into delivery (date, time, qty) values
('20160101','0800',4500),
('20160203','0900',6000),
('20160301','0810',3400),
('20160328','1710',5300),
('20160402','1201',6000);
You can create a running total using a window function based on descending order of date and time, like so:
SELECT
Date,
Time,
Qty
FROM
(
SELECT
Date,
Time,
Qty,
SUM(Qty) OVER (ORDER BY Date DESC, Time DESC) AS Running_Total
FROM
fuel_deliveries
WHERE
Date < '20160402'
) rt
WHERE
Running_Total <= 10000;
The inner/sub query gets you the running total, but you then want to filter on it where the value is less than or equal to 10000.

Postgres calculate growth rate over two month

I would like to calculate growth rate for customers for following data.
month | customers
-------------------------
01-2015 | 1
02-2015 | 10
03-2014 | 10
06-2015 | 15
I have used following formula to calculate the growth rate, it works only for one month interval as well as not able to give expected output due to gap between 3rd and 6th month as shown in above table
select
month, total,
(total::float / lag(total) over (order by month) - 1) * 100 growth
from (
select to_char(created, 'yyyy-mm') as month, count(id) total
from customers
group by month
) s
order by month;
I think this can be done by creating a date range and group by that range.
I expect two main output separately
1) Generate growth rate with exact one month difference
2) Growth rate with interval of 2 month instead of single month only. In above data sum the two month result and group by two month instead of month
Still not sure about the second part. Here's growth from your adapted query and twon month growth column:
select
month, total,
(total::float / lag(total) over (order by m) - 1) * 100 growth,m,m2
from (
select created, (sum(customers) over (order by m))::float total,customers,m,m2,to_char(created, 'yyyy-mm') as month
from customers c
right outer join (
select generate_series('2015-01-01','2015-06-01','1 month'::interval) m
) m1 on m=c.created
left outer join (
select generate_series('2015-01-01','2015-06-01','2 month'::interval) m2
) m2 on m2=m
order by m
) s
order by m;
basically answer is use generate_series