Trying to aggregate price and orders , make the output less granular - postgresql

Only three months into SQL, so excuse the level of ignorance.
Got stuck on something I'm not sure the process I'm following is correct.
Got to values, price and quantity. Trying to average the price or create smaller ranges, then add the quantities within that range.
Price Quantity
20289.7 0.001
21320 1.798
20259.4 1.724
20365 2.1
21066.6 0.055
20517.8 0.002
20836.9 0.037
Lets say for every $50 range on left, how many orders are placed in that range.
Seen something like the following
SELECT *
FROM (
SELECT CASE
WHERE price BETWEEN 15000 AND 18000 then '10 -18'
WHERE price BETWEEN 18000 AND 19000 then '18 - 19'
WHERE price BETWEEN 19000 AND 20000 then '19 - 20'
but I'm not sure if this is the correct path.

This is the correct path, but syntax is incorrect:
SELECT
CASE WHEN price < 15000 THEN
'<15'
WHEN price BETWEEN 15000 AND 18000 THEN
'15 -18'
WHEN price BETWEEN 18001 AND 19000 THEN
'18 - 19'
WHEN price BETWEEN 19001 AND 20000 THEN
'19 - 20'
ELSE
'>20'
END AS price_range,
sum(quantity) AS sum_quantity
FROM
orders
GROUP BY
CASE WHEN price < 15000 THEN
'<15'
WHEN price BETWEEN 15000 AND 18000 THEN
'15 -18'
WHEN price BETWEEN 18001 AND 19000 THEN
'18 - 19'
WHEN price BETWEEN 19001 AND 20000 THEN
'19 - 20'
ELSE
'>20'
END;
always provide edge cases, in case a new price gets inserted not within the range.

Related

Add condition to where clause in q/kdb+

Table Tab
minThreshold
maxThreshold
point
1000
10000
10
wClause,:enlist((';~:;<);`qty;Tab[`minThreshold])
trying to incorporate maxThreshold column to where clause
qty >= MinThreshold
qty <= MaxThreshold
something like
wClause,:enlist((';~:;<);`qty;Tab[`minThreshold]);Tab[`maxThreshold])
q)Tab:([] minThreshold:500 1000;maxThreshold:700 2000;point:5 10)
q)Tab
minThreshold maxThreshold point
-------------------------------
500 700 5
1000 2000 10
q)select from Tab where minThreshold>=900,maxThreshold<=2500
minThreshold maxThreshold point
-------------------------------
1000 2000 10
q)parse"select from Tab where minThreshold>=900,maxThreshold<=2500"
?
`Tab
,(((';~:;<);`minThreshold;900);((';~:;>);`maxThreshold;2500))
0b
()
q)?[Tab;((>=;`minThreshold;900);(<=;`maxThreshold;2500));0b;()]
minThreshold maxThreshold point
-------------------------------
1000 2000 10
See the whitepaper for more information on functional selects:
https://code.kx.com/q/wp/parse-trees/
Is your problem
you have a Where phrase that works for functional qSQL and you want to extend it?
you want to select rows of a table where the value of a quantity falls within an upper and lower bound?
If (2) you can use Join Each to get the bounds for each row, and within to test the quantity.
q)show t:([]lwr:1000 900 150;upr:10000 25000 500;qty:10 1000 450)
lwr upr qty
---------------
1000 10000 10
900 25000 1000
150 500 450
q)select from t where qty within' lwr{x,y}'upr
lwr upr qty
--------------
900 25000 1000
150 500 450
Above we use {x,y} because in qSQL queries comma does not denote Join.

How to calculate rest of the amount after comparing current date in pyspark dataframe?

I need to calculate how much I have in my account after today. That means, for the current day
how much I left in my original Total_salary.
Below is my sample data set.
start_date end_date duration(months) Total_salary left_amount
2021-05-03 2022-05-03 12 1200 400
2019-01-01 2023-01-01 48 4800 2300
2018-01-01 2020-01-01 24 2400 0
2020-01-01 2023-01-01 36 3600 1200
2024-01-01 2027-01-01 36 3600 3600
I need get the upto current date how much I left, if end_date < current date.
Let take first row as an example, I agree with a client for working for 12 months with total
salary 1200, by each month I will receive 100 as my salary. So, I need to know today how much
I left from my original total_salary. (100*8 = 800, 1200-800 = 400)
I don't know how to get SUM up to current date.
I need to implement this in pyspark. Please anyone can help me to sort out this?
Thank you
import datetime
from pyspark.sql import functions as F
current_date = datetime.date.today()
(
df
.withColumn('left_months', F.greatest(F.lit(0), F.months_between('end_date', F.lit(current_date))))
.withColumn('left_amount', F.col('total_salary')/F.col('duration(months)') * F.col('left_months'))
.withColumn('left_amount', F.least('total_salary', 'left_amount'))
)

How can I create a list of conversion rates and currency types from a table with converted values in BigQuery SQL?

I have a big table with many rows. A data example is the following:
Currency
Value
Value_in_NOK
USD
100
800
USD
200
1600
SEK
120
108
USD
400
3200
SEK
240
216
USD
300
2400
EUR
15
150
EUR
30
300
The converted value is always in NOK.
What I want is to use a SELECT statemnet to create a distinct list of Currencies, including the NOK, with the currency rate made from the first row with the distinct Currency:
Currency
Currency_Rate
USD
8.000
SEK
0.900
EUR
10.000
NOK
1.000
Assuming there is a some column in your table that defines order of rows - for example timestamp (ts)
select Currency, array_agg(round(Value_in_NOK/Value, 3) order by ts limit 1)[offset(0)] as Currency_Rate
from your_table
group by Currency
union all
select 'NOK', 1.000
if applied to sample data in your question - output is
This is the code I ended up with that works perfect.
SELECT
Opportunity_First_year_value_Currency,
ARRAY_AGG(ROUND(SAFE_CAST(Opportunity_First_year_value_converted AS NUMERIC)/SAFE_CAST(Opportunity_First_year_value AS NUMERIC), 5)
ORDER BY
Opportunity_Close_Date DESC
LIMIT
1) [
OFFSET
(0)] AS Currency_Rate
FROM
`JOINED_Opportunity`
WHERE
SAFE_CAST(Opportunity_First_year_value_converted AS NUMERIC) > 0
GROUP BY
Opportunity_First_year_value_Currency
UNION ALL
SELECT
'NOK',
1.00000

Postgresql - Partial sum per day and overall in one query

I've got a table of different transactions with the according timestamps:
Table: Transactions
Recipient Amount Date
--------------------------------------------------
Bob 52 2019-04-21 11:06:32
Jack 12 2019-06-26 12:08:11
Jill 50 2019-04-19 24:50:26
Bob 90 2019-03-20 16:34:35
Jack 81 2019-03-25 12:26:54
Jenny 53 2019-04-20 09:07:02
Jenny 5 2019-03-29 06:15:35
Now I want to get all of Jack's transactions for today and overall:
Result
Person Amount_Today Amount_Overall
-----------------------------------------------
Jack 12 93
What's the most performant way to archieve this in postgresql? At the moment I run two queries - this one is for Amount_Today:
select Recipient, sum(Amount)
from Transactions
where Recipient = 'Jack'
and created_at > NOW() - INTERVAL '1 day'
But that doesn't seem like the right way.
You can use the filter clause:
select Recipient,
sum(Amount) as Amount_Overall,
sum(Amount) FILTER (WHERE created_at > NOW() - INTERVAL '1 day') as Amount_Today
from Transactions
where Recipient = 'Jack'
GROUP BY recipient;
You have probably realized this, but now() - interval '1 day' is not really today, it is the last 24 hours. You could use date_trunc if you want just today.

Ordering the Amount range values (Ascending Order )in Postgres

Hi I want to show the Result set in ascending order. I have created the SQL FIDDLE for the same.
select amount_range as amount_range, count(*) as number_of_items,
sum(amount) as total_amount
from (
select *,case
when amount between 0.00 and 2500.00 then '<=$2,500.00'
when amount between 2500.01 and 5000.00 then '$2,500.01 - $5,000.00'
when amount between 5000.01 and 7500.00 then '$5,000.01 - $7,500.00'
when amount between 7500.01 and 10000.00 then '$7,500.01 - $10,000.00'
else '>$10,000.01' end as amount_range
from Sales ) a
group by amount_range order by amount_range;
My Results should be like
<=$2,500.00 4 5000
$2,500.01 - $5,000.00 3 12000
$5,000.01 - $7,500.00 2 13000
$7,500.01 - $10,000.00 1 10000
>$10,000.01 1 15000
The easiest method will be to sort off of a value in each grouping, for example the minimum amount:
select amount_range as amount_range,
count(*) as number_of_items,
sum(amount) as total_amount
from (
select *,case
when amount between 0.00 and 2500.00 then '<=$2,500.00'
when amount between 2500.01 and 5000.00 then '$2,500.01 - $5,000.00'
when amount between 5000.01 and 7500.00 then '$5,000.01 - $7,000.00'
when amount between 7500.01 and 10000.00 then '$7,500.01 - $10,000.00'
else '>$10,000.01' end as amount_range
from Sales ) a
group by amount_range
order by min(amount);
In Postgres, your subquery could also return an array where the first element is the desired position and the second is the string describing the bucket. Then, the outer query can ORDER BY your positioning value.
select amount_range[2] as amount_range,
count(*) as number_of_items,
sum(amount) as total_amount
from (
select *,case
when amount between 0.00 and 2500.00 then ARRAY['1','<=$2,500.00']
when amount between 2500.01 and 5000.00 then ARRAY['2','$2,500.01 - $5,000.00']
when amount between 5000.01 and 7500.00 then ARRAY['3', '$5,000.01 - $7,000.00']
when amount between 7500.01 and 10000.00 then ARRAY['4', '$7,500.01 - $10,000.00']
else ARRAY['5','>$10,000.01'] end as amount_range
from Sales ) a
group by amount_range
order by amount_range[1];
The first method happens to be simpler for your example. The second method would be useful if you were bucketing by something more complicated than ranges.