How do I join multiple select results into a single table? - postgresql

I have a query which returns monthly averages from the same table, but for different pressure_level's:
SELECT some_id, avg(exposure_value) monthly_avg_1000
FROM mytable
WHERE pressure_level = 1000
AND some_id = 7
GROUP BY some_id, date_trunc('month', measurement_time)
I then have the same query, but for a different pressure_level:
SELECT some_id, avg(exposure_value) monthly_avg_925
FROM mytable
WHERE pressure_level = 925
AND some_id = 7
GROUP BY some_id, date_trunc('month', measurement_time)
Both queries return 12 rows (1 per month) with the ID and the average value for the month:
some_id | monthly_avg_1000
--------------------------
1 | 0.000023
1 | 0.000051
1 | 0.000009
some_id | monthly_avg_925
--------------------------
1 | 0.000014
1 | 0.000007
1 | 0.000131
I would like to combine the two queries so that the monthly_avg_* columns all appear in the final table:
some_id | monthly_avg_1000 | monthly_avg_925
--------------------------
1 | 0.000023 | 0.000014
1 | 0.000051 | 0.000007
1 | 0.000009 | 0.000131
How can I do this?

if you have same id, then you can try join:
with a as (
SELECT some_id, avg(exposure_value) monthly_avg_1000,date_trunc('month', measurement_time) d
FROM mytable
WHERE pressure_level = 1000
AND some_id = 7
GROUP BY some_id, date_trunc('month', measurement_time)
)
, b as (
SELECT some_id, avg(exposure_value) monthly_avg_925, date_trunc('month', measurement_time) d
FROM mytable
WHERE pressure_level = 925
AND some_id = 7
GROUP BY some_id, date_trunc('month', measurement_time)
)
select distinct a.some_id, monthly_avg_1000,monthly_avg_925
from a
join b on a.some_id = b.some_id and a.d = b.d

Related

how to drop rows if a variale is less than x, in sql

I have the following query code
query = """
with double_entry_book as (
SELECT to_address as address, value as value
FROM `bigquery-public-data.crypto_ethereum.traces`
WHERE to_address is not null
AND block_timestamp < '2022-01-01 00:00:00'
AND status = 1
AND (call_type not in ('delegatecall', 'callcode', 'staticcall') or call_type is null)
union all
-- credits
SELECT from_address as address, -value as value
FROM `bigquery-public-data.crypto_ethereum.traces`
WHERE from_address is not null
AND block_timestamp < '2022-01-01 00:00:00'
AND status = 1
AND (call_type not in ('delegatecall', 'callcode', 'staticcall') or call_type is null)
union all
)
SELECT address,
sum(value) / 1000000000000000000 as balance
from double_entry_book
group by address
order by balance desc
LIMIT 15000000
"""
In the last part, I want to drop rows where "balance" is less than, let's say, 0.02 and then group, order, etc. I imagine this should be a simple code. Any help will be appreciated!
We can delete on a CTE and use returning to get the id's of the rows being deleted, but they still exist until the transaction is comitted.
CREATE TABLE t (
id serial,
variale int);
insert into t (variale) values
(1),(2),(3),(4),(5);
✓
5 rows affected
with del as
(delete from t
where variale < 3
returning id)
select
t.id,
t.variale,
del.id ids_being_deleted
from t
left join del
on t.id = del.id;
id | variale | ids_being_deleted
-: | ------: | ----------------:
1 | 1 | 1
2 | 2 | 2
3 | 3 | null
4 | 4 | null
5 | 5 | null
select * from t;
id | variale
-: | ------:
3 | 3
4 | 4
5 | 5
db<>fiddle here

How to force query to return only first row from window?

I have data:
id | price | date
1 | 25 | 2019-01-01
2 | 35 | 2019-01-01
1 | 27 | 2019-02-01
2 | 37 | 2019-02-01
Is it possible to write such query which will return only first row from window? something like LIMIT 1 but for the window OVER( date )?
I expect next result:
id | price | date
1 | 25 | 2019-01-01
1 | 27 | 2019-02-01
Or ignore whole window if first window row has NULL:
id | price | date
1 | NULL | 2019-01-01
2 | 35 | 2019-01-01
1 | 27 | 2019-02-01
2 | 37 | 2019-02-01
result:
1 | 27 | 2019-02-01
Order the rows by date and id, and take only the first row per date.
Then remove those where the price is NULL.
SELECT *
FROM (SELECT DISTINCT ON (date)
id, price, date
FROM mytable
ORDER BY date, id
) AS q
WHERE price IS NOT NULL;
#Laurenz let me to provide a bit more explanation
select distinct on (<fldlist>) * from <table> order by <fldlist+>;
is equal to much more complex query:
select * from (
select row_number() over (partition by <fldlist> order by <fldlist+>) as rn,*
from <table>)
where rn = 1;
And here <fldlist> should be the beginning part (or equal) of <fldlist+>
As Myon on IRC said:
if you want to use a window function in WHERE, you need to put it into a subselect first
So the target query is:
select * from (
select
*
agg_function( my_field ) OVER( PARTITION BY other_field ) as agg_field
from sometable
) x
WHERE agg_field <condition>
In my case I have next query:
SELECT * FROM (
SELECT *,
FIRST_VALUE( p.price ) over( PARTITION BY crate.app_period ORDER BY st.DEPTH ) AS first_price,
ROW_NUMBER() over( PARTITION BY crate.app_period ORDER BY st.DEPTH ) AS row_number
FROM st
LEFT JOIN price p ON <COND>
LEFT JOIN currency_rate crate ON <COND>
) p
WHERE p.row_number = 1 AND p.first_price IS NOT null
Here I select only first rows from the group and where price IS NOT NULL

pl sql query recuresive looping

i have only one table "tbl_test"
Which have table filed given below
tbl_test table
trx_id | proj_num | parent_num|
1 | 14 | 0 |
2 | 14 | 1 |
3 | 14 | 2 |
4 | 14 | 0 |
5 | 14 | 3 |
6 | 15 | 0 |
Result i want is : when trx_id value 5 is fetched
it's a parent child relationship. so,
trx_id -> parent_num
5 -> 3
3 -> 2
2 -> 1
That means output value:
3
2
1
Getting all parent chain
Query i used :
SELECT * FROM (
WITH RECURSIVE tree_data(project_num, task_num, parent_task_num) AS(
SELECT project_num, task_num, parent_task_num
FROM tb_task
WHERE project_num = 14 and task_num = 5
UNION ALL
SELECT child.project_num, child.task_num, child.parent_task_num
FROM tree_data parent Join tb_task child
ON parent.task_num = child.task_num AND parent.task_num = child.parent_task_num
)
SELECT project_num, task_num, parent_task_num
FROM tree_data
) AS tree_list ;
Can anybody help me ?
There's no need to do this with pl/pgsql. You can do it straight in SQL. Consider:
WITH RECURSIVE my_tree AS (
SELECT trx_id as id, parent_id as parent, trx_id::text as path, 1 as level
FROM tbl_test
WHERE trx_id = 5 -- start value
UNION ALL
SELECT t.trx_id, t.parent_id, p.path || ',' || t.trx_id::text, p.level + 1
FROM my_tree p
JOIN tbl_text t ON t.trx_id = p.parent
)
select * from my_tree;
If you are using PostgresSQL, try using a WITH clause:
WITH regional_sales AS (
SELECT region, SUM(amount) AS total_sales
FROM orders
GROUP BY region
), top_regions AS (
SELECT region
FROM regional_sales
WHERE total_sales > (SELECT SUM(total_sales)/10 FROM regional_sales)
)
SELECT region,
product,
SUM(quantity) AS product_units,
SUM(amount) AS product_sales
FROM orders
WHERE region IN (SELECT region FROM top_regions)
GROUP BY region, product;

Iterate through rows, compare them against each other and store results in another table

I have a table that contains the following rows:
product_id | order_date
A | 12/04/12
A | 01/11/13
A | 01/21/13
A | 03/05/13
B | 02/14/13
B | 03/09/13
What I now need is an overview for each month, how many products have been bought for the first time (=have not been bought the month before), how many are existing products (=have been bought the month before) and how many have not been purchased within a given month. Taken the sample above as an input, the script should deliver the following result, regardless of what period of time is in the data:
month | new | existing | nopurchase
12/2012 | 1 | 0 | 0
01/2013 | 0 | 1 | 0
02/2013 | 1 | 0 | 1
03/2013 | 1 | 1 | 0
Would be great to get a first hint how this could be solved so I'm able to continue.
Thanks!
SQL Fiddle
with t as (
select product_id pid, date_trunc('month', order_date)::date od
from t
group by 1, 2
)
select od,
sum(is_new::integer) "new",
sum(is_existing::integer) existing,
sum(not_purchased::integer) nopurchase
from (
select od,
lag(t_pid) over(partition by s_pid order by od) is null and t_pid is not null is_new,
lag(t_pid) over(partition by s_pid order by od) is not null and t_pid is not null is_existing,
lag(t_pid) over(partition by s_pid order by od) is not null and t_pid is null not_purchased
from (
select t.pid t_pid, s.pid s_pid, s.od
from
t
right join
(
select pid, s.od
from
t
cross join
(
select date_trunc('month', d)::date od
from
generate_series(
(select min(od) from t),
(select max(od) from t),
'1 month'
) s(d)
) s
group by pid, s.od
) s on t.od = s.od and t.pid = s.pid
) s
) s
group by 1
order by 1

Joining many tables on same data and returning all rows

UPDATE:
my orgional attempt to use FULL OUTER JOIN did not work correctly. I have updated the question to reflex the true issue. Sorry for presenting a classic XY PROBLEM.
I'm trying to retrieve a dataset from multiple tables all in one query thats is grouped by year, month of the data.
The final result should look like this:
| Year | Month | Col1 | Col2 | Col3 |
|------+-------+------+------+------|
| 2012 | 11 | 231 | - | - |
| 2012 | 12 | 534 | 12 | 13 |
| 2013 | 1 | - | 22 | 14 |
Coming from data that looks like this:
Table 1:
| Year | Month | Data |
|------+-------+------|
| 2012 | 11 | 231 |
| 2012 | 12 | 534 |
Table 2:
| Year | Month | Data |
|------+-------+------|
| 2012 | 12 | 12 |
| 2013 | 1 | 22 |
Table 3:
| Year | Month | Data |
|------+-------+------|
| 2012 | 12 | 13 |
| 2013 | 1 | 14 |
I tried using FULL OUTER JOIN but this doesn't quite work because in my SELECT clause because no matter which table I select 'Year' and 'Month' from there are null values.
SELECT
Collase(t1.year,t2.year,t3.year)
,Collese(t1.month,t2.month,t3.month)
,t1.data as col1
,t2.data as col2
,t3.data as col3
From t1
FULL OUTER JOIN t2
on t1.year = t2.year and t1.month = t2.month
FULL OUTER JOIN t3
on t1.year = t3.year and t1.month = t3.month
Result is something like this (is too confusing to repeat exactly what i would get using this demo data):
| Year | Month | Col1 | Col2 | Col3 |
|------+-------+------+------+------|
| 2012 | 11 | 231 | - | - |
| 2012 | 12 | 534 | 12 | 13 |
| 2013 | 1 | - | 22 | |
| - | 1 | - | - | 14 |
If your data allows it (not 100 columns), this is usually a clean way of doing it:
select year, month, sum(col1) as col1, sum(col2) as col2, sum(col3) as col3
from (
SELECT t1.year, t1.month, t1.data as col1, 0 as col2, 0 as col3
From t1
union all
SELECT t2.year, t2.month, 0 as col1, t2.data as col2, 0 as col3
From t2
union all
SELECT t3.year, t3.month, 0 as col1, 0 as col2, t3.data as col3
From t3
) as data
group by year, month
If you are using SQL Server 2005 or later version, you could also try this PIVOT solution:
SELECT
Year,
Month,
Col1,
Col2,
Col3
FROM (
SELECT Year, Month, 'Col1' AS Col, Data FROM t1
UNION ALL
SELECT Year, Month, 'Col2' AS Col, Data FROM t2
UNION ALL
SELECT Year, Month, 'Col3' AS Col, Data FROM t3
) f
PIVOT (
SUM(Data) FOR Col IN (Col1, Col2, Col3)
) p
;
This query can be tested and played with at SQL Fiddle.
Perhaps you are looking for the COALESCE keyword? It takes a list of columns and returns the first one that is NOT NULL, or NULL if all arguments are null. In your example, you would do something like this.
SELECT COALESCE(t1.data, t2.data)
You would still need to join tables in this case. It would just cut down on the case statements.
You could derive the complete list of years and months from all the tables, than join every table to that list (using a left join):
SELECT
f.Year,
f.Month,
t1.data AS col1,
t2.data AS col2,
t3.data AS col3
FROM (
SELECT Year, Month FROM t1
UNION
SELECT Year, Month FROM t2
UNION
SELECT Year, Month FROM t3
) f
LEFT JOIN t1 ON f.year = t1.year and f.month = t1.month
LEFT JOIN t2 ON f.year = t2.year and f.month = t2.month
LEFT JOIN t3 ON f.year = t3.year and f.month = t3.month
;
You can see a live demonstration of this query at SQL Fiddle.
if you are looking for the non-null values from either tabloe then you will have to add t1.dat IS NOT NULL as well. I hope that I understand your question.
CREATE VIEW joined_SALES
AS SELECT t1.year, t1.month, t1.data , t2.data
FROM table1 t1, table2 t2
WHERE
t1.year = t2.year
and t1.month = t2.month
and t1.dat IS NOT NULL
GROUP BY t1.year, t1.month;
This might be a better way, especially if you are going to do something with the data before returning it. Basically you are translating the table the data came from into a typeId.
declare #temp table
([year] int,
[month] int,
typeId int,
data decimal)
insert into #temp
SELECT t1.year, t1.month, 1, sum(t1.data)
From t1
group by t1.year, t1.month
insert into #temp
SELECT t2.year, t2.month, 2, sum(t2.data)
From t2
group by t1.year, t1.month
insert into #temp
SELECT t3.year, t3.month, 3, sum(t3.data)
group by t1.year, t1.month
select t.year, t.month,
sum(case when t.typeId = 1 then t.data end) as col1,
sum(case when t.typeId = 2 then t.data end) as col2,
sum(case when t.typeId = 3 then t.data end) as col3
from #temp t
group by t.year, t.month