How to select based on values "keyed" by another column? - postgresql

PostgreSQL newbie here. I have data that look like this:
+-----------+---------+-------+
| StudentID | ClassID | Grade |
+-----------+---------+-------+
| 19927 | A13 | 5 |
| 19927 | A07 | 3 |
| 19927 | B22 | 7 |
| 10001 | A13 | 2 |
| 10001 | A07 | 8 |
| 22207 | A13 | 7 |
| 22207 | A07 | 10 |
| 22207 | C80 | 2 |
| 27516 | A07 | 8 |
+-----------+---------+-------+
I'm trying to select all students which have a higher grade in class A13 than in class A07. This means only including students who actually have grades in both classes.
What's the best way to do this? Having been brought up on Stata, I would normally try:
selecting only rows where classID = A07 or A13
reshaping to wide
select using a where clause on A13 > A07
But I feel like this is very un-SQL-like.

Postgresql gives lots of different ways of doing it, here's one
SELECT a13.* FROM
(SELECT * FROM table1 where classid='A13') as a13
INNER JOIN
(SELECT * FROM table1 where classid='A07') as a07
ON a13.grade > a07.grade

Related

Postgres SQL find similar

I want to query a single orders table using a Postgres SQL editor (DBeaver)
| order_id | subs_id |
| -------- | --------|
| 1 | aa |
| 2 | aa |
| 3 | aa |
| 4 | bb |
| 5 | bb |
| 6 | bb |
| 7 | aa |
| 8 | bb |
All I want to do is find all orders for a subscriptions by using one of the order numbers. So if I have an order id, I want to find the other related orders for that subscription.
Should be a simple process.
Find associated subs_id for supplied order_id
Find all orders for that subs_id
Here is what I tried.
select *
from orders o
where o.subs_id in (
select o2.subs_id
from orders o2
where o2.order_id = '3')
This is the desired result
| order_id | subs_id |
| -------- | --------|
| 1 | aa |
| 2 | aa |
| 3 | aa |
| 7 | aa |
Thanks!
You can join the table with itself by subs_id. For example:
select b.*
from t a
join t b on b.subs_id = a.subs_id
where a.order_id = '3'

Make sure every distinct value of Column1 has a row with every distinct value of Column2, by populating a table with 0s - postgresql

Here's a crude example I've made up to illustrate what I want to achieve:
table1:
| Shop | Product | QuantityInStock |
| a | Prod1 | 13 |
| a | Prod3 | 13 |
| b | Prod2 | 13 |
| b | Prod3 | 13 |
| b | Prod4 | 13 |
table1 becomes:
| Shop | Product | QuantityInStock |
| a | Prod1 | 13 |
| a | Prod2 | 0 | -- new
| a | Prod3 | 13 |
| a | Prod4 | 0 | -- new
| b | Prod1 | 0 | -- new
| b | Prod2 | 13 |
| b | Prod3 | 13 |
| b | Prod4 | 13 |
In this example, I want to represent every Shop/Product combination
every Shop {a,b} to have a row with every Product {Prod1, Prod2, Prod3, Prod4}
QuantityInStock=13 has no significance, I just wanted a placeholder number :)
Use a calendar table cross join approach:
SELECT s.Shop, p.Product, COALESCE(t1.QuantityInStock, 0) AS QuantityInStock
FROM (SELECT DISTINCT Shop FROM table1) s
CROSS JOIN (SELECT DISTINCT Product FROM table1) p
LEFT JOIN table1 t1
ON t1.Shop = s.Shop AND
t1.Product = p.Product
ORDER BY
s.Shop,
p.Product;
The idea here is to generate an intermediate table containing of all shop/product combinations via a cross join. Then, we left join this to table1. Any shop/product combinations which do not have a match in the actual table are assigned a zero stock quantity.

Sort using auxiliary fields, start and end

In PostgreSQL, what is the best way to sort records using start and end fields in a generic way, without the need to include in the query the first record (where start_id=3)?
Example table:
+-------+----------+--------+--------+
| FK_ID | START_ID | END_ID | STRING |
+-------+----------+--------+--------+
| 77 | 1 | 9 | E |
| 82 | 5 | 2 | A |
| 77 | 7 | 1 | I |
| 77 | 3 | 7 | W |
| 82 | 9 | 5 | Q |
| 77 | 9 | 5 | X |
| 82 | 2 | 7 | G |
+-------+----------+--------+--------+
Sorted where FK_ID = 77:
+----+---+---+---+
| 77 | 3 | 7 | W |
| 77 | 7 | 1 | I |
| 77 | 1 | 9 | E |
| 77 | 9 | 5 | X |
+----+---+---+---+
Sorted where FK_ID = 82:
+----+---+---+---+
| 82 | 9 | 5 | Q |
| 82 | 5 | 2 | A |
| 82 | 2 | 7 | G |
+----+---+---+---+
Result query sequence:
+-------+----------+
| FK_ID | SEQUENCE |
+-------+----------+
| 82 | QAG |
| 77 | WIEX |
+-------+----------+
I do not think this is the most efficient way but you can try with a recursive CTE
WITH RECURSIVE path AS (
SELECT * FROM myTable AS t1 WHERE NOT EXISTS(
SELECT 1 FROM myTable AS t2 WHERE t1.fk_id = t2.fk_id AND t2.end_id = t1.start_id
) ORDER BY start_id LIMIT 1
UNION ALL
SELECT myTable.* FROM myTable JOIN path ON path.end_id = myTable.start_id
)
SELECT fk_id,array_to_string(array_agg(string)) FROM path GROUP BY fk_id

Merge multiple tables with a common column name

I am trying to merge multiple tables that have a common column name which need not have the same values across the tables. For ex,
-tmp1-
id dat
1 234
2 432
3 412
-tmp2-
id nom
1 jim
2
3 ryan
4 jack
-tmp3-
id pin
1 gi23
2 x4ed
3 yit42
8 hiu11
If above are the input, the output needs to be,
id dat nom pin
1 234 jim gi23
2 432 x4ed
3 412 ryan yit42
4 jack
8 hiu11
Thanks in advance.
postgresql 8.2.15 on greenplum from R(pass-through queries)
use FULL JOIN ... USING (id) syntax.
please see example: http://sqlfiddle.com/#!12/3aff2/1
this is how diffrent join types work (provided that tab1.row3 meets joining condition with tab2.row1, and tab1.row3 meets tab2.row2):
| tab1 | | tab2 | | JOIN | | LEFT JOIN | | RIGHT JOIN | | FULL JOIN |
-------- -------- ------------------------- ------------------------- ------------------------- -------------------------
| row1 | | tab1.row1 | | tab1.row1 |
| row2 | | tab1.row2 | | tab1.row2 |
| row3 | | row1 | | tab1.row3 | tab2.row1 | | tab1.row3 | tab2.row1 | | tab1.row3 | tab2.row1 | | tab1.row3 | tab2.row1 |
| row4 | | row2 | | tab1.row4 | tab2.row2 | | tab1.row4 | tab2.row2 | | tab1.row4 | tab2.row2 | | tab1.row4 | tab2.row2 |
| row3 | | tab2.row3 | | tab2.row3 |
| row4 | | tab2.row4 | | tab2.row4 |

AVG didn't give the correct value - Postgresql

I have a table in which there many redundant points, I want to select distinct points using (distinct) and to select the average of some row (eg. rscp).
Here we have an example :
| id | point | rscp | ci
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| 1 | POINT(10.1192 36.8018) | 10 | 701
| 2 | POINT(10.1192 36.8018) | 11 | 701
| 3 | POINT(10.1192 36.8018) | 12 | 701
| 4 | POINT(10.4195 36.0017) | 30 | 701
| 5 | POINT(10.4195 36.0017) | 44 | 701
| 6 | POINT(10.4195 36.0017) | 55 | 701
| 7 | POINT(10.9197 36.3014) | 20 | 701
| 8 | POINT(10.9197 36.3014) | 22 | 701
| 9 | POINT(10.9197 36.3014) | 25 | 701
What i want to get is this table below : (rscp_avg is the average of rscp of the redundant points)
| id | point | rscp_avg | ci
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| * | POINT(10.1192 36.8018) | 11 | *
| * | POINT(10.4195 36.0017) | 43 | *
| * | POINT(10.9197 36.3014) | 22.33 | *
I tried this, but it gave me a false average !!!!
select distinct on(point)
id,st_astext(point),avg(rscp) as rscp_avg,ci
from mesures
group by id,point,ci;
Thanks for your help (^_^)
Hamdoulah ! Thanks God !
I find the solution just now :
select on distinct(point)
id,st_astext(point),rscp_avg,ci
from
(select id,point,avg(rscp) over w as rscp_avg,ci
from mesures
window w as (partition by point order by id desc)
) ss
order by point,id asc;
The websites that help me are :
http://www.postgresql.org/docs/9.1/static/tutorial-window.html
http://www.w3resource.com/PostgreSQL/postgresql-avg-function.php