Return values not present in another table - postgresql

I have two tables:
A - table shows if a given topic has been processed
B - all posibile topicNames related to a given projectId
What I'd like to return is table, which shows topics left to be processed. So assuming table B contains all possible topicNames I want to exclude those from table A and show only B-A(ghi, jkl). To ilustrate this please look at table C below:
I'm really struggling to get the right query. Any hints on that?
A:
fieldId | projectId | topicName
-------------------------------
1 | A | abc
1 | A | def
B:
fieldId | projectId | topicName
--------------------------------
1 | A | abc
1 | A | def
1 | A | ghi
1 | A | jkl
What I want - Table C:
C:
fieldId | projectId | topicName
-------------------------------
1 | A | ghi
1 | A | jkl

You are looking for EXCEPT based upon 2 queries. This is basically the opposite of UNION.
EXCEPT returns all rows that are in the result of query1 but not in
the result of query2. (This is sometimes called the difference between
two queries.) Again, duplicates are eliminated unless EXCEPT ALL is
used.
For your case something like: (see demo)
select fieldId , projectId , topicName from B
except
select fieldId , projectId , topicName from A;

Related

How to query parent child in PostgreSQL?

I have the following table structure :
place_id | parent_place_id | name
---------|-----------------|------------
1 | 2 | child
---------|-----------------|------------
2 | 3 | dad
---------|-----------------|------------
3 | | grandfather
......
I am trying to write a query so that my output data is as follows :
id_Grandfather | name_Grandfather | id_Dad | name_Dad | id_Child | name_child
----------------|------------------|--------|----------|----------|-----------
3 | grandfather | 2 | dad | 1 | child
I have tried many ways but not getting the expected result. Can anyone help me to solve it? Thank !
There is a way to do it with double join. But does it make any sense is totally different question.
SELECT
gf.place_id as id_Grandfather,
gf.name as name_Grandfather,
d.place_id as id_Dad,
d.name as name_Dad,
c.place_id as id_Child,
c.name as name_Child
FROM
your_table c
LEFT JOIN your_table d ON c.parent_place_id = d.place_id
LEFT JOIN your_table gf ON d.parent_lace_id = gf.place_id
-- Add this if you want to have only lines which has Dad and Grandfather fields populated
WHERE d.place_id IS NOT NULL
;

PostgreSQL - Setting null values to missing rows in a join statement

SQL newbie here. I'm trying to write a query that generates a scoring table, setting null to a student's grades in a module for which they haven't yet taken their exams (on PostgreSQL).
So I start with tables that look something like this:
student_evaluation:
|student_id| module_id | course_id |grade |
|----------|-----------|-----------|-------|
| 1 | 1 | 1 |3 |
| 1 | 1 | 1 |7 |
| 1 | 2 | 1 |8 |
| 2 | 4 | 2 |9 |
course_module:
| module_id | course_id |
| ---------- | --------- |
| 1 | 1 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
In our use case, a course is made up of several modules. Each module has a single exam, but a student who failed his exam may have a couple of retries. The same module may also be present in different courses, but an exam attempt only counts for one instance of the module (ie. student A passed module 1's exam on course 1. If course 2 also has module 1, student A has to retake the same exam for course 2 if he also has access to that course).
So the output should look like this:
student_id
module_id
course_id
grade
1
1
1
3
1
1
1
7
1
2
1
8
1
3
1
null
2
4
2
9
I feel like this should have been a simple task, but I think I have a very flawed understanding of how outer and cross joins work. I have tried stuff like:
SELECT se.student_id, se.module_id, se.course_id, se.grade FROM student_evaluation se
RIGHT OUTER JOIN course_module ON course_module.course_id = se.course_id
AND course_module.module_id = se.module_id
or
SELECT se.student_id, se.module_id, se.course_id, se.grade FROM student_evaluation se
CROSS JOIN course_module WHERE course_module.course_id = se.course_id
Neither worked. These all feel wrong, but I'm lost as to what would be the proper way to go about this.
Thank you in advance.
I think you need both join types: first use a cross join to build a list of all combinations of students and courses, then use an outer join to add the grades.
SELECT sc.student_id,
sc.module_id,
sc.course_id,
se.grade
FROM student_evaluation se
RIGHT JOIN (SELECT s.student_id,
c.module_id,
c.course_id
FROM (SELECT DISTINCT student_id
FROM student_evaluation) AS s
CROSS JOIN course_module AS c) AS sc
USING (course_id));

PostgreSQL One ID multiple values

I have a Postgres table where one id may have multiple Channel values as follows
ID |Channel | Column 3 | Column 4
_____|________|__________|_________
1 | Sports | x | null
1 | Organic| x | z
2 | Organic| null | q
3 | Arts | b | w
3 | Organic| e | r
4 | Sports | sp | t
No ID will have a duplicate channel name, and no ID will be both Sports and Arts. That is, ID 1 could have a Sports and Organic channel, a Sports and Arts channel, but not two sports or two organic entries and not a Sports and Arts channel. I want all IDs to be in the query, but if there is a non-organic channel I prefer that. The result I would want would be
ID |Channel | Column 3 | Column 4
_____|________|__________|_________
1 | Sports | x | null
2 | Organic| null | q
3 | Arts | b | w
4 | Sports | sp | t
I feel like there is some CTE here, a rank and partition or something that could do the trick, but I'm just not getting it. I'm only including Columns 3 and 4 to show there are extra columns.
Does anyone have any ideas on the code to deploy here?
You could use DISTINCT ON with an appropriate ORDER BY clause:
SELECT DISTINCT ON (id)
id, channel, column3, column4
FROM atable
ORDER BY id, channel = 'Organic';
This relies on the fact that FALSE < TRUE.
I ended up using a rank over function
ROW_NUMBER () over (partition by salesforce_id order by case when channel is organic then 0 else 1 end desc, timestamp desc) as id_rank
I didn't include in the original question that I had a timestamp! This works now. Thanks

How to order rows with linked parts in PostgreSQL

I have a table A with columns: id, title, condition
And i have another table B with information about position for some rows from table A. Table B have columns id, next_id, prev_id
How to sort rows from A based on information from table B?
For example,
Table A
id| title
---+-----
1 | title1
2 | title2
3 | title3
4 | title4
5 | title5
Table B
id| next_id | prev_id
---+-----
2 | 1 | null
5 | 4 | 3
I want to get this result:
id| title
---+-----
2 | title2
1 | title1
3 | title3
5 | title5
4 | title4
And after apply this sort, i want to sort by condition column yet.
I've already spent a lot of time looking for a solution, and hope for your help.
You have to add weights to your data, so you can order accordingly. This example uses next_id, not sure if you need to use prev_id, you don't explain the use of it.
Anyway, here's a code example:
-- Temporal Data for the test:
CREATE TEMP TABLE table_a(id integer,tittle text);
CREATE TEMP TABLE table_b(id integer,next_id integer, prev_id integer);
INSERT INTO table_a VALUES
(1,'title1'),
(2,'title2'),
(3,'title3'),
(4,'title4'),
(5,'title5');
INSERT INTO table_b VALUES
(2,1,null),
(5,4,3);
-- QUERY:
SELECT
id,tittle,
CASE -- Adding weight
WHEN next_id IS NULL THEN (id + 0.1)
ELSE next_id
END AS orden
FROM -- Joining tables
(SELECT ta.*,tb.next_id
FROM table_a ta
LEFT JOIN table_b tb
ON ta.id=tb.id)join_a_b
ORDER BY orden
And here's the result:
id | tittle | orden
--------------------------
2 | title2 | 1
1 | title1 | 1.1
3 | title3 | 3.1
5 | title5 | 4
4 | title4 | 4.1

Sql Query to select missing records based on multiple hard coded ranges

Creating a SQL query that performs math with variables from multiple tables
This question I asked previously will help a bit as far as layout, for the sake of saving time I'll include the important bits and add in more detail for scenarios pertaining to this:
A mockup of what the tables look like:
Inventory
ID | lowrange | highrange | ItemType
----------------------------------------
1 | 15 | 20 | 1
2 | 21 | 30 | 1
3 | null | null | 1
4 | 100 | 105 | 2
MissingOrVoid
ID | Item | Missing | Void
---------------------------------
1 | 17 | 1 | 0
1 | 19 | 1 | 0
4 | 102 | 0 | 1
4 | 103 | 1 | 0
4 | 104 | 1 | 0
TableWithDataEnteredForItemType1
InventoryID| ItemID | Detail1 | Detail2 | Detail3
-------------------------------------------------
1 | 16 | Some | Info | Here
1 | 18 | More | Info | Here
1 | 20 | Data | Is | Here
2 | 21 | .... | .... | ....
2 | 24 | .... | .... | ....
2 | 28 | .... | .... | ....
2 | 29 | .... | .... | ....
2 | 30 | .... | .... | ....
TableWithDataEnteredForItemType2
InventoryID| ItemID | Col1 | Col2 | Col3
----------------------------------------
4 | 101 | .... | .... | ....
I attempted this. I know it is not functional but it illustrates what I'm trying to do and I personally haven't seen anything written up like this before:
SELECT CASE WHEN (I.ItemType = 1) THEN SELECT TONE.ItemID FROM
TableWithDataEnteredForItemType1 TONE WHEN (I.ItemType = 2)
THEN SELECT TTWO.ItemID FROM TableWithDataEnteredForItemType2
TTWO END AS ItemMissing Inventory I JOIN CASE WHEN (I.ItemType = 1) THEN
TableWithDataEnteredForItem1 T WHEN (I.ItemType = 2) THEN
TableWithDataEnteredForItem2 T END ON
I.ID = T.InventoryID WHERE ItemMissing NOT BETWEEN IN (SELECT
I.lowrange FROM Inventory WHERE I.lowrange IS NOT NULL) AND IN
(SELECT I.highrange FROM Inventory WHERE I.highrange IS NOT NULL)
AND ItemMissing NOT IN (SELECT Item from MissingOrVoid)
The result should be:
ItemMissing
----
15
22
23
25
26
27
105
I know I'm probably not even going in the right direction with my query, but I was hoping I could get some direction as to fixing it to get the results that are needed.
Thanks
Edit:
Specific requirements (thought I included this but appears I didn't) - return list of all items not accounted for in the system. There are two ways of something being accounted for: 1) an entry in the corresponding ItemType table 2) located in MissingOrVoid (known items to be missing or removed).
DDL (I had to look this up as I wasn't sure what was meant here. Being that I have very little experience creating tables by scripting, this will probably be psuedo-DDL):
Inventory
ID - int, identifier/primary key, not nullable
lowrange - int, nullable
highrange - int, nullable
itemtype - int, not nullable
MissingOrVoid
ID - int, foreign key for Inventory.ID, not nullable
Item - int, identifier/primary key, not nullable
missing - bit, not nullable
void - bit, not nullable
Tables for Item types:
IntenvoryID - int, foreign key for Inventory.ID, not nullable
ItemID - int, primary key, not nullable
Everything else - not needed for querying, just data about the item (included
to show that the tables aren't the same content)
Edit 2: Here's an incredibly inefficient C# and Linq way of doing it but maybe of some help:
List<int> Items = new List<int>();
List<int> MoV = (from c in db.MissingOrVoid Select c.Item).ToList();
foreach (Table...ItemType1 row in db.Table...ItemType1)
Items.Add(row.ItemID);
foreach (Table...ItemType2 row in db.Table...ItemType2)
Items.Add(row.ItemID);
List<Range> InventoryRanges = new List<Range>();
foreach (Inventory row in db.Inventories)
{
if (row.lowrange != null && row.highrange != null)
InventoryRanges.Add(new Range(row.lowrange, row.highrange));
}
foreach (int item in Items)
{
foreach (Range range in InventoryRanges)
{
if (range.lowrange <= item && range.highrange >= item)
Items.Remove(item);
}
if (MoV.Contains(item))
Items.Remove(item);
}
return Items;
There's a ready made number table called master..spt_values, which can be quite helpful in this case. Note though, that you can use this table if the distance between lowrange and highrange cannot exceed 2047, otherwise create, populate and use your own number table instead.
Here's the method:
SELECT
ItemMissing = i.Item
FROM (
SELECT
i.ID,
Item = i.lowrange + v.number,
i.ItemType
FROM Inventory i
INNER JOIN master..spt_values v
ON v.type = 'P' AND v.number BETWEEN 0 AND i.highrange - i.lowrange
) inv
LEFT JOIN MissingOrViod m
ON inv.ID = m.ID AND inv.Item = m.Item
LEFT JOIN TableWithDataEnteredForItem1 t1 ON inv.ItemType = 1
AND inv.ID = t1.InventoryID AND inv.Item = t1.ItemID
LEFT JOIN TableWithDataEnteredForItem2 t2 ON inv.ItemType = 2
AND inv.ID = t2.InventoryID AND inv.Item = t2.ItemID
WHERE m.ID IS NULL AND t1.InventoryID IS NULL AND t2.InventoryID IS NULL
The subselect expands the Inventory table into a complete item list with item IDs as defined by lowrange and highrange (this is where the number table comes in handy). The obtained list is then compared against the other three tables to find and exclude those items that are present in them. The remaining items, then, constitute the list of 'items missing'.