How to select rows based on properties of another row? - postgresql

Had a question..
| a_id | name | r_id | message | date
_____________________________________________
| 1 | bob | 77 | bob here | 1-jan
| 1 | bob | 77 | bob here again | 2-jan
| 2 | jack | 77 | jack here. | 2-jan
| 1 | bob | 79 | in another room| 3-feb
| 3 | gill | 79 | gill here | 4-feb
These are basically accounts (a_id) chatting inside different rooms (r_id)
I'm trying to find the last chat message for every room that jack a_id = 2 is chatting in.
What i've tried so far is using distinct on (r_id) ... ORDER BY r_id, date DESC.
But this incorrectly gives me the last message in every room instead of only giving the last message in everyroom that jack belongs to.
| 2 | jack | 77 | jack here. | 2-jan
| 3 | gill | 79 | gill here | 4-feb
Is this a partition problem instead distinct on?

I would suggest :
to group the rows by r_id with a GROUP BY clause
to select only the groups where a_id = 2 is included with a HAVING clause which aggregates the a_id of each group : HAVING array_agg(a_id) #> array[2]
to select the latest message of each selected group by aggregating its rows in an array with ORDER BY date DESC and selecting the first element of the array : (array_agg(t.*))[1]
to convert the selected rows into a json object and then displaying the expected result by using the json_populate_record function
The full query is :
SELECT (json_populate_record(null :: my_table, (array_agg(to_json(t.*)))[1])).*
FROM my_table AS t
GROUP BY r_id
HAVING array_agg(a_id) #> array[2]
and the result is :
a_id
name
r_id
message
date
1
bob
77
bob here
2022-01-01
see dbfiddle

For last message in every chat room simply would be:
select a_id, name, r_id, to_char(max(date),'dd-mon') from chats
where a_id =2
group by r_id, a_id,name;
Fiddle https://www.db-fiddle.com/f/keCReoaXg2eScrhFetEq1b/0
Or seeing messages
with last_message as (
select a_id, name, r_id, to_char(max(date),'dd-mon') date from chats
where a_id =1
group by r_id, a_id,name
)
select l.*, c.message
from last_message l
join chats c on (c.a_id= l.a_id and l.r_id=c.r_id and l.date=to_char(c.date,'dd-mon'));
Fiddle https://www.db-fiddle.com/f/keCReoaXg2eScrhFetEq1b/1
Though all this complication could by avoided with a primary key on your table.

Related

Reset column with numeric value that represents the order when destroying a row

I have a table of users that has a column called order that represents the order in they will be elected.
So, for example, the table might look like:
| id | name | order |
|-----|--------|-------|
| 1 | John | 2 |
| 2 | Mike | 0 |
| 3 | Lisa | 1 |
So, say that now Lisa gets destroyed, I would like that in the same transaction that I destroy Lisa, I am able to update the table so the order is still consistent, so the expected result would be:
| id | name | order |
|-----|--------|-------|
| 1 | John | 1 |
| 2 | Mike | 0 |
Or, if Mike were the one to be deleted, the expected result would be:
| id | name | order |
|-----|--------|-------|
| 1 | John | 1 |
| 3 | Lisa | 0 |
How can I do this in PostgreSQL?
If you are just deleting one row, one option uses a cte and the returning clause to then trigger an update
with del as (
delete from mytable where name = 'Lisa'
returning ord
)
update mytable
set ord = ord - 1
from del d
where mytable.ord > d.ord
As a more general approach, I would really recommend trying to renumber the whole table after every delete. This is inefficient, and can get tedious for multi-rows delete.
Instead, you could build a view on top of the table:
create view myview as
select id, name, row_number() over(order by ord) ord
from mytable

Selecting value for the latest two distinct columns

I am trying to do an SQL which will return the latest data value of the two distinct columns of my table.
Currently, I select distinct the values of the column and afterwards, I iterate through the columns to get the distinct values selected before then order and limit to 1. These tags can be any number and may not always be posted together (one time only tag 1 can be posted; whereas other times 1, 2, 3 can).
Although it gives the expected outcome, this seems to be inefficient in a lot of ways, and because I don't have enough SQL experience, this was so far the only way I found of performing the task...
--------------------------------------------------
| name | tag | timestamp | data |
--------------------------------------------------
| aa | 1 | 566 | 4659 |
--------------------------------------------------
| ab | 2 | 567 | 4879 |
--------------------------------------------------
| ac | 3 | 568 | 1346 |
--------------------------------------------------
| ad | 1 | 789 | 3164 |
--------------------------------------------------
| ae | 2 | 789 | 1024 |
--------------------------------------------------
| af | 3 | 790 | 3346 |
--------------------------------------------------
Therefore the expected outcome is {3164, 1024, 3346}
Currently what I'm doing is:
"select distinct tag from table"
Then I store all the distinct tag values programmatically and iterate programmatically through these values using
"select data from table where '"+ tags[i] +"' in (tag) order by timestamp desc limit 1"
Thanks,
This comes close, but beware if you have two rows with the same tag share a maximum timestamp you will get duplicates in the result set
select data from table
join (select tag, max(timestamp) maxtimestamp from table t1 group by tag) as latesttags
on table.tag = latesttags.tag and table.timestamp = latesttags.maxtimestamp

Postgresql Split single row to multiple rows

I'm new to postgresql. I'm getting below results from a query and now I need to split single row to obtain multiple rows.
I have gone through below links, but still couldn't manage it. Please help.
unpivot and PostgreSQL
How to split a row into multiple rows with a single query?
Current result
id,name,sub1code,sub1level,sub1hrs,sub2code,sub2level,sub2hrs,sub3code,sub3level,sub3hrs --continue till sub15
1,Silva,CHIN,L1,12,MATH,L2,20,AGRW,L2,35
2,Perera,MATH,L3,30,ENGL,L1,10,CHIN,L2,50
What we want
id,name,subcode,sublevel,subhrs
1,Silva,CHIN,L1,12
1,Silva,MATH,L2,20
1,Silva,AGRW,L2,35
2,Perera,MATH,L3,30
2,Perera,ENGL,L1,10
2,Perera,CHIN,L2,50
Use union:
select id, 1 as "#", name, sub1code, sub1level, sub1hrs
from a_table
union all
select id, 2 as "#", name, sub2code, sub2level, sub2hrs
from a_table
union all
select id, 3 as "#", name, sub3code, sub3level, sub3hrs
from a_table
order by 1, 2;
id | # | name | sub1code | sub1level | sub1hrs
----+---+--------+----------+-----------+---------
1 | 1 | Silva | CHIN | L1 | 12
1 | 2 | Silva | MATH | L2 | 20
1 | 3 | Silva | AGRW | L2 | 35
2 | 1 | Perera | MATH | L3 | 30
2 | 2 | Perera | ENGL | L1 | 10
2 | 3 | Perera | CHIN | L2 | 50
(6 rows)
The # column is not necessary if you want to get the result sorted by subcode or sublevel.
You should consider normalization of the model by splitting the data into two tables, e.g.:
create table students (
id int primary key,
name text);
create table hours (
id int primary key,
student_id int references students(id),
code text,
level text,
hrs int);

T-SQL - how can I use group by on xml objects

I've wrote the following query which I expect to return a data-set as outlined under the query
Query
SELECT
RelatedRecordID AS [OrganisationID],
Data.value('(//OpportunityViewEvent/Title)[1]','nvarchar(255)') AS OpportunityTitle,
Data.value('(//OpportunityViewEvent/ID)[1]','int') AS OpportunityID,
Count(Data.value('(//OpportunityViewEvent/ID)[1]','int')) AS Visits
FROM [Audit].[EventData]
LEFT OUTER JOIN Employed.Organisation AS ORG ON [EventData].RelatedRecordID = ORG.ID
Where EventTypeID = 4
Group BY RelatedRecordID
Order By Visits Desc
Expected Result
+-----------------+-----------------+---------------+--------+
| OrganisationID | OpportunityTitle | OpportunityID | Visits |
+-----------------+------------------+---------------+--------+
| 23 | Plumber | 122 | 567 |
| 65 | Accountant | 34 | 288 |
| 12 | Developer | 81 | 100 |
| 45 | Driver | 22 | 96 |
+-----------------+------------------+---------------+--------+
I receive an error saying
Column 'Audit.EventData.Data' is invalid in the select list because it
is not contained in either an aggregate function or the GROUP BY
clause.
If I then try to group the xml data I get a different error saying
XML methods are not allowed in a GROUP BY clause.
Is there a way to work around this?
Thanks
You can do by adding it into CTE
;with cte as (
SELECT
RelatedRecordID AS [OrganisationID],
Data.value('(//OpportunityViewEvent/Title)[1]','nvarchar(255)') AS OpportunityTitle,
Data.value('(//OpportunityViewEvent/ID)[1]','int') AS OpportunityID,
Data.value('(//OpportunityViewEvent/ID)[1]','int') as visit
FROM [Audit].[EventData]
LEFT OUTER JOIN Employed.Organisation AS ORG ON [EventData].RelatedRecordID = ORG.ID
Where EventTypeID = 4 )
select OrganisationID, opportunityTitle, opportunityId, count(visit) as Visits from cte
Group BY OrganisationID, opportunityTitle, opportunityId

Sum of the most recent non-null columns (window function with "ignore nulls")

I am using PostgreSQL 9.1.9.
In the project I am working on, some most recent records have null columns because that information was not available when that row was created. I have a view that lists the sum of rows that belongs to the members of a group. As of right now, the view shows the sum of the most recent columns, which uses null values if those are the most recent values. For example,
table1
group_name | member
-------------------
group1 | Andy
group1 | Bob
table2
name | stat_date | col1 | col2 | col 3
--------------------------------------
Andy | 6/19/13 | null | 1 | 2
Andy | 6/18/13 | 100 | 3 | 5
Bob | 6/19/13 | 50 | 9 | 12
Bob | 6/18/13 | 111 | 31 | 51
-- creating view would be something like this...
create view v_grouped as
select table1.group_name, stat_date,
sum(col1) as col1_sum, sum(col2) as col2_sum, sum(col3) as col3_sum
from table1
join table2 on table1.member = table2.name
group by table1.group_name, table2.stat_date;
Current view looks like this:
group_name | stat_date | col1_sum | col2_sum | col3_sum
-------------------------------------------------------
group1 | 6/19/13 | 50 | 10 | 14
group2 | 6/18/13 | 211 | 34 | 56
Instead of 50, 150 would be a closer representation of what the actual group total is, despite lack of data for 6/19. So, I want an output of
group_name | stat_date | col1_sum | col2_sum | col3_sum
-------------------------------------------------------
group1 | 6/19/13 | 150 | 10 | 14
group2 | 6/18/13 | 211 | 34 | 56
I've been looking at first_value() from window functions as a possible function to use. I found that Oracle's first_value() supports the ignore nulls option which I believe will do what I want (http://psoug.org/definition/FIRST_VALUE.htm). According to this page I linked, about PL/SQL's first_value() function:
If the first value in the result set is NULL then the function returns NULL unless you specify IGNORE NULLS.
If you use the IGNORE NULLS parameter then FIRST_VALUE will return the first non-null value found in the result set. (If all
values are null then it will return NULL.)
Example Syntax: FIRST_VALUE(expression [INGORE NULLS]) OVER (analytic_clause)
But PostgreSQL's first_value() does not support such an option. Is there a way to do this in PostgreSql? Thank you in advance!
You can use this custom aggregate as a postgres variant of FIRST_VALUE(expression INGORE NULLS). Or build your own aggregate with desired behavior.
Is this what you are trying to describe?
SELECT sum(col1), sum(col2), sum(col3) FROM table2 WHERE col1 IS NOT NULL
(although I omitted the join on table1; that is an exercise for the reader)