Query to get last conversations for user inbox - postgresql

I need a specific SQL query to select last 10 conversations for user inbox.
Inbox shows only conversations(threads) with every user - it selects the last message from the conversation and shows it in inbox.
Edited.
Expecting result: to extract latest message from each of 10 latest conversations. Facebook shows latest conversations in the same way
And one more question. How to make a pagination to show next 10 latest messages from previous latest conversations in the next page?
Private messages in the database looks like:
| id | user_id | recipient_id | text
| 1 | 2 | 3 | Hi John!
| 2 | 3 | 2 | Hi Tom!
| 3 | 2 | 3 | How are you?
| 4 | 3 | 2 | Thanks, good! You?

As per my understanding, you need to get the latest message of the conversation on per-user basis (of the last 10 latest conversations)
Update: I have modified the query to get the latest_conversation_message_id for every user conversation
The below query gets the details for user_id = 2, you can modify, users.id = 2 to get it for any other user
SQLFiddle, hope this solves your purpose
SELECT
user_id,
users.name,
users2.name as sent_from_or_sent_to,
subquery.text as latest_message_of_conversation
FROM
users
JOIN
(
SELECT
text,
row_number() OVER ( PARTITION BY user_id + recipient_id ORDER BY id DESC) AS row_num,
user_id,
recipient_id,
id
FROM
private_messages
GROUP BY
id,
recipient_id,
user_id,
text
) AS subquery ON ( ( subquery.user_id = users.id OR subquery.recipient_id = users.id) AND row_num = 1 )
JOIN users as users2 ON ( users2.id = CASE WHEN users.id = subquery.user_id THEN subquery.recipient_id ELSE subquery.user_id END )
WHERE
users.id = 2
ORDER BY
subquery.id DESC
LIMIT 10
Info: The query gets the latest message of every conversation with any other user, If user_id 2, sends a message to user_id 3, that too is displayed, as it indicates the start of a conversation. The latest message of every conversation with any other user is displayed

To solve groupwise-max in pg you can use DISTINCT ON. Like this:
SELECT
DISTINCT ON(pm.user_id)
pm.user_id,
pm.text
FROM
private_messages AS pm
WHERE pm.recipient_id= <my user id>
ORDER BY pm.user_id, pm.id DESC;
http://sqlfiddle.com/#!12/4021d/19
To get the latest X however we will have to use it in a subselect:
SELECT
q.user_id,
q.id,
q.text
FROM
(
SELECT
DISTINCT ON(pm.user_id)
pm.user_id,
pm.id,
pm.text
FROM
private_messages AS pm
WHERE pm.recipient_id=2
ORDER BY pm.user_id, pm.id DESC
) AS q
ORDER BY q.id DESC
LIMIT 10;
http://sqlfiddle.com/#!12/4021d/28
To get both sent and recieved threads:
SELECT
q.user_id,
q.recipient_id,
q.id,
q.text
FROM
(
SELECT
DISTINCT ON(pm.user_id,pm.recipient_id)
pm.user_id,
pm.recipient_id,
pm.id,
pm.text
FROM
private_messages AS pm
WHERE pm.recipient_id=2 OR pm.user_id=2
ORDER BY pm.user_id,pm.recipient_id, pm.id DESC
) AS q
ORDER BY q.id DESC
LIMIT 10;
http://sqlfiddle.com/#!12/4021d/42

Paste it after your WHERE clause
ORDER BY "ColumnName" [ASC, DESC]
UNION Description at W3Schools it combines the result of this 2 statements.
SELECT "ColumnName" FROM "TableName"
UNION
SELECT "ColumnName" FROM "TableName"

For large data sets I think you might like to try running the two statements and then consolidating the results, as an index scan on (user_id and id) or (recipient_id and id) ought to be very efficient at getting the 10 most recent conversations of each type.
with sent_messages as (
SELECT *
FROM private_messages
WHERE user_id = my_user_id
ORDER BY id desc
LIMIT 10),
received_messages as ( SELECT *
FROM private_messages
WHERE recipient_id = my_user_id
ORDER BY id desc
LIMIT 10),
all_messages as (
select *
from sent_messages
union all
select *
from received_messages)
select *
from all_messages
order by id desc
limit 10
Edit: Actually another query worth trying might be:
select *
from private_messages
where id in (
select id
from (
SELECT id
FROM private_messages
WHERE user_id = my_user_id
ORDER BY id desc
LIMIT 10
union all
SELECT id
FROM private_messages
WHERE recipient_id = my_user_id
ORDER BY id desc
LIMIT 10) all_ids
order by id desc
limit 10) last_ten_ids
order by id desc
This might be better in 9.2+, where the indexes alone could be used to get the id's, or in cases where the most recent number to retrieve is very large. Still a bit unclear on that though. If in doubt I'd go for the former version.

Related

How to find the latest date and price>0 per id?

In POSTGRESQL 13, I have a table of ids,dates, prices.
I simply want to have the latest date where the price is greater than 0 per id.
One row per id.
So the optimal output is :
id | the_date | price
1 2013-08-09 0.45
2 2013-08-11 0.34
I have an SQL fiddle at this link :
https://dbfiddle.uk/?rdbms=postgres_13&fiddle=a89bbbc922601be5465ad764fd035161
I have tried an INNER JOIN with the MAX date unsuccessfully.
SELECT DISTINCT ON (id)
id, the_date, price
FROM inventory
WHERE price>0
ORDER BY id ASC, the_date DESC
You can do something like this:
select i.id, i.the_date, i.price
from inventory as i, (
select id, max(the_date) as max_date
from inventory
where price > 0
group by id
) as c where c.id = i.id and i.the_date = c.max_date
Demo in dbfiddle.uk
This might work for you.
SELECT inventory.id, the_date, price
FROM inventory
join (select id,max(the_date) md from inventory where price>0 group by id ) d
on inventory.id=d.id and the_date=d.md
If you want a row for id's with not price you'd use left join.

Inner join removed from the SQL query

I have a below SQL query to get the three records for notifying purpose.
SELECT orders.msg
FROM orders
INNER JOIN
(
SELECT id
FROM orders
WHERE type_id = 12
ORDER BY id DESC LIMIT 3 OFFSET 0
) AS items
ON orders.id = items.id;
When trying to make the query optimized, i made the changes as below.
SELECT orders.msg
FROM orders
WHERE type_id = 12
ORDER BY id DESC LIMIT 3 OFFSET 0;
Is the modified query seems to be OK or did i miss anything here or any other way of doing is there??
The simplified version on the bottom looks logically identical, to me, to the one on top:
SELECT msg
FROM orders
WHERE type_id = 12
ORDER BY id DESC LIMIT 3;
Note that the above query could benefit from the following index:
CREATE INDEX idx ON orders (type_id, id, msg);
This index would completely cover the WHERE, ORDER BY, and SELECT clauses.
You can try this also:
SELECT orders.msg
FROM orders
WHERE orders.id
IN (
SELECT id
FROM orders
WHERE type_id = 12
ORDER BY id
DESC LIMIT 3 OFFSET 0
)

Count With Conditional on PostgreSQL

I have a table with people and another with visits. I want to count all visits but if the person signed up with 'emp' or 'oth' on ref_signup then remove the first visit. Example:
This are my tables:
PEOPLE:
id | ref_signup
---------------------
20 | emp
30 | oth
23 | fri
VISITS
id | date
-------------------------
20 | 10-01-2019
20 | 10-05-2019
23 | 10-09-2019
23 | 10-10-2019
30 | 09-10-2019
30 | 10-07-2019
On this example the visit count should be 4 because persons with id's 20 and 30 have their ref_signup as emp or oth, so it should exclude their first visit, but count from the second and forward.
This is what I have as a query:
SELECT COUNT(*) as visit_count FROM visits
LEFT JOIN people ON people.id = visits.people_id
WHERE visits.group_id = 1
Would using a case on the count help on this case as I just want to remove one visit not all of the visits from the person.
Subtract from COUNT(*) the distinct number of person.ids with person.ref_signup IN ('emp', 'oth'):
SELECT
COUNT(*) -
COUNT(DISTINCT CASE WHEN p.ref_signup IN ('emp', 'oth') THEN p.id END) as visit_count
FROM visits v LEFT JOIN people p
ON p.id = v.id
See the demo.
Result:
| visit_count |
| ----------- |
| 4 |
Note: this code and demo fiddle use the column names of your sample data.
Premise, select the count of visits from each person, along with a synthetic column that contains a 1 if the referral was from emp or oth, a 0 otherwise. Select the sum of the count minus the sum of that column.
SELECT SUM(count) - SUM(ignore_first) FROM (SELECT COUNT(*) as count, CASE WHEN ref_signup in ('emp', 'oth') THEN 1 ELSE 0 END as ignore_first as visit_count FROM visits
LEFT JOIN people ON people.id = visits.people_id
WHERE visits.group_id = 1 GROUP BY id) a
where's "people_id" in your example ?
SELECT COUNT(*) as visit_count
FROM visits v
JOIN people p ON p.id = v.people_id
WHERE p.ref_signup IN ('emp','oth');
then remove the first visit.
You cannot select count and delete the first visit at same time.
DELETE FROM visits
WHERE id IN (
SELECT id
FROM visits v
JOIN people p ON p.id = v.people_id
WHERE p.ref_signup IN ('emp','oth')
ORDER BY v.id
LIMIT 1
);
edit: typos
First, I create the tables
create table people (id int primary key, ref_signup varchar(3));
insert into people (id, ref_signup) values (20, 'emp'), (30, 'oth'), (23, 'fri');
create table visits (people_id int not null, visit_date date not null);
insert into visits (people_id, visit_date) values (20, '10-01-2019'), (20, '10-05-2019'), (23, '10-09-2019'), (23, '10-10-2019'), (30, '09-10-2019'), (30, '10-07-2019');
You can use the row_number() window function to mark which visit is "visit number one":
select
*,
row_number() over (partition by people_id order by visit_date) as visit_num
from people
join visits
on people.id = visits.people_id
Once you have that, you can do another query on those results, and use the filter clause to count up the correct rows that match the condition where visit_num > 1 or ref_signup = 'fri':
-- wrap the first query in a WITH clause
with joined_visits as (
select
*,
row_number() over (partition by people_id order by visit_date) as visit_num
from people
join visits
on people.id = visits.people_id
)
select count(1) filter (where visit_num > 1 or ref_signup = 'fri')
from joined_visits;
-- First get the corrected counts for all users
WITH grouped_visits AS (
SELECT
COUNT(visits.*) -
CASE WHEN people.ref_signup IN ('emp', 'oth') THEN 1 ELSE 0 END
AS visit_count
FROM visits
INNER JOIN people ON (people.id = visits.id)
GROUP BY people.id, people.ref_signup
)
-- Then sum them
SELECT SUM(visit_count)
FROM grouped_visits;
This should give you the result you're looking for.
On a side note, I can't help but think clever use of a window function could do this in a single shot without the CTE.
EDIT: No, it can't since window functions run after needed WHERE and GROUP BY and HAVING clauses.

How to get id of the row which was selected by aggregate function? [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 4 years ago.
I have next data:
id | name | amount | datefrom
---------------------------
3 | a | 8 | 2018-01-01
4 | a | 3 | 2018-01-15 10:00
5 | b | 1 | 2018-02-20
I can group result with the next query:
select name, max(amount) from table group by name
But I need the id of selected row too. Thus I have tried:
select max(id), name, max(amount) from table group by name
And as it was expected it returns:
id | name | amount
-----------
4 | a | 8
5 | b | 1
But I need the id to have 3 for the amount of 8:
id | name | amount
-----------
3 | a | 8
5 | b | 1
Is this possible?
PS. This is required for billing task. At some day 2018-01-15 configuration of a was changed and user consumes some resource 10h with the amount of 8 and rests the day 14h -- 3. I need to count such a day by the maximum value. Thus row with id = 4 is just ignored for 2018-01-15 day. (for next day 2018-01-16 I will bill the amount of 3)
So I take for billing the row:
3 | a | 8 | 2018-01-01
And if something is wrong with it. I must report that row with id == 3 is wrong.
But when I used aggregation function the information about id is lost.
Would be awesome if this is possible:
select current(id), name, max(amount) from table group by name
select aggregated_row(id), name, max(amount) from table group by name
Here agg_row refer to the row which was selected by aggregation function max
UPD
I resolve the task as:
SELECT
(
SELECT id FROM t2
WHERE id = ANY ( ARRAY_AGG( tf.id ) ) AND amount = MAX( tf.amount )
) id,
name,
MAX(amount) ma,
SUM( ratio )
FROM t2 tf
GROUP BY name
UPD
It would be much better to use window functions
There are at least 3 ways, see below:
CREATE TEMP TABLE test (
id integer, name text, amount numeric, datefrom timestamptz
);
COPY test FROM STDIN (FORMAT csv);
3,a,8,2018-01-01
4,a,3,2018-01-15 10:00
5,b,1,2018-02-20
6,b,1,2019-01-01
\.
Method 1. using DISTINCT ON (PostgreSQL-specific)
SELECT DISTINCT ON (name)
id, name, amount
FROM test
ORDER BY name, amount DESC, datefrom ASC;
Method 2. using window functions
SELECT id, name, amount FROM (
SELECT *, row_number() OVER (
PARTITION BY name
ORDER BY amount DESC, datefrom ASC) AS __rn
FROM test) AS x
WHERE x.__rn = 1;
Method 3. using corelated subquery
SELECT id, name, amount FROM test
WHERE id = (
SELECT id FROM test AS t2
WHERE t2.name = test.name
ORDER BY amount DESC, datefrom ASC
LIMIT 1
);
demo: db<>fiddle
You need DISTINCT ON which filters the first row per group.
SELECT DISTINCT ON (name)
*
FROM table
ORDER BY name, amount DESC
You need a nested inner join. Try this -
SELECT id, T2.name, T2.amount
FROM TABLE T
INNER JOIN (SELECT name, MAX(amount) amount
FROM TABLE
GROUP BY name) T2
ON T.amount = T2.amount

PostgreSQL Window Function "column must appear in the GROUP BY clause"

I'm trying to get a leaderboard of summed user scores from a list of user score entries. A single user can have more than one entry in this table.
I have the following table:
rewards
=======
user_id | amount
I want to add up all of the amount values for given users and then rank them on a global leaderboard. Here's the query I'm trying to run:
SELECT user_id, SUM(amount) AS score, rank() OVER (PARTITION BY user_id) FROM rewards;
I'm getting the following error:
ERROR: column "rewards.user_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT user_id, SUM(amount) AS score, rank() OVER (PARTITION...
Isn't user_id already in an "aggregate function" because I'm trying to partition on it? The PostgreSQL manual shows the following entry which I feel is a direct parallel of mine, so I'm not sure why mine's not working:
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
They're not grouping by depname, so how come theirs works?
For example, for the following data:
user_id | score
===============
1 | 2
1 | 3
2 | 5
3 | 1
I would expect the following output (I have made a "tie" between users 1 and 2):
user_id | SUM(score) | rank
===========================
1 | 5 | 1
2 | 5 | 1
3 | 1 | 3
So user 1 has a total score of 5 and is ranked #1, user 2 is tied with a score of 5 and thus is also rank #1, and user 3 is ranked #3 with a score of 1.
You need to GROUP BY user_id since it's not being aggregated. Then you can rank by SUM(score) descending as you want;
SQL Fiddle Demo
SELECT user_id, SUM(score), RANK() OVER (ORDER BY SUM(score) DESC)
FROM rewards
GROUP BY user_id;
user_id | sum | rank
---------+-----+------
1 | 5 | 1
2 | 5 | 1
3 | 1 | 3
There is a difference between window functions and aggregate functions. Some functions can be used both as a window function and an aggregate function, which can cause confusion. Window functions can be recognized by the OVER clause in the query.
The query in your case then becomes, split in doing first an aggregate on user_id followed by a window function on the total_amount.
SELECT user_id, total_amount, RANK() OVER (ORDER BY total_amount DESC)
FROM (
SELECT user_id, SUM(amount) total_amount
FROM table
GROUP BY user_id
) q
ORDER BY total_amount DESC
If you have
SELECT user_id, SUM(amount) ....
^^^
agreagted function (not window function)
....
FROM .....
You need
GROUP BY user_id