MERGING TABLE VALUES - tsql

I have a single table: comments
Columns are: author_id, id, parent_id, timestamp
Example:
author_id, id, parent_id, timestamp
1, 1, NULL, 12:00 << this is the original post
2, 1234, 1, 12:04 << this is a reply made 4 minutes after
3, 5678, 1, 12:05 << this is another reply
All replies share OP's 'id' as their 'parent_id'
What I want is a single table or view, ordered so all the conversations (OP's and replies) are ordered as above. What I have right now is a list of all the comments (OP's and replies) simply ordered by time, so I have lots of overlapping conversations. I need to bond the conversations, but not as a join because it gives a repeat of each OP for each reply and double the columns I need.
Thanks

Assuming that the conversation is linked by id and parent_id and that a new id means the start of a new conversation.
You could write something like this:
select
ISNULL(parent_id, id) as ConversationId,
*
from
Comments
order by ConversationId, timestamp

Related

How to work with data values formatted [{}, {}, {}]

I apologize if this is a simple question - I had some trouble even formatting the question when I was trying to Google for help!
In one of the tables I am working with, there's data value that looks like below:
Invoice ID
Status
Product List
1234
Processed
[{"product_id":463153},{"product_id":463165},{"product_id":463177},{"pid":463218}]
I want to count how many products each order has purchased. What is the proper syntax and way to count the values under "Product List" column? I'm aware that count() is wrong, and I need to maybe extract the data from the string value.
select invoice_id, count(Product_list)
from quote_table
where status = 'processed'
group by invoice_id
You can use a JSON function named: json_array_length and cast this column like a JSON data type (as long as possible), for example:
select invoice_id, json_array_length(Product_list::json) as count
from quote_table
where status = 'processed'
group by invoice_id;
invoice_id | count
------------+-------
1234 | 4
(1 row)
If you need to count a specific property of the json column, you can use the query below.
This query solves the problem by using create type, json_populate_recordset and subquery to count product_id inside json data.
drop type if exists count_product;
create type count_product as (product_id int);
select
t.invoice_id,
t.status,
(
select count(*) from json_populate_recordset(
null::count_product,
t.Product_list
)
where product_id is not null
) as count_produto_id
from (
-- symbolic data to use in query
select
1234 as invoice_id,
'processed' as status,
'[{"product_id":463153},{"product_id":463165},{"product_id":463177},{"pid":463218}]'::json as Product_list
) as t

SQL query to filter where all array items in JSONB array meet condition

I made a similar post before, but deleted it as it had contextual errors.
One of the tables in my database includes a JSONB column which includes an array of JSON objects. It's not dissimilar to this example of a session table which I've mocked up below.
id
user_id
snapshot
inserted_at
1
37
{cart: [{product_id: 1, price_in_cents: 3000, name: "product A"}, {product_id: 2, price_in_cents: 2500, name: "product B"}]}
2022-01-01 20:00:00.000000
2
24
{cart: [{product_id: 1, price_in_cents: 3000, name: "product A"}, {product_id: 3, price_in_cents: 5500, name: "product C"}]}
2022-01-02 20:00:00.000000
3
88
{cart: [{product_id: 4, price_in_cents: 1500, name: "product D"}, {product_id: 2, price_in_cents: 2500, name: "product B"}]}
2022-01-03 20:00:00.000000
The query I've worked with to retrieve records from this table is as follows.
SELECT sessions.*
FROM sessions
INNER JOIN LATERAL (
SELECT *
FROM jsonb_to_recordset(sessions.snapshot->'cart')
AS product(
"product_id" integer,
"name" varchar,
"price_in_cents" integer
)
) AS cart ON true;
I've been trying to update the query above to retrieve only the records in the sessions table for which ALL of the products in the cart have a price_in_cents value of greater than 2000.
To this point, I've not had any success on forming this query but I'd be grateful if anyone here can point me in the right direction.
You can use a JSON path expression:
select *
from sessions
...
where not sessions.snapshot ## '$.cart[*].price_in_cents <= 2000'
There is no JSON path expression that would check that all array elements are greater 2000. So this returns those rows where no element is smaller than 2000 - because that can be expressed with a JSON path expression.
Here is one possible solution based on the idea of your original query.
Each element of the cart JSON array object is joined to its sessions parent row. You 're left adding the WHERE clause conditions now that the wanted JSON array elements are exposed.
SELECT *
FROM (
SELECT
sess.id,
sess.user_id,
sess.inserted_at,
cart_items.cart_name,
cart_items.cart_product_id,
cart_items.cart_price_in_cents
FROM sessions sess,
LATERAL (SELECT (snapshot -> 'cart') snapshot_cart FROM sessions WHERE id = sess.id) snap_arr,
LATERAL (SELECT
(value::jsonb ->> 'name')::text cart_name,
(value::jsonb -> 'product_id')::int cart_product_id,
(value::jsonb -> 'price_in_cents')::int cart_price_in_cents
FROM JSONB_ARRAY_ELEMENTS(snap_arr.snapshot_cart)) cart_items
) session_snapshot_cart_product;
Explanation :
From the sessions table, the cart array is exctracted and joined per sessions row
The necessary items of the cart JSON array is then unnested by the second join using the JSONB_ARRAY_ELEMENTS(jsonb) function
The following worked well for me and allowed me the flexibility to use different comparison operators other than just ones such as == or <=.
In one of the scenarios I needed to construct, I also needed to have my WHERE in the subquery also compare against an array of values using the IN comparison operator, which was not viable using some of the other solutions that were looked at.
Leaving this here in case others run into the same issue as I did, or if others find better solutions or want to propose suggestions to build upon this one.
SELECT *
FROM sessions
WHERE NOT EXISTS (
SELECT sessions.*
FROM sessions
INNER JOIN LATERAL (
SELECT *
FROM jsonb_to_recordset(sessions.snapshot->'cart')
AS product(
"product_id" integer,
"name" varchar,
"price_in_cents" integer
)
) AS cart ON true
WHERE name ILIKE "Product%";
)

PostgreSQL get nested rows with limit

I got a self referencing table that's only 1 deep: comments and replies. A reply is just a comment with a parent id:
Comments (simplified):
- comment_id
- parentCommentId
Users have to scroll through the comments and replies and typically 10 new rows are being fetched each time, I'm trying out an recursive query for this:
WITH RECURSIVE included_childs(comment_id, parent_comment_id) AS
(SELECT comment_id, parent_comment_id
FROM comments
UNION
SELECT c.comment_id, c.parent_comment_id
FROM included_childs ch, comments c
WHERE c.comment_id = ch.parent_comment_id
)
SELECT *
FROM included_childs
limit 10
obviously because of limit 10 not all the childs are being included this way and conversations will be cut off. What I actually want is a limit on the parents and have all childs included, regardless of how many total rows.
update
This is the actual query, now with limit in the first branch:
WITH RECURSIVE included_childs(comment_id, from_user_id, fk_topic_id, comment_text, parent_comment_id, created) AS
((SELECT comment_id, from_user_id, fk_topic_id, comment_text, parent_comment_id, created
FROM vw_comments WHERE fk_topic_id = 2
and parent_comment_id is null
limit 1)
union all
SELECT c.comment_id, c.from_user_id, c.fk_topic_id, c.comment_text, c.parent_comment_id, c.created
FROM included_childs ch, vw_comments c
WHERE c.comment_id = ch.parent_comment_id
)
SELECT *
FROM included_childs
still, this doesn't give me the expected results, I get 1 comment as a result with no replies.
update 2
silly mistake on the where clause:
WHERE c.comment_id = ch.parent_comment_id
should've been
WHERE ch.comment_id = c.parent_comment_id
it's working now.
I think the first branch in the UNION in the recursive CTE should be something like:
SELECT comment_id, parent_comment_id
FROM comments
WHERE parent_comment_id IS NULL
LIMIT 10
Then you'd get all replies for these 10 "root" comments.
I'd expect some sort of ORDER BY in there, unless you don't care for the order.
UNION ALL would be fester than UNION, and there cannot be cycles, right?

Find last n entries ordered by association

I have a table Stories and a table Post. Each Story contains multiple Posts (Story.hasMany(models.Post);, Post.belongsTo(models.Story);)
What I'm trying to achieve is to list the first 10 Stories ordered by Posts.createdAt. So it might be possible that the first entry is the oldest Story but with a very new Post.
What I'm trying right now is the following:
var options = {
limit: 10,
offset: 0,
include: [{
model: models.sequelize.model('Post'),
attributes: ['id', 'createdAt'],
required: true
}],
order: [
[models.sequelize.model('Post'), 'createdAt', 'DESC'],
['createdAt', 'DESC']
],
attributes: ['id', 'title', 'createdAt']
};
Story.findAll(options)...
Which gives me this SQL query:
SELECT "Story".*, "Posts"."id" AS "Posts.id", "Posts"."createdAt" AS "Posts.createdAt"
FROM (SELECT "Story"."id", "Story"."title", "Story"."createdAt"
FROM "Stories" AS "Story"
WHERE ( SELECT "StoryId"
FROM "Posts" AS "Post"
WHERE ("Post"."StoryId" = "Story"."id") LIMIT 1 ) IS NOT NULL
ORDER BY "Story"."createdAt" DESC LIMIT 10) AS "Story"
INNER JOIN "Posts" AS "Posts" ON "Story"."id" = "Posts"."StoryId"
ORDER BY "Posts"."createdAt" DESC, "Story"."createdAt" DESC;
The problem here is that if the 11th entry has a very new post it is not displayed in the top 10 list.
How can I get a limited list of stories ordered by Posts.createdAt?
Why are you using two nested subselects? That's ineffective, might produce expensive nested loops, and still does not return what you are looking for.
As a start, you can cross-join Stories and Posts, order by the creation timestamp from Posts. But this still might scan the entire table.
Have a look at this presentation:
http://www.slideshare.net/MarkusWinand/p2d2-pagination-done-the-postgresql-way
But I have no idea, how you can bring that into your model :-(

Store ordered items in Cassandra

I'm working on web application which should use Apache Cassandra to store data in.
I need to store rating for each item and then get list of items with highest rating.
So the task - to store some additional info for items in sorted order to get rid of client side ordering or ordering using ORDER BY.
One of the possible options is to create index Column Family:
userId {
100_ItemId1 : null,
90__ItemId2 : null,
80__ItemId3 : null,
80__ItemId4 : null
}
Note: userId is a key of the row, 100, 90, 80 - are rating values
But here is an issue with deleting, we should know previous rating value to remove index, an it can require to store reversed info in Column Family:
reversed_userId{
ItemId1 : 100_ItemId1,
ItemId2 : 100_ItemId2,
...
}
Could you please say are there some patterns to store ordered items effectively?
P.S: I'm not going to use OrderPreservingPartitioner since it can be applied to whole KeySpace and it can damage load balancing and performance.
I'm hoping you'll be happy to know that in CQL 3 you can use composite key structure to sort now.
http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
So for example:
CREATE TABLE SortedPosts (
post_id int,
sort_order int,
post_title text,
PRIMARY KEY(post_id, sort_order)
);
sort_order will sort it. And you can:
SELECT * FROM SortedPosts WHERE post_id = 1 ORDER BY sort_order ASC
SELECT * FROM SortedPosts WHERE post_id = 1 ORDER BY sort_order DESC