I have a recursive table which contains item code and their parents.
Table
items
(
item_code varchar2(40),
item_parent varchar2(40)
)
Problem
When I specify the level and items in the query, I want the item's parents to be retrieved to the specified level in the query. Instead, when I specify the level only the items in this level return.
I want to retrieve the items from level 1 to the specified level.
Example
select item_parent
from items
where level=3 and item code between item_code and item_code
I want all the item from level 1 to level 3 including the item in the select statement
not only the item in level 3.
Related
Assuming I have a table sets with a field filters, containing array of key - value mappings, table items to which select query must be applied to extract rows based on these filters, and associated table for M:M relations to link each set with each item. I am seeking for a method or mechanism to cancel select query if sets.filters were updated, otherwise M:M relation will be built invalid as based on yet not refreshed filters.
The concrete scenario when a problem takes place is:
Receive file with items data, to parse, and insert into items returning new relevant ids(primary keys here);
After insertion, select from relevant sets for filters;
Take items ids and select from items using filters;
Update M:M association table for all the items returned at step 3.
So, unfortunately between step 3 and 4 or even earlier, API call makes an update on one of the sets rows, changing its filters. As the result - M:M table is invalid, because one filter was changed(lets say the filters contained kind of weight <= 100 kilos expression, however after the mentioned update it has become weight <= 50 kilos, so if there are some new items with weight greater than 50, those items ids should not be in M:M table, obviously).
Is there some efficient way to cancel select query from items during transaction? Or maybe there is a strong query to use. My idea is to rollback changes post-factum, checking sets.modified_at column. But it seems as doing additional job by wasting disk and cpu time.
The problem
I'm struggling with a hierarchical problem with Postgres.
I have multi-level hierarchical structure (let's assume 3 for purpose of the question)
-> category
--> subcategory
---> item
Those are different tables but I can simply convert it into one child-parent table / CTE. Each of the parent category can have multiple children.
And I have another table, let's say UserPreferences, where there is a relation between User and Items (which Items user selected as preferences).
What I need to do is create a query that will return all user preferences BUT there is a requirement that if all nodes under the parent are selected, then parent name should be presented instead of the list of children.
Example
if we have following situation:
-> category 1
--> subcategory 1
---> item 1 (selected by user)
---> item 2 (selected by user)
--> subcategory 2
---> item 3 (selected by user)
---> item 4
-> category 2
--> subcategory 3
---> item 5 (selected by user)
Then the desired output for user is:
subcategory 1, item 3, category 2
Note that the query should allow to query for multiple users at once so function is not an option.
Attempts
I had multiple attemps on writing such query using:
recursive CTE - I had problems with writing the proper condition though :/ I'm also a bit worried about the performance - if it will be good enough
group by ROLLUP + some left lateral joins to get counts within categories - but here I had problems in a situation presented with "category 2" - so that in my case "category 2" and "subcategory 2" would be presented instead of only "category 2"
Does anyone have any suggestions what would be the best approach for this?
You should have a database design that represent this hierachy with intervals instead of nested set. Then the query will be simple...
I wrote some papers about that:
https://sqlpro.developpez.com/cours/arborescence/#LII
But it is in french...
The query to find the upper level when all subitems are checked is :
SELECT *
FROM INTERVAL_TREE AS IT
WHERE (RIGHT_BOUND - LEFT_BOUND) / 2 =
(SELECT COUNT(*)
FROM INTERVAL_TREE
WHERE CHECKED = 1
AND RIGHT_BOUND > IT.RIGHT_BOUND
AND LEFT_BOUND < IT.LEFT_BOUND)
Off course the predicate "CHECKED = 1" can be a subquery...
In my batch process data from a sql database has to be selected and be exported as a xml file. Therefore, I have to select all data for one parent element, to be able to export the parent node and all child nodes as xml.
I have a table like the following example:
|key|parent|child|
------------------
|yxc|par001|chi01|
|xcv|par001|chi02|
|cvb|par002|chi03|
|vbn|par003|chi04|
|bnm|par003|chi05|
Now I want to select every parent and its child elements. These should be processed after each other. For the above example table it should be: par001 -> par002 -> par003. The xml that will be exported should look like the following:
<par001>
<chi01></chi01>
<chi02></chi02>
</par001>
<par002>
<chi03></chi03>
</par002>
...
How can I select the data so that I can process each parent element after each other? Is this possible with a JpaItemReader?
I would break the problem down into two steps:
step 1 does a select distinct(parent) from your_table and stores the result in the job execution context (the result is a list of Strings or IDs, not entire items, so it's fine to store them in the execution context in order to share them with the next step)
step 2 reads parent IDs from the execution context and iterate over them using an item reader. An item processor would enrich each item with its children before passing enriched items to a StaxEventItemWriter
I'm developing an application with SQLAlchemy and PostgreSQL. Users of the system modify data in 8 or so tables. Consider this contrived example schema:
I want to add visible logging to the system to record what has changed, but not necessarily how it has changed. For example: "User A modified product Foo", "User A added user B" or "User C purchased product Bar". So basically I want to store:
Who made the change
A message describing the change
Enough information to reference the object that changed, e.g. the product_id and customer_id when an order is placed, so the user can click through to that entity
I want to show each user a list of recent and relevant changes when they log in to the application (a bit like the main timeline in Facebook etc). And I want to store subscriptions, so that users can subscribe to changes, e.g. "tell me when product X is modified", or "tell me when any products in store S are modified".
I have seen the audit trigger recipe, but I'm not sure it's what I want. That audit trigger might do a good job of recording changes, but how can I quickly filter it to show recent, relevant changes to the user? Options that I'm considering:
Have one column per ID type in the log and subscription tables, with an index on each column
Use full text search, combining the ID types as a tsvector
Use an hstore or json column for the IDs, and index the contents somehow
Store references as URIs (strings) without an index, and walk over the logs in reverse date order, using application logic to filter by URI
Any insights appreciated :)
Edit It seems what I'm talking about it an activity stream. The suggestion in this answer to filter by time first is sounding pretty good.
Since the objects all use uuid for the id field, I think I'll create the activity table like this:
Have a generic reference to the target object, with a uuid column with no foreign key, and an enum column specifying the type of object it refers to.
Have an array column that stores generic uuids (maybe as text[]) of the target object and its parents (e.g. parent categories, store and organisation), and search the array for marching subscriptions. That way a subscription for a parent category can match a child in one step (denormalised).
Put a btree index on the date column, and (maybe) a GIN index on the array UUID column.
I'll probably filter by time first to reduce the amount of searching required. Later, if needed, I'll look at using GIN to index the array column (this partially answers my question "Is there a trick for indexing an hstore in a flexible way?")
Update this is working well. The SQL to fetch a timeline looks something like this:
SELECT *
FROM (
SELECT DISTINCT ON (activity.created, activity.id)
*
FROM activity
LEFT OUTER JOIN unnest(activity.object_ref) WITH ORDINALITY AS act_ref
ON true
LEFT OUTER JOIN subscription
ON subscription.object_id = act_ref.act_ref
WHERE activity.created BETWEEN :lower_date AND :upper_date
AND subscription.user_id = :user_id
ORDER BY activity.created DESC,
activity.id,
act_ref.ordinality DESC
) AS sub
WHERE sub.subscribed = true;
Joining with unnest(...) WITH ORDINALITY, ordering by ordinality, and selecting distinct on the activity ID filters out activities that have been unsubscribed from at a deeper level. If you don't need to do that, then you could avoid the unnest and just use the array containment #> operator, and no subquery:
SELECT *
FROM activity
JOIN subscription ON activity.object_ref #> subscription.object_id
WHERE subscription.user_id = :user_id
AND activity.created BETWEEN :lower_date AND :upper_date
ORDER BY activity.created DESC;
You could also join with the other object tables to get the object titles - but instead, I decided to add a title column to the activity table. This is denormalised, but it doesn't require a complex join with many tables, and it tolerates objects being deleted (which might be the action that triggered the activity logging).
I retrieve an ordered list of items from a table of items in a Sqlite Database. How can I swap the id so the order of two items in the Sqlite database table?.
The id shouldn't determine position or ordering. It should be an immutable identifier.
If you need to represent order in a database you need to create another orderNumber column. A couple options are (1) either have values that span a range or (2) have a pointer to next (like a linked list).
For ranges: Spanning a range helps you avoid rewriting the orderNumber column for all items after the insert point. For example, in the range, insert first gets 1, insert 2nd gets max range, insert 3rd between first and second gets mid-range number - if you reposition you have to assign mid-points of the items it's between. One downside is if the list gets enough churn (minimized by a large span) you may have to rebalance the ranges. The pro of this solution is you can get the ordered list just by ordering by this column in the sql statement.
For linked list: If the database has a next column that points to the id that's after it in order, you need to update a couple rows to insert something. Upside is it's simple. Downside is you can't order in the sql statement - you're relying on the code getting the list to sort it.
One other variation is you could pull the ordered list data out of that table altogether. For example, you could have an ordered list table that has listid, itemid, orderedNumber. That allows you to have one or multiple logical ordered lists of the items in that table it references.
Some other references:
How to store ordered items which often change position in DB
Best way to save a ordered List to the Database while keeping the ordering
https://dba.stackexchange.com/questions/5683/how-to-design-a-database-for-storing-a-sorted-list