How to find the average progress (per level) in a hierarchical tasks tree with PostgreSQL? - postgresql

I´m trying to determine the average progress of each task group in a tasks tree. The problem is that only the leaf nodes (which are in a diferent table ‒ a child table ‒) actually have the progress field.
I have a Tasks (parent) table and an Activities (child) table. Both of them share IDs. I should point out that I´m using the ltree extension in my DB.
I´m trying to make a view showing the entire tree and a column with the average progress per node level. I have to start the calculations from the inner-most level and build my way up in the tree structure, but I can´t seem to get it right.
I´ve tried to approach this using recursive CTEs and window functions to no avail. Postgres doesn´t allow to use aggregate functions in the recursive term of a recursive cte, so that didn´t work.
These are my tables (simplified):
tasks:
=======
id -- SERIAL
parent_id -- INTEGER
project_id -- INTEGER
name -- VARCHAR
task_group -- BOOL
path -- LTREE
activities:
===========
id -- INTEGER
progress -- INTEGER
EDIT
This is a screenshot of the view as of right now. I´m able to get the average progress of all leaf nodes, but that´s not exactly what I need. The average progress should be calculated for each level in the tree, and then build it´s way up in the node hierarchy.
Any insight on this will be much appreciated. Thanks in advance.

Related

Linking multiple columns of one table to a single column in another table - Qlik Sense

This is a qlik sense question.
I have the below table (project_task) for the tasks and subtasks.
The tree looks something like this (X,Y,Z,A,D are root nodes):
I have a table(task_tree_format) like this in database, which depicts the project tasks in depth format:
I want to place the name1, name2 and name 3 in pivot table so that it appears in hierarchical format(with expand and collapse buttons) in my Qlik Sense sheet.
My requirement is -
When task 'B' (NAME2 column) is selected in pivot table, it should perform the operation based on the same 'Task Name' in project_task table.
Eg.. If I select Task 'B' in pivot table, it should trigger the 'Task Name' B in project_task table.
In short, I want to associate NAME1, NAME2 and NAME3 columns with the 'Task Name' column in project_task.
I request your help on how to proceed with this in the load data editor.
Appreciate your help regarding this.
Thanks!
A Pivot table cannot be used to represent a tree of unlimited depth. You are trying to use the wrong tool. What you need is a recursive algorithm that will read the table and spit out to the tree using the desired indentation and aggregation on another sheet.
That said, you would be better off using MS project or any proper project management tool rather than reinventing the wheel.

How to reset an auto-incremented value when another value changes?

I'm actually working on a PostgreSQL DB structure and I'm having hard time figuring out how to solve a problem.
The DB will be recording data regarding architectural objects.
The main table, "object",have attributes that describe the object with information like type, localization, etc.
One of these attributes is a serial named object_num.
Another table is called "code" which contains a code made of three letters corresponding to the town where the mission is conducted.
Example :
I'm working on an architectural inventory for the city of Paris. The code_name will be PRS and the first entry (aka the first architectural entity : house, bridge, etc) will be associated to object_num 001.
So PRS001 will be a unique identifier referring to this specific architectural entity.
Things going on, I might end up with quite a few entries, for example entry PRS745.
Say this mission isn't finished yet but a new one starts for the city of Bordeaux, where BDX is going to identify the inventory. It would be great that the identifier for the first entry will be BDX001 rather than BDX746 (auto-increment).
Considering this, it will be also nice that, going back to the Paris mission after a few records for the Bordeaux mission (say BDX211), the next value will start back at (PRS)745 rather than (BDX)211.
So, is it possible to reset the value of a serial to 1 when using a new code ?
And is it possible to start back serial increment from the last value of a specific code ?
I guess you can perform this task with constraints and checks, but I'm not really familiar with these and am a bit lost...
Thanks for your help,
Yrkoutsk
You could create separate sequences for each code_name and grab your auo-increment based on the code_name:
CREATE SEQUENCE PRS START 1;
CREATE SEQUENCE BDX START 1;
insert into your_table (object_num, code_name, other_data)
values ( code_name||lpad(nextval(code_name)::char,3,'0')
, code_name, other_data);
You will have to create a new sequence every time you add a new code_name otherwise the db will end up throwing an error when you try accessing the nonexistent sequence.

Is there a way to include a column from one table in many other tables (while maintaining consistency) in PostgreSQL?

I'm trying to build a database (in PostgreSQL 9.6.6) that allows for one "master column" (items.id) to be replicated in to many (automatically generated) tables (e.g. rank1.id, rank2.id, rank3.id, ...). Only items will have INSERT's (or DELETE's) performed and when they are the newly added id's should also show up (or be removed) in the rankX table(s). To be more concrete:
items:
id | name | description
rank1:
id | rank
rank2:
id | rank
...
Where the id's are always the same, and there is always the same number of rows in each of the tables. The rankX.rank values, however, will be different (imagine users ranking how funny a series of images are -- the images all have the same id's but different users might rank them differently).
What I was thinking was that when a new user was added and a new rankX table created I would do the following:
Have rankX.id referencing a foreign key items.id (with ON DELETE CASCADE)
Copy any items.id that already exist
Auto-generate a trigger function that mirrors the INSERT's to items to the rankX table
This seems cumbersome and wasteful of space since all of the xxxx.id columns are identical and I will end up with hundreds or thousands of trigger functions. As someone new to relational databases I was hoping there was an easier way to achieve this.
So, I have a few questions:
Is there a more efficient way to define my tables such that all of this copying isn't necessary?
If this the best way, can you give an example of how you would set up the triggers (and associated functions)?
Do I need to worry about running out of space on the server as I create (potentially many) sets of triggers of this type?

Postgres count(*) optimization idea

I'm currently working on a project involving keeping track of users and their actions with my database (PostgreSQL as the RDMS), and I have run into an issue when trying to perform COUNT(*) on occurrences of each user. What I want is to be able to, efficiently, count the number of times each user appears from every record, and also be able to achieve looking at counts on a particular date range.
So, the problem is how do we achieve counting the total number of times a user appears from the tables contents, and how do we count the total number on a date range.
What I've tried
As you might know, Postgres doesn't support COUNT(*) very well using indices, so we have to consider other ways to reduce the # of records it looks at in order to speed up the query. So my first approach is to create a table to keep track of the number of times a user has a log message associated with them, and on what day (similar to the idea behind a materialized view, but I dont want continually refresh the materialized view with my count query). Here is what I've come up with:
CREATE TABLE users_counts(user varchar(65536), counter int default 0, day date);
CREATE RULE inc_user_date_count
AS ON INSERT TO main_table
DO ALSO UPDATE users_counts SET counter = counter + 1
WHERE user = NEW.user AND day = DATE(NEW.date_);
What this does is every time a new record is inserted into my 'main_table', we update the current users_counts table to increment the records whose date is equal to the new records date, and the user names are the same.
NOTE: the date_ column in 'main_table' is a timestamp so I must cast the new records date_ to be a DATE type.
The problem is, what if the user column value doesn't already exist in my new table 'users_count' for the current day, then nothing is updated.
Here is my question:
How do I write the rule such that we check if a user exists for the current day, if so increment that counter, otherwise insert new row with user, day, and counter of 1;
I also would like to know if my approach makes sense to do, or is there any ideas I am missing that I just haven't thought about. As my database grows, it is increasingly inefficient to perform counting, so I want to avoid any performance bottlenecks.
EDIT 1: I was able to actually figure this out by creating a separate RULE but I'm not sure if this is correct:
CREATE RULE test_insert AS ON INSERT TO main_table
DO ALSO INSERT INTO users_counts(user, counter, day)
SELECT NEW.user, 1, DATE(NEW.date)
WHERE NOT EXISTS (SELECT user FROM users.log_messages WHERE user = NEW.user_);
Basically, an insert happens if the user doesn't already exist in my CACHED table called user_counts, and the first rule above updates the count.
What I'm unsure of is how do I know when which rule is called first, the update rule or insert.. And there must be a better way, how do I combine the two rules? Can this be done with a function?
It is true that postgresql is notoriously slow when it comes to count(*) queries. However if you do have a where clause that limits the number of entries the query will be much faster. If you are using postgresql 9.2 or newer this query will be just as fast as it's in mysql because of index only scans which was added in 9.2 but it's best to explain analyze your query to make sure.
Does my solution make sense?
Very much so provided that your explain analyze show that index only scans are not being used. Trigger based solutions like the one that you have adapted find wide usage. But as you have realized the problem with the initial state arises (whether to do an update or an insert).
which rule is called first
Multiple rules on the same table and same event type are applied in
alphabetical name order.
from http://www.postgresql.org/docs/9.1/static/sql-createrule.html
the same applies for triggers. If you want a particular rule to be executed first change it's name so that it comes up higher in the alphabetical order.
how do I combine the two rules?
One solution is to modify your rule to perform an upsert (Look right at the bottom of that page for a sample upsert ). The other is to populate the counter table with initial values. The trick is to create the trigger at the same time to avoid errors. This blog post explains it really well.
While the initial setup will be slow each individual insert will probably be faster. The two opposing factors being the slowness of a WHERE NOT EXISTS query vs the overhead of catching an exception.
Tip: A block containing an EXCEPTION clause is significantly more
expensive to enter and exit than a block without one. Therefore, don't
use EXCEPTION without need.
Source the postgresql documentation page linked above.

Silverlight WCF RIA Service select from SQL View vs SQL Table

I have arrived at this dilemma via a tortuous and frustrating route, but I'll start with where I am right now. For information I'm using VS2010, Silverlight 5 and the latest versions of the Silverlight and RIA Toolkits, SDKs etc.
I have a view in my database (it's actually now an indexed view, but that has made no difference to the behaviour). For testing purposes (and that includes testing my sanity) I have duplicated the view as a Table (ie identical column names and definitions), and inserted all the view rows into the table. So if I SELECT * from the view or the table in Query Analyzer, I get identical results. So far so good.
I create an EDF model in my Silverlight Business Application web project, including all objects.
I create a Domain Service based on the model, and it creates ContextTypes and metadata for both the View and the Table, and associated Query objects.
If I populate a Silverlight ListBox in my Silverlight project via the Table Query, it returns all the data in the table.
If I populate the same ListBox via the View Query, it returns one row only, always the first row in the collection, however it is ordered. In fact, if I delve into the inner workings via the debugger, when it executes the ObjectContext Query in the service, it returns a result set of the correct number of rows, but all the rows are identical! If I order ascending I get n copies of the first row, descending I get n copies of the last row.
Can anyone put me out of my misery here, and tell me why the View doesn't work?
Ade
OK, well that was predictable - nearly every time I ask a question on a forum I stumble across the answer while I'm waiting for responses to flood in!
Despite having been through the metadata and model.designer files and made sure that all "view" and "table" class/method definitions etc were identical, it was still showing the exasperating difference in behaviour between view and table queries. So the problem just had to be caused by the database, right?
Sure enough, I hadn't noticed myself creating NOT NULL columns when I created the "identical" Table version of my view! Even though I was using a SELECT NEWID() to create a unique key column on the view, the database insisted that the ID column in the view was NULLABLE, and it was apparently this which was causing the problem.
To save some storage space I switched from using NEWID() to using ROW_NUMBER() to create my key column, but still had the "NULLABLE" property problem. SO I then changed it to
SELECT ISNULL(ROW_NUMBER() (OVER...) , -1)
for the ID column, and at last the column in the view was created NOT NULL! Even though neither NEWID() nor ROW_NUMBER() can ever generate NULL output, it seems you have to hold SQL Server's hand and reassure it by using the ISNULL operator before it will believe itself.
Having done this, deleted/recreated my model and service files, everything burst into glorious technicolour life without any manual additions of [Key()] properties or anything else. The problem had been with the database all along, and NOT with the Model/Service/Metadata definitions.
Hope this saves someone some time. Now all I need to do is work out why the original stored procedure method I started with two days ago doesn't work - but at least I now have a hint!
Ade