Postgresql Trigger function for setting value in column based on most recent timestamp - postgresql

I'm having troubles with creating a trigger function in postgresql which should do the following:
I have a table with category(varchar), edit_date(timestamp) and actual(boolean).
I already have a trigger which sets the actual timestamp on UPDATE or INSERT. The column 'actual' should then say TRUE, if this edit_date is the most current one for a category. So when there are two entities from category a, the one with the more recent edit_date should be set to TRUE, while the other one of this category should be set to FALSE.
Thank you so much for your help, I'm pretty stuck on this one.
Best

Related

DB Architecture question: Would using a JSONB column for storing row changes be an efficient solution?

I have an app which allows for dynamic table and column generation. One feature we are looking to implement is change tracking for each row. Anytime any changes occur, we want to keep track of the field and what the new value is.
One potential solution we are looking at is adding a "history" JSONB column to the table which would contain an ongoing array of every change. When a field is updated or multiple fields are updated, we could append a new element to the history field by using the || append syntax which will append the element to the existing jsonb vlaue:
update table set history = history || '{new node with changes}'
When viewing the "history changes", it would all be contained in this JSONB field and we'd have logic to parse out /display the changes over time.
Note: We won't need to query this JSONB column. It's simply the place to store all the row level changes. It seems like this should be an efficient way of saving updates.
My question is, is this a viable solution in regards to performance over time? It's possible this history field could become large over time, so would making any row changes become slower the larger the field? Or does the fact we are using the || operator to append data to the field mitigate any performance issues?
FYI, I am currently on Postgres 11, but could upgrade to newer versions if that had an impact.
Thanks for any feedback you can provide.

automatically change column values to lower case while inserting

I have a table in db2 which is having a varchar column. I want to insert only lower case string in the column.
Is it possible to change the case to lower whenever an insertion happens in that column. What will be the alter
Query for that if possible ?
I can not make another column which can take reference of my current column and be referenced like ucase(Current_column)
The means to ensure the effect of lower-casing the data that is inserted into a column, i.e. "to change the case to lower whenever an insertion happens in that column", is a TRIGGER.
Presumably much like Why is my “before update” trigger changing unexpected columns?, per having noted in a followup comment to the OP that "I tried making a BEFORE INSERT", a TRIGGER similar to the following apparently was implemented in that attempt?:
CREATE TRIGGER TOLOWER_BI BEFORE INSERT ON USERS
REFERENCING NEW AS N OLD AS O FOR EACH ROW MODE DB2ROW
set N.LoginId= lcase(N.LoginId)
If so, then "the application is not picking the trigger" [also from a followup comment to the OP] must be explained, because a TRIGGER is in effect at the database layer, such that an application has no choice [no picking] with regard to the effects of the trigger being enforced.

Postgres count(*) optimization idea

I'm currently working on a project involving keeping track of users and their actions with my database (PostgreSQL as the RDMS), and I have run into an issue when trying to perform COUNT(*) on occurrences of each user. What I want is to be able to, efficiently, count the number of times each user appears from every record, and also be able to achieve looking at counts on a particular date range.
So, the problem is how do we achieve counting the total number of times a user appears from the tables contents, and how do we count the total number on a date range.
What I've tried
As you might know, Postgres doesn't support COUNT(*) very well using indices, so we have to consider other ways to reduce the # of records it looks at in order to speed up the query. So my first approach is to create a table to keep track of the number of times a user has a log message associated with them, and on what day (similar to the idea behind a materialized view, but I dont want continually refresh the materialized view with my count query). Here is what I've come up with:
CREATE TABLE users_counts(user varchar(65536), counter int default 0, day date);
CREATE RULE inc_user_date_count
AS ON INSERT TO main_table
DO ALSO UPDATE users_counts SET counter = counter + 1
WHERE user = NEW.user AND day = DATE(NEW.date_);
What this does is every time a new record is inserted into my 'main_table', we update the current users_counts table to increment the records whose date is equal to the new records date, and the user names are the same.
NOTE: the date_ column in 'main_table' is a timestamp so I must cast the new records date_ to be a DATE type.
The problem is, what if the user column value doesn't already exist in my new table 'users_count' for the current day, then nothing is updated.
Here is my question:
How do I write the rule such that we check if a user exists for the current day, if so increment that counter, otherwise insert new row with user, day, and counter of 1;
I also would like to know if my approach makes sense to do, or is there any ideas I am missing that I just haven't thought about. As my database grows, it is increasingly inefficient to perform counting, so I want to avoid any performance bottlenecks.
EDIT 1: I was able to actually figure this out by creating a separate RULE but I'm not sure if this is correct:
CREATE RULE test_insert AS ON INSERT TO main_table
DO ALSO INSERT INTO users_counts(user, counter, day)
SELECT NEW.user, 1, DATE(NEW.date)
WHERE NOT EXISTS (SELECT user FROM users.log_messages WHERE user = NEW.user_);
Basically, an insert happens if the user doesn't already exist in my CACHED table called user_counts, and the first rule above updates the count.
What I'm unsure of is how do I know when which rule is called first, the update rule or insert.. And there must be a better way, how do I combine the two rules? Can this be done with a function?
It is true that postgresql is notoriously slow when it comes to count(*) queries. However if you do have a where clause that limits the number of entries the query will be much faster. If you are using postgresql 9.2 or newer this query will be just as fast as it's in mysql because of index only scans which was added in 9.2 but it's best to explain analyze your query to make sure.
Does my solution make sense?
Very much so provided that your explain analyze show that index only scans are not being used. Trigger based solutions like the one that you have adapted find wide usage. But as you have realized the problem with the initial state arises (whether to do an update or an insert).
which rule is called first
Multiple rules on the same table and same event type are applied in
alphabetical name order.
from http://www.postgresql.org/docs/9.1/static/sql-createrule.html
the same applies for triggers. If you want a particular rule to be executed first change it's name so that it comes up higher in the alphabetical order.
how do I combine the two rules?
One solution is to modify your rule to perform an upsert (Look right at the bottom of that page for a sample upsert ). The other is to populate the counter table with initial values. The trick is to create the trigger at the same time to avoid errors. This blog post explains it really well.
While the initial setup will be slow each individual insert will probably be faster. The two opposing factors being the slowness of a WHERE NOT EXISTS query vs the overhead of catching an exception.
Tip: A block containing an EXCEPTION clause is significantly more
expensive to enter and exit than a block without one. Therefore, don't
use EXCEPTION without need.
Source the postgresql documentation page linked above.

Select new or updated records com table in Postgresql

It's possible to get all the new or update records from one table in postgresql by
specified date?
something like this:
Select NEW OR UPDATED FROM anyTable WHERE dt_insert or dt_update = '2015-01-01'
tks
You can only do this if you added a trigger-maintained field that keeps track of the last change time.
There is no internal row timestamp in PostgreSQL, so in the absence of a trigger-maintained timestamp for the row, there's no way to find rows changed/added after a certain time.
PostgreSQL does have internal information on the transaction ID that wrote a row, stored in the xmin hidden column. There's no record of what transaction ID committed when, though, until PostgreSQL 9.5 which keeps track of this if and only if the new track_commit_timestamps setting is turned on. Additionally, PostgreSQL eventually clears the creator transaction ID information from a tuple because it re-uses transaction IDs, so it only works for fairly recent transactions.
In other words: it's kind-of possible in a rough way, if you understand the innards of the database, but should really only be used for forensic and recovery purposes.

Get next available auto_increment ID in PostgreSQL - A better approach?

I'm new to postgreSQL, so would really appreciate any pointers from the community.
I am updating some functionality in the CMS of a pretty old site I've just inherited. Basically, I need the ID of an article before it is inserted into the database. Is there anyway anyway to check the next value that will be used by a sequence before a database session (insert) has begun?
At first I thought I could use SELECT max(id) from tbl_name, however as the id is auto incremented from a sequence and articles are often deleted, it obviously won't return a correct id for the next value in the sequence.
As the article isn't in the database yet, and a database session hasn't started, it seems I can't use the currval() functionality of postgreSQL. Furthermore if I use nextval() it auto increments the sequence before the data is inserted (the insert also auto-incrementing the sequence ending up with the sequence being doubly incremented).
The way I am getting around it at the moment is as follows:
function get_next_id()
{
$SQL = "select nextval('table_id_seq')";
$response = $this->db_query($SQL);
$arr = pg_fetch_array($query_response, NULL, PGSQL_ASSOC);
$id = (empty($arr['nextval'])) ? 'NULL' : intval($arr['nextval']);
$new_id = $id-1;
$SQL = "select setval('table_id_seq', {$new_id})";
$this->db_query($SQL);
return $id;
}
I use SELECT nextval('table_id_seq') to get the next ID in the sequence. As this increments the sequence I then immediately use SELECT setval('table_id_seq',$id) to set the sequence back to it's original value. That way when the user submits the data and the code finally hits the INSERT statement, it auto increments and the ID before the insert and after the insert are identical.
While this works for me, I'm not too hot on postgreSQL and wonder if it could cause any problems down the line, or if their isn't a better method? Is there no way to check the next value of a sequence without auto-incrementing it?
If it helps I'm using postgresql 7.2
Folks - there are reasons to get the ID before inserting a record. For example, I have an application that stores the ID as part of the text that is inserted into another field. There are only two ways to do this.
1) Regardless of the method, get the ID before inserting to include in my INSERT statement
2) INSERT, get the the ID (again, regardless of how (SELECT ... or from INSERT ... RETURNING id;)), update the record's text field that includes the ID
Many of the comments and answers assumed the OP was doing something wrong... which is... wrong. The OP clearly stated "Basically, I need the ID of an article before it is inserted into the database". It should not matter why the OP wants/needs to do this - just answer the question.
My solution opted to get the ID up front; so I do nextval() and setval() as necessary to achieve my needed result.
Disclaimer: Not sure about 7.2 as I have never used that.
Apparently your ID column is defined to get its default value from the sequence (probably because it's defined as serial although I don't know if that was available in 7.x).
If you remove the default but keep the sequence, then you can retrieve the next ID using nextval() before inserting the new row.
Removing the default value for the column will require you to always provide an ID during insert (by retrieving it from the sequence). If you are doing that anyway, then I don't see a problem. If you want to cater for both scenarios, create a before insert trigger (does 7.x have them?) that checks if the ID column has a value, if not retrieve a new value from the sequence otherwise leave it alone.
The real question though is: why do you need the ID before insert. You could simply send the row to the server and then get the generated id by calling curval()
But again: you should really (I mean really) talk to the customer to upgrade to a recent version of Postgres