As we know details of every job are stored in rdbms in table Hsp_Job_Status. But unfortunately this table gets truncated each time we re-start services. As per business requirement we needed to keep a record of BR's launched by user and it's details. So we had developed a work around and created a trigger on the table such that it inserted each new row/update in a backup table. This was working fine uptill now.
Recently after re-start the the values of old Job_id (i.e primary key), are not appearing in order. It started series form a previous number. It was going in series of 106XX but after re-start the numbering started from 100XX. As Hsp_job_status was truncated during restart, there was no issue of duplicate primary key in that table. But it created duplicate values in backup table. And this has created issues with backup table and procedure that we use.
Usually the series is continuous one even after table truncate. So may be some thing has gone wrong during restart. Can you please suggest me as to what should i check and do to resolve this issue.
Thanks in advance.
Partial answer: the simple solution is to insert an instance prefix to the Job_Id, and on service startup increment the active instance. The instance table can then include details from startup/shutdown events to help drive SLA metrics. Unfortunately, I don't know how you would go about implementing such a scheme, since it's been many years since I've spoken any SQL dialects.
Related
I've got a system (WSO2SP) which uses PostgreSQL. It stores BLOBs and uses pg_largeobject. I don't have control over how the system uses this Postgres feature. The issue is that the table pg_largeobject is growing constantly and the only way to keep it from growing is cleaning the table using a scheduled task.
Is it possible to analyze requests, queries or another activity to understand why the table might be growing?
Answered by Mohandarshan on Siddhi Slack
As per the default behaviour, state persistence revisions are get cleaned automatically in Siddhi.
In Postgres, there is a separate table to handle the large object.
This is not cleaned even the actual data is removed. In our case, even state persistence data is removed respective pg_largeobject entry is not getting deleted.
It seems, Postgres recommends to remove them using the DB trigger or manual process. You can refer to the guide to get some understanding. Please get help from Postgres community, if you need further guidance on this.
Imagine dropping a subscription and recreating it from scratch. Is it possible to ignore existing data during the first synchronization?
Creating a subscription with (copy_data=false) is not an option because I do want to copy data, I just don't want to copy already existing data.
Example: There is a users table and a corresponding publication on the master. This table has 1 million rows and every minute a new row is added. Then we drop the subscription for a day.
If we recreate the subscription with (copy_data=true), replication will not start due to a conflict with already existing data. If we specify (copy_data=false), 1440 new rows will be missing. How can we synchronize the publisher and the subscriber properly?
You cannot do that, because PostgreSQL has no way of telling when the data were added.
You'd have to reconcile the tables by hand (or INSERT ... ON CONFLICT DO NOTHING).
Unfortunately PostgreSQL does not support nice skip options for conflicts yet, but I believe it will be enhanced in the feature.
Based on #Laurenz Albe answer which recommends the use of the statement:
INSERT ... ON CONFLICT DO NOTHING.
I believe that it would be better to use the following command which also will take care any possible updates on your data before you start the subscription again:
INSERT ... ON CONFLICT UPDATE SET...
Finally I have to say that both are dirty solutions as during the execution of the above statement and the creation of the subscription, new lines may have been arrived which will result in losing them until you perform again the custom sync.
I have seen some other suggested solutions using the LSN number from the Postgresql log file...
For me maybe is elegant and safe to delete all the data from the destination table and create the replication again!
I want to periodically export data from db2 and load it in another database for analysis.
In order to do this, I would need to know which rows have been inserted/updated since the last time I've exported things from a given table.
A simple solution would probably be to add a timestamp to every table and use that as a reference, but I don't have such a TS at the moment, and I would like to avoid adding it if possible.
Is there any other solution for finding the rows which have been added/updated after a given time (or something else that would solve my issue)?
There is an easy option for a timestamp in Db2 (for LUW) called
ROW CHANGE TIMESTAMP
This is managed by Db2 and could be defined as HIDDEN so existing SELECT * FROM queries will not retrieve the new row which would cause extra costs.
Check out the Db2 CREATE TABLE documentation
This functionality was originally added for optimistic locking but can be used for such situations as well.
There is a similar concept for Db2 z/OS - you have to check that out as I have not tried this one.
Of cause there are other ways to solve it like Replication etc.
That is not possible if you do not have a timestamp column. With a timestamp, you can know which are new or modified rows.
You can also use the TimeTravel feature, in order to get the new values, but that implies a timestamp column.
Another option, is to put the tables in append mode, and then get the rows after a given one. However, this option is not sure after a reorg, and affects the performance and space utilisation.
One possible option is to use SQL replication, but that needs extra tables for staging.
Finally, another option is to read the logs, with the db2ReadLog API, but that implies a development. Also, just appliying the archived logs into the new database is possible, however the database will remain in roll forward pending.
I know this question was asked here but 1) it's relatively old and 2) It didn't help me much.
I am running into a relatively large number of deadlocks with a few operations on my database. The setup is as follows:
Tables:
Table A with foreign key into Table B.
Operations:
Insert into table A
Insert into table B
Update row in table B
Delete row in table B
Delete row in table A
Problem:
These operations can happen essentially in any order because I have multiple worker roles so these operations must be idempotent, however, each worker role will be working with a different primary key from table A. I am still trying to wrap my head around the concept of locks on tables and from what i understand, any delete on A will first lock table B, delete relevant rows there, and then delete the row from A. I currently assume that is an atomic operation and there is no time to execute additional locks between locking table B and locking table A because I can't imagine a way to get around that.
I am currently able to catch an exception in microsoft visual studio of the following format:
Transaction (Process ID xxx) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
This exception seems like it can happen on any of the above operations.
My question is: How do i know which locks/transactions are the ones causing the deadlock? Does anyone know any queries that would be useful AFTER we get the exception?
sys.event_log is the answer here.
It lives in your server's masterdb and should contain an entry with all of the deadlock graphs your database has hit in the last month.
Armed with the deadlock graph there are many tutorials on sql server deadlock graph debugging.
Currently profiling tools for Sql Azure are practically non existent.
The locking problem shouldn't differ much between standard Sql Server and Sql Azure world thus I would suggest trying to repro the problem in the 'old' world using standard techniques such as good old Profiler: quite useful article & this.
If that approach doesn't prove to be fruitful a dirty solution could be to work on catch/retry logic.
I ran into similar issues recently.
Try using your updates with "with (UPDLOCK)".
To try and find the root cause:
Start by just running a single worker role.
Then check:
Are you locking at the right level table lock, page lock or row lock?
Are you releasing the locks?
is your system designed in such a way, that all edits to the same row will be done by the same machine?
There is a blog post on finding blocking queries here: http://blogs.msdn.com/b/sqlazure/archive/2010/08/13/10049896.aspx
I am building a database in SQL Server 2000 and need to perform data validation by testing for foreign key violations. This post is related to an earlier post I made (Trigger exits on first failed insert and cant set xact_abort OFF in SQL Server 2000) which focussed on how to port from a working SQL Server 2005 implementation to a server 2000 implementation. Following the advice received on this post indicating wholesale recoding was required, i am now re-considering the design itself - hence this post. To recap on my application, my
I receive a daily data feed containing ~5k records into a Staging table. When this insert is done a single record is then added to a table called TRIGGER_DATA.
I have created a trigger ‘on insert’ on this table which then attempts to insert the data therein into a FACT_data table one record at a time.
The FACT_data table is foreign keyed to many DIM tables which define the acceptable inputs the field can take.
If any record violates a foreign key constraint the insert should fail and the record should instead be inserted into a Load_error table (which has no foreign key and all fields are Nullable).
Given the volume of records in each insert i thought it would be a bad idea to create the trigger on the Stage_data table since this would result in ~5k trigger firing in one go each day. However since i cannot set xact_abort off in a trigger under SQL Server 2000 and therefore on the first failure it aborts in the trigger i am wondering if it might be actually be a half decent solution.
Questions:
The basic question i am now asking myself is what is the typical approach for doing this - it seems to me that this kind of data validation through checking for FK violations must be common and therefore a consensus best practise may have emerged (although i really cant find any for server 2000 platform!)
Am i correct that the trigger on the stage_data table would be bad practise given the volume of records in each insert or is it acceptable?
Is my approach of looping through each record from within the trigger and testing the insert ok?
What are your thoughts on this alternative that i have just thought of. Stop using triggers altogether and, after the Stage table is loaded, update a 'stack' table with a record saying that data had been received and was ready to be validated and loaded to the FACT table (perhaps along with a priority level indicating order in which order tasks must be processed). This stack or 'job' table would then be a register of all requested inserts along with their status (created/in-progress/completed). I would then have a stored procedure continually poll this table and process the top priority record. This would mean that all stored proc calls would happen outwith the trigger.
Many thanks
You don't need a trigger at all. Unless there is some reason that you need split-second timing of this daily data load, just schedule a job (stored proc) that runs as often as necessary to look for data in the staging table.
When it finds any, process the records one at a time and load the ones that are OK and do whatever you do with the ones that have broken FKs (delete, move to a work queue, etc.).
If you use a schedule frequency that is often enough that there is some risk of the next job starting while the last one is still running, then you should create a sentinel table that your stored proc can write in to say that the job is running. This could work one of two ways. Either you just have one record that says "running" or "not running" or, you could have one record per job (like a transaction log) that has a status code indicating whether the job is complete or not.