postgres trigger to check for data

postgres trigger to check for data - postgresql

I need to write a trigger that will check a table column to see if data is there or not. The trigger needs to run all the time and log msg every hour.
Basically it will run a select statement if result found then sleep for an hour else log and sleep for an hour

What you want is a scheduled job. pgAgent : http://www.pgadmin.org/docs/1.4/pgagent.html create an hourly job that checks for that line and then logs as required.
Edit to add:
Curious if you've considered writing a SQL script that generates the log on the fly by reading the table instead of a job. If you have a timestamp field, it is quite possible to have a script that returns all hourly periods that don't have a corresponding entry within that time frame (assuming the time stamp isn't updated). Why store a second log when you can generate it directly against the data?

Triggers (in pg and in every dbms i know) can execute before or after events such insert, update or delete. What you probably want here is a script launched via something like cron (if you are using a unix system) every hour, redirecting your output to the log file.
I used something like this many times and it sounded like this (written in python):
#!/usr/bin/python
import psycopg2
try:
conn = psycopg2.connect("dbname='"+dbmane+"' user='"+user+"' host='"+hostname+"' password='"+passwd+"'")
except:
# Get the most recent exception
exceptionType, exceptionValue, exceptionTraceback = sys.exc_info()
# Exit the script and print an error telling what happened.
logging.debug("I am unable to connect to the database!\n ->%s" % (exceptionValue))
exit(2)
cur = conn.cursor()
query = "SELECT whatever FROM wherever WHERE yourconditions"
try:
cur.execute(query)
except ProgrammingError:
print "Programming error, no result produced"
result = cur.fetchone()
if(result == None):
#do whatever you need, if result == None the data is not in your table column
I used to launch my script via cron every 10 minutes, you can easily configure it to launch the script every hour redirecting its output to the log file of your choice.
If your working in a windows environment, than you'll be looking for something like cron.
I don't think that a trigger can help you with this, they fire only after some events (you can use a trigger to check after every insert if the inserted data is the one you want to check every hour, but it's not the same, doing it via script is the best solution in my experience)

Related

How to update a local table remotely?

I have a large table on a remote server with an unknown (millions) amount of rows of data. I'd like to be able to fetch the data in batches of 100,000 rows at a time, update my local table with those fetched rows, and complete this until all rows have been fetched. Is there a way I can update a local table remotely?
Currently I have a dummy table called t on the server along with the following variables...
t:([]sym:1000000?`A`B`Ab`Ba`C`D`Cd`Dc;id:1+til 1000000)
selector:select from t where sym like "A*"
counter:count selector
divy:counter%100000
divyUP:ceiling divy
and the below function on the client along with the variables index set to 0 and normTable, which is a copy of the remote table...
index:0
normTable:h"0#t"
batches:{[idx;divy;anty;seltr]
if[not idx=divy;
batch:select[(anty;100000)] from seltr;
`normTable upsert batch;
idx+::1;
divy:divy;
anty+:100000;
seltr:seltr;
batches[idx;divy;anty;seltr]];
idx::0}
I call that function using the following command...
batches[index;h"divyUP";0;h"selector"]
The problem with this approach though is h"selector" fetches all the rows of data at the same time (and multiple times - for each batch of 100,000 that it upserts to my local normTable).
I could move the batches function to the remote server but then how would I update my local normTable remotely?
Alternatively I could break up the rows into batches on the server and then pull each batch individually. But if I don't know how many rows there are how do I know how many variables are required? For example the following would work, but only up to the first 400k rows...
batch1:select[100000] from t where symbol like "A*"
batch2:select[100000 100000] from t where symbol like "A*"
batch3:select[200000 100000] from t where symbol like "A*"
batch4:select[300000 100000] from t where symbol like "A*"
Is there a way to set a batchX variable so that it creates a new variable that equals the count of divyUP?

I would suggest few changes as you are trying to connect to remote server:
Do not run synchronous request as that would make server to slow down its processing. Try to make asynchronous request using callbacks.
Do not do full table scan(for heavy comparison) in each call specially for regex. Its possible that most of the data might be available in cache in next call but still it is not guaranteed which will again impact the server normal operations.
Do not make data requests in burst. Either use timer or make another data request call when last batch data has arrived.
Below approach is based on above suggestions. It will avoid scanning full table for columns other than index column(which is light weight) and make next request only when last batch has arrived.
Create Batch processing function
This function will run on server and read small batch of data from table using indices and return the required data.
q) batch:{[ind;s] ni:ind+s; d:select from t where i within (ind;ni), sym like "A*";
neg[.z.w](`upd;d;$[ni<count t;ni+1;0]) }
It takes 2 arguments- starting index and batch size to work on.
This function will finally call upd function on local macine asynchronously and will pass 2 arguments.
Table index to start next batch from (return 0 in case all rows are done to stop next batch processing)
Data from current batch request
Create Callback function
Result from batch processing function will come into this function.
If index > 0 that means there is more data to process and next batch should start form this index.
q) upd:{[data;ind] t::t,data;if[ind>0;fetch ind]}
Create Main function to start process
q)fetch:{[ind] h (batch;ind;size)}
Finally open connection, create table variable and run fetch function.
q) h:hopen `:server:port
q) t:()
q) size:100
q) fetch 0
Now, above method is based on the assumption that server table is static. In case its getting updates in real time then changes would be required depending upon how the table is getting updated on server.
Also, other optimizations can be done depending upon attributes set on remote table which can improve the performance.

If you're ok sending sync messages it can be simplified to something like:
{[h;i]`mytab upsert h({select from t where i in x};i)}[h]each 0N 100000#til h"count t"
And you can easily change it to control the number of batches (rather than the size) by instead using 10 0N# (that would do it in 10 batches)

Rather than having individual variables, the cut function can split up the result of the select into chunks of 100000 rows. Indexing each element is a table.
batches:100000 cut select from t where symbol like "A*"

How can i know which query takes long time while running a postgres function?

Actually i am running a function in postgres which takes 1123 + ms to execute the function.
That function consist of calling other function and have many query to execute . How can i know which query is culprit for slow execution of function .
I have seen . select * from pg_stat_activity; give the output of current running process.
Can it is possible to get the individual query time while running the postgres function ?
I know many will say log the query time in database by insert but is there is any method in postgres so that i can get the time taking by each query .
Also is there is any way without changing the config file in postgres because i don't want to restart the postgres . If not , other solution most welcome.
Thanks.

You can do this via the pg_stat_statements extension, though loading it does require a server restart.
After installing it, just SET pg_stat_statements.track = all (as a superuser), call your function, then SELECT * FROM pg_stat_statements.
Unless you have exclusive access to the server, you probably want to include a default of pg_stat_statements.track = none in your postgresql.conf, so that only your session is tracked.

Firebird update statement freezes

After over one day of finding & trying to solve issue its time to ask here.
We use Firebird over years (around 6) and with current version 2.5.2 updated long time ago we met the problem with unable to UPDATE statement.
SQL statement is OK, even "where" condition is used with primary key.
Problem: update is going to stuck after click on: execute
a) from php script it return Internal server error 500
b) directly from Flamerobin or IBQ its freezes and not responding at all
hint: SQL which wasnt working is working right after clean firebird (stop & start) but after a while its going to stuck again
hint: selects are or, problem is only with UPDATE
hint: I did all described here https://www.ibphoenix.com/resources/documents/how_to/doc_5
(redump of database file which was giving me errors on gfix -v -full)
after redump there was no error but problem occurs
hint: We have more servers running the same configuration with more database files but this happen only for one table in one DB file on one server.
After some investigation I finally got this from fbtrace:
2016-11-28T14:25:28.4410 (9473:0x7f1489cb1f08) PREPARE_STATEMENT
phones.fdb (ATT_273856, VILAS:NONE, UTF8, TCPv4:10.1.1.195)
/usr/bin/flamerobin:10868
(TRA_78838, CONCURRENCY | WAIT | READ_WRITE)
Statement 422749:
-------------------------------------------------------------------------------
UPDATE TELEFONI_CISLA SET DATUM_PRIDANI = '2011-7-3' WHERE ID = '17274'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PLAN (TELEFONI_CISLA INDEX (RDB$PRIMARY4))
0 ms
2016-11-28T14:25:28.4780 (9473:0x7f1489cb1f08) EXECUTE_STATEMENT_START
phones.fdb (ATT_273856, VILAS:NONE, UTF8, TCPv4:10.1.1.195)
/usr/bin/flamerobin:10868
(TRA_78838, CONCURRENCY | WAIT | READ_WRITE)
Statement 422749:
-------------------------------------------------------------------------------
UPDATE TELEFONI_CISLA SET DATUM_PRIDANI = '2011-7-3' WHERE ID = '17274'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
PLAN (TELEFONI_CISLA INDEX (RDB$PRIMARY4))
Im out of ideas what to check, almost lost. Just want let you know that we didnt change anything in FB or server configuration.
Thanks for any usefull help.

Everything was caused by one of many script running in background (from cron) taking open transaction without committing for few hours.
Its OK back again.

multiple cron jobs on the same postgres table

I have a cron job that runs every 2 mins it takes a 10 records from a postgres table and working on them then it set a flag when it is finished. i want to make sure if the fist cron runs and takes more than 2 min the other one will run on different data on DBs not on the same data.
is there any why to handle this case?

This can be solved using a Database Transaction.
BEGIN;
SELECT
id,status,server
FROM
table_log
WHERE
(direction = '2' AND status_log = '1')
LIMIT 100
FOR UPDATE SKIP LOCKED;
what are we doing?
We are Selecting all rows available (not locked) from other cron-jobs that might be running. And selecting them for update. So this means all this Query grabs its unlocked and all results will be locked for this cron-job only.
how to update my locked rows?
Simple use a for loop on your processor language (Python, Ruby, PHP) and do a concatenation for each update remember we are building 1 single update.
UPDATE table_log SET status_log = '6' ,server = '1' WHERE id = '1';
Finally we use
COMMIT;
And all rows locked will be updated. This prevents other Queries from touching the same data at the same time. Hope it helps.

Turn your "finished" flag from binary to ternary ("needs work", "in process", "finished"). You also might want to store the pid of the "in process" process, in case it dies and you need to clean it up, and a timestamp for when it started.
Or use a queueing system that someone already wrote and debugged for you.

What is a "batch", and why is GO used?

I have read and read over MSDN, etc. Ok, so it signals the end of a batch.
What defines a batch? I don't see why I need go when I'm pasting in a bunch of scripts to be run all at the same time.
I've never understood GO. Can anyone explain this better and when I need to use it (after how many or what type of transactions)?
For example why would I need GO after each update here:
UPDATE [Country]
SET [CountryCode] = 'IL'
WHERE code = 'IL'
GO
UPDATE [Country]
SET [CountryCode] = 'PT'
WHERE code = 'PT'

GO is not properly a TSQL command.
Instead it's a command to the specific client program which connects to an SQL server (Sybase or Microsoft's - not sure about what Oracle does), signalling to the client program that the set of commands that were input into it up till the "go" need to be sent to the server to be executed.
Why/when do you need it?
GO in MS SQL server has a "count" parameter - so you can use it as a "repeat N times" shortcut.
Extremely large updates might fill up the SQL server's log. To avoid that, they might need to be separated into smaller batches via go.
In your example, if updating for a set of country codes has such a volume that it will run out of log space, the solution is to separate each country code into a separate transaction - which can be done by separating them on the client with go.
Some SQL statements MUST be separated by GO from the following ones in order to work.
For example, you can't drop a table and re-create the same-named table in a single transaction, at least in Sybase (ditto for creating procedures/triggers):
> drop table tempdb.guest.x1
> create table tempdb.guest.x1 (a int)
> go
Msg 2714, Level 16, State 1
Server 'SYBDEV', Line 2
There is already an object named 'x1' in the database.
> drop table tempdb.guest.x1
> go
> create table tempdb.guest.x1 (a int)
> go
>

GO is not a statement, it's a batch separator.
The blocks separated by GO are sent by the client to the server for processing and the client waits for their results.
For instance, if you write
DELETE FROM a
DELETE FROM b
DELETE FROM c
, this will be sent to the server as a single 3-line query.
If you write
DELETE FROM a
GO
DELETE FROM b
GO
DELETE FROM c
, this will be sent to the server as 3 one-line queries.
GO itself does not go to the server (no pun intended). It's a pure client-side reserved word and is only recognized by SSMS and osql.
If you will use a custom query tool to send it over the connection, the server won't even recognize it and issue an error.

Many command need to be in their own batch, like CREATE PROCEDURE
Or, if you add a column to a table, then it should be in its own batch.
If you try to SELECT the new column in the same batch it fails because at parse/compile time the column does not exist.
GO is used by the SQL tools to work this out from one script: it is not a SQL keyword and is not recognised by the engine.
These are 2 concrete examples of day to day usage of batches.
Edit: In your example, you don't need GO...
Edit 2, example. You can't drop, create and permission in one batch... not least, where is the end of the stored procedure?
IF OBJECT_ID ('dbo.uspDoStuff') IS NOT NULL
DROP PROCEDURE dbo.uspDoStuff
GO
CREATE PROCEDURE dbo.uspDoStuff
AS
SELECT Something From ATable
GO
GRANT EXECUTE ON dbo.uspDoStuff TO RoleSomeOne
GO

Sometimes there is a need to execute the same command or set of commands over and over again. This may be to insert or update test data or it may be to put a load on your server for performance testing. Whatever the need the easiest way to do this is to setup a while loop and execute your code, but in SQL 2005 there is an even easier way to do this.
Let's say you want to create a test table and load it with 1000 records. You could issue the following command and it will run the same command 1000 times:
CREATE TABLE dbo.TEST (ID INT IDENTITY (1,1), ROWID uniqueidentifier)
GO
INSERT INTO dbo.TEST (ROWID) VALUES (NEWID())
GO 1000
source:
http://www.mssqltips.com/tip.asp?tip=1216
Other than that it marks the "end" of an SQL block (e.g. in a stored procedure)... Meaning you're on a "clean" state again... e.G: Parameters used in the statement before the code are reset (not defined anymore)

As everyone already said, "GO" is not part of T-SQL. "GO" is a batch separator in SSMS, a client application used to submit queries to the database. This means that declared variables and table variables will not persist from code before the "GO" to code following it.
In fact, GO is simply the default word used by SSMS. This can be changed in the options if you want. For a bit of fun, change the option on someone else's system to use "SELECT" as a batch seperator instead of "GO". Forgive my cruel chuckle.

It is used to split logical blocks. Your code is interpreted into sql command line and this indicate next block of code.
But it could be used as recursive statement with specific number.
Try:
exec sp_who2
go 2
Some statement have to be delimited by GO:
use DB
create view thisViewCreationWillFail

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

postgres trigger to check for data - postgresql

I need to write a trigger that will check a table column to see if data is there or not. The trigger needs to run all the time and log msg every hour. Basically it will run a select statement if result found then sleep for an hour else log and sleep for an hour

Related

How to update a local table remotely?

How can i know which query takes long time while running a postgres function?

Firebird update statement freezes

multiple cron jobs on the same postgres table

What is a "batch", and why is GO used?

Categories

Resources