postgresql pl/sh asynchronous execution of external process - postgresql

I have a postgresql-plsh function
CREATE OR REPLACE FUNCTION MDN_REG_LOCATE(MDN VARCHAR(50), CALLID VARCHAR(50)) RETURN AS '
#!/bin/sh
/home/infoobjects/Projects/java/execute.sh $1 $2 &
logger -t "data" "$1$2"
' LANGUAGE plsh;
and execute.sh call's a java process(method) which takes 3 minutes to execute . I have made the script asynchronous with appending & at the end of script(execute.sh)
My problem is that the postgresql function still waits for the result to come and does not act in asynchronous manner although the shell script behaves asynchronously because logger in above function logs up after jst a call to MDN_REG_LOCATE() but still this postgresql function(MDN_REG_LOCATE) waits for complete process of 3 minutes to over , I don't know what I am lacking , please help me in this .
Thanks In Adv.

Simply backgrounding a process isn't enough; it's still attached to its parent process. See this answer.
You're likely to be much better off reworking your logger program to keep a persistent connection to the database where it LISTENs for tasks. Your client (or a trigger, or whatever would normally invoke your PL/Sh function) sends a NOTIFY with the parameters as a payload, or (for older Pg versions) INSERTs a row into a queue table then sends a NOTIFY to tell the listening client to look at the queue table.
The listening client can then run the background process with no worries about holding up the database. Best of all, the NOTIFY is transactional; it's only delivered when the transaction that sent it commits, at the time it commits.

Related

Reading postgres NOTICE message in C++ API

I am struggling to read my Postgres NOTICE messages in my C++ API. I can only read EXCEPTIONmessages using the function PQresultErrorMessage(PGresult), but not lower level messages.
PQresultErrorField(res, PG_DIAG_SEVERITY) returns null pointer.
How do I read NOTICE and other low level messages?
(Using PostgreSQL 9.2)
Set up a notice receiver or notice processor using PQsetNoticeReceiver / PQsetNoticeProcessor. Both set up callbacks that are invoked when asynchronous notifications are received. Note that this may happen before, during, or after processing of query data.
It's safe to assume that after all query results are returned (PQexec or whatever has returned) and you've called PQconsumeInput to make sure there's nothing else waiting, then all notices for the last command are received. The PQconsumeInput shouldn't really be necessary, it's just to be cautious.
See the documentation for libpq.

How to push tables through socket

Considering the setup of kdb+ tick, how do the tables get pushed through the sockets?
In tick, it's possible to subscribe with a process (let's say) a to the tickerplant, which will then proceed to push the data of the subscribed 'tickers' to a as new data arrives.
I would like to do the same but I was wondering how. As far as I know, inter-process communication between q process is just the ability to transport commands from one process to the other, such that the commands will be executed on the other.
So how is it then possible to transport a complete table between processes?
I know the method which does this in tick is .u.pub and .u.sub, but it's not clear to me how the tables are transported between the processes.
So I have two questions:
How does kdb+ tick do this?
How can I push a table from one process to the other in general?
Let's understand the simple process of doing this:
We have one server 'S' and one client 'C'. When 'C' calls .u.sub function, that function code connects to 'S' using its host and port and call a specific function on 'S' (lets say 'request') with subscription parameters.
On getting this request, 'S request' function makes following entries to its subscribtion table which it maintains for subscription request.
-> Host and port of Client(incoming request)
-> Subscription params (for ex. clients send sym `VOD.L for subscription)
Now when 'S' gets any data update from feed, it goes thorugh it's subscription table and check the entries whose subscription param column value (sym in our case) matches with incoming data. Then it makes connection to each of them using their host and port from table and call their 'upd' function with new data.
Only thing is, client should have 'upd' function defined on their side.
This is a very basic process. KDB+ uses this with extra optimizations and features. For ex. more optimized structure for maintaining subscription table,log maintenance, replaying logs, unsubscription ,recovery logic, timer for publishing and lot more.
For more details, you can check definition of functions in 'u' namespace.

ActiveRecord find_or_initialize_by race conditions

I have a scenario where 2 db connections might both run Model.find_or_initialize_by(params) and raise an error: PG::UniqueViolation: ERROR: duplicate key value violates unique constraint
I'd like to update my code so it could gracefully recover from it. Something like:
record = nil
begin
record = Model.find_or_initialize_by(params)
rescue ActiveRecord::RecordNotUnique
record = Model.where(params).first
end
return record
The trouble is that there's not a nice/easy way to reproduce this on my local machine, so I'm not confident that my fix actually works.
So I thought I'd get a bit creative and try calling create 2 times (locally) in a row which should raise then PG::UniqueViolation: ERROR, then I could rescue from it and make sure everything is handled gracefully.
But I get this error: PG::InFailedSqlTransaction: ERROR: current transaction is aborted, commands ignored until end of transaction block
I get this error even when I wrap everything in individual transaction blocks
record = nil
Model.transaction do
record = Model.create(params)
end
begin
Model.transaction do
record = Model.create(params)
end
rescue ActiveRecord::RecordNotUnique
end
Model.transaction do
record = Model.where(params).first
end
return record
My questions:
What's the right way to gracefully handle the race condition I mentioned at the very beginning of this post?
How do I test this locally?
I imagine there's probably something simple that I'm missing here, but it's late and perhaps I'm not thinking too clearly.
I'm running postgres 9.3 and rails 4.
EDIT Turns out that find_or_initialize_by should have been find_or_create_by and the errors I was getting was from the actual save call that happened later on in execution. #VeryTiredWhenIWroteThis
Has this actually happenend?
Model.find_or_initialize_by(params)
should never raise an ´ActiveRecord::RecordNotUnique´ error as it is not saving anything to db. It just creates a new ActiveRecord.
However in the second snippet you are creating records.
create (without bang) does not throw exceptions caused by validations, but
ActiveRecord::RecordNotUnique is always thrown in case of a duplicate by both create and create!
If you're creating records you don't need transactions at all. As Postgres being ACID compliant guarantees that only one of the both operations succeeds and if it responds so it's changes will be durable. (a single statement query against postgres is also a transaction). So your above code is almost fine if you replace through find_or_create_by
begin
record = Model.find_or_create_by(params)
rescue ActiveRecord::RecordNotUnique
record = Model.where(params).first
end
You can test if the code behaves correctly by simply trying to create the same record twice in row. However this will not test ActiveRecord::RecordNotUnique is actually thrown correctly on race conditions.
It's also no the responsibility of your app to test and testing it is not easy. You would have to start rails in multithread mode on your machine, or test against a multi process staging rails instance. Webrick for example handles only one request at a time. You can use puma application server, however on MRI there is no true concurrency (GIL). Threads only share the GIL only on IO blocking. Because talking to Postgres is IO, i'd expect some concurrent requests, but to be 100% sure, the best testing scenario would be to deploy on passenger with multiple workers and then use jmeter to run concurrent request agains the server.

How can I tell if a kdb server is busy?

Is there a command to know if the kdb server is busy running a query? Even better, knowing what is the percentage completion of the query being run?
So far I've been looking at the top screen on linux to know which server to use...
Unfortunately, not directly. The reason is due to the single threaded nature of a KDB process. In practice, this is easily worked around by adding some basic logging to your server. So whenever a query comes in just log to a file the time the query came in and when the result was returned to the user.
Take a look at the .z.pg and the .z.ps functions which are called to handle synchronous or asynchronous requests, respectively. By default they are just set to "value", which means evaluate the string and return the result. Just replace this with your own function to log events to a file or a log server.
Besides above solution, a more simple way is: keep checking the port.
Normally all queries will be running against port, and kdb server can launched multiple ports for different purpose.
Details:
Use below code to query again port, if the port is busy, null res will return. And you can further kill the port and restart it or whatever the requirement is.
The code will send out 1 to the port and calculate.
.server.testQuery:{[inPort]
res:#[{hopen(x;3000)};`$":",":" sv string `,inPort;0N];
if[not null res;hclose res];
:res
};

How do you ensure consistent client reads in an eventual consistent system?

I'm digging into CQRS and I am looking for articles on how to solve client reads in an eventual consistent system. Consider for example a web shop where users can add items to their cart. How can you ensure that the client displays items in the cart if the actual processing of the command "AddItemToCart" is done async? I understand the principles of dispatching commands async and updating the read model async based on domain events, but I fail to see how this is handled from the clients perspective.
There are a few different ways of doing it;
Wait at user till consistent
Just poll the server until you get the read model updated. This is similar to what Ben showed.
Ensure consistency through 2PC
You have a queue that supports DTC; and your commands are put there first. They are then; executed, events sent, read model updated; all inside a single transaction. You have not actually gained anything with this method though, so don't do it this way.
Fool the client
Place the read models in local storage at the client and update them when the corresponding event is sent -- but you were expecting this event anyway, so you had already updated the javascript view of the shopping cart.
I'd recommend you have a look at the Microsoft Patterns & Practices team's guidance on CQRS. Although this is still work-in-progress they have given one solution to the issue you've raised.
Their approach for commands requiring feedback is to submit the command asynchronously, redirect to another controller action and then poll the read model for the expected change or a time-out occurs. This is using the Post-Redirect-Get pattern which works better with the browser's forward and back navigation buttons, and gives the infrastructure more time to process the command before the MVC controller starts polling.
Example code from the RegistrationController using ASP.NET MVC 4 asynchronous controllers.
[HttpGet]
[OutputCache(Duration = 0, NoStore = true)]
public Task<ActionResult> SpecifyRegistrantAndPaymentDetails(Guid orderId, int orderVersion)
{
return this.WaitUntilOrderIsPriced(orderId, orderVersion)
.ContinueWith<ActionResult>(
...
);
}
...
private Task<PricedOrder> WaitUntilOrderIsPriced(Guid orderId, int lastOrderVersion)
{
return
TimerTaskFactory.StartNew<PricedOrder>(
() => this.orderDao.FindPricedOrder(orderId),
order => order != null && order.OrderVersion > lastOrderVersion,
PricedOrderPollPeriodInMilliseconds,
DateTime.Now.AddSeconds(PricedOrderWaitTimeoutInSeconds));
}
I'd probably use AJAX polling instead of having a blocked web request at the server.
Post-Redirect-Get
You're hoping that the save command executes on time before Get is called. What if the command takes 10 seconds to complete in the back end but Get is called in 1 second?
Local Storage
With storing the result of the command on the client while the command goes off to execute, you're assuming that the command will go through without errors. What if the back-end runs into an error while processing the command? Then what you have locally isn't consistent.
Polling
Polling seems to be the option that is actually in line with eventual consistency; you're not faking or assuming. Your polling mechanism can be an asynchronous as a part of your page, e.g. shopping cart page component polls until it gets an update without refreshing the page.
Callbacks
You could introduce something like web hooks to make a call back to the client if the client is capable of receiving such. By providing a correlation Id once the command is accepted by the back-end, once the command has finished processing, the back-end can notify the front end of the command's status along with the correlation Id on whether the command went through successfully or not. There is no need for any kind of polling with this approach.