PostgreSQL exceptions with failed inserts - postgresql

Is there any circumstance where an exception will not be thrown if an insert statement in a stored procedure fails?
I'm using catch-all style exception handling in my PostgreSQL stored procedures via EXCEPTION WHEN OTHERS THEN. I'm wondering if that's sufficient to catch all failed inserts.

That should cover it.
I quote the manual on Trapping Errors in PL/pgSQL:
The special condition name OTHERS matches every error type except
QUERY_CANCELED. (It is possible, but often unwise, to trap
QUERY_CANCELED by name.)

Related

What does "Canceled on identification as a pivot" mean?

I use a postgres database and I see the following error message:
could not serialize access due to read/write dependencies among transactions
DETAIL: Reason code: Canceled on identification as a pivot, during conflict out checking.
HINT: The transaction might succeed if retried.
The issue is that I have two transactions which are executed concurrently and they can influence each other.
This question is about understanding the error message itself, not about fixing the issue. I would love if somebody could walk me through it. Some questions I have:
What is the mentioned "identification"?
What is a pivot in this case?
What is "conflict out checking"?

Dapper not throwing an exception when select query does not contain expected column

When using Dapper as follows:
connection.Query("select a from Foo").Select(x => new Foo(x.a, x.b));
My preference is that an exception is thrown when trying to access x.b, rather than it just returning null.
I understand that just returning null may be by design (e.g. for performance reasons), and ideally tests would exist that flag up the missing data, but if I could have both, this seems like an additional layer of safety that may be worth having.
Alternatively, is there another way to instantiate an object using a constructor in a way that will throw an exception when a needed column is missing?
Yup. Dapper often seems to err on the side of optimism. Every error is a runtime error, and as in this case, some things that you would like to be errors fly straight through.
Were you to use QueryFirst, you wouldn't even need to run the program. This would be caught as you type. You access your query results via generated POCOs, so if a column isn't in the query, it isn't in the POCO. If you use select *, then later delete a columnn from the DB, you can regenerate all your queries and the compile errors will point to the line in your code that tries to access the deleted column.
Download here.
Short video presentation here.
disclaimer: I wrote QueryFirst

When are (SELECT) queries planned?

In PostgreSQL, when are (SELECT) queries planned?
Is it:
at statement-prepare time, or
at the start of processing the SELECT, or
something else
The reason I ask is that there is a Stackoverflow question: same query, two different ways, vastly different performance
A lot of people seem to be thinking that the query is planned differently because in one case the query contains a string literal ('foo') and in another case it's a placeholder (?).
Now my thinking is that this is a red herring, because the query isn't planned at statement-prepare time, but is actually planned at SELECT time.
So, say, I could prepare a statement with a placeholder, then run the query multiple times with different bound values, and the query planner will be run for each different bound value.
I suspect that the question linked above boils down to the PostgreSQL data type of the value, which in the case of a 'foo' literal is known to be a string, but in the case of a placeholder, the type can't be divined, so is coming through to the query planner as some strange type, which it can't create an efficient plan for. In which case, the issue is not that the query is being planned differently because the value is a placeholder (at statement preparation time) per se but that the value is coming through to the query as a different PostgreSQL type, and that is what is influencing the query planner. To fix this would simply be a matter of binding the placeholder with an appropriate explicit type declaration.
I cannot talk about the client-side Perl interface itself but I can shed some light on the PostgreSQL server side.
PostgreSQL has prepared statements and unprepared statements. Unprepared statements are parsed, planned and executed immediately. They also do not support parameter substitution. On a plain psql shell you can show their query plan like this:
tmpdb> explain select * from sometable where flag = true;
On the other hand there are prepared statements: They are usually (see "exception" below) parsed and planned in one step and executed in a second step. They can be re-executed several times with different parameters, because they do support parameter substitution. The equivalent in psql is this:
tmpdb> prepare foo as select * from sometable where flag = $1;
tmpdb> explain execute foo(true);
You may see, that the plan is different from the plan in the unprepared statement, because planning did take place already in the prepare phase as described in the doc for PREPARE:
When the PREPARE statement is executed, the specified statement is parsed, rewritten, and planned. When an EXECUTE command is subsequently issued, the prepared statement need only be executed. Thus, the parsing, rewriting, and planning stages are only performed once, instead of every time the statement is executed.
This also means, that the plan is NOT optimized for the substituted parameters: In the first examples might use an index for flag because PostgreSQL knows that within a million entries only ten have the value true. This reasoning is impossible when PostgreSQL uses a prepared statement. In that case a plan is created which will work for all possible parameter values as good as possible. This might exclude the mentioned index because fetching the better part of the complete table via random access (due to the index) is slower than a plain sequential scan. The PREPARE doc confirms this:
In some situations, the query plan produced for a prepared statement will be inferior to the query plan that would have been chosen if the statement had been submitted and executed normally. This is because when the statement is planned and the planner attempts to determine the optimal query plan, the actual values of any parameters specified in the statement are unavailable. PostgreSQL collects statistics on the distribution of data in the table, and can use constant values in a statement to make guesses about the likely result of executing the statement. Since this data is unavailable when planning prepared statements with parameters, the chosen plan might be suboptimal.
BTW - Regarding plan caching the PREPARE doc also has something to say:
Prepared statements only last for the duration of the current database session. When the session ends, the prepared statement is forgotten, so it must be recreated before being used again.
Also there is no automatic plan caching and no caching/reuse over multiple connections.
EXCEPTION: I have mentioned "usually". The shown psql examples are not the stuff a client adapter like Perl DBI really uses. It uses a certain protocol. Here the term "simple query" corresponds to the "unprepared query" in psql, the term "extended query" corresponds to "prepared query" with one exception: There is a distinction between (one) "unnamed statement" and (possibly multiple) "named statements". Regarding named statements the doc says:
Named prepared statements can also be created and accessed at the SQL command level, using PREPARE and EXECUTE.
and also:
Query planning for named prepared-statement objects occurs when the Parse message is processed.
So in this case planning is done without parameters as described above for PREPARE - nothing new.
The mentioned exception is the "unnamed statement". The doc says:
The unnamed prepared statement is likewise planned during Parse processing if the Parse message defines no parameters. But if there are parameters, query planning occurs every time Bind parameters are supplied. This allows the planner to make use of the actual values of the parameters provided by each Bind message, rather than use generic estimates.
And here is the benefit: Although the unnamed statement is "prepared" (i.e. can have parameter substitution), it also can adapt the query plan to the actual parameters.
BTW: The exact handling of the unnamed statement has changed several times in the past releases of the PostgreSQL server. You can lookup the old docs for details if you really want.
Rationale - Perl / any client:
How a client like Perl uses the protocol is a completely different question. Some clients like the JDBC driver for Java basically say: Even if the programmer uses a prepared statement, the first five (or so) executions are internally mapped to a "simple query" (i.e. effectively unprepared), after that the driver switches to "named statement".
So a client has these choices:
Force (re)planning each time by using the "simple query" protocol.
Plan once, execute multiple times by using the "extended query" protocol and the "named statement" (plan might be bad because planning is done without parameters).
Parse once, plan for each execution (with current PostgreSQL version) by using the "extended query" protocol and the "unnamed statement" and obeying some more things (provide some params during "parse" message)
Play completely different tricks like the JDBC driver.
What Perl does currently: I don't know. But the mentioned "red herring" is not very unlikely.

NEventStore CommonDomain: Why does EventStoreRepository.GetById open a new stream when an aggregate doesn't exist?

Please explain the reasoning behind making EventStoreRepository.GetById create a new stream when an aggregate doesn't exist. This behavior appears to give GetById a dual responsibility which in my case could result in undesirable results. For example, my aggregates always implement a factory method for their creation so that their first event results in the setting of the aggregate's Id. If a command was handled prior to the existance of the aggregate it may succeed even though the aggregate doesn't have it's Id set or other state initialized (crash-n-burn with null reference exception is equally as likely).
I would much rather have GetById throw an exception if an aggregate doesn't exist to prevent premature creation and command handling.
That was a dumb design decision on my part. I went back and forth on this one. To be honest, I'm still back and forth on it.
The issue is that exceptions should really be used to indicate exceptional or unexpected scenarios. The problem is that a stream not being found can be a common operation and even an expected operation in some regards. I tinkered with the idea of throwing an exception, returning null, or returning an empty stream.
The way to determine if the aggregate is empty is to check the Revision property to see if it's zero.

BEST approach to share messages between data tier and application tier

I was looking for any suggestion or a good approach to handle messages between the data & application tier, what i mean with this is when using store procedures or even using direct SQL statements in the application, there should be a way the data tier notifies upper layers about statement/operation results in at least an organized way.
What i commonly use is two variables in every store procedure:
#code INT,
#message VARCHAR(1024)
I put the DML statements inside a TRY CATCH block and at the end of the try block, set both variables to certain status meaning everything went ok, as you might be thinking i do the opposite on the catch block, set both variables to some failure code if needed perform some error handling.
After the TRY CATCH block i return the result using a record:
SELECT #code AS code, #message AS message
These two variables are used in upper tiers for validation purposes as sending messages to the user, also for committing o rolling back transactions.
Perhaps i'm missing important features like RAISERROR or not cosidering better and more optimal and secure approaches.
I would appreciate your advices and good practices, not asking for cooks recipes there's no need, just the idea but if you decide to include examples they'd be more than welcome.
Thanks
The generally accepted approach for dealing with this is to have your sproc set a return values
and use raise error as has been previously suggested.
The key thing that I suspect you are missing is how to get this data back on the client side.
If you are in a .net environment, you can check for the return value as part of the ado.net code. The raise error messages can be captured by catching the SqlException object and iterating through the errors collection.
Alternatively, you could create and event handler and attach to the connection InfoMessage event. This would allow you to get the messages as they are generated from sql server instead of at the completion of the batch or stored procedure.
another option which I wouldn't recommend, would be to track all of the things you care about into a XML message and return that from the stored proc this gives you a little more structure then your free text field approach.
My experience has been to rely on the inherent workings of the procedure mechanisms. What I mean by this is that if a procedure is to fail, either the implicit raised error (if not using try/catch . . . simply CRUD stuff) or using the explicit RAISERROR. This means that if the code falls into the catch block and I want to notify the application of the error, I would re-raise the error with RAISERROR. In the past, I've have created a custom rethrow procedure so I can trap and log the error information. One could use the rethrow as an opportunity to throw a custom message/code that could be used to guide the application down a different decision path. If the calling application doesn't receive an error on execution, it's safe to assume that it succeeded - so explicit transaction committing can commence.
I've never attempted to write in my own communication mechanism for interaction between the app and the stored proc - the objects in the languages I interact with have events or mechanisms to detect the raised error from the procedure. It sounds to me like your approach doesn't necessarily require too much more code than standard error handling, but I do find it confusing as it seems to be a reinventing the wheel type of scenario.
I hope I've not misunderstood what you are doing, nor do I want to come off as critical; I do believe, however, you have this feedback mechanism available with the standard API and are probably making this too complicated. I would recommend reading up on RAISERROR and see if it's going to meet your needs (see the Using RAISERROR link for this at the bottom of the page as well).