How to avoid arithmetic errors in PostgreSQL? - postgresql

I have a PostgreSQL-powered web app that does some non-essential, simple calculations involving getting values from outside sources, multiplication and division for reporting purposes. Today an error where a multiplication that exceeded the value domain of a numeric( 10, 4 ) field led to an application crash. It would be much better if the relevant field had just been set to null and a notice be generated. The way the bug worked was that a wrong value in one field caused several views to become unavailable, and while a missing value in that place would have been sad but no big problem, the blocked view is still essential for the app to work.
Now I'm aware that in this particular case, setting that field to numeric( 11, 4 ) would have prevented the bailout, but that is, of course, only postponing the issue at hand. Since the error happened in a function call, I could also have written an exception handler; lastly, one could check either the multiplicands or the result for sane values (but that is in itself a little strange as I would either have to do a guess based on magnitudes or else do the multiplication in another numeric type that can probably handle a value whose magnitude is in principle not known to me with certainty, because external sources).
Exception handling is probably what this will boil down to, which, however, entails that all numeric calculations will have to be done via PL/pgSQL function calls, and will have to be implemented in many different places. None of the options seems particularly maintainable or elegant. So the question is: Can I somehow configure PostgreSQL to ignore some or all arithmetic errors and use default values in such cases? If so, can that be done per database or will I have to configure the server? If this is impossible or a Bad Idea, what are the best practices to avoid arithmetic errors?
Clarification This is not a question about how to rewrite numeric( 10, 4 ) so that the field can hold values of 1e6 and above, and also not so much about error handling in the application that uses the DB. It's more about whether there is an operator, a function call, a general configuration or a general pattern that is most commonly recommended to deal with situations where a (non-essential) computation normally results in a number (or in fact other value type) except with some inputs that cause exceptions, which is when the result could fully well and safely be discarded. Think Excel printing out #### when cell is too narrow for the digits to be displayed, or JavaScript giving you NaN in place of arithmetic errors. Returning null instead of raising an exception may be a bad idea in general programming but legitimate in specific case.
Observe that PostGreSQL error codes does have e.g. invalid_argument_for_logarithm, invalid_argument_for_ntile_function, division_by_zero all grouped together under Class 22 — Data Exception and does allow exception handling in function bodies, so I can also specifically ask: How to catch all class 22 exceptions short of listing all the error codes?, but then I still hope for a more principled approach.

Arguably the type numeric (without type modifiers) would be the right thing for you if you want to avoid overflows (that's what you seem to mean with “arithmetic error”) as much as possible.
However, there will still be the possibility of value overflows numeric format.
There is no way to configure PostgreSQL so that it ignores a numeric overflow.
If the result of an operation cannot be represented in a data type, there should be an error. If the data supplied by the application can lead to an error, the application should be ready to handle such an error rather than “crash”. Failure to do so is an application bug.

Related

is this an SQL injection

In the apache access logs I found the following code as query string (GET), submitted multiple times each second for quite a while from one IP:
**/OR/**/ASCII(SUBSTRING((SELECT/**/COALESCE(CAST(LENGTH(rn)/**/AS/**/VARCHAR(10000))::text,(CHR(32)))/**/FROM/**/"public".belegtable/**/ORDER/**/BY/**/lv/**/OFFSET/**/1492/**/LIMIT/**/1)::text/**/FROM/**/1/**/FOR/**/1))>9
What does it mean?
Is this an attempt of breaking in via injection?
I have never seen such a statement and I don't understand its meaning. PostgreSQL is used on the server.
rn and belegtable exist. Some other attempts contain other existing fields/tables. Since the application is very costum, I don't know how the information on existing SQL fields can be known to strangers. Very weird.
**/
OR ASCII(
SUBSTRING(
( SELECT COALESCE(
CAST(LENGTH(rn) AS VARCHAR(10000))::text,
(CHR(32))
)
FROM "public".belegtable
ORDER BY lv
OFFSET 1492
LIMIT 1
)::text
FROM 1
FOR 1
)
) > 9
Is this an attempt of breaking in via injection?
The query in question does not have too many characteristics of an attempted SQL injection.
An SQL injection typically involves inserting an unwanted action into some section of a bigger query, under the disguise of a single value. Typically the injected part tries to guess what comes before it, neutralise it, do something malicious and secure the entire query from syntax errors by also neutralising what comes after the injected piece, which might not be visible to the attacker.
I don't see anything that could work as an escape sequence at the beginning or anything that would neutralise the remains of the query coming in after the injection. What this query does also isn't malicious. An SQL injection would attempt to extract some additional information about the database security, structure and configuration, or - if the attacker already gathered enough data - it would try to steal the data, encrypt it or tamper with it otherwise, depending on the aim and strategy of the attacker as well as the type of data found in the database. There also wouldn't be much point looping it like that.
As to the looping part: if someone attempted to put load on the database - as in DDoS - you'd likely see more than one node doing that and probably in a more elaborate and well disguised manner, using different and more demanding queries sent at different frequencies.
What does it mean?
It's likely someone's buggy code stuck in an unterminated loop, judging by the LIMIT and OFFSET mechanism I've seen used for looping through some set of records by taking one at a time (LIMIT 1) and incrementing which one to get next (OFFSET n). The whole expression always returns true because ASCII() returns the character code of the first character in the string. That string defaults to a space ' ', ASCII code 32, or some text representation of a number between 0 and 99999. Sice all ASCII digits are between code 48 and 57, it's effectively always comparing some bigger number than 9 to a 9, checking if it indeed is bigger.
The author of that code might not have predicted the loop to be able to run infinitely and might have misinterpreted what some of the functions used in that query do. Regardless of what really happened, I think it was a good idea to cut off that IP avoiding needless stress on the database. Double-checking your security setup is always a good idea but I wouldn't call this an attempted attack. At least not this query alone, as it might be a harmless piece of a bigger, more malicious operation - but that could be said about any query.

Informatica router transformation-Issue

I am facing a weird problem with router transformation in Informatica. I am using it in my mapping where I check for a particular port's value and based on the condition, I route it to appropriate flow. While I debug, I see the value of the variable as expected, but the row is identified as "filtered" in the debugger. I have tried various other methods like trimming the variable (LTRIM/RTRIM) to ensure there is no trailing spaces which makes the router condition fail, but that doesn't work either. As a result, my rows which are supposed to be inserted into the target as bypassed. Have anyone faced similar issue? I am wondering if I am missing something here.
When you run normally without the debugger are you experiencing different results?
As you know, "Filtered" means the condition is evaluating to false, so the only question should be around your condition.
What is the data type of the port and what is your exact conditional expression?
Mismatching data-types can cause unexpected boolean evaluations (ex. comparing an integer to a string without casting one side using TO_CHAR or TO_INTEGER respectively).

Which exception to throw when I find my data in inconsistent state in Scala?

I have a small Scala program which reads data from a data source. This data source is currently a .csv file, so it can contain data inconsistencies.
When implementing a repository pattern for my data, I implemented a method which will return an object by a specific field which should be unique. However, I can't guarantee that it will really be unique, as in a .csv file, I can't enforce data quality in a way I could in a real database.
So, the method checks whether there are one or zero objects with the requested field value in the repository, and that goes well. But I don't know Scala well (or Java for that matter), and the charts of the Java exception hierarchy which I found were not very helpful. Which would be the appropriate exception to throw if there are two objects with the same supposedly unique value. What should I use?
There are two handy exceptions for such cases: IllegalStateException and IllegalArgumentException. First one is used when object internal state is in some illegal position (say, you calling connect twice) and the last one (which seems to be more suitable to your case) is used when there is the data that comes from the outside world and it does not satisfy some prescribed conditions: e.g. negative value, when function is supposed to work with zero & positive values.
Both are not something that should be handled programmatically on the caller side (with the try/catch) -- they signify illegal usage of api and/or logical errors in program flow and such errors has to be fixed during the development (in your case, they have to inform developer who is passing that data, that specific field has to contain only unique values).
You can always use a customized Exception and in case this is a web API you might want to map your exception to: Bad Request (400) code.

What coredata errors should I prepare for?

I have an app getting close to release date, but it occurred to me that wherever I have core data save and/or fetch requests I'm not really handling the errors other than to check if they exist and #throw them, which I'm sure will seem almost like nails on a chalkboard to more experienced programmers, and surely there's some kind of disaster waiting to happen.
So to be specific, what kinds of errors can I expect from A) Fetches, and B) Saves, and also C) in general terms, how should I deal with these?
You can see the Core Data Constants Reference to get an idea about what kind of errors you can expect to see in general.
For fetches, the most common issue is that the fetch returns an empty array. Make sure that your view controllers, datasources and delegates can handle an empty fetch. If you dynamically construct complex predicates, make sure catch exceptions from an invalid predicate.
Most save errors results from validation errors. You should have a error recovery for every validation you supply. One common and somewhat hidden validation error is not providing a required relationship.
One thing that trips people up with Objective-c is that errors and exceptions are slightly different critters than they are in other languages. In Objective-C an error is something that the programmer should anticipate and plan for in the normal operation of the application e.g. a missing file. By contrast an exception is something exceptional that the programmer wouldn't expect the app to have to routinely handle e.g. a corrupted file.
Therefore, in Core Data a validation failure would be an common expected and unexceptional error whereas as corrupted persistent store would be a rare, unexpected and highly exceptional exception.
See the Exceptions Programming Guide and the Error Handling Programming Guide for details.

Catching errors with DBD::Informix

I need to run dynamically constructed queries against Informix IDS 9.x; while WHERE clause is mostly quite simple, Projection clause can be quite complicated with lots of columns and formulas applied to columns. Here is one example:
SELECT ((((table.I_ACDTIME + table.I_ACWTIME + table.I_DA_ACDTIME + table.I_DA_ACWTIME +
table.I_RINGTIME))+(table.I_ACDOTHERTIME + table.I_ACDAUXINTIME +
table.I_ACDAUX_OUTTIME)+(table.I_TAUXTIME + table.I_TAVAILTIME +
table.I_TOTHERTIME)+((table.I_AVAILTIME + table.I_AUXTIME)*
((table.MAX_TOT_PERCENTS/100)/table.MAXSTAFFED)))/(table.INTRVL*60))
FROM table
WHERE ...
The problem arises when some of the fields used contain zeroes; Informix predictably throws division by zero error, but the error message is not very helpful:
DBD::Informix::st fetchrow_arrayref failed:
SQL: -1202: An attempt was made to divide by zero.
In this case, it is desirable to return NULL upon failed calculation. Is there any way to achieve this other than parse Projection clause and enclose each and every division attempt in CASE ... END? I would prefer to use some DBD::Informix magic if it's there.
I don't believe you'll be able to solve this with DBD::Informix or any other database client, without resorting to parsing the SQL and rewriting it. There's no option to just ignore the column with the /0 arithmetic: the whole statement fails when the error is encountered, at the engine level.
If it's any help, you can write the code to avoid /0 as a DECODE rather than CASE ... END, which is a little cleaner, ie:
DECODE(table.MAXSTAFFED, 0, NULL,
((table.MAX_TOT_PERCENTS/100)/table.MAXSTAFFED)))/(table.INTRVL*60)))
DBD::Informix is an interface to the Informix DBMS, and as thin as possible (which isn't anywhere near as thin as I'd like, but that's another discussion). Such behaviour cannot reasonably be mediated by DBD::Informix (or any other DBD driver accessing a DBMS); it must be handled by the DBMS itself.
IDS does not provide a mechanism to yield NULL in lieu of a divide by zero error. It might be a reasonable feature request - but it would not be implemented until the successor version to Informix 11.70 at the earliest.
Note that Informix Dynamic Server (IDS) 9.x is several years beyond the end of its supported life (10.00 is also unsupported).
From experience working with informix I would say you woud be lucky to get that kind of functionallity within IDS (earlier versions of IDS - not much earlier than your version - had barely any string manipulation function nevermind anything complicated.)
I would save yourself the time and generate the calculations against an in memory list.