Avoiding TSQL Data-conversion errors - tsql

I think this is best asked in the form of a simple example. The following chunk of SQL causes a "DB-Library Error:20049 Severity:4 Message:Data-conversion resulted in overflow" message, but how come?
declare #a numeric(18,6), #b numeric(18,6), #c numeric(18,6)
select #a = 1.000000, #b = 1.000000, #c = 1.000000
select #a/(#b/#c)
go
How is this any different to:
select 1.000000/(1.000000/1.000000)
go
which works fine?

I ran into the same problem the last time I tried to use Sybase (many years ago). Coming from a SQL Server mindset, I didn't realize that Sybase would attempt to coerce the decimals out -- which, mathematically, is what it should do. :)
From the Sybase manual:
Arithmetic overflow errors occur when
the new type has too few decimal
places to accommodate the results.
And further down:
During implicit conversions to numeric
or decimal types, loss of scale
generates a scale error. Use the
arithabort numeric_truncation option
to determine how serious such an error
is considered. The default setting,
arithabort numeric_truncation on,
aborts the statement that causes the
error but continues to process other
statements in the transaction or
batch. If you set arithabort
numeric_truncation off, Adaptive
Server truncates the query results and
continues processing.
So assuming that the loss of precision is acceptable in your scenario, you probably want the following at the beginning of your transaction:
SET ARITHABORT NUMERIC_TRUNCATION OFF
And then at the end of your transaction:
SET ARITHABORT NUMERIC_TRUNCATION ON
This is what solved it for me those many years ago ...

This is just speculation, but could it be that the DBMS doesn't look at the dynamic value of your variables but only the potential values? Thus, a six-decimal numeric divided by a six-decimal numeric could result in a twelve-decimal numeric; in the literal division, the DBMS knows there is no overflow. Still not sure why the DBMS would care, though--shouldn't it return the result of two six-decimal divisions as up to a 18-decimal numeric?

Because you have declared the variables in the first example the result is expected to be of the same declaration (i.e. numeric (18,6)) but it is not.
I have to say that the first one worked in SQL2005 though (returned 1.000000 [The same declared type]) while the second one returned (1.00000000000000000000000 [A total different declaration]).

Not directly related, but could possibly save someone some time with the Arithmetic overflow errors using Sybase ASE (12.5.0.3).
I was setting a few default values in a temporary table which I intended to update later on, and stumbled on to an Arithmetic overflow error.
declare #a numeric(6,3)
select 0.000 as thenumber into #test --indirect declare
select #a = ( select thenumber + 100 from #test )
update #test set thenumber = #a
select * from #test
Shows the error:
Arithmetic overflow during implicit conversion of NUMERIC value '100.000' to a NUMERIC field .
Which in my head should work, but doesn't as the 'thenumber' column wasn't declared ( or indirectly declared as decimal(4,3) ). So you would have to indirectly declare the temp table column with scale and precision to the format you want, as in my case was 000.000.
select 000.000 as thenumber into #test --this solved it
Hopefully that saves someone some time :)

Related

PostgreSQL insert with nested query fails with large numbers of rows

I'm trying to insert data into a PostgreSQL table using a nested SQL statement. I'm finding that my inserts work with a small (a few thousand) rows being returned from the nested query. For instance, when I attempt:
insert into the_target_table (a_few_columns, average_metric)
SELECT a_few_columns, AVG(a_metric)
FROM a table
GROUP BY a_few_columns LIMIT 5000)
However, this same query fails when I remove my LIMIT (the inner query without limit returns about 30,000 rows):
ERROR: Integer out of range
a_metric is a double precision, and a_few_columns are text. I've played around with the LIMIT rows, and it seems like the # of rows it can insert without throwing an error is 14,000. I don't know if this is non-deterministic, or a constant threshold before the error is thrown.
I've looked through a few other SO posts on this topic, including this one, and changed my table primary key data type to BIGINT. I still get the same error. I don't think it's an issue w/ numerical overflow, however, as the number of inserts I'm making is small and nowhere even close to hitting the threshold.
Anyone have any clues what is causing this error?
The issue here was an improper definition of the avg_metric field in my table that I wanted to insert it into. I accidentally had defined it as an integer. This normally isn’t a huge issue, but I also had a handful of infinity values ( inf). Once I switched my field data type to double precision I was able to insert successfully. Of course, it’s probably best if my application had checked beforehand for finite values prior to attempting the insert- normally I’d do this programmatically via asserts, but with a nested query I hadn’t bothered to check.
The final query I used was
insert into the_target_table (a_few_columns, average_metric)
SELECT a_few_columns, CASE WHEN AVG(a_metric) = 'inf' THEN NULL ELSE AVG(a_metric) END
FROM a_table
GROUP BY a_few_columns LIMIT 5000)
An even better solution would have been to go through a_table and replace all inf values first.

PostgreSQL Select into with double precision always returns null when run in PL/pgSQL function

I have a function in PL/pgSQL that is trying to back out some data for a date range. The problem I have is that I cannot seem to store the double precision inside a variable. No matter what I do the value is always null when running inside a function. When I run the query from psql command line it returns me the correct data. I can also run the query on another column that is isn't of type double precision and it works fine. For example if I change the column to "total_impressions_for_date_range" it will return me the correct data.
I am using PostgreSQL 8.4
CREATE OR REPLACE FUNCTION rollback_date_range_revenue(campaign_id int,
begin_date timestamp, end_date timestamp, autocommit boolean)
RETURNS void AS $BODY$
DECLARE
total_impressions_for_date_range bigint;
total_clicks_for_date_range bigint;
total_revenue_for_date_range double precision;
total_cost_for_date_range double precision;
BEGIN
SELECT sum(revenue) INTO total_revenue_for_date_range
FROM ad_block_summary_hourly
WHERE ad_run_id IN (
SELECT ad_run_id FROM ad_run WHERE ad_campaign_id = campaign_id)
AND ad_summary_time >= begin_date
AND ad_summary_time < end_date
AND (revenue IS NOT NULL);
RAISE NOTICE 'Total revenue for given date range and campaign % was %',
campaign_id, total_revenue_for_date_range;
When I run this I always get a null value for the revenue
SELECT rollback_date_range_revenue(8818, '2015-07-20 18:00:00'::timestamp,
'2015-07-20 20:00:00'::timestamp, false);
NOTICE: Total revenue for given date range and campaign 8818 was <NULL>
When I run it from command line outside of the function it works completely fine
select sum(revenue) from ad_block_summary_hourly where ad_run_id in (
select ad_run_id from ad_run where ad_campaign_id = 8818) and ad_summary_time
>= '2015-07-20 18:00:00'::TIMESTAMP and ad_summary_time < '2015-07-20
20:00:00'::TIMESTAMP ;
sum
----------
3122.533
(1 row)
EDIT
Huge thanks to a_horse_with_no_name and Patrick. This was indeed a problem with a place holder I had called revenue which overlapped with my query. I was thrown off by the fact that the two queries that were not working were both double precision. It just happened to be that those two were also the place holders that I had overlapped with column names.
2 things to take away from this.
I adopted the p_ naming scheme for place holders suggested by a_horse_with_no_name, so as to not run into this issue again.
Post a full code example, this could have been identified much quicker by the experts.
First of all, PostgreSQL 8.4 is no longer supported so you should upgrade to 9.4 as soon as you can. Second, your function is obviously abbreviated because some declared variables are not used and there is no END clause. These two points together make it somewhat guesswork to give you an answer, but here goes.
Try casting the double precision to text, or convert it with to_char(). RAISE NOTICE expects a string for the expressions to be inserted; possibly in 8.4 this is not automatic.
You could also improve upon your query:
...
SELECT sum(sh.revenue) INTO total_revenue_for_date_range
FROM ad_block_summary_hourly sh
JOIN ad_run r USING (ad_run_id)
WHERE r.ad_campaign_id = campaign_id
AND sh.ad_summary_time BETWEEN begin_date AND end_date;
RAISE NOTICE 'Total revenue for given date range and campaign % was %',
campaign_id, to_char(total_revenue_for_date_range, '9D999');
...
Another potential cause of the problem (guessing again due to lack of information) is a name collision between a function parameter or variable with a column name from either of the two tables.

Why do I get a DATETIME conversion error in TSQL?

I know there are numerous questions about this topic, even one I asked myself a while ago (here). Now I ran into a different problem, and neither myself nor my colleagues know what the reason for the strange behaviour is.
We've got a relatively simple SQL statement quite like this:
SELECT
CONVERT(DATETIME, SUBSTRING(MyText, CHARINDEX('Date:', MyText) + 8, 16) AS MyDate,
SomeOtherColumn,
...
FROM
MyTable
INNER JOIN MyOtherTable
ON MyTable.ID = MyOtherTable.MyTableID
WHERE
MyTable.ID > SomeValue AND
MyText LIKE 'Date: %'
This is not my database and also not my SQL statement, and I didn't create the great schema to store datetime values in varchar columns, so please ignore that bit.
The problem we are facing right now is a SQL conversion error 241 ("Conversion failed when converting date and/or time from character string.").
Now I know that the query optimiser may change the execution plan that the WHERE clause may be used to filter results after the conversion is attempted, but the really strange thing is that I don't get any errors when I delete all of the WHERE clause.
I also don't get any errors when I add a single line to the statement above as follows:
SELECT
MyText, -- This is the added line
CONVERT(DATETIME, SUBSTRING(MyText, CHARINDEX('Date:', MyText) + 8, 16) AS MyDate,
...
As soon as I remove it I get the conversion error again. Manually checking the values in the MyText column without trying to convert them does not show that there are any records which might cause a problem.
What is the reason for the conversion error? Why do I not run into it when I also select the column as part of the SELECT statement?
Update
Here the execution plan, although I don't think it's going to help.
Sometimes, SQL Server aggressively optimizes by pushing conversion operations earlier in the process than they would otherwise need to be. (It shouldn't. See SQL Server should not raise illogical errors on Connect, as an example).
When you just select:
CONVERT(DATETIME, SUBSTRING(MyText, CHARINDEX('Date:', MyText) + 8, 16)
Then the optimizer decides it can perform this conversion as part of the table/index scan or seek - right at the point at which it's reading the data from the table (and, importantly, before, or at the same time, as the WHERE clause filter). The rest of the query can then just use the converted value.
When you select:
MyText, -- This is the added line
CONVERT(DATETIME, SUBSTRING(MyText, CHARINDEX('Date:', MyText) + 8, 16)
It decides to let the conversion happen later. Importantly, the conversion now (by happenstance) happens later than the WHERE clause filter which should, by rights, be filtering all rows before the conversion is attempted.
The only safe way to deal with this is to force the filtering to definitely occur before the conversion is attempted. If you're not dealing with aggregates, a CASE expression may be safe enough:
SELECT CASE WHEN MyText LIKE 'Date: %' THEN CONVERT(DATETIME, SUBSTRING(MyText, CHARINDEX('Date:', MyText) + 8, 16) END
Otherwise, the even safer option is to split the query into two separate queries, and store the intermediate results in a temp table or table variable (views, CTEs and subqueries don't count, because the optimizer can "see through" such constructs)

Trouble with Dynamic T-SQL IIF Statement

This code works fine and does exactly what I want, which is to sum the Qty * Price for each instance of the dynamic query.
But when I add an IIF statement it breaks. What I am trying to do is the same thing as above but when the transaction type is 'CO' set the sum to a negative amount.
The problem turned out to be the NVARCHAR(4000) type of #sql, limiting its length to 4000 characters: the query got truncated at some random place after adding another long chunk to it.
DECLARE #sql NVARCHAR(MAX) solves the problem, allowing a dynamic query of any size below 2GB.

SQL Multiply Discrepancy

I stumbled across this oddity when multiplying DECIMAL numbers on SQL Server 2005/2008. Can anyone explain the effect?
DECLARE #a DECIMAL(38,20)
DECLARE #b DECIMAL(38,20)
DECLARE #c DECIMAL(38,20)
SELECT #a=1.0,
#b=2345.123456789012345678,
#c=23456789012345.999999999999999999
SELECT CASE WHEN #a*#b*#c = #c*#b*#a
THEN 'Product is the same'
ELSE 'Product differs'
END
It's due to precision representation and rounding errors.
The problem is due to
SELECT #a*#b --(=2345.123457)
[Please search SO for multiple examples.]
Related: Sql Server Decimal(30,10) losing last 2 decimals