i have a DB2 data source and an Oracle 12c target.
The Oracle has a DB link to the DB2 defined which is working in general.
Now i have a huge table in the DB2 which has a timestamp column (lets call it ROW_CHANGED) for row changes. I want to retrieve rows which have changed after a particular time.
Running
SELECT * FROM lib.tbl WHERE ROW_CHANGED >'2016-08-01 10:00:00'
on the DB2 returns exactly 1 row after ca. 90 secs which is fine.
Now i try the same query from the Oracle via the db link:
SELECT * FROM lib.tbl#dblink_name WHERE ROW_CHANGED >TO_TIMESTAMP('2016-08-01 10:00:00')
This runs for hours and ends up in a timeout.
I read some Oracle docs and found distributed query optimization tips but most of them refer to joining a local to a remote table which is not my case.
In my desperation, i have tried the DRIVING_SITE hint, without effect.
Now i wonder when the WHERE part of the query will be evaluated. Since i have to use Oracle syntax and not DB2 syntax for the query, is it possible the Oracle will try to first copy the full table and apply the where clause afterwards? I did some research but did not find anything which would help me in this direction.
The ROW_CHANGED is a hidden column in the DB2, if that matters.
Thx for any hint in advance.
Update
Thanks#all for help. I'll share what did the trick for me.
First of all i have used TO_TIMESTAMP since the DB2 column is also Timestamp (not date) and i had expected to circumvent implicit conversions by this.
Without the explicit conversion i ran into ORA-28534: Heterogeneous Services preprocessing error and i have no hope of touching the DB config within reasonable time.
The explain plan btw did not bring much. It showed a FULL hint and no conversion on the predicates. Indeed it showed the ROW_CHANGED column as Date, i wonder why.
I have tried Justins suggestion to use a bind variable, however i got ORA-28534 again. Next thing i did was to wrap it into a pl/sql block (will run in a SP anyway later).
declare
v_tmstmp TIMESTAMP := 01.08.16 10:00:00;
begin
INSERT INTO ORAUSER.TMP_TBL (SRC_PK,ROW_CHANGED)
SELECT SRC_PK,ROW_CHANGED
FROM lib.tbl#dblink_name
WHERE ROW_CHANGED > v_tmstmp;
end;
This was executing in the same time as in DB2 itself. The date format is DD.MM.YY here since it is the default unfortunately.
When changing the variable assignment to
v_tmstmp TIMESTAMP := TO_TIMESTAMP('01.08.16 10:00:00','DD.MM.YY HH24:MI:SS');
I got the same problem as before.
Meanwhile the DB2 operators have created an index in the ROW_CHANGED column which i requested earlier that day. This has solved the problem in general it seems. Even my original query finishes in no time now.
If you are actually using an Oracle-specific conversion function like to_timestamp, that forces the predicate to be evaluated on the Oracle side. Oracle isn't going to know how to convert a built-in function like to_timestamp into an exactly equivalent function call in DB2.
If you used a bind variable, that would be more likely to get evaluated on the DB2 side. But that may be complicated by the data type mapping between different databases-- there may not be a perfect mapping between one engine's date and another engine's timestamp data type. If this was a numeric column, a bind variable would be almost certain to get pushed. In this case, it probably involves playing around a bit to figure out exactly what data type to use for your variable that works for your framework, Oracle, and DB2.
If using a bind variable doesn't work, you can force the predicate to be evaluated on the remote server using the dbms_hs_passthrough package. That lets you send a query verbatim to the remote server which allows you to do things like use functions defined in your DB2 database. That's a bit of overkill in this situation, hopefully, but it's nice to have the hammer as your backup if the simpler solution doesn't work quickly enough.
Related
I was creating a function following an example from a database class which included the creation of a temporary variable (base_salary) and using a SELECT INTO to calculate its value later.
However, I did not realize I used a different order for the syntax (SELECT ... FROM ... INTO base_salary) and the function could be used later without any visible issues (values worked as expected).
Is there any difference in using "SELECT ... FROM ... INTO" syntax order? I tried looking about it in the PostgreSQL documentation but found nothing about it. Google search did not provide any meaningful information neither. Only thing I found related to it was from MySQL documentation, which only mentioned about supporting the different order in an older version.
There is no difference. From the docs of pl/pgsql:
The INTO clause can appear almost anywhere in the SQL command.
Customarily it is written either just before or just after the list of
select_expressions in a SELECT command, or at the end of the command for other command types. It is recommended that you follow
this convention in case the PL/pgSQL parser becomes stricter in future
versions.
Notice that in (non-procedural) SQL, there is also a SELECT INTO command which works like CREATE TABLE AS, in this version the INTO must come right after the SELECT clause.
I always use SELECT ... INTO ... FROM , I believe that is the standard supported notation
https://www.w3schools.com/sql/sql_select_into.asp
I would recommend using this, also if there are any updates or if the other version might become unsupported as you mentioned...
I am trying to pull data from DB2 via informatica, I have a SQ query that pulls few fields based on joins for 4 different tables.
When I run the query directly in the database, it returns the expected result, however when I run it in informatica and run a debugger, I see something else.
Please note all the columns data perfectly match, except one single column.
Weird thing is, this is a calculated field from the table based on a case statement:
CASE WHEN Column1='3' THEN 'N' ELSE 'Y' END.
Since this is a calculated field with a length of one string, I have connected from the source to SQ from one of the sources having 1 character length.
This returns 'Y' when executed in the database, the same query when I copy paste in SQ of information and run it, I get a data 'E', and this data can never be possible as I expect only a N or a Y. I have verified the column order, that its in the right place. This is very strange, is something going wrong because of the CASE Statement?
Save yourself the hassle, put an expression transformation after tge source qualifier and calculate, port value there then forget about it
I think i got the issue. We use Informatica PowerExchange to connect to a as400 system(DB2), and it seems that when we are trying to set a flag information in AS400, and pass it to informatica via PowerExchange, it converts it to binary, and to solve this, there needs to be an entry in the PowerExchange configuration file.
Unfortunately, i myself was not aware that it could be related to PowerExchange instead of powercenter itself.!!
Thanks for your assistance! Below is the KB about it.
https://kb.informatica.com/solution/4/Pages/17498.aspx
I primarily use CFQUERYPARAM to prevent SQL injection. Since Query-of-Queries (QoQ) does not touch the database, is there any logical reason to use CFQUERYPARAM in them? I know that values that do not match the cfsqltype and maxlength will throw an exception, but, these values should already be validated before that and display friendly messages (from a UX viewpoint).
Since Query-of-Queries (QoQ) does not touch the database, is there any logical reason to use CFQUERYPARAM in them? Actually, it does touch the database, the database that you currently have stored in memory. The data in that database could still theoretically be tampered with via some sort of injection from the user. Does that affect your physical database - no. Does that affect the use of the data within your application - yes.
You did not give any specific details but I would err on the side of caution. If ANY of the data you are using to build your query comes from the client then use cfqueryparam in them. If you can guarantee that none of the elements in your query comes from the client then I think it would be okay to not use the cfqueryparam.
As an aside, using cfqueryparam also helps optimize the query for the database although I'm not sure if that is true for query of queries. It also escapes characters for you like apostrophes.
Here is a situation where it's simpler, in my opinion.
<cfquery name="NoVisit" dbtype="query">
select chart_no, patient_name, treatment_date, pr, BillingCompareField
from BillingData
where BillingCompareField not in
(<cfqueryparam cfsqltype="cf_sql_varchar"
value="#ValueList(FinalData.FinalCompareField)#" list="yes">)
</cfquery>
The alternative would be to use QuotedValueList. However, if anything in that value list contained an apostrophe, cfqueryparam will escape it. Otherwise I would have to.
Edit starts here
Here is another example where not using query parameters causes an error.
QueryAddRow(x,2);
QuerySetCell(x,"dt",CreateDate(2001,1,1),1);
QuerySetCell(x,"dt",CreateDate(2001,1,11),2);
</cfscript>
<cfquery name="y" dbtype="query">
select * from x
<!---
where dt in (<cfqueryparam cfsqltype="cf_sql_date" value="#ValueList(x.dt)#" list="yes">)
--->
where dt in (#ValueList(x.dt)#)
</cfquery>
The code as written throws this error:
Query Of Queries runtime error.
Comparison exception while executing IN.
Unsupported Type Comparison Exception:
The IN operator does not support comparison between the following types:
Left hand side expression type = "DATE".
Right hand side expression type = "LONG".
With the query parameter, commented out above, the code executes successfully.
I have a generic code that is used to retrieve DDL information from a Firebird database (FB2.1). It generates SQL code like
SELECT * FROM MyTable where 'c' <> 'c'
I cannot change this code. Actually, if that matters, it is inside Report Builder 10.
The fact is that some tables from my database are becoming a litle too populated (>1M records) and that query is starting to take too long to execute.
If I try to execute
SELECT * FROM MyTable where SomeIndexedField = SomeImpossibleValue
it will obviously use that index and run very quickly.
Well, it wouldn´t be that hard to the database find out that that is an impossible matcher and make some sort of optimization and avoid testing it against each row.
Is there any way to make my firebird database to optimize that search?
As the filter condition is a negative proposition (and also doesn't refer a column to search, but only a value to compare to another value), Firebird need to do a full table scan (without use any index) to confirm that aren't any record that meet your criteria.
If you can't change you need to wait for the upcoming 3.0 version, that will implement the Boolean data type, and therefore should start to evaluate "constant" fake comparisons in advance (maybe the client library will do this evaluation before send the statement to the server?).
I butter-fingered a query in SQL Server 2000 and added a period in the middle of the table name:
SELECT t.est.* FROM test
Instead of:
SELECT test.* FROM test
And the query still executed perfectly. Even SELECT t.e.st.* FROM test executes without issue.
I've tried the same query in SQL Server 2008 where the query fails (error: the column prefix does not match with a table name or alias used in the query). For reasons of pure curiosity I have been trying to figure out how SQL Server 2000 handles the table names in a way that would allow the butter-fingered query to run, but I haven't had much luck so far.
Any sql gurus know why SQL Server 2000 ran the query without issue?
Update: The query appears to work regardless of the interface used (e.g. Enterprise Manager, SSMS, OSQL) and as Jhonny pointed out below it bizarrely even works when you try:
SELECT TOP 1000 dbota.ble.* FROM dbo.table
Maybe table names are constructed from a naive concatenation of prefix and base name.
't' + 'est' == 'test'
And maybe in the later versions of SQL Server, the distinction was made more semantic/more rigorously.
{ owner = t, table = est } != { table = test }
SQL Server 2005 and up has a "proper" implementation of schemas. SQL 2000 and earlier did not. The details escape me (its been years since I used SQL 2000), all I recall clearly is that you'd be nuts to create anything that wasn't owned by "dbo". It all ties into users and object ownership, but the 2000 and earlier model was pretty confusticated. Hopefully someone will read up on BOL, do some experimentation, and post their results here.
S-SQL reference manual:
"[dot] Can be used to combine multiple names into a name of the form A.B to refer to a column in a table, or a table in a schema. Note that you calso just use a symbol with a dot in it."
So I think if you referenced tblTest as tblT.est it would work OK as long as there isn't a column called 'est' in tblTest.
If it can't find a column name referenced with the dot I imagine it checks the parent of the object.
I found a reference to it being a bug
Note: as a result of a comparison
algorithm bug in SQL Server 2000, dot
symbols themselves have no effect on
matching, so "dbo.t" will successfully
match with tables "dbot", "d.b.o.t",
etc
from Link
It's been fixed in SQL Server 2005. Same link > Changes introduced in SQL Server 2005
Dot-related comparison bug has been fixed.
Is it in the "Open table" view of SSMS or via Enterprise Manager or via an SSMS Query Window?
There is/was a SQL Server 2005 issue with SSMS so how you run the query affects how it behaves.
This is a bug.
It has to do with internal representation of column names in SQL server 2000 that leaked out.
You will also not be able to create tablecolumn with a name which collides with table+column concatenation with another column, like, if you have tables User and UserDetail, you won't be able to have columns DetailAge and Age in these tables, respectively.