Fast search within strings in PostgreSQL - postgresql

Which is the fastest way to search within string in PostgreSQL (case insensivity):
SELECT col FROM table WHERE another_col ILIKE '%typed%'
or
SELECT col FROM table WHERE another_col ~* 'typed'
How can I turn on showing the time which query need to return results? Something like is on default in mySQL (I am thinking about CLI client).

Both queries are the same, PostgreSQL rewrites ILIKE to ~*. Check the results from EXPLAIN to see this behaviour.
I'm not sure about your question, but the psql-client can show you some timing of the query, using \timing.

Regarding the timing:
One solution is to use the switch for psql that Frank has already mentioned.
When you use EXPLAIN ANALZYE it also includes the total runtime of the query on the server.
I prefer this when comparing the runtime for different versions of a query as it removes the network from the equation.

Related

Is there such a thing as a "PostgreSQL-SQL-to-OtherDB-SQL" converter in PHP, or at all?

I use PostgreSQL exclusively. I have no plans to ever change this. However, I recognize that other people are not me, and they instead use MySQL, MS SQL, IBM SQL, SQLite SQL, Oracle SQL and ManyOthers SQL. I'm aware that they have different names in reality.
My queries look like:
SELECT * FROM table WHERE id = $1;
UPDATE table SET col = $1 WHERE col = $2;
INSERT INTO table (a, b, c) VALUES ($1, $2, $3);
My database wrapper functions currently support only PostgreSQL, by internally calling the pg_* functions.
I wish to support "the other databases" too. This would involve (the trivial part) to make my wrapper functions able to interact with the other databases by using the PHP functions for those.
The difficult part is to reconstruct the PostgreSQL-flavor SQL queries from the application into something that works identically yet will be understood by the other SQL database in use, such as MySQL. This obviously involves highly advanced parsing, analysis and final creation of the final query string. For example, this PostgreSQL SQL query:
SELECT * FROM table WHERE col ILIKE $1 ORDER BY random() LIMIT 1;
... will be turned into WeirdSQL like this:
SELECT * FROM table WHERE col ISEQUALTOKINDA %1 ORDER BY rnd() LIMIT 1;
I don't require support from any other input SQL flavor than PostgreSQL, but the output must be "all the big SQL database vendors".
Has anyone even attempted this? Or is it something that is never gonna happen as free software but might exist as a commercial offering? It sounds like it would be a thing. It would be insanely useful, and "crazier" projects have been attempted.
jOOQ is a Java library that aims to hide differences between databases. It has its own SQL grammar which tries to be compatible with everything (but parameter markers must be the JDBC ?), and generates DB-specific SQL from that.
There is an online translator, which generates the following from your query for Oracle:
select *
from table
where lower(cast(col as varchar2(4000))) like lower(cast(:1 as varchar2(4000)))
order by DBMS_RANDOM.RANDOM
fetch next 1 rows only
ODBC uses its own syntax on top on the database's syntax. ODBC drivers are required to convert ODBC parameter markers (?) to whatever the database uses, and to translate escape sequences for certain elements that are likely to have a non-standard syntax in the DB (time/GUID/interval literals, LIKE escape character, outer joins, procedure calls, function calls).
However, most escape sequences are optional, and this does not help with other syntax differences, such as the LIMIT 1.
ODBC drivers provide a long list of information about SQL syntax details, but it is the application's job to construct queries that conform to those restrictions, and not all differences can be described by this list. In practice, most ODBC applications restrict themselves to a commonly supported subset of SQL.

Getting a Result Set's Column Names via T-SQL

Is there a way to get the column names that an arbitrary query will return using just T-SQL that works with pre-2012 versions of Microsoft SQL Server?
What Doesn't Work:
sys.columns and INFORMATION_SCHEMA.COLUMNS work great for obtaining the column list for tables or views but don't work with arbitrary queries.
sys.dm_exec_describe_first_result would be perfect except that this management function was added in SQL Server 2012. What I'm writing needs to be backwards compatible to SQL Server 2005.
A custom CLR function could easily provide this information but introduces deployment complexities on the server side. I'd rather not go this route.
Any ideas?
So long as the arbitrary query qualifies to be used as a nested query (i.e. no CTEs, unique column names, etc.), this can be achieved by loading the query's metadata into a temp table, then retrieving column details via sys.tables:
SELECT TOP 0 * INTO #t FROM (query goes here) q
SELECT name FROM tempdb.sys.columns WHERE object_id = OBJECT_ID('tempdb..#t')
DROP TABLE #t
Thanks to #MartinSmith's for suggesting this approach!

SQL Server OpenQuery() behaving differently then a direct query from TOAD

The following query works efficiently when run directly against Oracle 11 using TOAD (with native Oracle drivers)
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date('12/31/9999','mm/dd/yyyy')
and rgn_nm = 'Boston'
) ...
;
The exact same query "never" returns if passed from SQL Server 2008 to the same Oracle database via openquery(). SQL Server has a link to the Oracle database using an Oracle Provider OLE DB driver.
select * from openquery( servername, '
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
The query doesn't return in a reasonable amount of time, and the user kills the query. I don't know if it would eventually return with the correct result.
This result where the direct TOAD query works efficiently and the openquery() version "never" returns is reproducible.
A small modification to the openquery() gives the correct efficient result: Change eff_endt to trunc(eff_endt).
That is well and good, but it doesn't seem like the change should be necessary.
openquery() is supposed to be pass through, so how can there be a difference between the TOAD and openquery() behavior?
The reason we care is because we frequently develop complex queries with TOAD directly accessing Oracle. Once we have the query functioning and optimized, we convert it to an openquery() string for use in a SQL Server application. It is extremely aggravating to have a query suddenly fail with openquery() when we know it worked as a direct query. Then we have to search for a work-around through trial and error.
I would like to see the Oracle trace files for the two scenarios, but the Oracle server is within another organization, and we are not getting cooperation from the Oracle DBAs.
Does anyone know of any driver, or TOAD, or ??? issues that could account for the discrepancy? Is there any way to eliminate the problem such that both methods always give the same result?
I know you asked this a while ago but I just came across your question.
I agree, they should be the same. Obviously there is a difference. We need to find out where the difference is.
I am thinking out loud as I type...
What happens if you specify just a few column instead of select * from openquery?
How many rows are supposed to be returned?
What if, in the oracle select, you limit the returned rows?
How quickly does the openquery timeout?
Are TOAD and SS on the same machine? Are you RDPing into the SS and running toad from there?
Are they using the same drivers? including bit? (32/64) version?
Are they using the same account on oracle?
It is interesting that using the trunc() makes a difference. I assume [eff_endt] is one of the returned fields?
I am wondering if SS is getting all the rows back but it is choking on doing the date conversions. The date type in oracle may need to be converted to a ss date type before ss shows it to you.
What if you insert the rows from the openquery into a table where the date field is just a (n)varchar. I am thinking ss might just dump the date it is getting back from oracle into that text field without trying to convert it.
something like:
insert into mytable(f1,f2,f3,datetimeX)
select f1,f2,f3,datetimeX from openquery( servername, '
select f1,f2,f3,datetimeX from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
What if toad or ss is modifying the query statement before sending it to oracle. You could fire up wireshark and see what toad and ss are actually sending.
I would be very curious if you get this resolved. I link ss to oracle often and have not run into this issue.
Here are basic things you can check for to see what the database is doing after it receives the query. First, check that the execution plans are the same in TOAD as when the query runs using openquery. You could plan the query yourself in TOAD using:
explain plan set statement_id = 'openquery_test' for <your query here>;
select *
from table(dbms_xplan.display(statement_id => 'openquery_test';
then have someone initiate the query using openquery() and have someone with permissions to view v$ tables to run:
select sql_id from v$session where username = '<user running the query>';
(If there's more than one connection with the same user, you'll have to find an additional attribute to isolate the row representing the session running the query.)
select *
from table(dbms_xplan.display_cursor('<value from query above'));
If those look the same then I'd move on to checking database waits and see what it's stuck on.
select se.username
, sw.event
, sw.p1text
, sw.p2text
, sw.p3text
, sw.wait_time_micro/1000000 as seconds_in_wait
, sw.state
, sw.time_since_last_wait_micro/1000000 as seconds_since_last_wait
from v$session se
inner join
v$session_wait sw
on se.sid = sw.sid
where se.username = '<user running the query>'
;
(again, if there's more than one session with the same username, you'll need another attribute to whittle it down to the one you're interested in.)
If the plans are different, then you need to find out why, or if they're the same, look into what it's waiting on (e.g. SQL*Net message to client ?) and why.
I noticed a difference using OLEDB through MS Access (2013) connecting to Oracle 10g & 11g tables, in that it did not always recognize indexes or primary keys on the Oracle tables properly. The same query through an MS Access 2000 database (using odbc) worked fine / had no problem with indexes & keys. The only way I found to fix the OLEDB version was to include all of the key fields in the SELECT -- which was not a satisfying answer, but it's all I could find. This might be an option to try through SSMS / OpenQuery(...) as well.
Besides that... you can try some alternatives to OPENQUERY, such as:
4-part names: SELECT ... FROM Server..Schema.Table
Execute AT: EXEC ('select...') at linked server
But as for why the OLEDB provider works differently than the native Oracle Provider -- the providers are not identical, and the native provider would be more likely to pave-over Oracle quirks than the more generic OLEDB provider would.

Selecting 2 tables in the same editor

In SQL I can write 2 separate SELECT statements and execute them at the same time, such as:
SELECT * FROM Table_Name
SELECT * FROM Another_Table
Is it possible to do this in PostgreSQL?
It sounds to me like you're really asking a PGAdmin question. In SQL Server's query analyzer / Enterprise Manager if you execute two select statements, two result sets will be displayed in the output area.
PGAdmin doesn't do this; you need two separate query windows to see both result sets. I know of no other free GUI admin tool for PostgreSQL. Maybe someone else will.
EDIT:
I've recently started using Squirrel-SQL and DBeaver. Either of these may be more what you're looking for. I prefer DBeaver myself.
Brian

Is it possible to use CASE with IN?

I'm trying to construct a T-SQL statement with a WHERE clause determined by an input parameter. Something like:
SELECT * FROM table
WHERE id IN
CASE WHEN #param THEN
(1,2,4,5,8)
ELSE
(9,7,3)
END
I've tried all combination of moving the IN, CASE etc around that I can think of. Is this (or something like it) possible?
try this:
SELECT * FROM table
WHERE (#param='??' AND id IN (1,2,4,5,8))
OR (#param!='??' AND id in (9,7,3))
this will have a problem using an index.
The key with a dynamic search conditions is to make sure an index is used, instead of how can I easily reuse code, eliminate duplications in a query, or try to do everything with the same query. Here is a very comprehensive article on how to handle this topic:
Dynamic Search Conditions in T-SQL by Erland Sommarskog
It covers all the issues and methods of trying to write queries with multiple optional search conditions. This main thing you need to be concerned with is not the duplication of code, but the use of an index. If your query fails to use an index, it will preform poorly. There are several techniques that can be used, which may or may not allow an index to be used.
here is the table of contents:
Introduction
The Case Study: Searching Orders
The Northgale Database
Dynamic SQL
Introduction
Using sp_executesql
Using the CLR
Using EXEC()
When Caching Is Not Really What You Want
Static SQL
Introduction
x = #x OR #x IS NULL
Using IF statements
Umachandar's Bag of Tricks
Using Temp Tables
x = #x AND #x IS NOT NULL
Handling Complex Conditions
Hybrid Solutions – Using both Static and Dynamic SQL
Using Views
Using Inline Table Functions
Conclusion
Feedback and Acknowledgements
Revision History
if you are on the proper version of SQL Server 2008, there is an additional technique that can be used, see: Dynamic Search Conditions in T-SQL Version for SQL 2008 (SP1 CU5 and later)
If you are on that proper release of SQL Server 2008, you can just add OPTION (RECOMPILE) to the query and the local variable's value at run time is used for the optimizations.
Consider this, OPTION (RECOMPILE) will take this code (where no index can be used with this mess of ORs):
WHERE
(#search1 IS NULL or Column1=#Search1)
AND (#search2 IS NULL or Column2=#Search2)
AND (#search3 IS NULL or Column3=#Search3)
and optimize it at run time to be (provided that only #Search2 was passed in with a value):
WHERE
Column2=#Search2
and an index can be used (if you have one defined on Column2)
if #param = 'whatever'
select * from tbl where id in (1,2,4,5,8)
else
select * from tbl where id in (9,7,3)