Is there any significant difference between using SELECT ... FROM ... INTO syntax instead of the standard SELECT ... INTO ... FROM? - postgresql

I was creating a function following an example from a database class which included the creation of a temporary variable (base_salary) and using a SELECT INTO to calculate its value later.
However, I did not realize I used a different order for the syntax (SELECT ... FROM ... INTO base_salary) and the function could be used later without any visible issues (values worked as expected).
Is there any difference in using "SELECT ... FROM ... INTO" syntax order? I tried looking about it in the PostgreSQL documentation but found nothing about it. Google search did not provide any meaningful information neither. Only thing I found related to it was from MySQL documentation, which only mentioned about supporting the different order in an older version.

There is no difference. From the docs of pl/pgsql:
The INTO clause can appear almost anywhere in the SQL command.
Customarily it is written either just before or just after the list of
select_expressions in a SELECT command, or at the end of the command for other command types. It is recommended that you follow
this convention in case the PL/pgSQL parser becomes stricter in future
versions.
Notice that in (non-procedural) SQL, there is also a SELECT INTO command which works like CREATE TABLE AS, in this version the INTO must come right after the SELECT clause.

I always use SELECT ... INTO ... FROM , I believe that is the standard supported notation
https://www.w3schools.com/sql/sql_select_into.asp
I would recommend using this, also if there are any updates or if the other version might become unsupported as you mentioned...

Related

Oracle DB link - where clause evaluation

i have a DB2 data source and an Oracle 12c target.
The Oracle has a DB link to the DB2 defined which is working in general.
Now i have a huge table in the DB2 which has a timestamp column (lets call it ROW_CHANGED) for row changes. I want to retrieve rows which have changed after a particular time.
Running
SELECT * FROM lib.tbl WHERE ROW_CHANGED >'2016-08-01 10:00:00'
on the DB2 returns exactly 1 row after ca. 90 secs which is fine.
Now i try the same query from the Oracle via the db link:
SELECT * FROM lib.tbl#dblink_name WHERE ROW_CHANGED >TO_TIMESTAMP('2016-08-01 10:00:00')
This runs for hours and ends up in a timeout.
I read some Oracle docs and found distributed query optimization tips but most of them refer to joining a local to a remote table which is not my case.
In my desperation, i have tried the DRIVING_SITE hint, without effect.
Now i wonder when the WHERE part of the query will be evaluated. Since i have to use Oracle syntax and not DB2 syntax for the query, is it possible the Oracle will try to first copy the full table and apply the where clause afterwards? I did some research but did not find anything which would help me in this direction.
The ROW_CHANGED is a hidden column in the DB2, if that matters.
Thx for any hint in advance.
Update
Thanks#all for help. I'll share what did the trick for me.
First of all i have used TO_TIMESTAMP since the DB2 column is also Timestamp (not date) and i had expected to circumvent implicit conversions by this.
Without the explicit conversion i ran into ORA-28534: Heterogeneous Services preprocessing error and i have no hope of touching the DB config within reasonable time.
The explain plan btw did not bring much. It showed a FULL hint and no conversion on the predicates. Indeed it showed the ROW_CHANGED column as Date, i wonder why.
I have tried Justins suggestion to use a bind variable, however i got ORA-28534 again. Next thing i did was to wrap it into a pl/sql block (will run in a SP anyway later).
declare
v_tmstmp TIMESTAMP := 01.08.16 10:00:00;
begin
INSERT INTO ORAUSER.TMP_TBL (SRC_PK,ROW_CHANGED)
SELECT SRC_PK,ROW_CHANGED
FROM lib.tbl#dblink_name
WHERE ROW_CHANGED > v_tmstmp;
end;
This was executing in the same time as in DB2 itself. The date format is DD.MM.YY here since it is the default unfortunately.
When changing the variable assignment to
v_tmstmp TIMESTAMP := TO_TIMESTAMP('01.08.16 10:00:00','DD.MM.YY HH24:MI:SS');
I got the same problem as before.
Meanwhile the DB2 operators have created an index in the ROW_CHANGED column which i requested earlier that day. This has solved the problem in general it seems. Even my original query finishes in no time now.
If you are actually using an Oracle-specific conversion function like to_timestamp, that forces the predicate to be evaluated on the Oracle side. Oracle isn't going to know how to convert a built-in function like to_timestamp into an exactly equivalent function call in DB2.
If you used a bind variable, that would be more likely to get evaluated on the DB2 side. But that may be complicated by the data type mapping between different databases-- there may not be a perfect mapping between one engine's date and another engine's timestamp data type. If this was a numeric column, a bind variable would be almost certain to get pushed. In this case, it probably involves playing around a bit to figure out exactly what data type to use for your variable that works for your framework, Oracle, and DB2.
If using a bind variable doesn't work, you can force the predicate to be evaluated on the remote server using the dbms_hs_passthrough package. That lets you send a query verbatim to the remote server which allows you to do things like use functions defined in your DB2 database. That's a bit of overkill in this situation, hopefully, but it's nice to have the hammer as your backup if the simpler solution doesn't work quickly enough.

WITH... UPDATE in PostgresQL

I'm trying to update a table (in pgsql) with a complex expression that needs to occur several times in the UPDATE statement. WITH seems perfect for this:
WITH newtz AS (SELECT timezone FROM timezonebyzipcode WHERE zip=(SELECT zip_code FROM company WHERE id=company_id))
UPDATE cross_rental
SET return_timezone=newtz
return_time=(return_time AT TIME ZONE return_timezone) AT TIME ZONE newtz
WHERE return_to='Vendor' AND return_timezone<>newtz
Unfortunately, it doesn't work:
ERROR: syntax error at or near "UPDATE"
LINE 2: UPDATE cross_rental
^
I've searched and couldn't find any examples of using WITH with UPDATE in this way, but I also don't see anything indicating it shouldn't work. Is this just unsupported, or am I making some silly mistake?
And, if it's unsupported, should I just copy that nasty long expression into each of the three places where I'm using "newtz" in the UPDATE clause? Or is there some better way to accomplish this update?
ERROR: syntax error at or near "UPDATE"
LINE 2: UPDATE cross_rental
This specific error message reveals you're using a PostgreSQL version 9.0 or older. The two major versions before 9.1 featured CTEs and WITH, but not in the context of data modifying queries.
This appeared in 9.1. See 7.8.2. Data-Modifying Statements in WITH in PostgreSQL 9.1 doc.
Assuming a newer version, a CTE must be used as a table with rows and columns (not as a scalar variable), so the query should be fixed as mentioned in Richard Huxton's answer.
It's a table (or a from-recordset-source anyway).
WITH calculated AS (
SELECT ....
)
UPDATE foo
SET bar = calculated.something
FROM calculated
WHERE ...
WITH (SELECT timezone FROM timezonebyzipcode WHERE zip=(SELECT zip_code FROM company WHERE id=company_id)) AS newtz
You have the alias and referent backwards...

using WITH clause for PostgreSQL?

I was planning to use the WITH clause with PostgreSQL, but it doesn't seem to support the command. Is there a substitute command?
What I want to do is with one query select several sub-resultsets and use parts of the sub-resultsets to create my final SELECT.
That would have been easy using the WITH clause.
UPDATE:
Opps! I discovered that I misunderstood the error message I got; and pgSQL does support WITH.
PostgreSQL supports common-table expressions (WITH queries) in version 8.4 and above. See common table expressions in the manual.
You should really include your PostgreSQL version, the exact text of the error message, and the exact text of any query you ran in your question. Where practical/relevant also include table definitions, sample data, and expected results.

SQLAlchemy and Postgresql: to_tsquery()

Does anyone know how to use to_tsquery() function of postgresql in sqlalchemy? I searched a lot in Google, I didn't find anything that I can understand. Please help.
I hope it is available in filter function like this:
session.query(TableName).filter(Table.column_name.to_tsquery(search_string)).all()
The expected SQL for the above query is something like this:
Select column_name
from table_name t
where t.column_name ## to_tsquery(:search_string)
The .op() method allows you to generate SQL for arbitrary operators.
session.query(TableName).filter(
Table.c.column_name.op('##')(to_tsquery(search_string))
).all()
For these type of arbitrary queries, you can embed the sql directly into your query:
session.query(TableName).\
filter("t.column_name ## to_tsquery(:search_string)").\
params(search_string=search_string).all()
You should also be able to parameterize t.column_name, but can't see the docs for that just now.
This may have been added recently but worth adding as a more standard solution.
query.filter(Model.attribute.match('your search string'))
does it for you as it looks for the right operation available for your dialect.
See the official docs:
https://docs.sqlalchemy.org/en/13/dialects/postgresql.html#full-text-search
Of course, this assumes the table you are querying is a view built with a to_tsvector attribute to apply the ## operation to.
My fifty cents in 2021, following the docs
None of the previous answers mention how to cast the text column as postgres: to_tsvector('english', column). I have a text column indexed as tsvector. And this is the way:
select(mytable.c.id).where(
func.to_tsvector('english', mytable.c.title )\
.match('somestring', postgresql_regconfig='english')
)
In my case, I didn't want to use to_tsquery as ".match" forces. A more intuitive way is to use websearch_to_tsquery when you have a search input like stackOverflow. So I did a mix from jd response.
I finally applied it as a .filter() the following statement:
func.to_tsvector('english', Table.column_name)\
.op('##')(func.websearch_to_tsquery("string to search",
postgresql_regconfig='english'))
I think this is the general formula and it applies to to_tsquery too.

Sqlalchemy with postgres. Try to get 'DISTINCT ON' instead of 'DISTINCT'

I need to generate query like this:
SELECT **DISTINCT ON** (article.code) article.code, article.title
First I try to make it via ORM distinct method and send it a list with fields. But it wont work. Second, I try to make it via sqlalchemy.sql.select - and it also generate sql query like this:
SELECT DISTINCT article.code, article.title
I Need SELECT **DISTINCT ON** (article.code)...
I look at source code and found in sqlalchemy.dialects.postgresql.base.PGCompiler.get_select_precolumns code for generating constructions like: 'DISTINCT ON'
But this method do not called. Instead of this called another method - sqlalchemy.sql.compiler.get_select_precolumns - it hasn't code for generating DISTINCT ON only for DISTINCT Maybe I should configure my session to called properly method?
This bug report suggests that DISTINCT ON works correctly in SQLAlchemy 0.7+. I think an upgrade is in order, unless you've uncovered a bug in 0.7.
Workarounds . . .
Volunteer to help get the 0.7 package
ready for Ubuntu.
Download and install from
source.
Rewrite queries to avoid DISTINCT
ON. I'm not sure whether that's
possible in the most general case.