Why subquery doesn't work in ON clause in DB2 - db2

Why this simple query works fine in oracle but doesn't work in DB2:
select *
from
sysibm.dual d1
left join sysibm.dual d2 on 1=1 and exists (select 1 from sysibm.dual)
Moving subquery-involving condition to where clause may help, but that will restrain outer join into inner.

When I try to run the query you have, I get a -338 error, which according to Information Center (see link), there are the following restrictions on the ON clause:
An ON clause associated with a JOIN operator or in a MERGE statement
is not valid for one of the following reasons.
* The ON clause cannot include any subqueries.
* Column references in an ON clause must only reference columns
of tables that are in the scope of the ON clause.
* Scalar fullselects are not allowed in the expressions of an ON clause.
* A function referenced in an ON clause of a full outer join
must be deterministic and have no external action.
* A dereference operation (->) cannot be used.
* A SQL function or SQL method cannot be used.
* The ON clause cannot include an XMLQUERY or XMLEXISTS expression.
I'm not sure if it's possible with your query, but do you think perhaps you could re-write something like this:
select *
from
sysibm.dual d1
left join (
SELECT dl.*,
CASE WHEN EXISTS (SELECT 1 FROM sysibm.dual)
THEN 1
ELSE 0
END AS jn
FROM sysibm.dual dl
) D2
on 1=1 and 1=d2.jn

This works in DB2 V10.1!
No fixpack installed.

Related

PostgreSQL reusing computed result as input to other select computations

Is there any way we can take a computed result inside the select clause and insert it into another computation inside the select clause?
For example this is what I want to have but can't so far:
select trim(leading https://www.amazon.com for url) as trimmedURL,
substring(trimmedURL, from position('/' in trimmedURL) for position ('html' in trimmedURL))....
As you can see I have used trimmedURL 3 times inside the substring function. I know how to naively do that be copy/paste of trim(leading https://www.amazon.com for url) into the substring function.
Is there any way to avoid that and not create really large function calls as the first value computed might be placed many times inside other functions. This will improve code readability and usability.
you could use a lateral join and place the computed fields i the lateral query. the lateral fields are then accessible from the main query.
Postgres documentation for lateral join
i.e.
SELECT
trimmedUrl
, SUBSTRING(trimmedURL,10,20) url_part
FROM mytable
LEFT JOIN LATERAL (SELECT trim(leading https://www.amazon.com for url) as trimmedURL) trmd
ON TRUE
also, note that postgresql ignores casing in the naming of columns / tables etc unless they are quoted.
Here's a self-contained example:
WITH x(col) AS (Values ('abc://cdf/def'), ('abc://xyz/pqr'))
SELECT x.col, SUBSTRING(y.col2 from position('/' in y.col2)) resuing_computation
FROM x
LEFT JOIN LATERAL (SELECT trim(leading 'abc://' from col) col2) y ON TRUE

What is the execution order of a query with sub queries?

Consider this query
select *
from documents d
where exists (select 1 as [1]
from (
select *
from (
select *
from ProductMediaDocuments
where d.id = MediaDocuments_Id
) as [dummy1]
) as [s2]
where exists(
select *
from ProductSkus psk
where psk.Product_Id = s2.MediaProducts_Id
)
)
Could someone tell me how this is being processed by SQL Server? When statements appears in parentheses, this means it will execute first. But does this also apply for the above statement? In this case I don't think so, because the sub queries needs values of outer queries. So, how does this works under the hood?
That's completely up to the database engine.
Since SQL is a declarative language, you specify WHAT you want, but the HOW part is up to the DB Engine and it really depends on many factors like indexes presence, type, fragmentation; row cardinality, statistics.
That's just to mention few, because the list can goes on.
Of course you can look to the execution plan but the point is that you can't know HOW it will be executed just reading the query.
The execution plan will tell you what the engine actually does. That is, the physical processing order. AFAIK, the query planner will rewrite your query if it finds a better way to express it to itself or the engine. If your question is, "Why is my query not working the way I think it should." then that is where you should start.
The doc says the logical processing order is:
FROM
ON
JOIN
WHERE
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP
It also has this note:
The [preceding] steps show the logical processing order, or binding order, for a SELECT statement. This order determines when the objects defined in one step are made available to the clauses in subsequent steps. For example, if the query processor can bind to (access) the tables or views defined in the FROM clause, these objects and their columns are made available to all subsequent steps. Conversely, because the SELECT clause is step 8, any column aliases or derived columns defined in that clause cannot be referenced by preceding clauses. However, they can be referenced by subsequent clauses such as the ORDER BY clause. Note that the actual physical execution of the statement is determined by the query processor and the order may vary from this list.
FROM would include inline views (subqueries) or CTE aliases. Each time it finds a subquery, it should start over from the beginning and evaluate that query.
I simplified your code a bit.
SELECT *
FROM documents d
WHERE EXISTS ( SELECT 1
FROM ProductMediaDocuments s2
WHERE d.id = MediaDocuments_Id
AND EXISTS (
SELECT *
FROM ProductSkus psk
WHERE psk.Product_Id = s2.MediaProducts_Id
)
)
I think this code is clearer don't you??
SELECT d.*
FROM documents d
JOIN ProductMediaDocuments s2 ON d.id = MediaDocuments_Id
JOIN ProductSkus psk ON psk.Product_Id = s2.MediaProducts_Id

Left outer join using 2 of 3 tables in Postgresql

I need to show all clients entered into the system for a date range.
All clients are assigned to a group, but not necessarily to a staff.
When I run the query as such:
SELECT
clients.name_lastfirst_cs,
to_char (clients.date_intake,'MM/DD/YY')AS Date_Created,
clients.client_id,
clients.display_intake,
staff.staff_name_cs,
groups.name
FROM
public.clients,
public.groups,
public.staff,
public.link_group
WHERE
clients.zrud_staff = staff.zzud_staff AND
clients.zzud_client = link_group.zrud_client AND
groups.zzud_group = link_group.zrud_group AND
clients.date_intake BETWEEN (now() - '8 days'::interval)::timestamp AND now()
ORDER BY
groups.name ASC,
clients.client_id ASC,
staff.staff_name_cs ASC
I get 121 entries
if I comment out:
SELECT
clients.name_lastfirst_cs,
to_char (clients.date_intake,'MM/DD/YY')AS Date_Created,
clients.client_id,
clients.display_intake,
-- staff.staff_name_cs, -- Line Commented out
groups.name
FROM
public.clients,
public.groups,
public.staff,
public.link_group
WHERE
-- clients.zrud_staff = staff.zzud_staff AND --Line commented out
clients.zzud_client = link_group.zrud_client AND
groups.zzud_group = link_group.zrud_group AND
clients.date_intake BETWEEN (now() - '8 days'::interval)::timestamp AND now()
ORDER BY
groups.name ASC,
clients.client_id ASC,
staff.staff_name_cs ASC
I get 173 entries
I know I need to do an outer join to capture all clients regardless of if there
is a staff assigned, but each attempt has failed. I have done outer joins with
two tables, but adding a third has twisted my brain.
Thanks for any suggestions
I have no way of testing this (or of knowing that it is right) but what I read in your query is that you want something similar to this:
SELECT --I just used short aliases. I choose something other than the table name so I know it is an alias "c" for client etc...
c.name_lastfirst_cs,
to_char (c.date_intake,'MM/DD/YY')AS Date_Created,
c.client_id,
c.display_intake,
s.staff_name_cs,
g.name,
l.zrud_client AS "link_client",--I'm selecting some data here so that I can debug later, you can just filter this out with another select if you need to
l.zzud_group AS "link_group" --Again, so I can see these relationships
FROM
public.clients c
LEFT OUTER JOIN staff s ON --is staff required? If it isn't then outer join (optional)
s.zzud_staff = c.zrud_staff --so we linked staff to clients here
LEFT OUTER JOIN public.link_group l ON --this looks like a lookup table to me so we select the lookup record
l.zrud_client = c.zzud_client -- this is how I define the lookup, a client id
LEFT OUTER JOIN public.groups g ON --then we use that to lookup a group
g.zzup_group = l.zrud_group --which is defined by this data here
WHERE -- the following must be true
c.date_intake BETWEEN (now() - '8 days'::interval)::timestamp AND now()
Now for the why: I've basically moved your where clause to JOIN x ON y=z syntax. In my experience this is a better way to write an maintain queries as it allows you to specify relationships between tables rather than doing a big-ol'-join and trying to filter that data with the where clause. Keep in mind each condition is REQUIRED not optional so when you say you want records with the following conditions you're going to get them (and if I read this right--I probably don't as I don't have a schema in-front of me) if a record is missing a link-table record OR a staff member you're going to filter it out.
Alternatively (possibly significantly slower) You can SELECT anything so you can chain it like:
SELECT
*
FROM
(
SELECT
*
FROM
public.clients
WHERE
x condition
)
WHERE
y condition
OR
SELECT * FROM x WHERE x.condition IN (SELECT * FROM y)
In your case this tactic probably won't be easier than a standard join syntax.
^And some serious opinion here: I recommend you use the join syntax I outlined above here. It is functionally the same as joining and specifying a where clause, but as you noted, if you don't understand the relationships it can cause a Cartesian join. http://www.tutorialspoint.com/sql/sql-cartesian-joins.htm . Lastly, I tend to specify what type of join I want. I write INNER JOIN and OUTER JOIN a lot in my queries because it helps the next person (usually me) figure out what the heck I meant. If it is optional use an outer join, if it is required use an inner join (default).
Good luck! There are much better SQL developers out there and there's probably another way to do it.

postgres syntax error in sql

select * from (
select max(h.updated_datetime) as max, min(h.updated_datetime) as min from report r, report_history h, procedure_runtime_information PRI, study S
where
h.report_fk=r.pk and
r.study_fk=S.pk and
PRI.pk=S.procedure_runtime_fk and
extract(epoch from (max(h.updated_datetime) - min(h.updated_datetime) ) <=900 and
h.pk IN (
select pk from
(select * from report_history where report_fk=r.pk) as result
)
and r.status_fk =21 group by r.pk)as result1;
this is my query i have a syntax error can any one help me fix this
thanks in advance
As you didn't bother telling us what the error is I have to guess, that it's this line:
AND h.pk IN (SELECT pk FROM (SELECT * FROM report_history WHERE report_fk=r.pk) AS RESULT)
The nesting level for the where condition is "too deep" and I think it cannot see the r alias in the where clause.
But the nested select is totally useless in your case anyway, so you can rewrite that condition as:
AND h.pk IN (SELECT pk FROM report_history WHERE report_fk=r.pk)
Even if that doesn't solve your problem, it makes your query more readable.
Then you are using an aggregate in the where clause which is also not allowed, you have to move it to a having clause.
having extract(epoch from (max(h.updated_datetime) - min(h.updated_datetime))) <=900
The having clause comes after the group by
You were also missing a closing ) but that is hard to tell because of your formatting (which I find very hard to read)
You should also get used to explicit JOIN syntax. The implicit joins in the WHERE clause are error-prone and no longer recommended.

HAVING clause in PostgreSQL

I'm rewriting the MySQL queries to PostgreSQL. I have table with articles and another table with categories. I need to select all categories, which has at least 1 article:
SELECT c.*,(
SELECT COUNT(*)
FROM articles a
WHERE a."active"=TRUE AND a."category_id"=c."id") "count_articles"
FROM articles_categories c
HAVING (
SELECT COUNT(*)
FROM articles a
WHERE a."active"=TRUE AND a."category_id"=c."id" ) > 0
I don't know why, but this query is causing an error:
ERROR: column "c.id" must appear in the GROUP BY clause or be used in an aggregate function at character 8
The HAVING clause is a bit tricky to understand. I'm not sure about how MySQL interprets it. But the Postgres documentation can be found here:
http://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-HAVING
It essentially says:
The presence of HAVING turns a query
into a grouped query even if there is
no GROUP BY clause. This is the same
as what happens when the query
contains aggregate functions but no
GROUP BY clause. All the selected rows
are considered to form a single group,
and the SELECT list and HAVING clause
can only reference table columns from
within aggregate functions. Such a
query will emit a single row if the
HAVING condition is true, zero rows if
it is not true.
The same is also explained in this blog post, which shows how HAVING without GROUP BY implicitly implies a SQL:1999 standard "grand total", i.e. a GROUP BY ( ) clause (which isn't supported in PostgreSQL)
Since you don't seem to want a single row, the HAVING clause might not be the best choice.
Considering your actual query and your requirement, just rewrite the whole thing and JOIN articles_categories to articles:
SELECT DISTINCT c.*
FROM articles_categories c
JOIN articles a
ON a.active = TRUE
AND a.category_id = c.id
alternative:
SELECT *
FROM articles_categories c
WHERE EXISTS (SELECT 1
FROM articles a
WHERE a.active = TRUE
AND a.category_id = c.id)
SELECT * FROM categories c
WHERE
EXISTS (SELECT 1 FROM article a WHERE c.id = a.category_id);
should be fine... perhaps simpler ;)