Re-use a hardcoded value in multiple function calls in PostgreSQL query - postgresql

I have some functions in PostgreSQL 9.0 that return table results. The idea behind these is to return the data as it was at a certain time, eg.
CREATE FUNCTION person_asof(effective_time timestamp with time zone)
RETURNS SETOF person
...
CREATE FUNCTION pgroup_asof(effective_time timestamp with time zone)
RETURNS SETOF pgroup
...
I can query them almost as if they were tables, with joins and all:
SELECT *
FROM pgroup_asof('2011-01-01') g
JOIN person_asof('2011-01-01') p
ON g.id = p.group_id
This works fine, but is there any trick I can use to specify the effective time just once?
I tried to do something like this:
SELECT *
FROM (SELECT '2010-04-12'::timestamp ts) effective,
pgroup_asof(effective.ts) g
JOIN person_asof(effective.ts) p
ON g.id = p.group_id
...but that fails with ERROR: function expression in FROM cannot refer to other relations of same query level and putting the main query into a sub-query doesn't help, either.

This is something I have wanted to do in the past as well but does not look like it is possible yet, but there may be hope on the horizon.

Related

PostgreSQL CTE records as parameters to function

I have a function that accepts two integers as parameters my_function(input_a, input_b). Is there an easy way to pass the results of a CTE (that returns records of input_a, input_b) into the function?
Should I be looking into writing a custom function with a for loop or is there a better approach?
If the function returns a single record then:
WITH cte AS (SELECT 1 a, 2 b)
SELECT my_function(a, b) FROM cte;
will work. However, if the function is an SRF (Set-Returning-Function), then you need to use LATERAL, to let the database know that you want to feed the results of the prior tables in the JOIN statement to the functions later on in the JOIN. This is accomplished like so:
WITH cte AS (SELECT 1 a, 2 b)
SELECT * FROM cte, LATERAL my_function(a, b);
The LATERAL will cause PostgreSQL to take each row from the CTE and run "my_function" with the values from that row, returning the results of that function to the overall SELECT statement.

Loop postgresql Function through Date Range

I have a user defined function. This question shows how to loop through dates. Using this approach, I tried this query:
select myfun(a::date) from generate_series('2015-01-01'::date,'2016-01-27','1 day') s(a)
This doesn't quite work. What it returns is a single column of the form:
(10101, "Sample", "test")
(10102, "Sample2", "test2")
When in reality there should be three columns. It merges them into one.
I noticed that this is the same behavior that you get in a vanilla query such as select mytable when I omit the asterisk. The above query doesn't have an asterisk in it, but adding one causes an error.
Place the function call in the FROM clause:
select f.*
from
generate_series('2015-01-01'::date,'2016-01-27','1 day') s(a),
myfun(a::date) f;
or using more formal syntax:
select f.*
from generate_series('2015-01-01'::date,'2016-01-27','1 day') s(a)
cross join myfun(a::date) f;
This form of the FROM clause is known as lateral join.

Does inner join effect order by?

I have a function a() which gives result in a specific order.
I want to do:
select final.*,tablex.name
from a() as final
inner join tablex on (a.key=tablex.key2)
My question is, can I guarantee that the join won't effect the order of rows as a() set it?
a() is:
select ....
from....
joins...
order by x,y,z
The short version:
The order of rows returned by a SQL query is not guaranteed in any way unless you use an order by
Any order you see without an order by is pure coincidence and can not be relied upon.
So how did I always get the correct order so far? when I did Select * from a()
If your function is a SQL function, then the query inside the function is executed "as is" (it's essentially "inlined") so you only run a single query that does have an order by. If it's a PL/pgSQL function and the only thing it does is a RETURN QUERY ... then you again only have a single query that is executed which does have an order by.
Assuming you do use a SQL function, then running:
select final.*,tablex.name
from a() as final
join tablex on a.key=tablex.key2
is equivalent to:
select final.*,tablex.name
from (
-- this is your query inside the function
select ...
from ...
join ...
order by x,y,z
) as final
join tablex on a.key=tablex.key2;
In this case the order by inside the derived table doesn't make sense as it might be "overruled" by an overall order by statement. In fact some databases would outright reject this query (and I sometime wish Postgres would do as well).
Without an order by on the **overall* query, the database is free to choose any order of rows that it wants.
So to get back to the initial question:
can I guarantee that the join won't effect the order of rows as a() set it?
The answer to that is a clear: NO - the order of the rows for that query is in no way guaranteed. If you need an order that you can rely on, you have to specify an order by.
I would even go so far to remove the order by from the function - what if someone runs: select * from a() order by z,y,x - I don't think Postgres will be smart enough to remove the order by inside the function.

DB2 Query Structure Using User-Defined Function as a Table

I'm a little new to DB2, and am having trouble developing a query. I have created a user-defined function that returns a table of data which I want to then join and select from in larger select statement. I'm working on a sensitive db, so the query below isn't what I'm literally running, but it's almost exactly like it (without the other 10 joins I have to do lol).
select
A.customerId,
A.firstname,
A.lastname,
B.orderId,
B.orderDate,
F.currentLocationDate,
F.currentLocation
from
customer A
INNER JOIN order B
on A.customerId = B.customerId
INNER JOIN table(getShippingHistory(B.customerId)) as F
on B.orderId = F.orderId
where B.orderId = 35
This works great if I run this query without the where clause (or some other where clause that doesn't check for an ID). When I include the where clause, I get the following error:
Error during Prepare 58004(-901)[IBM][CLI Driver][DB2/LINUXX8664]
SQL0901N The SQL statement failed because of a non-severe system
error. Subsequent SQL statements can be processed. (Reason "Bad Plan;
Unresolved QNC found".) SQLSTATE=58004
I have tracked the issue down to fact that I'm using one of join criteria for the parameters (B.customerId). I have validated this fact by replacing B.customerId with a valid customerId, and the query works great. Problem is, I don't know the customerId when calling this query. I know only the orderId (in this example).
Any thoughts on how to restructure this so I can make only 1 call to get all the info? I know the plan is the problem b/c the customerId isn't getting resolved before the function is called.
So if I understand correctly, the function getShippingHistory(customerId) returns a table.
And if you call it with a single customer Id that table gets joined in your query above no problem at all.
But the way you have the query written above, you are asking db2 to call the function for every row returned by your query (i.e. every b.customerId that matches your join and where conditions).
So I'm not sure what behaviour you are expecting, because what you're asking for is a table back for every row in your query, and db2 (nor I) can figure out what the result is supposed to look like.
So in terms of restructuring your query, think about how you can change the getShippingHistory logic when multiple customer Ids are involved.
i found the best solution (given the current query structure) is to use a LEFT join instead of an INNER join in order force the LEFT part of the join to happen which will resolve the customerId to a value by the time it gets to the function call.
select
A.customerId,
A.firstname,
A.lastname,
B.orderId,
B.orderDate,
F.currentLocationDate,
F.currentLocation
from
customer A
INNER JOIN order B
on A.customerId = B.customerId
LEFT JOIN table(getShippingHistory(B.customerId)) as F
on B.orderId = F.orderId
where B.orderId = 35

SQL DateTime Conversion Fails when No Conversion Should be Taking Place

I'm modifying an existing query for a client, and I've encountered a somewhat baffling issue.
Our client uses SQL Server 2008 R2 and the database in question provides the user the ability to specify custom fields for one of its tables by making use of an EAV structure. All of the values stored in this structure are varchar(255), and several of the fields are intended to store dates. The query in question is being modified to use two of these fields and compare them (one is a start, the other is an end) against the current date to determine which row is "current".
The issue I'm having is that part of the query does a CONVERT(DateTime, eav.Value) in order to turn the varchar into a DateTime. The conversions themselves all succedd and I can include the value as part of the SELECT clause, but part of the question is giving me a conversion error:
Conversion failed when converting date and/or time from character string.
The real kicker is this: if I define the base for this query (getting a list of entities with the two custom field values flattened into a single row) as a view and select against the view and filter the view by getdate(), then it works correctly, but it fails if I add a join to a second table using one of the (non-date) fields from the view. I realize that this might be somewhat hard to follow, so I can post an example query if desired, but this question is already getting a little long.
I've tried recreating the basic structure in another database and including sample data, but the new database behaves as expected, so I'm at a loss here.
EDIT In case it's useful, here's the statement for the view:
create view Festival as
select
e.EntityId as FestivalId,
e.LookupAs as FestivalName,
convert(Date, nvs.Value) as ActivityStart,
convert(Date, nve.Value) as ActivityEnd
from tblEntity e
left join CustomControl ccs on ccs.ShortName = 'Activity Start Date'
left join CustomControl cce on cce.ShortName = 'Activity End Date'
left join tblEntityNameValue nvs on nvs.CustomControlId = ccs.IdCustomControl and nvs.EntityId = e.EntityId
left join tblEntityNameValue nve on nve.CustomControlId = cce.IdCustomControl and nve.EntityId = e.EntityId
where e.EntityType = 'Festival'
The failing query is this:
select *
from Festival f
join FestivalAttendeeAll fa on fa.FestivalId = f.FestivalId
where getdate() between f.ActivityStart and f.ActivityEnd
Yet this works:
select *
from Festival f
where getdate() between f.ActivityStart and f.ActivityEnd
(EntityId/FestivalId are int columns)
I've encountered this type of error before, it's due to the "order of operations" performed by the execution plan.
You are getting that error message because the execution plan for your statement (generated by the optimizer) is performing the CONVERT() operation on rows that contain string values that can't be converted to DATETIME.
Basically, you do not have control over which rows the optimizer performs that conversion on. You know that you only need that conversion done on certain rows, and you have predicates (WHERE or ON clauses) that exclude those rows (limit the rows to those that need the conversion), but your execution plan is performing the CONVERT() operation on rows BEFORE those rows are excluded.
(For example, the optimizer may be electing to a do a table scan, and performing that conversion on every row, before any predicate is being applied.)
I can't give a specific answer, without a specific question and specific SQL that is generating the error.
One simple approach to addressing the problem would be to use the ISDATE() function to test whether the string value can be converted to a date.
That is, replace:
CONVERT(DATETIME,eav.Value)
with:
CASE WHEN ISDATE(eav.Value) > 0 THEN CONVERT(DATETIME, eav.Value) ELSE NULL END
or:
CONVERT(DATETIME, CASE WHEN ISDATE(eav.Value) > 0 THEN eav.Value ELSE NULL END)
Note that the ISDATE() function is subject to some significant limitations, such as being affected by the DATEFORMAT and LANGUAGE settings of the session.
If there is some other indication on the eav row, you could use some other test, to conditionally perform the conversion.
CASE WHEN eav.ValueIsDateTime=1 THEN CONVERT(DATETIME, eav.Value) ELSE NULL END
The other approach I've used is to try to gain some modicum of control over the order of operations of the optimizer, using inline views or Common Table Expressions, with operations that force the optimizer to materialize them and apply predicates, so that happens BEFORE any conversion in the outer query.