PostgreSQL use function result in ORDER BY - postgresql

Is there a way to use the results of a function call in the order by clause?
My current attempt (I've also tried some slight variations).
SELECT it.item_type_id, it.asset_tag, split_part(it.asset_tag, 'ASSET', 2)::INT as tag_num
FROM serials.item_types it
WHERE it.asset_tag LIKE 'ASSET%'
ORDER BY split_part(it.asset_tag, 'ASSET', 2)::INT;
While my general assumption is that this can't be done, I wanted to know if there was a way to accomplish this that I wasn't thinking of.
EDIT: The query above gives the following error [22P02] ERROR: invalid input syntax for integer: "******"

Your query is generally OK, the problem is that for some row the result of split_part(it.asset_tag, 'ASSET', 2) is the string ******. And that string cannot be cast to an integer.
You may want to remove the order by and the cast in the select list and add a where split_part(it.asset_tag, 'ASSET', 2) = '******', for instance, to narrow down that data issue.
Once that is resolved, having such a function in the order by list is perfectly fine. The quoted section of the documentation in the comments on the question is referring to applying an order by clause to the results of UNION, INTERSECTION, etc. queries. In other words, the order by found in this query:
(select column1 as result_column1 from table1
union
select column2 from table 2)
order by result_column1
can only refer to the accumulated result columns, not to expressions on individual rows.

Related

Why does this SQL unnest query result in 2 rows rather than 4?

Relatively new SQL user question....
If my postgresql query looks like this:
select
to_timestamp((unnest(enrolled_ranges) ->> 'start_time')::float) as start_time
, to_timestamp((unnest(enrolled_ranges) ->> 'end_time')::float) as end_time
from student_inclusions
where student_id = '123456'
And the initial enrolled_ranges json data is this:
{"{\"start_time\":1536652800.00007,\"end_time\":1563981839.966626}","{\"start_time\":1563982078.624668,\"end_time\":1563989693.830777}"}
Why does sql do this
instead of this
The first answer is what I want, I just don't understand how sql knows from the query to associate the matching start and end times. Do you have any insight?
The documentation on set-returning functions describes the behavior you observed:
For each row from the underlying query, there is an output row using the first result from each set-returning function, then an output row using the second result, and so on.
See also What is the expected behaviour for multiple set-returning functions in SELECT clause?

Will Postgres' DISTINCT function always return null as the first element?

I'm selecting distinct values from tables thru Java's JDBC connector and it seems that NULL value (if there's any) is always the first row in the ResultSet.
I need to remove this NULL from the List where I load this ResultSet. The logic looks only at the first element and if it's null then ignores it.
I'm not using any ORDER BY in the query, can I still trust that logic? I can't find any reference in Postgres' documentation about this.
You can add a check for NOT NULL. Simply like
select distinct columnName
from Tablename
where columnName IS NOT NULL
Also if you are not providing the ORDER BY clause then then order in which you are going to get the result is not guaranteed, hence you can not rely on it. So it is better and recommended to provide the ORDER BY clause if you want your result output in a particular output(i.e., ascending or descending)
If you are looking for a reference Postgresql document then it says:
If ORDER BY is not given, the rows are returned in whatever order the
system finds fastest to produce.
If it is not stated in the manual, I wouldn't trust it. However, just for fun and try to figure out what logic is being used, running the following query does bring the NULL (for no apparent reason) to the top, while all other values are in an apparent random order:
with t(n) as (values (1),(2),(1),(3),(null),(8),(0))
select distinct * from t
However, cross joining the table with a modified version of itself brings two NULLs to the top, but random NULLs dispersed througout the resultset. So it doesn't seem to have a clear-cut logic clumping all NULL values at the top.
with t(n) as (values (1),(2),(1),(3),(null),(8),(0))
select distinct * from t
cross join (select n+3 from t) t2

ERROR: syntax error at or near "group"

Hello I'm writing an sql query But i am getting a syntax error on the line with the GROUP BY. What can possibly be the problem, help if you can please.
UPDATE intersection_points i
SET nbr_victimes = sum(tue+bl+bg)
FROM accident_ma a ,intersection_points i
WHERE (ST_DWithin(i.st_intersection,a.geom_acc, 10000) group by st_intersection)) ;
GROUP BY is its own clause, it's not part of a WHERE clause.
This is what you have:
WHERE (
ST_DWithin(i.st_intersection,a.geom_acc, 10000)
group by st_intersection
)
This is what you need:
WHERE ST_DWithin(i.st_intersection,a.geom_acc, 10000)
group by st_intersection
Edit: In response to comments, it sounds like your JOIN is a bit more complex than the UPDATE ... FROM syntax would need. Take a look at the "Notes" section on this page:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
Because of this indeterminacy, referencing other tables only within sub-selects is safer, though often harder to read and slower than using a join.
Normally this would involve changing the syntax to something like:
UDPATE SomeTable
SET SomeColumn = 'Some Value'
WHERE AnotherColumn =
(SELECT AnotherColumn
FROM AnotherTable
-- etc.)
However, the use of ST_DWithin() in this query may complicate that quite a bit. Without much deeper knowledge of the table structures, relationships, and overall intent of this update there probably isn't much more help I can give. Essentially you're going to need to clarify for the database exactly what records need to be updated and how to update them, which may involve changing your query to this latter sub-select syntax in some way.
I don' t understand your data structure. I create the following tables from your query. Please check table structure.
if table's structure is this
your query must be
UPDATE intersection_points SET nbr_victimes = (SELECT SUM(a.tue+a.bl+a.bg) FROM accident_ma a WHERE st_dwithin(st_intersection, a.geom_acc, 1000));

PostgreSQL array_agg order

Table 'animals':
animal_name animal_type
Tom Cat
Jerry Mouse
Kermit Frog
Query:
SELECT
array_to_string(array_agg(animal_name),';') animal_names,
array_to_string(array_agg(animal_type),';') animal_types
FROM animals;
Expected result:
Tom;Jerry;Kerimt, Cat;Mouse;Frog
OR
Tom;Kerimt;Jerry, Cat;Frog;Mouse
Can I be sure that order in first aggregate function will always be the same as in second.
I mean I would't like to get:
Tom;Jerry;Kermit, Frog;Mouse,Cat
Use an ORDER BY, like this example from the manual:
SELECT array_agg(a ORDER BY b DESC) FROM table;
If you are on a PostgreSQL version < 9.0 then:
From: http://www.postgresql.org/docs/8.4/static/functions-aggregate.html
In the current implementation, the order of the input is in principle unspecified. Supplying the input values from a sorted subquery will usually work, however. For example:
SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
So in your case you would write:
SELECT
array_to_string(array_agg(animal_name),';') animal_names,
array_to_string(array_agg(animal_type),';') animal_types
FROM (SELECT animal_name, animal_type FROM animals) AS x;
The input to the array_agg would then be unordered but it would be the same in both columns. And if you like you could add an ORDER BY clause to the subquery.
According to Tom Lane:
... If I read it right, the OP wants to be sure that the two aggregate functions will see the data in the *same* unspecified order. I think that's a pretty safe assumption. The server would have to go way out of its way to do differently, and it doesn't.
... So it is documented behavior that an aggregate without its own ORDER BY will see the rows in whatever order the FROM clause supplies them.
So I think it's fine to assume that all the aggregates, none of which uses ORDER BY, in your query will see input data in the same order. The order itself is unspecified though (which depends on the order the FROM clause supplies rows).
Source: PostgreSQL mailing list
Do this:
SELECT
array_to_string(array_agg(animal_name order by animal_name),';') animal_names,
array_to_string(array_agg(animal_type order by animal_type),';') animal_types
FROM
animals;

SQL DateTime Conversion Fails when No Conversion Should be Taking Place

I'm modifying an existing query for a client, and I've encountered a somewhat baffling issue.
Our client uses SQL Server 2008 R2 and the database in question provides the user the ability to specify custom fields for one of its tables by making use of an EAV structure. All of the values stored in this structure are varchar(255), and several of the fields are intended to store dates. The query in question is being modified to use two of these fields and compare them (one is a start, the other is an end) against the current date to determine which row is "current".
The issue I'm having is that part of the query does a CONVERT(DateTime, eav.Value) in order to turn the varchar into a DateTime. The conversions themselves all succedd and I can include the value as part of the SELECT clause, but part of the question is giving me a conversion error:
Conversion failed when converting date and/or time from character string.
The real kicker is this: if I define the base for this query (getting a list of entities with the two custom field values flattened into a single row) as a view and select against the view and filter the view by getdate(), then it works correctly, but it fails if I add a join to a second table using one of the (non-date) fields from the view. I realize that this might be somewhat hard to follow, so I can post an example query if desired, but this question is already getting a little long.
I've tried recreating the basic structure in another database and including sample data, but the new database behaves as expected, so I'm at a loss here.
EDIT In case it's useful, here's the statement for the view:
create view Festival as
select
e.EntityId as FestivalId,
e.LookupAs as FestivalName,
convert(Date, nvs.Value) as ActivityStart,
convert(Date, nve.Value) as ActivityEnd
from tblEntity e
left join CustomControl ccs on ccs.ShortName = 'Activity Start Date'
left join CustomControl cce on cce.ShortName = 'Activity End Date'
left join tblEntityNameValue nvs on nvs.CustomControlId = ccs.IdCustomControl and nvs.EntityId = e.EntityId
left join tblEntityNameValue nve on nve.CustomControlId = cce.IdCustomControl and nve.EntityId = e.EntityId
where e.EntityType = 'Festival'
The failing query is this:
select *
from Festival f
join FestivalAttendeeAll fa on fa.FestivalId = f.FestivalId
where getdate() between f.ActivityStart and f.ActivityEnd
Yet this works:
select *
from Festival f
where getdate() between f.ActivityStart and f.ActivityEnd
(EntityId/FestivalId are int columns)
I've encountered this type of error before, it's due to the "order of operations" performed by the execution plan.
You are getting that error message because the execution plan for your statement (generated by the optimizer) is performing the CONVERT() operation on rows that contain string values that can't be converted to DATETIME.
Basically, you do not have control over which rows the optimizer performs that conversion on. You know that you only need that conversion done on certain rows, and you have predicates (WHERE or ON clauses) that exclude those rows (limit the rows to those that need the conversion), but your execution plan is performing the CONVERT() operation on rows BEFORE those rows are excluded.
(For example, the optimizer may be electing to a do a table scan, and performing that conversion on every row, before any predicate is being applied.)
I can't give a specific answer, without a specific question and specific SQL that is generating the error.
One simple approach to addressing the problem would be to use the ISDATE() function to test whether the string value can be converted to a date.
That is, replace:
CONVERT(DATETIME,eav.Value)
with:
CASE WHEN ISDATE(eav.Value) > 0 THEN CONVERT(DATETIME, eav.Value) ELSE NULL END
or:
CONVERT(DATETIME, CASE WHEN ISDATE(eav.Value) > 0 THEN eav.Value ELSE NULL END)
Note that the ISDATE() function is subject to some significant limitations, such as being affected by the DATEFORMAT and LANGUAGE settings of the session.
If there is some other indication on the eav row, you could use some other test, to conditionally perform the conversion.
CASE WHEN eav.ValueIsDateTime=1 THEN CONVERT(DATETIME, eav.Value) ELSE NULL END
The other approach I've used is to try to gain some modicum of control over the order of operations of the optimizer, using inline views or Common Table Expressions, with operations that force the optimizer to materialize them and apply predicates, so that happens BEFORE any conversion in the outer query.