Why array_agg() is returning empty array in postgresql? - postgresql

I have an integer type column named as start. I want to make an array by the values of this column. It seemed to be very easy and I used array_agg(), but it is giving empty array as output. Following is my column data
start
1
2
11
5
.
.
. (and so on)
And following is my query used to make the array:
select array_agg(start) as start_array from table1;
Why is it giving empty array?

It's not
There is no way that this can return empty unless there are no rows. Perhaps a JOIN or a WHERE clause is wrong and you have 0-rows?
Also as a micro-optimization if your query is this simple,
select array_agg(start) as start_array from table1;
Then it's probably better written with the ARRAY() constructor...
SELECT ARRAY(SELECT start FROM table1) AS start_array;

Related

Why does this SQL unnest query result in 2 rows rather than 4?

Relatively new SQL user question....
If my postgresql query looks like this:
select
to_timestamp((unnest(enrolled_ranges) ->> 'start_time')::float) as start_time
, to_timestamp((unnest(enrolled_ranges) ->> 'end_time')::float) as end_time
from student_inclusions
where student_id = '123456'
And the initial enrolled_ranges json data is this:
{"{\"start_time\":1536652800.00007,\"end_time\":1563981839.966626}","{\"start_time\":1563982078.624668,\"end_time\":1563989693.830777}"}
Why does sql do this
instead of this
The first answer is what I want, I just don't understand how sql knows from the query to associate the matching start and end times. Do you have any insight?
The documentation on set-returning functions describes the behavior you observed:
For each row from the underlying query, there is an output row using the first result from each set-returning function, then an output row using the second result, and so on.
See also What is the expected behaviour for multiple set-returning functions in SELECT clause?

How to hash a query result with sha256 in PostgreSQL?

I want to somehow hash the result of a query in PostgreSQL. I have a query like
SELECT output FROM result;
And it returns a column composed only of integers. So I somehow want to hash the result of this query. Concatenate the values and hash, or somehow hash the query output directly. Simply I need a way to put it inside SELECT sha256(...). So please note that I do not want to get hash of every column entry, but one hash that somehow corresponds to the query output. Any ideas?
PostgreSQL doesn't come with a built-in streaming hash function exposed to the user, so the easiest way is to build the string in memory and then hash it. Of course this won't work with giant result sets. You can use digest from the pg_crypto extension. You also need to order your rows, or else you might get different results on the same data from one execution to the next if you get the rows in different orders.
select digest(string_agg(output::text,' ' order by output),'sha256')
from result;
Replace 1234 with your column name and add [from table_name] to this query:
select encode(digest(1234::text, 'sha256'), 'hex')
Or for multiple rows use this:
select encode(
digest(
(select array_agg(q1)::text[] from (select row(R.*)::text as q1 from (SELECT output FROM result)R)alias)::text
, 'sha256')
, 'hex')

Casting rows to arrays in PostgreSQL

I need to query a table as in
SELECT *
FROM table_schema.table_name
only each row needs to be a TEXT[] with array values corresponding to column values casted to TEXT coming in the same order as in SELECT * so assuming the table has columns a, b and c I need the result to look like
SELECT ARRAY[a::TEXT, b::TEXT, c::TEXT]
FROM table_schema.table_name
only it shouldn't explicitly list columns by name. Ideally it should look like
SELECT as_text_array(a)
FROM table_schema.table_name AS a
The best I came up with looks ugly and relies on "hstore" extension
WITH columnz AS ( -- get ordered column name array
SELECT array_agg(attname::TEXT ORDER BY attnum) AS column_name_array
FROM pg_attribute
WHERE attrelid = 'table_schema.table_name'::regclass AND attnum > 0 AND NOT attisdropped
)
SELECT hstore(a)->(SELECT column_name_array FROM columnz)
FROM table_schema.table_name AS a
I am having a feeling there must be a simpler way to achieve that
UPDATE 1
Another query that achieves the same result but arguably as ugly and inefficient as the first one is inspired by the answer by #bspates. It may be even less efficient but doesn't rely on extensions
SELECT r.text_array
FROM table_schema.table_name AS a
INNER JOIN LATERAL ( -- parse ROW::TEXT presentation of a row
SELECT array_agg(COALESCE(replace(val[1], '""', '"'), NULLIF(val[2], ''))) AS text_array
FROM regexp_matches(a::text, -- parse double-quoted and simple values separated by commas
'(?<=\A\(|,) (?: "( (?:[^"]|"")* )" | ([^,"]*) ) (?=,|\)\Z)', 'xg') AS t(val)
) AS r ON TRUE
It is still far from ideal
UPDATE 2
I tested all 3 options existing at the moment
Using JSON. It doesn't rely on any extensions, it is short to write, easy to understand and the speed is ok.
Using hstore. This alternative is the fastest (>10 times faster than JSON approach on a 100K dataset) but requires an extension. hstore in general is very handy extension to have through.
Using regex to parse TEXT presentation of a ROW. This option is really slow.
A somewhat ugly hack is to convert the row to a JSON value, then unnest the values and aggregate it back to an array:
select array(select (json_each_text(to_json(t))).value) as row_value
from some_table t
Which is to some extent the same as your hstore hack.
If the order of the columns is important, then using json and with ordinality can be used to keep that:
select array(select val
from json_each_text(to_json(t)) with ordinality as t(k,val,idx)
order by idx)
from the_table t
The easiest (read hacky-est) way I can think of is convert to a string first then parse that string into an array. Like so:
SELECT string_to_array(table_name::text, ',') FROM table_name
BUT depending on the size and type of the data in the table, this could perform very badly.

Selecting a row by searching a specific value in an Array column

We have a table where one of the columns is an array. I need to select a row or many rows as long as my search value matches their values using ILIKE. My problem is that I need to search the values of an array column as well. I tried using ANY but the value needs to be exact to select a row. I need something similar to ILIKE but for that array column.
Thank you in advance.
Use unnest function:
SELECT x.value
FROM my_table t, unnest(t.my_array_column) as x(value)
WHERE x.value ILIKE 'foo'
Once your question is also tagged elixir, for converting this to Ecto use Ecto.Query.API.fragment/1 for the select condition and Ecto.Query.API.ilike/2 for match.

PostgreSQL use function result in ORDER BY

Is there a way to use the results of a function call in the order by clause?
My current attempt (I've also tried some slight variations).
SELECT it.item_type_id, it.asset_tag, split_part(it.asset_tag, 'ASSET', 2)::INT as tag_num
FROM serials.item_types it
WHERE it.asset_tag LIKE 'ASSET%'
ORDER BY split_part(it.asset_tag, 'ASSET', 2)::INT;
While my general assumption is that this can't be done, I wanted to know if there was a way to accomplish this that I wasn't thinking of.
EDIT: The query above gives the following error [22P02] ERROR: invalid input syntax for integer: "******"
Your query is generally OK, the problem is that for some row the result of split_part(it.asset_tag, 'ASSET', 2) is the string ******. And that string cannot be cast to an integer.
You may want to remove the order by and the cast in the select list and add a where split_part(it.asset_tag, 'ASSET', 2) = '******', for instance, to narrow down that data issue.
Once that is resolved, having such a function in the order by list is perfectly fine. The quoted section of the documentation in the comments on the question is referring to applying an order by clause to the results of UNION, INTERSECTION, etc. queries. In other words, the order by found in this query:
(select column1 as result_column1 from table1
union
select column2 from table 2)
order by result_column1
can only refer to the accumulated result columns, not to expressions on individual rows.