"Function does not exist" errors when trying to split column containing array of timestampz into delimited text string in Postgres

"Function does not exist" errors when trying to split column containing array of timestampz into delimited text string in Postgres - postgresql

I have a table with columns that contain arrays that I want converted into strings so I can split them by the delimiter into multiple columns.
I'm having trouble doing this with arrays of dates with timezones.
create materialized view matview1 as select
(location) as location,
(nullif(split_part(string_agg(distinct name,'; '),'; ',1),'')) as name1,
(nullif(split_part(string_agg(distinct name,'; '),'; ',2),'')) as name2,
(nullif(split_part(string_agg(distinct name,'; '),'; ',3),'')) as name3,
(array_agg(distinct(event_date_with_timestamp))) as event_dates
from table2 b
group by location;
In the code above I'm creating a materialized view of table to consolidate all table entries related to certain locations into single rows.
How can I create additional columns for each event_date entry like I did with the names (e.g Name1, Name2 and Name3 from the 'name' array)?
I tried changing the array to a string format with:
(nullif(split_part(array_to_string(array_agg(distinct(event_date_with_timestamp))),'; ',1),'')) as event_date1
But this throws the error:
"function array_to_string(timestamp with time zone[]) does not exist"
And casting to different datatypes always produces errors saying I can't cast from type timestampz into anything else.

I found a way to accomplish this by casting from timestampz to text and then back again like this:
(nullif(split_part(string_agg(distinct event_date::text,'; '),'; ',1),'')::date) as date1,

Related

Redshift how to split a stringified array into separate parts

Say I have a varchar column let's say religions that looks like this: ["Christianity", "Buddhism", "Judaism"] (yes it has a bracket in the string) and I want the string (not array) split into multiple rows like "Christianity", "Buddhism", "Judaism" so it can be used in a WHERE clause.
Eventually I want to use the results of the query in a where clause like this:
SELECT ...
FROM religions
WHERE name in
(
<this subquery>
)
How can one do this?

You can use the function JSON_PARSE to convert the varchar string into an array. Then you can use the strategy described in Convert varchar array to rows in redshift - Stack Overflow to convert the array to separate rows.

You can do the following.
Create a temporary table with sequence of numbers
Using the sequence and split_part function available in redshift, you can split the values based on the numbers generated in the temporary table by doing a cross join.
To replace the double quote and square brackets, you can use the regexp_replace function in Redshift.
create temp table seq as
with recursive numbers(NUMBER) as
(
select 1 UNION ALL
select NUMBER + 1 from numbers where NUMBER < 28
)
select * from numbers;
select regexp_replace(split_part(val,',',seq.number),'[]["]','') as value
from
(select '["christianity","Buddhism","Judaism"]' as val) -- You can select the actual column from the table here.
cross join
seq
where seq.number <= regexp_count(val,'[,]')+1;

Create rows from part of column names

Source data
I am working on an ELT project to load data from CSV files into PostgreSQL where I will transform it. The CSV files have many columns that are consistent across files, but also contain activity columns that are inconsistent with names like Date (05/19/2020), Type (05/19/2020), etc.
In the loading script I am merging all of the columns with dates in the column name into one jsonb column so I don't have to constantly add new columns to the raw data table.
The resulting jsonb column in the raw data table looks like this:
id
activity
12345678
{"Date (05/19/2020)": null, "Type (05/19/2020)": null, "Date (06/03/2020)": "06/01/2020", "Type (06/03/2020)": "E"}
98765432
{"Date (05/19/2020)": "05/18/2020", "Type (05/19/2020)": "B", "Date (10/23/2020)": "10/26/2020", "Type (10/23/2020)": "T"}
JSON to columns
Using the amazing create_jsonb_flat_view function from this post I can convert the jsonb to columns like this:
id
Date (05/19/2020)
Type (05/19/2020)
Date (06/03/2020)
Type (06/03/2020)
Type (10/23/2020
Date (10/23/2020)
Type (10/23/2020)
10629465
null
null
06/01/2020
E
98765432
05/18/2020
B
10/26/2020
T
Need to move part of column name to row
Now, this is where I'm stuck. I need to remove the portion of the column name that is the Activity Date (e.g. (05/19/2020)) and create a row for each id and ActivityDate with additional columns for Date and Type like this:
id
ActivityDate
Date
Type
12345678
05/19/2020
null
null
12345678
06/03/2020
06/01/2020
E
98765432
05/19/2020
05/18/2020
B
98765432
10/23/2020
10/26/2020
T

I followed your link to the create_jsonb_flat_view article yesterday and then forgot this question. While I thank you for pointing me there, I think that mentioning it worked against you.
A more conventional approach using regexp_replace() works here. I left the date values as strings, but you can convert them with to_date() if needed:
with parse as (
select id, e.k, e.v,
regexp_replace(e.k, '\s+\([0-9/]{10}\)', '') as k_no_date,
regexp_replace(e.k, '^.+([0-9/]{10}).+', '\1') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;
db<>fiddle here

#Mike-Organek's Answer works beautifully!
However, I was curious if the regexp_replace() calls might be slowing the query down a bit and it seemed I could get the same results using a simpler function.
Since Mike gave me a great example to start with I modified it to split on the space between Date and (05/19/2020).
For 20,000 rows, it went from taking an avg of 7 sec on my local machine to an avg of .9 sec.
Here is the resulting query:
with parse as (
select id, e.k, e.v,
split_part(e.k, ' ', 1) as k_no_date,
trim(split_part(e.k, ' ', 2),'()') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;

Extract all the values in jsonb into a row

I'm using postgresql 11, I have a jsonb which represent a row of that table, it's look like
{"userid":"test","rolename":"Root","loginerror":0,"email":"superadmin#ae.com",...,"thirdpartyauthenticationkey":{}}
is there any method that I could gather all the "values" of the jsonb into a string which is separated by ',' and without the keys?
The string I want to obtain with the jsonb above is like
(test, Root, 0, superadmin#ae.com, ..., {})
I need to keep the ORDER of those values as what their keys were in the jsonb. Could I do that with postgresql?

You can use the jsonb_populate_record function (assuming your json data does match the users table). This will force the text value to match the order of your users table:
Schema (PostgreSQL v13)
CREATE TABLE users (
userid text,
rolename text,
loginerror int,
email text,
thirdpartyauthenticationkey json
)
Query #1
WITH d(js) AS (
VALUES
('{"userid":"test", "rolename":"Root", "loginerror":0, "email":"superadmin#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb),
('{"userid":"other", "rolename":"User", "loginerror":324, "email":"nope#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb)
)
SELECT jsonb_populate_record(null::users, js),
jsonb_populate_record(null::users, js)::text AS record_as_text,
pg_typeof(jsonb_populate_record(null::users, js)::text)
FROM d
;
jsonb_populate_record
record_as_text
pg_typeof
(test,Root,0,superadmin#ae.com,{})
(test,Root,0,superadmin#ae.com,{})
text
(other,User,324,nope#ae.com,{})
(other,User,324,nope#ae.com,{})
text
Note that if you're building this string to insert it back into postgresql then you don't need to do that, since the result of jsonb_populate_record will match your table:
Query #2
WITH d(js) AS (
VALUES
('{"userid":"test", "rolename":"Root", "loginerror":0, "email":"superadmin#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb),
('{"userid":"other", "rolename":"User", "loginerror":324, "email":"nope#ae.com", "thirdpartyauthenticationkey":{}}'::jsonb)
)
INSERT INTO users
SELECT (jsonb_populate_record(null::users, js)).*
FROM d;
There are no results to be displayed.
Query #3
SELECT * FROM users;
userid
rolename
loginerror
email
thirdpartyauthenticationkey
test
Root
0
superadmin#ae.com
[object Object]
other
User
324
nope#ae.com
[object Object]
View on DB Fiddle

You can use jsonb_each_text() to get a set of a text representation of the elements, string_agg() to aggregate them in a comma separated string and concat() to put that in parenthesis.
SELECT concat('(', string_agg(value, ', '), ')')
FROM jsonb_each_text('{"userid":"test","rolename":"Root","loginerror":0,"email":"superadmin#ae.com","thirdpartyauthenticationkey":{}}'::jsonb) jet (key,
value);
db<>fiddle
You didn't provide DDL and DML of a (the) table the JSON may reside in (if it does, that isn't clear from your question). The demonstration above therefore only uses the JSON you showed as a scalar. If you have indeed a table you need to CROSS JOIN LATERAL and GROUP BY some key.
Edit:
If you need to be sure the order is retained and you don't have that defined in a table's structure as #Marth's answer assumes, then you can of course extract every value manually in the order you need them.
SELECT concat('(',
concat_ws(', ',
j->>'userid',
j->>'rolename',
j->>'loginerror',
j->>'email',
j->>'thirdpartyauthenticationkey'),
')')
FROM (VALUES ('{"userid":"test","rolename":"Root","loginerror":0,"email":"superadmin#ae.com","thirdpartyauthenticationkey":{}}'::jsonb)) v (j);
db<>fiddle

Querying Postgres SQL JSON Column

I have a json column (json_col) in a postgres database with the following structure:
{
"event1":{
"START_DATE":"6/18/2011",
"END_DATE":"7/23/2011",
"event_type":"active with prior experience"
},
"event2":{
"START_DATE":"8/20/11",
"END_DATE":"2/11/2012",
"event_type":"active"
}
}
[example of table structure][1]
How can I make a select statement in postgres to return the start_date and end_date with a where statement where "event_type" like "active"?
Attempted Query:
select person_id, json_col#>>'START_DATE' as event_start, json_col#>>'END_DATE' as event_end
from data
where json_col->>'event_type' like '%active%';
Returns empty columns.
Expected Response:
event_start
6/18/2011
8/20/2011

It sounds like you want to unnest your json structure, ignoring the top level keys and just getting the top level values. You can do this with jsonb_each, looking at resulting column named 'value'. You would put the function call in the FROM list as a lateral join (but since it is a function call, you don't need to specify the LATERAL keyword, it is implicit)
select value->>'START_DATE' from data, jsonb_each(json_col)
where value->>'event_type' like '%active%';

Build jsonb array from jsonb field

I have column options with type jsonb , in format {"names": ["name1", "name2"]} which was created with
UPDATE table1 t1 SET options = (SELECT jsonb_build_object('names', names) FROM table2 t2 WHERE t2.id= t1.id)
and where names have type jsonb array.
SELECT jsonb_typeof(names) FROM table2 give array
Now I want to extract value of names as jsonb array. But query
SELECT jsonb_build_array(options->>'names') FROM table
gave me ["[\"name1\", \"name2\"]"], while I expect ["name1", "name2"]
How can I get value in right format?

The ->> operator will return the value of the field (in your case, a JSON array) as a properly escaped text. What you are looking for is the -> operator instead.
However, note that using the jsonb_build_array on that will return an array containing your original array, which is probably not what you want either; simply using options->'names' should get you what you want.

Actually, you don't need to use jsonb_build_array() function.
Use select options -> 'names' from table; This will fix your issue.
jsonb_build_array() is for generating the array from jsonb object. You are following wrong way. That's why you are getting string like this ["[\"name1\", \"name2\"]"].
Try to execute this sample SQL script:
select j->'names'
from (
select '{"names": ["name1", "name2"]}'::JSONB as j
) as a;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

"Function does not exist" errors when trying to split column containing array of timestampz into delimited text string in Postgres - postgresql

I found a way to accomplish this by casting from timestampz to text and then back again like this: (nullif(split_part(string_agg(distinct event_date::text,'; '),'; ',1),'')::date) as date1,

Related

Redshift how to split a stringified array into separate parts

Create rows from part of column names

Extract all the values in jsonb into a row

Querying Postgres SQL JSON Column

Build jsonb array from jsonb field

Categories

Resources