How to explode a list of string into new columns - postgresql-14

I have a table in postgresql 14 like this one:
text classes
some string [food, drink]
another string [food, medicine, drink]
another random [car]
And I want to get this as output:
text class_1 class_2 class_3
some string food drink
another string food medicine drink
another random car
So I want to strip the [] off and explode each of the string into columns.
I am trying:
select text, replace(replace(unnest(string_to_array(classes, ',')), '[', ' '),']','') from tbl
but i am getting each of the classes in one line which duplicates the text columns
Also, is there any clean way to remove the []?

You can use translate to get rid of the square brackets. Then use split_part to get one column for each element:
select "text",
split_part(translate(classes, '[]', ''), ',', 1) as class_1,
split_part(translate(classes, '[]', ''), ',', 2) as class_2,
split_part(translate(classes, '[]', ''), ',', 3) as class_3
from the_table;
It is not possible to make this dynamic. A fundamental restriction of the the SQL language is, that the number, names and data types of all result columns must be known before the database starts retrieving data.

Related

Create rows from part of column names

Source data
I am working on an ELT project to load data from CSV files into PostgreSQL where I will transform it. The CSV files have many columns that are consistent across files, but also contain activity columns that are inconsistent with names like Date (05/19/2020), Type (05/19/2020), etc.
In the loading script I am merging all of the columns with dates in the column name into one jsonb column so I don't have to constantly add new columns to the raw data table.
The resulting jsonb column in the raw data table looks like this:
id
activity
12345678
{"Date (05/19/2020)": null, "Type (05/19/2020)": null, "Date (06/03/2020)": "06/01/2020", "Type (06/03/2020)": "E"}
98765432
{"Date (05/19/2020)": "05/18/2020", "Type (05/19/2020)": "B", "Date (10/23/2020)": "10/26/2020", "Type (10/23/2020)": "T"}
JSON to columns
Using the amazing create_jsonb_flat_view function from this post I can convert the jsonb to columns like this:
id
Date (05/19/2020)
Type (05/19/2020)
Date (06/03/2020)
Type (06/03/2020)
Type (10/23/2020
Date (10/23/2020)
Type (10/23/2020)
10629465
null
null
06/01/2020
E
98765432
05/18/2020
B
10/26/2020
T
Need to move part of column name to row
Now, this is where I'm stuck. I need to remove the portion of the column name that is the Activity Date (e.g. (05/19/2020)) and create a row for each id and ActivityDate with additional columns for Date and Type like this:
id
ActivityDate
Date
Type
12345678
05/19/2020
null
null
12345678
06/03/2020
06/01/2020
E
98765432
05/19/2020
05/18/2020
B
98765432
10/23/2020
10/26/2020
T
I followed your link to the create_jsonb_flat_view article yesterday and then forgot this question. While I thank you for pointing me there, I think that mentioning it worked against you.
A more conventional approach using regexp_replace() works here. I left the date values as strings, but you can convert them with to_date() if needed:
with parse as (
select id, e.k, e.v,
regexp_replace(e.k, '\s+\([0-9/]{10}\)', '') as k_no_date,
regexp_replace(e.k, '^.+([0-9/]{10}).+', '\1') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;
db<>fiddle here
#Mike-Organek's Answer works beautifully!
However, I was curious if the regexp_replace() calls might be slowing the query down a bit and it seemed I could get the same results using a simpler function.
Since Mike gave me a great example to start with I modified it to split on the space between Date and (05/19/2020).
For 20,000 rows, it went from taking an avg of 7 sec on my local machine to an avg of .9 sec.
Here is the resulting query:
with parse as (
select id, e.k, e.v,
split_part(e.k, ' ', 1) as k_no_date,
trim(split_part(e.k, ' ', 2),'()') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;

Set numeric column to equal formatted varchar currency column in PostgreSQL

I have a VARCHAR(1000) column of prices with dollar signs (e.g. $100) and I have created a new NUMERIC(15,2) column, which I'd like to set equal to the prices in the VARCHAR column.
This is what worked for me in MySQL:
UPDATE product_table
SET cost = REPLACE(REPLACE(price, '$', ''), ',','');
but in PostgreSQL it throws an error:
ERROR: column "cost" is of type numeric but expression is of type character
LINE 2: SET cost = REPLACE(REPLACE(price, '$', ''), ',','');
^
HINT: You will need to rewrite or cast the expression.
I tried to follow the hint and tried some Google searches for examples, but my small brain hasn't been able to figure it out.
In PostgreSQL you can do this in one swoop, rather than replacing '$' and ',' s in separate calls:
UPDATE product_table
SET cost = regexp_replace(price, '[$,]', '', 'g')::numeric(15,2);
In regexp_replace the pattern [$,] means to replace either of '$' or ',' with the replace string (the empty string '' in this case), and the 'g' flag indicates that all such patterns need to be replaced.
Then you need to cast the resulting string to a numeric(15,2) value.
Simply cast the result of REPLACE with cast .. as numeric.
Try this:
UPDATE product_table
SET cost = CAST(REPLACE(REPLACE(price, '$', ''), ',','') AS NUMERIC);
I wouldn't suggest having this table structure though, because it can lead to anomalies (cost value doesn't reflect the price value).

postgres coalesce fields, sum and group by

maybe someone can help me out with a postgres query.
the table structure looks like this
nummer nachname vorname cash
+-------+----------+----------+------+
2 Bert Brecht 0,758
2 Harry Belafonte 1,568
3 Elvis Presley 0,357
4 Mark Twain 1,555
4 Ella Fitz 0,333
…
How can I coalesce the fields where "nummer" are the same and sum the cash values?
My output should look like this:
2 Bert, Brecht 2,326
Harry, Belafonte
3 Elvis, Presley 0,357
4 Mark, Twain 1,888
Ella, Fitz
I think the part to coalesce should work something like this:
array_to_string(array_agg(nachname|| ', ' ||coalesce(vorname, '')), '<br />') as name,
Thanks for any help,
tony
SELECT
nummer,
string_agg(nachname||CASE WHEN vorname IS NULL THEN '' ELSE ', '||vorname END, E'\n') AS name,
sum(cash) AS total_cash
FROM Table1
GROUP BY nummer;
See this SQLFiddle; note that it doesn't display the newline characters between names, but they're still there.
The CASE statement is used instead of coalesce so you don't have a trailing comma on entries with a last name but no first name. If you want a trailing comma, use format('%s, %s',vorname,nachname) instead and avoid all that ugly string concatenation business:
SELECT
nummer, string_agg(format('%s, %s', nachname, vorname), E'\n'),
sum(cash) AS total_cash
FROM Table1
GROUP BY nummer;
If string_agg doesn't work, get a newer PostgreSQL, or mention the version in your questions so it's clear you're using an obsolete version. The query is trivially rewritten to use array_to_string and array_agg anyway.
If you're asking how to sum numbers that're actually represented as text strings like 1,2345 in the database: don't do that. Fix your schema. Format numbers on input and output instead, store them as numeric, float8, integer, ... whatever the appropriate numeric type for the job is.

Postgres query: array_to_string with empty values

I am trying to combine rows and concatenate two columns (name, vorname) in a Postgres query.
This works good like this:
SELECT nummer,
array_to_string(array_agg(name|| ', ' ||vorname), '\n') as name
FROM (
SELECT DISTINCT
nummer, name, vorname
FROM myTable
) AS m
GROUP BY nummer
ORDER BY nummer;
Unfortunately, if "vorname" is empty I get no results although name has a value.
Is it possible get this working:
array_to_string(array_agg(name|| ', ' ||vorname), '\n') as name
also if one column is empty?
Use coalesce to convert NULL values to something that you can concatenate:
array_to_string(array_agg(name|| ', ' ||coalesce(vorname, '<missing>')), '\n')
Also, you can concatenate strings directly without collecting them to an array by using the string_agg function.
If you have 9.1, then you can use third parameter for array_to_string - null string
array_to_string(array_agg(name), ',', '<missing>') from bbb

Split a column value into two columns in a SELECT?

I have a string value in a varchar column. It is a string that has two parts. Splitting it before it hits the database is not an option.
The column's values look like this:
one_column:
'part1 part2'
'part1 part2'
So what I want is a a result set that looks like:
col1,col2:
part1,part2
part1,part2
How can I do this in a SELECT statement? I found a pgsql function to split the string into an array but I do not know how to get it into two columns.
select split_part(one_column, ' ', 1) AS part1,
split_part(one_column, ' ', 2) AS part2 ...