Redshift how to split a stringified array into separate parts - postgresql

Say I have a varchar column let's say religions that looks like this: ["Christianity", "Buddhism", "Judaism"] (yes it has a bracket in the string) and I want the string (not array) split into multiple rows like "Christianity", "Buddhism", "Judaism" so it can be used in a WHERE clause.
Eventually I want to use the results of the query in a where clause like this:
SELECT ...
FROM religions
WHERE name in
(
<this subquery>
)
How can one do this?

You can use the function JSON_PARSE to convert the varchar string into an array. Then you can use the strategy described in Convert varchar array to rows in redshift - Stack Overflow to convert the array to separate rows.

You can do the following.
Create a temporary table with sequence of numbers
Using the sequence and split_part function available in redshift, you can split the values based on the numbers generated in the temporary table by doing a cross join.
To replace the double quote and square brackets, you can use the regexp_replace function in Redshift.
create temp table seq as
with recursive numbers(NUMBER) as
(
select 1 UNION ALL
select NUMBER + 1 from numbers where NUMBER < 28
)
select * from numbers;
select regexp_replace(split_part(val,',',seq.number),'[]["]','') as value
from
(select '["christianity","Buddhism","Judaism"]' as val) -- You can select the actual column from the table here.
cross join
seq
where seq.number <= regexp_count(val,'[,]')+1;

Related

Split the column value and make key as column name in postgres query

I have the table with the column value as below:
data_as_of_date:20210202 unique_cc:3999
data_as_of_date:20220202 unique_cc:1999
i need to convert this column into like this:
data_as_of_date unique_cc
20210202 3999
20220202 1999
Sample data:
create table test (val varchar);
insert into test(val) values ('data_as_of_date:20210202 unique_cc:3999');
insert into test(val) values ('data_as_of_date:20220202 unique_cc:1999');
I have tried with unnest with string_to_array & crosstab functions, but it is not working.
You don't need unnest or a crosstab for this. A simple regular expression should do the trick:
select substring(the_column from 'data_as_of_date:([0-9]{8})') as data_as_of_date,
substring(the_column from 'unique_cc:([0-9]{4})') as unqiue_cc
from the_table;

How to re-map array column values in select in Postgresql?

Is it possible to re-map integer values from a Postgres array column in the select? This is what I have:
select unnest(tag_ids) from mention m where id = 288201;
unnest
---------
-143503
-143564
125192
143604
137694
tag_ids is integer[] column
I would like to translate those numbers. Functions like abs(unnest(..)) work but found I cannot use a CASE statement. Tx.
If you want to do anything non-trivial with the elements from an array after unnesting, use the set-returning function like table:
select u.tag_id
from mention m
cross join unnest(m.tag_ids) as u(tag_id)
where m.id = 288201;
Now, u.tag_id is an integer column that you can use like any other column, e.g. in a CASE expression.

Fetch rows from postgres table which contains a specific id in jsonb[] column

I have a details table with adeet column defined as jsonb[]
a sample value stored in adeet column is as below image
Sample data stored in DB :
I want to return the rows which satisfies id=26088 i.e row 1 and 3
I have tried array operations and json operations but it does'nt work as required. Any pointers
Obviously the type of the column adeet is not of type JSON/JSONB, but maybe VARCHAR and we should fix the format so as to convert into a JSONB type. I used replace() and r/ltrim() funcitons for this conversion, and preferred to derive an array in order to use jsonb_array_elements() function :
WITH t(jobid,adeet) AS
(
SELECT jobid, replace(replace(replace(adeet,'\',''),'"{','{'),'}"','}')
FROM tab
), t2 AS
(
SELECT jobid, ('['||rtrim(ltrim(adeet,'{'), '}')||']')::jsonb as adeet
FROM t
)
SELECT t.*
FROM t2 t
CROSS JOIN jsonb_array_elements(adeet) j
WHERE (j.value ->> 'id')::int = 26088
Demo
You want to combine JSONB's <# operator with the generic-array ANY construct.
select * from foobar where '{"id":26088}' <# ANY (adeet);

PostgreSQL calculate prefix combinations after split

I do have a string as entry, of the form foo:bar:something:221. I'm looking for a way to generate a table with all prefixes for this string, like:
foo
foo:bar
foo:bar:something
foo:bar:something:221
I wrote the following query to split the string, but can't figure out where to go from there:
select unnest(string_to_array('foo:bar:something:221', ':'));
An option is to simulate a loop over all elements, then take the sub-array from the input for each element index:
with data(input) as (
values (string_to_array('foo:bar:something:221', ':'))
)
select array_to_string(input[1:g.idx], ':')
from data
cross join generate_series(1, cardinality(input)) as g(idx);
generate_series(1, cardinality(input)) generates as many rows as the array has elements. And the expression input[1:g.idx] takes the "sub-array" starting with the first up to the "idx" one. As the output is an array, I use array_to_string to re-create the representation with the :
You can use string_agg as a window function. The default frame is from the beginning of the partition to the current row:
SELECT string_agg(s, ':') OVER (ORDER BY n)
FROM unnest(string_to_array('foo:bar:something:221', ':')) WITH ORDINALITY AS u(s, n);
string_agg
-----------------------
foo
foo:bar
foo:bar:something
foo:bar:something:221
(4 rows)

help with TSQL IN statement with int

I am trying to create the following select statement in a stored proc
#dealerids nvarchar(256)
SELECT *
FROM INVOICES as I
WHERE convert(nvarchar(20), I.DealerID) in (#dealerids)
I.DealerID is an INT in the table. and the Parameter for dealerids would be formatted such as
(8820, 8891, 8834)
When I run this with parameters provided I get no rows back. I know these dealerIDs should provided rows as if I do it individually I get back what I expect.
I think I am doing
WHERE convert(nvarchar(20), I.DealerID) in (#dealerids)
incorrectly. Can anyone point out what I am doing wrong here?
Use a table values parameter (new in SQl Server 2008). Set it up by creating the actual table parameter type:
CREATE TYPE IntTableType AS TABLE (ID INTEGER PRIMARY KEY)
Your procedure would then be:
Create Procedure up_TEST
#Ids IntTableType READONLY
AS
SELECT *
FROM ATable a
WHERE a.Id IN (SELECT ID FROM #Ids)
RETURN 0
GO
if you can't use table value parameters, see: "Arrays and Lists in SQL Server 2005 and Beyond, When Table Value Parameters Do Not Cut it" by Erland Sommarskog, then there are many ways to split string in SQL Server. This article covers the PROs and CONs of just about every method. in general, you need to create a split function. This is how a split function can be used:
SELECT
*
FROM YourTable y
INNER JOIN dbo.yourSplitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach to split a string in TSQL but there are numerous ways to split strings in SQL Server, see the previous link, which explains the PROs and CONs of each.
For the Numbers Table method to work, you need to do this one time table setup, which will create a table Numbers that contains rows from 1 to 10,000:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this split function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
Create Procedure up_TEST
#Ids VARCHAR(MAX)
AS
SELECT * FROM ATable a
WHERE a.Id IN (SELECT ListValue FROM dbo.FN_ListToTable(',',#Ids))
You can't use #dealerids like that, you need to use dynamic SQL, like this:
#dealerids nvarchar(256)
EXEC('SELECT *
FROM INVOICES as I
WHERE convert(nvarchar(20), I.DealerID) in (' + #dealerids + ')'
The downside is that you open yourself up to SQL injection attacks unless you specifically control the data going into #dealerids.
There are better ways to handle this depending on your version of SQL Server, which are documented in this great article.
Split #dealerids into a table then JOIN
SELECT *
FROM INVOICES as I
JOIN
ufnSplit(#dealerids) S ON I.DealerID = S.ParsedIntDealerID
Assorted split functions here (I'd probably a numbers table in this case for a small string