How to use redshift regex to get out numbers in an array - amazon-redshift

In redshift, I have a column that contains an array-like string like [1,2,3] and I want to return 1,2,3 using Redshift's regex functionality. How can one do this? I don't want to do this:
SELECT LISTAGG(option_name , ',') WITHIN GROUP (ORDER BY option_name) as pets_names
FROM reference.vital_options
WHERE option_id in
(
-- this nested CTE splits the json string array into comma separated pet ids
with NS AS (
SELECT vo.option_id + 1 as n
FROM <column with number id> as vo
WHERE upper(vo.country) = 'US'
...
)
select TRIM(JSON_EXTRACT_ARRAY_ELEMENT_TEXT(u.pets_vital, NS.n - 1)) AS val
FROM NS
INNER JOIN go_prod.users AS u ON NS.n <= JSON_ARRAY_LENGTH(u.pets_vital)
WHERE u.id = %(user_id)s
)
AND ...

Is all you are trying to do is remove the square brackets? If so then the translate() function is likely what you want to use. For example:
create table test as (select '[1,2,3]'::text as A);
select a, translate(a, '][', '') as b from test;

Related

How to get a row for each occurrence of Id in IN clause?

Given that I have a list of repeating ids that I need to fetch some additional data to populate xls spreadsheet, how can I do that. "IN" clause returns only 1 match, but I need a row for each occurrence of an Id. I looked at PIVOT, thinking I could create a select list and then do inner join.
Select m.Id, m.LegalName, m.OtherId
from MyTable m
where m.OtherId in (1,2,1,1,3,1,4,4,2,1)
You can use VALUES clause :
SELECT t.id as OtherId, m.id, m.LegalName
FROM ( VALUES (1),(2),(1),(1),(3),(1),(4),(4),(2),(1)
) t(ID) INNER JOIN
MyTable m
ON m.OtherId = t.id;

Split and sequentially join string parts in Postgresql

I need to create a DB view with parts of sequential combinations of string parts of a source column. Example:
IN:
tag
--------
A_B_C_D
X_Y_Z
OUT:
subtag
--------
A
A_B
A_B_C
A_B_C_D
X
X_Y
X_Y_Z
The answer seems to be somewhere around WITH RECURSIVE, but I cannot put it all together.
demo:db<>fiddle
SELECT
array_to_string( -- 3
array_agg(t.value) OVER (PARTITION BY tags ORDER BY t.number), --2
'_'
) AS subtag
FROM
tags,
regexp_split_to_table(tag, '_') WITH ORDINALITY as t(value, number) -- 1
Split the string into one row per element. The WITH ORDINALITY adds a row count which can be used to hold the original order of the elements
Using array_agg() window function to aggregate the elements. The ORDER BY makes it cumulative
Reaggregate the array into a string.
You can use a recursive query:
WITH RECURSIVE s AS (
SELECT tag FROM tag
UNION
SELECT regexp_replace(tag, '_[^_]*$', '') FROM s
)
SELECT * FROM s;
tag
---------
A_B_C_D
X_Y_Z
A_B_C
X_Y
A_B
X
A
(7 rows)
The idea is to successively cut off _* at the end.
Thanks a lot #laurenz-albe! There is a problem with your code that it's missing recursion break condition. So I ended up with this:
WITH RECURSIVE s AS (
SELECT tag FROM tag
UNION
SELECT regexp_replace(tag, '_[^_]*$', '')
FROM s
WHERE tag LIKE '%\_%'
)
SELECT * FROM s;
db<>fiddle

Fuzzy string search in array with postgresql

This is how I do fuzzy string search in postgresql:
select * from table where levenshtein(name, 'value') < 2;
But what can I do if the 'name' colum contains array?
P.S.: It is necessary to use index. And this is the difference.
You can use unnest() over the array:
select * from
(
select unnest(name) as name_in_array, id from
(
select 1 as id, ARRAY['value1','valu','lav'] as name
union all
select 2 as id, ARRAY['value2','orange','yellow'] as name
)t1
) t2
where levenshtein(name_in_array, 'value') < 2;

How to count the frequency of integers in a set of querystrings in postgres

I have a column in a postgres database which logs search querystrings for a page on our website.
The column contains data like
"a=2&b=4"
"a=2,3"
"b=4&a=3"
"a=4&a=3"
I'd like to work out the frequency of each value for a certain parameter (a).
value | freq
------|------
3 | 3
2 | 2
4 | 1
Anyway to do this in a single SQL statement?
Something like this:
with all_values as (
select string_to_array(split_part(parameter, '=', 2), ',') as query_params
from the_table d,
unnest(string_to_array(d.querystring, '&')) as x(parameter)
where x.parameter like 'a%'
)
select t.value, count(*)
from all_values av, unnest(av.query_params) as t(value)
group by t.value
order by t.value;
Online example: http://rextester.com/OXM67442
try something like this :
select data_value,count(*) from (
select data_name,unnest(string_to_array(data_values,',')) data_value from (
select split_part(data_array,'=',1) data_name ,split_part(data_array,'=',2) data_values from (
select unnest(string_to_array(mydata,'&')) data_array from mytable
) a
) b
) c where data_name='a' group by 1 order by 1
Assuming tha table that keeps the counts is called paramcount:
WITH vals(v) AS
(SELECT regexp_replace(p, '^.*=', '')
FROM regexp_split_to_table(
'b=4&a=3,2',
'&|,'
) p(p)
)
INSERT INTO paramcount (value, freq)
SELECT v, 1 FROM vals
ON CONFLICT (value)
DO UPDATE SET freq = paramcount.freq + 1
WHERE paramcount.value = EXCLUDED.value;
get csv integer after 'a='
split that to numbers
stat values
select v, count(*) from (
SELECT c,unnest(string_to_array(unnest(regexp_matches(c,'a=([0-9,]+)','g')),',')) as v FROM qrs
) x group by v;
Parametrize:
WITH argname(aname) as (values ('a'::TEXT))
select v, count(*) from (SELECT c,unnest(string_to_array(unnest(regexp_matches(c,aname||'=([0-9,]+)','g')),',')) as v FROM qrs,argname) x group by v;

T-SQL Query, combine columns from multiple rows into single column

I have seeen some examples of what I am trying to do using COALESCE and FOR XML (seems like the better solution). I just can't quite get the syntax right.
Here is what I have (I will shorten the fields to only the key ones):
Table Fields
------ -------------------------------
Requisition ID, Number
IssuedPO ID, Number
Job ID, Number
Job_Activity ID, JobID (fkey)
RequisitionItems ID, RequisitionID(fkey), IssuedPOID(fkey), Job_ActivityID (fkey)
I need a query that will list ONE Requisition per line with its associated Jobs and IssuedPOs. (The requisition number start with "R-" and the Job Number start with "J-").
Example:
R-123 | "PO1; PO2; PO3" | "J-12345; J-6780"
Sure thing Adam!
Here is a query that returns multiple rows. I have to use outer joins, since not all Requisitions have RequisitionItems that are assigned to Jobs and/or IssuedPOs (in that case their fkey IDs would just be null of course).
SELECT DISTINCT Requisition.Number, IssuedPO.Number, Job.Number
FROM Requisition
INNER JOIN RequisitionItem on RequisitionItem.RequisitionID = Requisition.ID
LEFT OUTER JOIN Job_Activity on RequisitionItem.JobActivityID = Job_Activity.ID
LEFT OUTER JOIN Job on Job_Activity.JobID = Job.ID
LEFT OUTER JOIN IssuedPO on RequisitionItem.IssuedPOID = IssuedPO.ID
Here's one way to do it using subqueries:
select 'R-' + cast(r.number as varchar(32)) as RequisitionNumber
, (
select 'PO' + CAST(ip.number as varchar(32)) + ';'
from IssuedPO ip
join RequisitionItems ri
on ip.id = ri.IssuedPOID
where ri.RequisitionID = r.id
for xml path('')
) as POList
, (
select 'J-' + CAST(j.number as varchar(32)) + ';'
from Job j
join Job_Activity ja
on j.id = ja.JobID
join RequisitionItems ri
on ri.Job_ActivityID = ja.id
where ri.RequisitionID = r.id
for xml path('')
) as JobList
from Requisition r