Extract all matches with a specific pattern in DuckDB - postgresql

I am trying to translate a query from Postgres to DuckDB that does the following: for a given string the query returns
All numbers
All pairs of consecutive tokens
The original Postgres queries are:
select (regexp_matches('34 121 adelaide st melbourne 3000', '[\d]+', 'g'))[1];
select (regexp_matches ( '34 121 adelaide st melbourne 3000', '[a-z0-9]+ [a-z0-9]+', 'g' ))[1] union select (regexp_matches ( regexp_replace ( '34 121 adelaide st melbourne 3000', '[a-z0-9]+', '' ), '[a-z0-9]+ [a-z0-9]+', 'g' ))[1];
For example, given the string '34 121 adelaide st melbourne 3000':
Return a table with row values 34, 121, 3000
Return a table with row values '34 121', '121 adelaide', 'adelaide st', 'st melbourne', 'melbourne 3000']
Using the regexp_extract function I can only return the first match. E.g.,
select regexp_extract('34 121 adelaide st melbourne 3000', '[\d]+');
produces '34' but none of the other digits.
Similarly select regexp_extract('34 121 adelaide st melbourne 3000', '[a-z0-9]+ [a-z0-9]+'); produces '34 121'
The second query I can re-write using a join to produce the correct results (although I would still prefer to do this in a simpler way).
Would anyone be able to assist?
I tried `select regexp_extract('34 121 adelaide st melbourne 3000', '[\d]+');' which results in a table containing only '34' and none of the other numbers.

Related

Pivoting results from CTE in Postgres

I have a large SQL statements(PostgreSQL version 11) with many CTE's, i want to use the results from an intermediary CTE to create a PIVOTed set of results and join it with other CTE.
Below is a small part of my query and the CTE "previous_months_actual_sales" is the one i need to PIVOT.
,last_24 as
(
SELECT l_24m::DATE + (interval '1' month * generate_series(0,24)) as last_24m
FROM last_24_month_start LIMIT 24
)
,previous_months_actual_sales as
(
SELECT TO_CHAR(created_at,'YYYY-MM') as dates
,b.code,SUM(quantity) as qty
FROM base b
INNER JOIN products_sold ps ON ps.code=b.code
WHERE TO_CHAR(created_at,'YYYY-MM')
IN(SELECT TO_CHAR(last_24m,'YYYY-MM') FROM last_24)
GROUP BY b.code,TO_CHAR(created_at,'YYYY-MM')
)
SELECT * FROM previous_months_actual_sales
The results of this CTE "previous_months_actual_sales" is shown below,
dates code qty
"2018-04" "0009" 23
"2018-05" "0009" 77
"2018-06" "0008" 44
"2018-07" "0008" 1
"2018-08" "0009" 89
The expected output based on the above result is,
code. 2018-04. 2018-05. 2018-06. 2018-07. 2018-08
"0009". 23 77 89
"0008". 44 1
Is there a way to achieve this?

split_part function from nth split till end of string

After the second space, I need to fetch the values till the particular position in the string.
Source:
"8 115 MACKIE STREET VICTORIA PARK WA 6100 AU"
"6A CAMBOON ROAD MORLEY WA 6062 AU"
output:
"MACKIE STREET VICTORIA PARK"
"CAMBOON ROAD MORLEY"
I'm trying to split the street name and suburb from the unit #,street# present in the beginning and the state, postcode, country present in the end.
t=# with s(v) as (values('6A CAMBOON ROAD MORLEY WA 6062 AU'),('8 115 MACKIE STREET VICTORIA PARK WA 6100 A'))
, split as (select *,count(1) over (partition by v) from s, regexp_matches(v,'( [A-Z]+)','g') with ordinality t(m,o))
select distinct v,string_agg(m[1],'') over (partition by v) from split where o <= count-(3-1);
v | string_agg
---------------------------------------------+------------------------------
8 115 MACKIE STREET VICTORIA PARK WA 6100 A | MACKIE STREET VICTORIA PARK
6A CAMBOON ROAD MORLEY WA 6062 AU | CAMBOON ROAD MORLEY
(2 rows)
I excluded index (or any not fitting mask [A-Z]+) thus cutting not three positions from the end, but two (3-1) where 1 is ahead known index.
Also I start not from the second space as it would be against your desired result...

results mismatched when retrieved dates from column of type character varying

I have two tables,i want to get the min and max date stored in table1 cfrange column which is of type character varying.
table1 and table2 is mapped using sid. i want to get the max and min date range when compared with sid of table2.
table1:
sid cfrange
100 3390
101 8000
102 5/11/2010
103 11/12/2016
104 01/03/2016
105 4000
106 4000
107 03/12/2017
108 03/11/2016
109 4/04/2018
110 10/12/2016
table2:
sid description
102 success
103 success
104 Proceeding
107 success
108 success
I tried as below but its not giving the correct min and max value.Please advice.
select max(t1.cfrange),min(t1.cfrange) from table1 t1,table2 t2 where t1.sid=t2.sid;
You should join two tables and cast cfrange as a date and cross your fingers. (May be you must format it as a date before to cast it).
create table table1 (sid int, cfrange varchar(30));
insert into table1 values
(100, '3390'),
(101, '8000'),
(102, '5/11/2010'),
(103, '11/12/2016'),
(104, '01/03/2016'),
(105, '4000'),
(106, '4000'),
(107, '03/12/2017'),
(108, '03/11/2016'),
(109, '4/04/2018'),
(110, '10/12/2016');
create table table2 (sid int, description varchar(30));
insert into table2 values
(102, 'success'),
(103, 'success'),
(104, 'Proceeding'),
(107, 'success'),
(108, 'success');
select 'Min' as caption, min(cfrange) as value
from (select table1.sid, table1.cfrange::date
from table1
inner join table2
on table1.sid = table2.sid) tt
UNION ALL
select 'Max' as caption, max(cfrange) as value
from (select table1.sid, table1.cfrange::date
from table1
inner join table2
on table1.sid = table2.sid) tt;
caption | value
:------ | :---------
Min | 2010-11-05
Max | 2017-12-03
dbfiddle here

PostgreSQL How to check range of integer in case statement

I am having problem to fetch query in which i have a check user score in range to display the grade if user score between 75 and 100 then its A. If user score between 60- 75 then its B and .. so on .
I am getting this values
CASE users.points_earned
WHEN 75-100 THEN
'A+'
WHEN 60-75 THEN
'A'
WHEN 40-60 THEN
'B+'
WHEN 1--40 THEN
'B'
ELSE
'Absent'
end as rank
buts not working how to check range in case statement of postgresql
You can use BETWEEN for check ranges.
WITH users(points_earned) as(
select 75
union all
select 90
union all
select 200
)
SELECT CASE
WHEN users.points_earned BETWEEN 40 AND 75 THEN 'A+'
WHEN users.points_earned BETWEEN 76 AND 100 THEN 'A'
ELSE 'Absent'
END as rank
FROM users

PGSQL duplicate record in same column

i have a table and i want to know where duplicate records are present for same columns. These are my columns and i want to get record where group_id or week are different for same code and fweek and newcode
Id newcode fweek code group_id week
1 343001 2016-01 343 100 8
2 343002 2016-01 343 100 8
3 343001 2016-01 343 101 08
Required record is
Id newcode fweek code group_id week
3 343001 2016-01 343 101 08
To find the duplicate values i have joined the table with itself.
and we need to group the results with code,fweek and newcode to get more than one duplicate rows if they exist. i have used max() to get last inserted row.
you don't need to use is distinct from (it is same for inequality + NULL). if you don't want to compare NULL ones, use <> operator.
You find more information about here info
select r.*
from your_table r
where r.id in (select max(r.id)
from your_table r
join your_table r2 on r2.code = r.code and r2.fweek = r.fweek and r2.newcode = r.newcode
where
r2.group_id is distinct from r.group_id or
r2.week is distinct from r.week
group by r.code,
r.fweek,
r.newcode
having count(*) > 1)