Extract all matches with a specific pattern in DuckDB

Extract all matches with a specific pattern in DuckDB - postgresql

I am trying to translate a query from Postgres to DuckDB that does the following: for a given string the query returns
All numbers
All pairs of consecutive tokens
The original Postgres queries are:
select (regexp_matches('34 121 adelaide st melbourne 3000', '[\d]+', 'g'))[1];
select (regexp_matches ( '34 121 adelaide st melbourne 3000', '[a-z0-9]+ [a-z0-9]+', 'g' ))[1] union select (regexp_matches ( regexp_replace ( '34 121 adelaide st melbourne 3000', '[a-z0-9]+', '' ), '[a-z0-9]+ [a-z0-9]+', 'g' ))[1];
For example, given the string '34 121 adelaide st melbourne 3000':
Return a table with row values 34, 121, 3000
Return a table with row values '34 121', '121 adelaide', 'adelaide st', 'st melbourne', 'melbourne 3000']
Using the regexp_extract function I can only return the first match. E.g.,
select regexp_extract('34 121 adelaide st melbourne 3000', '[\d]+');
produces '34' but none of the other digits.
Similarly select regexp_extract('34 121 adelaide st melbourne 3000', '[a-z0-9]+ [a-z0-9]+'); produces '34 121'
The second query I can re-write using a join to produce the correct results (although I would still prefer to do this in a simpler way).
Would anyone be able to assist?
I tried `select regexp_extract('34 121 adelaide st melbourne 3000', '[\d]+');' which results in a table containing only '34' and none of the other numbers.

Related

Pivoting results from CTE in Postgres

I have a large SQL statements(PostgreSQL version 11) with many CTE's, i want to use the results from an intermediary CTE to create a PIVOTed set of results and join it with other CTE.
Below is a small part of my query and the CTE "previous_months_actual_sales" is the one i need to PIVOT.
,last_24 as
(
SELECT l_24m::DATE + (interval '1' month * generate_series(0,24)) as last_24m
FROM last_24_month_start LIMIT 24
)
,previous_months_actual_sales as
(
SELECT TO_CHAR(created_at,'YYYY-MM') as dates
,b.code,SUM(quantity) as qty
FROM base b
INNER JOIN products_sold ps ON ps.code=b.code
WHERE TO_CHAR(created_at,'YYYY-MM')
IN(SELECT TO_CHAR(last_24m,'YYYY-MM') FROM last_24)
GROUP BY b.code,TO_CHAR(created_at,'YYYY-MM')
)
SELECT * FROM previous_months_actual_sales
The results of this CTE "previous_months_actual_sales" is shown below,
dates code qty
"2018-04" "0009" 23
"2018-05" "0009" 77
"2018-06" "0008" 44
"2018-07" "0008" 1
"2018-08" "0009" 89
The expected output based on the above result is,
code. 2018-04. 2018-05. 2018-06. 2018-07. 2018-08
"0009". 23 77 89
"0008". 44 1
Is there a way to achieve this?

split_part function from nth split till end of string

After the second space, I need to fetch the values till the particular position in the string.
Source:
"8 115 MACKIE STREET VICTORIA PARK WA 6100 AU"
"6A CAMBOON ROAD MORLEY WA 6062 AU"
output:
"MACKIE STREET VICTORIA PARK"
"CAMBOON ROAD MORLEY"
I'm trying to split the street name and suburb from the unit #,street# present in the beginning and the state, postcode, country present in the end.

t=# with s(v) as (values('6A CAMBOON ROAD MORLEY WA 6062 AU'),('8 115 MACKIE STREET VICTORIA PARK WA 6100 A'))
, split as (select *,count(1) over (partition by v) from s, regexp_matches(v,'( [A-Z]+)','g') with ordinality t(m,o))
select distinct v,string_agg(m[1],'') over (partition by v) from split where o <= count-(3-1);
v | string_agg
---------------------------------------------+------------------------------
8 115 MACKIE STREET VICTORIA PARK WA 6100 A | MACKIE STREET VICTORIA PARK
6A CAMBOON ROAD MORLEY WA 6062 AU | CAMBOON ROAD MORLEY
(2 rows)
I excluded index (or any not fitting mask [A-Z]+) thus cutting not three positions from the end, but two (3-1) where 1 is ahead known index.
Also I start not from the second space as it would be against your desired result...

results mismatched when retrieved dates from column of type character varying

I have two tables,i want to get the min and max date stored in table1 cfrange column which is of type character varying.
table1 and table2 is mapped using sid. i want to get the max and min date range when compared with sid of table2.
table1:
sid cfrange
100 3390
101 8000
102 5/11/2010
103 11/12/2016
104 01/03/2016
105 4000
106 4000
107 03/12/2017
108 03/11/2016
109 4/04/2018
110 10/12/2016
table2:
sid description
102 success
103 success
104 Proceeding
107 success
108 success
I tried as below but its not giving the correct min and max value.Please advice.
select max(t1.cfrange),min(t1.cfrange) from table1 t1,table2 t2 where t1.sid=t2.sid;

You should join two tables and cast cfrange as a date and cross your fingers. (May be you must format it as a date before to cast it).
create table table1 (sid int, cfrange varchar(30));
insert into table1 values
(100, '3390'),
(101, '8000'),
(102, '5/11/2010'),
(103, '11/12/2016'),
(104, '01/03/2016'),
(105, '4000'),
(106, '4000'),
(107, '03/12/2017'),
(108, '03/11/2016'),
(109, '4/04/2018'),
(110, '10/12/2016');
create table table2 (sid int, description varchar(30));
insert into table2 values
(102, 'success'),
(103, 'success'),
(104, 'Proceeding'),
(107, 'success'),
(108, 'success');
select 'Min' as caption, min(cfrange) as value
from (select table1.sid, table1.cfrange::date
from table1
inner join table2
on table1.sid = table2.sid) tt
UNION ALL
select 'Max' as caption, max(cfrange) as value
from (select table1.sid, table1.cfrange::date
from table1
inner join table2
on table1.sid = table2.sid) tt;
caption | value
:------ | :---------
Min | 2010-11-05
Max | 2017-12-03
dbfiddle here

PostgreSQL How to check range of integer in case statement

I am having problem to fetch query in which i have a check user score in range to display the grade if user score between 75 and 100 then its A. If user score between 60- 75 then its B and .. so on .
I am getting this values
CASE users.points_earned
WHEN 75-100 THEN
'A+'
WHEN 60-75 THEN
'A'
WHEN 40-60 THEN
'B+'
WHEN 1--40 THEN
'B'
ELSE
'Absent'
end as rank
buts not working how to check range in case statement of postgresql

You can use BETWEEN for check ranges.
WITH users(points_earned) as(
select 75
union all
select 90
union all
select 200
)
SELECT CASE
WHEN users.points_earned BETWEEN 40 AND 75 THEN 'A+'
WHEN users.points_earned BETWEEN 76 AND 100 THEN 'A'
ELSE 'Absent'
END as rank
FROM users

PGSQL duplicate record in same column

i have a table and i want to know where duplicate records are present for same columns. These are my columns and i want to get record where group_id or week are different for same code and fweek and newcode
Id newcode fweek code group_id week
1 343001 2016-01 343 100 8
2 343002 2016-01 343 100 8
3 343001 2016-01 343 101 08
Required record is
Id newcode fweek code group_id week
3 343001 2016-01 343 101 08

To find the duplicate values i have joined the table with itself.
and we need to group the results with code,fweek and newcode to get more than one duplicate rows if they exist. i have used max() to get last inserted row.
you don't need to use is distinct from (it is same for inequality + NULL). if you don't want to compare NULL ones, use <> operator.
You find more information about here info
select r.*
from your_table r
where r.id in (select max(r.id)
from your_table r
join your_table r2 on r2.code = r.code and r2.fweek = r.fweek and r2.newcode = r.newcode
where
r2.group_id is distinct from r.group_id or
r2.week is distinct from r.week
group by r.code,
r.fweek,
r.newcode
having count(*) > 1)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Extract all matches with a specific pattern in DuckDB - postgresql

Related

Pivoting results from CTE in Postgres

split_part function from nth split till end of string

results mismatched when retrieved dates from column of type character varying

PostgreSQL How to check range of integer in case statement

PGSQL duplicate record in same column

Categories

Resources