How to find the first and last occurrences of a specific character inside a string in PostgreSQL - postgresql

I want to find the first and the last occurrences of a specific character inside a string. As an example, consider a string named "2010-####-3434", and suppose the character to be searched for is "#". The first occurrence of hash inside the string is at 6-th position, and the last occurrence is at 9-th position.

Well...
Select position('#' in '2010-####-3434');
will give you the first.
If you want the last, just run that again with the reverse of your string. A pl/pgsql string reverse can be found here.
Select length('2010-####-3434') - position('#' in reverse_string('2010-####-3434')) + 1;

My example:
reverse(substr(reverse(newvalue),0,strpos(reverse(newvalue),',')))
Reverse all string
Substring string
Reverse result

In the case where char = '.', an escape is needed. So the function can be written:
CREATE OR REPLACE FUNCTION last_post(text,char)
RETURNS integer LANGUAGE SQL AS $$
select length($1)- length(regexp_replace($1, E'.*\\' || $2,''));
$$;

9.5+ with array_positions
Using basic PostgreSQL array functions we call string_to_array(), and then feed that to array_positions() like this array_positions(string_to_array(str,null), c)
SELECT
arrpos[array_lower(arrpos,1)] AS first,
arrpos[array_upper(arrpos,1)] AS last
FROM ( VALUES
('2010-####-3434', '#')
) AS t(str,c)
CROSS JOIN LATERAL array_positions(string_to_array(str,null), c)
AS arrpos;

I do not know how to do that, but the regular expression functions like regexp_matches, regexp_replace, and regexp_split_to_array may be an alternative route to solving your problem.

This pure SQL function will provide the last position of a char inside the string, counting from 1. It returns 0 if not found ... But (big disclaimer) it breaks if the character is some regex metacharacter ( .$^()[]*+ )
CREATE FUNCTION last_post(text,char) RETURNS integer AS $$
select length($1)- length(regexp_replace($1, '.*' || $2,''));
$$ LANGUAGE SQL IMMUTABLE;
test=# select last_post('hi#-#-#byte','#');
last_post
-----------
7
test=# select last_post('hi#-#-#byte','a');
last_post
-----------
0
A more robust solution would involve pl/pgSQL, as rfusca's answer.

Another way to count last position is to slit string to array by delimeter equals to needed character and then substract length of characters
for the last element from the length of whole string
CREATE FUNCTION last_pos(char, text) RETURNS INTEGER AS
$$
select length($2) - length(a.arr[array_length(a.arr,1)])
from (select string_to_array($2, $1) as arr) as a
$$ LANGUAGE SQL;
For the first position it is easier to use
select position('#' in '2010-####-3434');

Related

How to get a list of quoted strings from the output of a SELECT query that has that list of quoted strings in it, but is of type string?

The following code is not a full setup that you can run to check. It shall just make it a bit clearer what the question is about.
With an example function like this (the variadic example is taken from PostgreSQL inserting list of objects into a stored procedure or PostgreSQL - Passing Array to Stored Function):
CREATE OR REPLACE function get_v(variadic _v text[]) returns table (v varchar(50)) as
$F$
declare
begin
return query
select t.v
from test t
where t.v = any(_v)
end;
$F$
language plpgsql
;
If you copy the one-value output of a select string_agg... query, 'x','y','z', by hand and put it as the argument of the function, the function works:
SELECT v FROM get_v_from_v(
'x','y','z'
);
The 'x','y','z' gets read into the function as variadic _v text[] so that the function can check its values with where t.v = any(_v).
If you instead put the (select string_agg...) query that is behind that 'x','y','z' output in the same place, the function does not work:
select v from get_v_from_v(
(select string_agg(quote_literal(x.v), ',') from (select v from get_v_from_name('something')) as x)
);
That means: the "one-value output field" 'x','y','z' that comes from the (select string_agg...) query is not the same as the text[] list type: 'x','y','z'.
With get_v_from_name('something') as another function that returns a table of one column and the "v" values in the rows, and after running the string_agg() on its output, you get the 'x','y','z' output. I learnt this working function string_agg() at How to make a list of quoted strings from the string values of a column in postgresql?. The full list of such string functions is in the postgreSQL guide at 9.4. String Functions and Operators.
I guess that the format of the select query output is just a string, not a list, so that the input is not seen as a list of quoted strings, but rather like a string as a whole: ''x','y','z''. The get_v_from_v argument does not need just one string of all values, but a list of quoted strings, since the argument for the function is of type text[] - which is a list.
It seems as if this question does not depend on the query that is behind the output. It seems rather just a general thing that the output in a tuple of a table and taken as the argument of a function is not the same as the same output hand-copied as the same argument.
Therefore, the question. What needs to be done to make the output of a select query the same as the hand-copy of its output, so that the output is just the list 'x','y','z', as if it was just copied and pasted?
PS: I guess that this way of making lists of quoted strings from the one-column table output only to pass it to the function is not best practice. For example, in TSQL/SQL Server, you should pass "table valued parameters", so that you pass values as a table that you select from within the function to get the values, see How do I pass a list as a parameter in a stored procedure?. Not sure how this is done in postgreSQL, but it might be what is needed here.
CREATE OR REPLACE function get_v(_v text[]) returns table (v varchar(50)) as
$F$
declare
begin
return query
select t.v
from test t
where t.v = any((select * from unnest(_v)))
end;
$F$
language plpgsql
;
With get_v_from_name('something') as another function that returns a table of one column and the "v" values in the rows (this was said in the question), the following works:
select v from get_v_from_v(
(select array_agg(x.v) from (select v from get_v_from_name('something')) as x)
);
Side remark:
array_agg(quote_literal(x.v), ',') instead of array_agg(x.v) fails, the function does not allow a second argument.

Combining function parameter with a like_regex function in postgres when searching jsonb GIN index

I am trying to do a case-insensitive partial search (contains string) on a property value - stored in a jsonb field in postgres.
The search is looking for a value within the title column of table destination which has an array of elements as follows:
[{"key": "EN", "text":"london and milk"},{"key": "FR", "text":"Edinburgh with milk and honey"}]
I have created a GIN index on the title field and a function to deal with the search.
CREATE OR REPLACE FUNCTION search(query_string character varying)
RETURNS SETOF destination
LANGUAGE 'plpgsql'
AS $BODY$
begin
return query select *
from destination
--where title #? '$.* ? (# like_regex ' || query_string || ' flag "i")';
where title #? '$.* ? (# like_regex ".*milk.*" flag "i")';
end;
$BODY$;
So the function works nicely if the regexp string is hardcoded (as shown above), but the search should be based on the incoming query_string. The commented line in the function shows an attempt to try to include the parameter in the query. (this will result in unterminated string constant error)
How can I exchange the hard-coded milk to parameter search_query?
Are there other (simpler) ways that would yield the same end result?
Your problem is one of precedence. #? and '||' are tied and are processed left to right, so you are applying #? to only a fragment of the string not the completely built string. Then you are trying to concat things to the Boolean result of #?. You can fix this by constructing the string inside parentheses. A side affect of this is that you then have to cast it to jsonpath explicitly.
where title #? ( '$.* ? (# like_regex "' || query_string || '" flag "i")' )::jsonpath;
But I think it would be cleaner to construct the jsonpath in a variable, rather than on the fly in the query itself. Could someone inject something into the jsonpath string that could do something nasty? I don't know enough about jsonpath to rule that out.
(code part of the suggested solution edited by question author to include the double quotes missing - see comment)

PLpgSQL function not returning matching titles

I am trying to return the movie name and the number of cast and crew when given a text. When I input the string and am using ilike, my query returns no matching titles. I created a view previously that has the movie titles and the number of crew to be input in the function.
My code is:
create or replace view movies_crew as
select movies.id, movies.title, principals.role
from movies
join principals on principals.movie_id=movies.id
where principals.role <> 'producer'
;
create or replace view movie_makers as
select movies_crew.title, count(movies_crew.title) as ncrew
from movies_crew
where movies_crew.title = 'Fight Club'
group by movies_crew.title;
CREATE or REPLACE function Q11(partial_title text)
RETURNS SETOF text
AS $$
DECLARE
title text;
BEGIN
for title in
select movie_makers.title, movie_makers.ncrew
from movie_makers
where movie_makers.title ilike '%$1%'
loop
return next movie_makers.title||'has'||movie_makers.ncrew||'cast and crew';
end loop;
if(not found) then
return next 'No matching titles';
end if;
END;
$$ LANGUAGE plpgsql;
select * from q11('Fight Club')
My database is: https://drive.google.com/file/d/1NVRLiYBVbKuiazynx9Egav7c4_VHFEzP/view?usp=sharing
Your immediate quoting issue aside (has been addressed properly by Jeff), the function can be much simpler and faster like this:
CREATE or REPLACE FUNCTION q11(partial_title text)
RETURNS SETOF text
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY
SELECT m.title || ' has ' || m.ncrew || ' cast and crew'
FROM movie_makers m
WHERE m.title ~* $1;
IF NOT FOUND THEN
RETURN NEXT 'No matching titles';
END IF;
END
$func$;
Major points:
Your function was still broken. References to movie_makers.title and movie_makers.ncrew wouldn't work that way. I fixed it.
Use RETURN QUERY instead of the loop. This way we also do not need to use or even declare any variables at all. See:
How to return result of a SELECT inside a function in PostgreSQL?
Optionally use the case insensitive regular expression match operator ~*. (Simpler, not faster.)
Difference between LIKE and ~ in Postgres
Either way, you may want to escape special characters. See:
Escape function for regular expression or LIKE patterns
Aside: hardly makes sense to filter on a view that already selects 'Fight Club' as its only row. For a meaningful search, you wouldn't use these views ...
ilike '%$1%'
$1 is not interpolated when inside single quotes, so you are searching for the literal characters $ and 1.
You could instead do:
ilike '%'||$1||'%'

Pg sql use ilike operator to search text in array

I have query like
_search_text := 'ind'
select * from table where (_search_text ILIKE ANY (addresses))
The st.addresses have value like [india,us,uk,pak,bang]
It should return each item rows where any item of column addresses contains the string _search_text,
Currently it returns only if give full india in _search_text.
What should I make the change
I was also try to thinkin to use unnest, but since it wil be a sub clause of a very long where cluase... so avoid that.
Thanks
I afraid so it is not possible - LIKE, ILIKE supports a array only on right side, and there is searching pattern. Not string. You can write own SQL function:
CREATE OR REPLACE FUNCTION public.array_like(text[], text)
RETURNS boolean
LANGUAGE sql
IMMUTABLE STRICT
AS $function$
select bool_or(v like $2) from unnest($1) v
$function$
and the usage can looks like:
WHERE array_like(addresses, _search_text)
So the readability of this query can be well. But I afraid about performance. There cannot be used any index on this expression. It is result of little bit bad design (the data are not normalized).
ilike ignores cases (difference in upper or lower cases), it doesn't search for string containing your value.
In your case you can use:
_search_text := '%ind%'
select * from table where (_search_text ILIKE ANY (addresses))
ANY will not work here because the arguments are in the wrong order for ILIKE.
You can define a custom operator to make this work. But this will most likely suffer from poor performance:
create table addresses(id integer primary key, states varchar[]);
insert into addresses values (1,'{"belgium","france","united states"}'),
(2,'{"belgium","ireland","canada"}');
CREATE OR REPLACE FUNCTION pg_catalog."rlike"(
leftarg text,
rightarg text)
RETURNS boolean
LANGUAGE 'sql'
COST 100
IMMUTABLE STRICT PARALLEL SAFE SUPPORT pg_catalog.textlike_support
AS $BODY$
SELECT pg_catalog."like"(rightarg,leftarg);
$BODY$;
ALTER FUNCTION pg_catalog."rlike"(text, text)
OWNER TO postgres;
CREATE OPERATOR ~~~ (
leftarg = text,
rightarg = text,
procedure = rlike
);
select * from addresses where 'bel%'::text ~~~ ANY (states);
It should be possible to define this function in C to make it faster.

Filtering on multiple values in function

The non-functional version of my problem is easy: I want to filter values based upon whether a text variable has some specific values:
SELECT routine_name, data_type
FROM information_schema.routines
WHERE data_type IN ('boolean', 'integer');
Since I want to do many variations on what I'm filtering on, I want to have a function that accepts the values to filter on. My attempt at turning this into a function looks like this:
CREATE FUNCTION get_fns_by_data_type(data_types text[])
RETURNS TABLE(routine_name text, data_type text) AS $$
BEGIN
RETURN QUERY SELECT routine_name, data_type
FROM information_schema.routines
WHERE data_type IN (array_to_string($1, ','));
END;
$$ LANGUAGE plpgsql;
When I call it:
SELECT * FROM get_fns_by_data_type(ARRAY['boolean', 'integer'])
I get no results.
I suspect that somehow I should be quoting the values, but I'm not sure of the best approach to this, nor how to debug the problem.
How do I use the array in my WHERE clause?
array_to_string returns a single string, not a list of string, so in your function you are actually running:
where data_type IN ('boolean, integer')
(which is clearly not what you intended)
You don't need convert the array in the first place. You can use it directly
where data_type = any ($1)