How to remove special characters from a string in postgresql - postgresql

I am trying to remove using REGEXP_REPLACE the following special characters: "[]{}
from the following text field: [{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}]
and replace them with nothing, not even a space.
*Needless to say, this is just an example string, and I need to find a consistent solution for similar but different strings.
I was trying to run the following: SELECT REGEXP_REPLACE('[{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}] ','[{[}]":]','')
But received pretty much the same string..
Thanks in advance!

You need to escape the special characters (\), and to specify that you want to repeat the operation for every characters ('g') else it will stop at the 1st match
SELECT REGEXP_REPLACE(
'[{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}] ',
'[{\[}\]":]',
'',
'g');
regexp_replace
--------------------------------------------------
xy,sG_1,cnC8,cnM2,gnG_2,cnCA99,cME3,gnG_3,cCA00
(1 row)

Related

Postgres replacing 'text' with e'text'

I inserted a bunch of rows with a text field like content='...\n...\n...'.
I didn't use e in front, like conent=e'...\n...\n..., so now \n is not actually displayed as a newline - it's printed as text.
How do I fix this, i.e. how to change every row's content field from '...' to e'...'?
The syntax variant E'string' makes Postgres interpret the given string as Posix escape string. \n encoding a newline is only one of many interpreted escape sequences (even if the most common one). See:
Insert text with single quotes in PostgreSQL
To "re-evaluate" your Posix escape string, you could use a simple function with dynamic SQL like this:
CREATE OR REPLACE FUNCTION f_eval_posix_escapes(INOUT _string text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT E''' || _string || '''' INTO _string;
END
$func$;
WARNING 1: This is inherently unsafe! We have to evaluate input strings dynamically without quoting and escaping, which allows SQL injection. Only use this in a safe environment.
WARNING 2: Don't apply repeatedly. Or it will misinterpret your actual string with genuine \ characters, etc.
WARNING 3: This simple function is imperfect as it cannot cope with nested single quotes properly. If you have some of those, consider instead:
Unescape a string with escaped newlines and carriage returns
Apply:
UPDATE tbl
SET content = f_eval_posix_escapes(content)
WHERE content IS DISTINCT FROM f_eval_posix_escapes(content);
db<>fiddle here
Note the added WHERE clause to skip updates that would not change anything. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
Use REPLACE in an update query. Something like this: (I'm on mobile so please ignore any typo or syntax erro)
UPDATE table
SET
column = REPLACE(column, '\n', e'\n')

How to remove double Quotes In DataStage using a transformer stage?

We receiving Input data like below
“VENKATA,KRISHNA”
I want output like below
VENKATA,KRISHNA
Can anyone help me with this
Check out the Ereplace function - it allows to replace certain characters so you could rplace " with '' (empty string).
An alternative is TRIM - you can specify which character the command should trim and also if All occurrences or Both (from both sides of the string) plus more.

Strip out the characters which is non numeric, dashes and pipes

I am trying to find a solution but somehow i am getting wrong output (referred some online solutions and confusing myself. please advise where i am going wrong.
I need to Strip out any characters that is non-numeric,dash "-" or pipe "|" using plsql.
As an example:
if I need to filter the string 0094-78556232_imk*.ext|4444; the output should be 0094-78556232|4444
Use REGEXP_REPLACE:
SELECT
col,
REGEXP_REPLACE (col, '[^0-9|-]', '') AS col_updated
FROM yourTable;
Demo
Don't use regexp_replace, especially if performance is important.
Instead use the standard string function TRANSLATE. Like so:
select col,
translate(col, '0123456789|-' || col, '01234567890|-') as col_updated
from yourTable;
This translates each character in the col value, according to the following scheme: 0 is translated to itself, ...., - is translated to itself. Any other character in col, which is not in this list already, is "translated" to nothing, since there is nothing for it to be translated to in the third argument to the function. So those characters that are NOT on the list are simply removed from the string.

removing leading zero and hyphen in Postgres

I need to remove leading zeros and hyphens from a column value in Postgresql database, for example:
121-323-025-000 should look like 12132325
060579-0001 => 605791
482-322-004 => 4823224
timely help will be really appreciated.
Postgresql string functions.
For more advanced string editing, regular expressions can be very powerful. Be aware that complex regular expressions may not be considered maintainable by people not familiar with them.
CREATE TABLE testdata (id text, expected text);
INSERT INTO testdata (id, expected) VALUES
('121-323-025-000', '12132325'),
('060579-0001', '605791'),
('482-322-004', '4823224');
SELECT id, expected, regexp_replace(id, '(^|-)0*', '', 'g') AS computed
FROM testdata;
How regexp_replace works. In this case we look for the beginning of the string or a hyphen for a place to start matching. We include any zeros that follow that as part of the match. Next we replace that match with an empty string. Finally, the global flag tells us to repeat the search until we reach the end of the string.

PostgreSQL regexp.replace all unwanted chars

I have registration codes in my PostgreSQL table which are written messy, like MU-321-AB, MU/321/AB, MU 321-AB and so forth...
I would need to clear all of this to get MU321AB.
For this I uses following expression:
SELECT DISTINCT regexp_replace(ccode, '([^A-Za-z0-9])', ''), ...
This expression work as expected in 'NET' but not in PostgreSQL where it 'clears' only first occurrence of unwanted character.
How would I modify regular expression which will replace all unwanted chars in string to get clear code with only letters and numbers?
Use the global flag, but without any capture groups:
SELECT DISTINCT regexp_replace(ccode, '[^A-Za-z0-9]', '', 'g'), ...
Note that the global flag is part of the standard regular expression parser, so .NET is not following the standard in this case. Also, since you do not want anything extracted from the string - you just want to replace some characters - you should not use capture groups ().