PostgreSQL regexp.replace all unwanted chars - postgresql

I have registration codes in my PostgreSQL table which are written messy, like MU-321-AB, MU/321/AB, MU 321-AB and so forth...
I would need to clear all of this to get MU321AB.
For this I uses following expression:
SELECT DISTINCT regexp_replace(ccode, '([^A-Za-z0-9])', ''), ...
This expression work as expected in 'NET' but not in PostgreSQL where it 'clears' only first occurrence of unwanted character.
How would I modify regular expression which will replace all unwanted chars in string to get clear code with only letters and numbers?

Use the global flag, but without any capture groups:
SELECT DISTINCT regexp_replace(ccode, '[^A-Za-z0-9]', '', 'g'), ...
Note that the global flag is part of the standard regular expression parser, so .NET is not following the standard in this case. Also, since you do not want anything extracted from the string - you just want to replace some characters - you should not use capture groups ().

Related

Postgres replacing 'text' with e'text'

I inserted a bunch of rows with a text field like content='...\n...\n...'.
I didn't use e in front, like conent=e'...\n...\n..., so now \n is not actually displayed as a newline - it's printed as text.
How do I fix this, i.e. how to change every row's content field from '...' to e'...'?
The syntax variant E'string' makes Postgres interpret the given string as Posix escape string. \n encoding a newline is only one of many interpreted escape sequences (even if the most common one). See:
Insert text with single quotes in PostgreSQL
To "re-evaluate" your Posix escape string, you could use a simple function with dynamic SQL like this:
CREATE OR REPLACE FUNCTION f_eval_posix_escapes(INOUT _string text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT E''' || _string || '''' INTO _string;
END
$func$;
WARNING 1: This is inherently unsafe! We have to evaluate input strings dynamically without quoting and escaping, which allows SQL injection. Only use this in a safe environment.
WARNING 2: Don't apply repeatedly. Or it will misinterpret your actual string with genuine \ characters, etc.
WARNING 3: This simple function is imperfect as it cannot cope with nested single quotes properly. If you have some of those, consider instead:
Unescape a string with escaped newlines and carriage returns
Apply:
UPDATE tbl
SET content = f_eval_posix_escapes(content)
WHERE content IS DISTINCT FROM f_eval_posix_escapes(content);
db<>fiddle here
Note the added WHERE clause to skip updates that would not change anything. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
Use REPLACE in an update query. Something like this: (I'm on mobile so please ignore any typo or syntax erro)
UPDATE table
SET
column = REPLACE(column, '\n', e'\n')

How to remove special characters from a string in postgresql

I am trying to remove using REGEXP_REPLACE the following special characters: "[]{}
from the following text field: [{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}]
and replace them with nothing, not even a space.
*Needless to say, this is just an example string, and I need to find a consistent solution for similar but different strings.
I was trying to run the following: SELECT REGEXP_REPLACE('[{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}] ','[{[}]":]','')
But received pretty much the same string..
Thanks in advance!
You need to escape the special characters (\), and to specify that you want to repeat the operation for every characters ('g') else it will stop at the 1st match
SELECT REGEXP_REPLACE(
'[{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}] ',
'[{\[}\]":]',
'',
'g');
regexp_replace
--------------------------------------------------
xy,sG_1,cnC8,cnM2,gnG_2,cnCA99,cME3,gnG_3,cCA00
(1 row)

By entering values into table, command line does not work [duplicate]

I have a table test(id,name).
I need to insert values like: user's log, 'my user', customer's.
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
I am getting an error if I run any of the above statements.
If there is any method to do this correctly please share. I don't want any prepared statements.
Is it possible using sql escaping mechanism?
String literals
Escaping single quotes ' by doubling them up → '' is the standard way and works of course:
'user's log' -- incorrect syntax (unbalanced quote)
'user''s log'
Plain single quotes (ASCII / UTF-8 code 39), mind you, not backticks `, which have no special purpose in Postgres (unlike certain other RDBMS) and not double-quotes ", used for identifiers.
In old versions or if you still run with standard_conforming_strings = off or, generally, if you prepend your string with E to declare Posix escape string syntax, you can also escape with the backslash \:
E'user\'s log'
Backslash itself is escaped with another backslash. But that's generally not preferable.
If you have to deal with many single quotes or multiple layers of escaping, you can avoid quoting hell in PostgreSQL with dollar-quoted strings:
'escape '' with '''''
$$escape ' with ''$$
To further avoid confusion among dollar-quotes, add a unique token to each pair:
$token$escape ' with ''$token$
Which can be nested any number of levels:
$token2$Inner string: $token1$escape ' with ''$token1$ is nested$token2$
Pay attention if the $ character should have special meaning in your client software. You may have to escape it in addition. This is not the case with standard PostgreSQL clients like psql or pgAdmin.
That is all very useful for writing PL/pgSQL functions or ad-hoc SQL commands. It cannot alleviate the need to use prepared statements or some other method to safeguard against SQL injection in your application when user input is possible, though. #Craig's answer has more on that. More details:
SQL injection in Postgres functions vs prepared queries
Values inside Postgres
When dealing with values inside the database, there are a couple of useful functions to quote strings properly:
quote_literal() or quote_nullable() - the latter outputs the unquoted string NULL for null input.
There is also quote_ident() to double-quote strings where needed to get valid SQL identifiers.
format() with the format specifier %L is equivalent to quote_nullable().
Like: format('%L', string_var)
concat() or concat_ws() are typically no good for this purpose as those do not escape nested single quotes and backslashes.
According to PostgreSQL documentation (4.1.2.1. String Constants):
To include a single-quote character within a string constant, write
two adjacent single quotes, e.g. 'Dianne''s horse'.
See also the standard_conforming_strings parameter, which controls whether escaping with backslashes works.
This is so many worlds of bad, because your question implies that you probably have gaping SQL injection holes in your application.
You should be using parameterized statements. For Java, use PreparedStatement with placeholders. You say you don't want to use parameterised statements, but you don't explain why, and frankly it has to be a very good reason not to use them because they're the simplest, safest way to fix the problem you are trying to solve.
See Preventing SQL Injection in Java. Don't be Bobby's next victim.
There is no public function in PgJDBC for string quoting and escaping. That's partly because it might make it seem like a good idea.
There are built-in quoting functions quote_literal and quote_ident in PostgreSQL, but they are for PL/PgSQL functions that use EXECUTE. These days quote_literal is mostly obsoleted by EXECUTE ... USING, which is the parameterised version, because it's safer and easier. You cannot use them for the purpose you explain here, because they're server-side functions.
Imagine what happens if you get the value ');DROP SCHEMA public;-- from a malicious user. You'd produce:
insert into test values (1,'');DROP SCHEMA public;--');
which breaks down to two statements and a comment that gets ignored:
insert into test values (1,'');
DROP SCHEMA public;
--');
Whoops, there goes your database.
In postgresql if you want to insert values with ' in it then for this you have to give extra '
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');
you can use the postrgesql chr(int) function:
insert into test values (2,'|| chr(39)||'my users'||chr(39)||');
When I used Python to insert values into PostgreSQL, I also met the question: column "xxx" does not exist.
The I find the reason in wiki.postgresql:
PostgreSQL uses only single quotes for this (i.e. WHERE name = 'John'). Double quotes are used to quote system identifiers; field names, table names, etc. (i.e. WHERE "last name" = 'Smith').
MySQL uses ` (accent mark or backtick) to quote system identifiers, which is decidedly non-standard.
It means PostgreSQL can use only single quote for field names, table names, etc. So you can not use single quote in value.
My situation is: I want to insert values "the difference of it’s adj for sb and it's adj of sb" into PostgreSQL.
How I figure out this problem:
I replace ' with ’, and I replace " with '. Because PostgreSQL value does not support double quote.
So I think you can use following codes to insert values:
insert into test values (1,'user’s log');
insert into test values (2,'my users');
insert into test values (3,'customer’s');
If you need to get the work done inside Pg:
to_json(value)
https://www.postgresql.org/docs/9.3/static/functions-json.html#FUNCTIONS-JSON-TABLE
You must have to add an extra single quotes -> ' and make doubling quote them up like below examples -> ' ' is the standard way and works of course:
Wrong way: 'user's log'
Right way: 'user''s log'
problem:
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
Solutions:
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');

removing leading zero and hyphen in Postgres

I need to remove leading zeros and hyphens from a column value in Postgresql database, for example:
121-323-025-000 should look like 12132325
060579-0001 => 605791
482-322-004 => 4823224
timely help will be really appreciated.
Postgresql string functions.
For more advanced string editing, regular expressions can be very powerful. Be aware that complex regular expressions may not be considered maintainable by people not familiar with them.
CREATE TABLE testdata (id text, expected text);
INSERT INTO testdata (id, expected) VALUES
('121-323-025-000', '12132325'),
('060579-0001', '605791'),
('482-322-004', '4823224');
SELECT id, expected, regexp_replace(id, '(^|-)0*', '', 'g') AS computed
FROM testdata;
How regexp_replace works. In this case we look for the beginning of the string or a hyphen for a place to start matching. We include any zeros that follow that as part of the match. Next we replace that match with an empty string. Finally, the global flag tells us to repeat the search until we reach the end of the string.

list trigger no system ending with "_BI"

I want to list the trigger no system ending with "_BI" in firebird database,
but no result with this
select * from rdb$triggers
where
rdb$trigger_source is not null
and (coalesce(rdb$system_flag,0) = 0)
and (rdb$trigger_source not starting with 'CHECK' )
and (rdb$trigger_name like '%BI')
but with this syntaxs it gives me a "_bi" and "_BI0U" and "_BI0U" ending result
and (rdb$trigger_name like '%BI%')
but with this syntaxs it gives me null result
and (rdb$trigger_name like '%#_BI')
thank you beforehand
The problem is that the Firebird system tables use CHAR(31) for object names, this means that they are padded with spaces up to the declared length. As a result, use of like '%BI') will not yield results, unless BI are the 30th and 31st character.
There are several solutions
For example you can trim the name before checking
trim(rdb$trigger_name) like '%BI'
or you can require that the name is followed by at least one space
rdb$trigger_name || ' ' like '%BI %'
On a related note, if you want to check if your trigger name ends in _BI, then you should also include the underscore in your condition. And as an underscore in like is a single character matcher, you need to escape it:
trim(rdb$trigger_name) like '%\_BI' escape '\'
Alternatively you could also try to use a regular expressions, as you won't need to trim or otherwise mangle the lefthand side of the expression:
rdb$trigger_name similar to '%\_BI[[:SPACE:]]*' escape '\'