removing leading zero and hyphen in Postgres - postgresql

I need to remove leading zeros and hyphens from a column value in Postgresql database, for example:
121-323-025-000 should look like 12132325
060579-0001 => 605791
482-322-004 => 4823224
timely help will be really appreciated.

Postgresql string functions.
For more advanced string editing, regular expressions can be very powerful. Be aware that complex regular expressions may not be considered maintainable by people not familiar with them.
CREATE TABLE testdata (id text, expected text);
INSERT INTO testdata (id, expected) VALUES
('121-323-025-000', '12132325'),
('060579-0001', '605791'),
('482-322-004', '4823224');
SELECT id, expected, regexp_replace(id, '(^|-)0*', '', 'g') AS computed
FROM testdata;
How regexp_replace works. In this case we look for the beginning of the string or a hyphen for a place to start matching. We include any zeros that follow that as part of the match. Next we replace that match with an empty string. Finally, the global flag tells us to repeat the search until we reach the end of the string.

Related

Postgres replacing 'text' with e'text'

I inserted a bunch of rows with a text field like content='...\n...\n...'.
I didn't use e in front, like conent=e'...\n...\n..., so now \n is not actually displayed as a newline - it's printed as text.
How do I fix this, i.e. how to change every row's content field from '...' to e'...'?
The syntax variant E'string' makes Postgres interpret the given string as Posix escape string. \n encoding a newline is only one of many interpreted escape sequences (even if the most common one). See:
Insert text with single quotes in PostgreSQL
To "re-evaluate" your Posix escape string, you could use a simple function with dynamic SQL like this:
CREATE OR REPLACE FUNCTION f_eval_posix_escapes(INOUT _string text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE 'SELECT E''' || _string || '''' INTO _string;
END
$func$;
WARNING 1: This is inherently unsafe! We have to evaluate input strings dynamically without quoting and escaping, which allows SQL injection. Only use this in a safe environment.
WARNING 2: Don't apply repeatedly. Or it will misinterpret your actual string with genuine \ characters, etc.
WARNING 3: This simple function is imperfect as it cannot cope with nested single quotes properly. If you have some of those, consider instead:
Unescape a string with escaped newlines and carriage returns
Apply:
UPDATE tbl
SET content = f_eval_posix_escapes(content)
WHERE content IS DISTINCT FROM f_eval_posix_escapes(content);
db<>fiddle here
Note the added WHERE clause to skip updates that would not change anything. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
Use REPLACE in an update query. Something like this: (I'm on mobile so please ignore any typo or syntax erro)
UPDATE table
SET
column = REPLACE(column, '\n', e'\n')

By entering values into table, command line does not work [duplicate]

I have a table test(id,name).
I need to insert values like: user's log, 'my user', customer's.
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
I am getting an error if I run any of the above statements.
If there is any method to do this correctly please share. I don't want any prepared statements.
Is it possible using sql escaping mechanism?
String literals
Escaping single quotes ' by doubling them up → '' is the standard way and works of course:
'user's log' -- incorrect syntax (unbalanced quote)
'user''s log'
Plain single quotes (ASCII / UTF-8 code 39), mind you, not backticks `, which have no special purpose in Postgres (unlike certain other RDBMS) and not double-quotes ", used for identifiers.
In old versions or if you still run with standard_conforming_strings = off or, generally, if you prepend your string with E to declare Posix escape string syntax, you can also escape with the backslash \:
E'user\'s log'
Backslash itself is escaped with another backslash. But that's generally not preferable.
If you have to deal with many single quotes or multiple layers of escaping, you can avoid quoting hell in PostgreSQL with dollar-quoted strings:
'escape '' with '''''
$$escape ' with ''$$
To further avoid confusion among dollar-quotes, add a unique token to each pair:
$token$escape ' with ''$token$
Which can be nested any number of levels:
$token2$Inner string: $token1$escape ' with ''$token1$ is nested$token2$
Pay attention if the $ character should have special meaning in your client software. You may have to escape it in addition. This is not the case with standard PostgreSQL clients like psql or pgAdmin.
That is all very useful for writing PL/pgSQL functions or ad-hoc SQL commands. It cannot alleviate the need to use prepared statements or some other method to safeguard against SQL injection in your application when user input is possible, though. #Craig's answer has more on that. More details:
SQL injection in Postgres functions vs prepared queries
Values inside Postgres
When dealing with values inside the database, there are a couple of useful functions to quote strings properly:
quote_literal() or quote_nullable() - the latter outputs the unquoted string NULL for null input.
There is also quote_ident() to double-quote strings where needed to get valid SQL identifiers.
format() with the format specifier %L is equivalent to quote_nullable().
Like: format('%L', string_var)
concat() or concat_ws() are typically no good for this purpose as those do not escape nested single quotes and backslashes.
According to PostgreSQL documentation (4.1.2.1. String Constants):
To include a single-quote character within a string constant, write
two adjacent single quotes, e.g. 'Dianne''s horse'.
See also the standard_conforming_strings parameter, which controls whether escaping with backslashes works.
This is so many worlds of bad, because your question implies that you probably have gaping SQL injection holes in your application.
You should be using parameterized statements. For Java, use PreparedStatement with placeholders. You say you don't want to use parameterised statements, but you don't explain why, and frankly it has to be a very good reason not to use them because they're the simplest, safest way to fix the problem you are trying to solve.
See Preventing SQL Injection in Java. Don't be Bobby's next victim.
There is no public function in PgJDBC for string quoting and escaping. That's partly because it might make it seem like a good idea.
There are built-in quoting functions quote_literal and quote_ident in PostgreSQL, but they are for PL/PgSQL functions that use EXECUTE. These days quote_literal is mostly obsoleted by EXECUTE ... USING, which is the parameterised version, because it's safer and easier. You cannot use them for the purpose you explain here, because they're server-side functions.
Imagine what happens if you get the value ');DROP SCHEMA public;-- from a malicious user. You'd produce:
insert into test values (1,'');DROP SCHEMA public;--');
which breaks down to two statements and a comment that gets ignored:
insert into test values (1,'');
DROP SCHEMA public;
--');
Whoops, there goes your database.
In postgresql if you want to insert values with ' in it then for this you have to give extra '
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');
you can use the postrgesql chr(int) function:
insert into test values (2,'|| chr(39)||'my users'||chr(39)||');
When I used Python to insert values into PostgreSQL, I also met the question: column "xxx" does not exist.
The I find the reason in wiki.postgresql:
PostgreSQL uses only single quotes for this (i.e. WHERE name = 'John'). Double quotes are used to quote system identifiers; field names, table names, etc. (i.e. WHERE "last name" = 'Smith').
MySQL uses ` (accent mark or backtick) to quote system identifiers, which is decidedly non-standard.
It means PostgreSQL can use only single quote for field names, table names, etc. So you can not use single quote in value.
My situation is: I want to insert values "the difference of it’s adj for sb and it's adj of sb" into PostgreSQL.
How I figure out this problem:
I replace ' with ’, and I replace " with '. Because PostgreSQL value does not support double quote.
So I think you can use following codes to insert values:
insert into test values (1,'user’s log');
insert into test values (2,'my users');
insert into test values (3,'customer’s');
If you need to get the work done inside Pg:
to_json(value)
https://www.postgresql.org/docs/9.3/static/functions-json.html#FUNCTIONS-JSON-TABLE
You must have to add an extra single quotes -> ' and make doubling quote them up like below examples -> ' ' is the standard way and works of course:
Wrong way: 'user's log'
Right way: 'user''s log'
problem:
insert into test values (1,'user's log');
insert into test values (2,''my users'');
insert into test values (3,'customer's');
Solutions:
insert into test values (1,'user''s log');
insert into test values (2,'''my users''');
insert into test values (3,'customer''s');

In DB2 SQL RegEx, how can a conditional replacement be done without CASE WHEN END..?

I have a DB2 v7r3 SQL SELECT statement with three instances of REGEXP_SUBSTR(), all with the same regex pattern string, each of which extract one of three groups.
I'd like to change the first SUBSTR to REGEXP_REPLACE() to do a conditional replacement if there's no match, to insert a default value similarly to the ELSE section of a CASE...END. But I can't make it work. I could easily use a CASE, but it seems more compact & efficient to use RegEx.
For example, I have descriptions of food containers sizes, in various states of completeness:
12X125
6X350
1X1500
1500ML
1000
The last two don't have the 'nnX' part at the beginning, in which case '1X' is assumed and needs to be inserted.
This is my current working pattern string:
^(?:(\d{1,3})(?:X))?((?:\d{1,4})(?:\.\d{1,3})?)(L|ML|PK|Z|)$
The groups returned are: quantity, size, and unit.
But only the first group needs the conditional replacement:
(?:(\d{1,3})(?:X))?
This RexEgg webpage describes the (?=...) operator, and it seems to be what I need, but I'm not sure. It's in the list of operators for my version of DB2, but I can't make it work. Frankly, it's a bit deeper than my regex knowledge, and I can't even make it work in my favorite online regex tester, Regex101.
So...does anyone have any idea or suggestions..? Thanks.
Try this (replace "digits not followed by X_or_digit"):
with t(s) as (values
'12X125'
, '6X350'
, '1X1500'
, '1500'
, '1125'
)
select regexp_replace(s, '^([\d]+(?![X\d]))', '1X\1')
from t;

PostgreSQL regexp.replace all unwanted chars

I have registration codes in my PostgreSQL table which are written messy, like MU-321-AB, MU/321/AB, MU 321-AB and so forth...
I would need to clear all of this to get MU321AB.
For this I uses following expression:
SELECT DISTINCT regexp_replace(ccode, '([^A-Za-z0-9])', ''), ...
This expression work as expected in 'NET' but not in PostgreSQL where it 'clears' only first occurrence of unwanted character.
How would I modify regular expression which will replace all unwanted chars in string to get clear code with only letters and numbers?
Use the global flag, but without any capture groups:
SELECT DISTINCT regexp_replace(ccode, '[^A-Za-z0-9]', '', 'g'), ...
Note that the global flag is part of the standard regular expression parser, so .NET is not following the standard in this case. Also, since you do not want anything extracted from the string - you just want to replace some characters - you should not use capture groups ().

PostgreSQL Trim excessive trailing zeroes: type numeric but expression is of type text

I'm trying to clean out excessive trailing zeros, I used the following query...
UPDATE _table_ SET _column_=trim(trailing '00' FROM '_column_');
...and I received the following error:
ERROR: column "_column_" is of
expression is of type text.
I've played around with the quotes since that usually is what it barrels down to for text versus numeric though without any luck.
The CREATE TABLE syntax:
CREATE TABLE _table_ (
id bigint NOT NULL,
x bigint,
y bigint,
_column_ numeric
);
You can cast the arguments from and the result back to numeric:
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::text)::numeric;
Also note that you don't quote column names with single quotes as you did.
Postgres version 13 now comes with the trim_scale() function:
UPDATE _table_ SET _column_ = trim_scale(_column_);
trim takes string parameters, so _column_ has to be cast to a string (varchar for example). Then, the result of trim has to be cast back to numeric.
UPDATE _table_ SET _column_=trim(trailing '00' FROM _column_::varchar)::numeric;
Another (arguably more consistent) way to clean out the trailing zeroes from a NUMERIC field would be to use something like the following:
UPDATE _table_ SET _column_ = CAST(to_char(_column_, 'FM999999999990.999999') AS NUMERIC);
Note that you would have to modify the FM pattern to match the maximum expected precision and scale of your _column_ field. For more details on the FM pattern modifier and/or the to_char(..) function see the PostgreSQL docs here and here.
Edit: Also, see the following post on the gnumed-devel mailing list for a longer and more thorough explanation on this approach.
Be careful with all the answers here. Although this looks like a simple problem, it's not.
If you have pg 13 or higher, you should use trim_scale (there is an answer about that already). If not, here is my "Polyfill":
DO $x$
BEGIN
IF count(*)=0 FROM pg_proc where proname='trim_scale' THEN
CREATE FUNCTION trim_scale(numeric) RETURNS numeric AS $$
SELECT CASE WHEN trim($1::text, '0')::numeric = $1 THEN trim($1::text, '0')::numeric ELSE $1 END $$
LANGUAGE SQL;
END IF;
END;
$x$;
And here is a query for testing the answers:
WITH test as (SELECT unnest(string_to_array('1|2.0|0030.00|4.123456000|300000','|'))::numeric _column_)
SELECT _column_ original,
trim(trailing '00' FROM _column_::text)::numeric accepted_answer,
CAST(to_char(_column_, 'FM999999999990.999') AS NUMERIC) another_fancy_one,
CASE WHEN trim(_column_::text, '0')::numeric = _column_ THEN trim(_column_::text, '0')::numeric ELSE _column_ END my FROM test;
Well... it looks like, I'm trying to show the flaws of the earlier answers, while just can't come up with other testcases. Maybe you should write more, if you can.
I'm like short syntax instead of fancy sql keywords, so I always go with :: over CAST and function call with comma separated args over constructs like trim(trailing '00' FROM _column_). But it's a personal taste only, you should check your company or team standards (and fight for change them XD)