Why does this substitution with regular expression in postgresql not work? - postgresql

I cannot understand why this line is not working
UPDATE <table_name> SET "Raw_data"= regexp_replace("Raw_data", '\s', '_'
I just want to substitute all white spaces present on the the cell of the field "Raw_data" with an underscore
I have no error when the query runs but does not work..

Related

Alphanumeric substitution with vim

I'm using the vscode vimplugin. I have a bunch of lines that look like:
Terry,169,80,,,47,,,22,,,6,,
I want to remove all the alphanumeric characters after the first comma so I get:
Terry,,,,,,,,,,,,,
In command mode I tried:
s/^.+\,[a-zA-Z0-9-]\+//g
But this does not appear to do anything. How can I get this working?
edit:
s/^[^,]\+,[a-zA-Z0-9-]\+//g
\+ is greedy; ^.\+, eats the entire line up to the last ,.
Instead of the dot (which means "any character") use [^,] which means "any but a comma". Then ^[^,]\+, means "any characters up to the first comma".
The problem with your requirement is that you want to anchor at the beginning using ^ so you cannot use flag g — with the anchor any substitution will be done once. The only way I can solve the puzzle is to use expressions: match and preserve the anchored text and then use function substitute() with flag g.
I managed with the following expression:
:s/\(^[^,]\+\)\(,\+\)\(.\+\)$/\=submatch(1) . submatch(2) . substitute(submatch(3), '[^,]', '', 'g')/
Let me split it in parts. Searching:
\(^[^,]\+\) — first, match any non-commas
\(,\+\) — any number of commas
\(.\+\)$ — all chars to the end of the string
Substituting:
\= — the substitution is an expression
See http://vimdoc.sourceforge.net/htmldoc/change.html#sub-replace-expression
submatch(1) — replace with the first match (non-commas anchored with ^)
submatch(2) — replace with the second match (commas)
substitute(submatch(3), '[^,]', '', 'g') — replace in the rest of the string
The last call to substitute() is simple, it replaces all non-commas with empty strings.
PS. Tested in real vim, not vscode.

LTRIM does not remove leading space in SQL

I load data from a text file and it seems that it does not contain leading space, however when I SELECT from a table, I see the leading space but cannot remove it with a LTRIM function:
SELECT ltrim(DATA) FROM MYTABLE WHERE LineNumber = 4
I'm getting the following:
T000000000004
with a single leading space before T
When I do select convert(varbinary,data) from mytable, that's what I get:
0x0A54303030303030303030303034
In the file it looks ok: T000000000004 - no leading space and it starts from the first character in the file. However, in the table it's inserted with a leading space.
How can I fix it?
As HABO mentioned, your value doesn't start with a space, it doesn't actually have any white space in it at all, it has a leading Line Feed (character 10, or 0X0A).
To remove these, and any carriage returns you might have too, you can use REPLACE:
REPLACE(REPLACE(data,CHAR(10),'')),CHAR(13),'')
(L/R)TRIM only remove leading/trailing white space. Nothing else.
If there could be a range of leading characters, and you want to remove all of them up to say the first alphanumerical character, you can use PATINDEX and STUFF:
SELECT STUFF(V.[data],1,PATINDEX('%[A-z1-9]%',V.[data])-1,'')
FROM (VALUES(CHAR(10) + CHAR(13) + ' -T000000129B'))V([data])

What's the command mean? sed 's,^.*/,,'

It's really un-usual using of sed for me. I'm used to have 's/pattern1/pattern2/g'.
Can someone help me to explain it?
The input string is just like the following:
path1/path2/path3/fileA path1/path2/path3/fileB path1/path2/path3/fileC
the output is fileA fileB fileC.
It's a substitute command using ',' instead of '/' as a separator - probably because there's a '/' in the pattern. It's equivalent to
s/^.*\///
which says remove everything from beginning of line to the last forward slash.
When you use 's' the next character is used as the separator. So you could also write it as
s!^.*/!!
s#^.*/##
etc
using a different separator saves you having to escape instances of the separator in your patterns.
Your example input:
path1/path2/path3/fileA
'^' means 'from the start of the string', '.*' means 'match anything' which is 'greedy' so it tries to match as much of the string as possible. '.*/' tries to greedily match anything so long as it's followed by a '/'. Because it's greedy, that includes other slashes. so it matches path1/path2/path3/. The replacement pattern is '', i.e. nothing, so it effectively removes everything from the start of the string to the last '/', leaving just fileA
TL;DR: It means "remove path information and leave just the filename"

meaning of the following regular expressions written in perl

Here is a piece of code
while($l=~/(\\\s*)$/) {
statements;
}
$l contains a line of text taken form file, in effect this code is for go through lines in file.
Questions:
I don't clearly understand what the condition in while is doing. I think it is trying to match group of \ followed by some number of white spaces at the end of line and loop should stop whenever a line ends with \ and may be some white spaces. I am not sure of it.
I came across statement $a ~= s/^(.*$)/$1/ . What I understand that ^ will force matching at the beginning of string, but in (.*$) would mean match all the characters at the end of string . Dose it mean that the statement is trying to find if any group of character at the end is same as group of character in the beginning of text ?
It is interesting to note that this statement:
while ( $l =~ /(\\\s*)$/ ) {
Is an infinite loop unless $l is altered inside the loop so that the regex no longer matches. As has already been mentioned by others, this is what it matches:
( ... ) a capture group, captures string to $1 (that's the number one, not lower case L)
\\ matches a literal backslash
\s* matches 0 or more whitespace characters.
$ matches end of line with optional newline.
Since you do not have the /g modifier, this regex will not iterate through matches, it will simply check if there is a match, resetting the regex each iteration, thereby causing an endless loop.
The statement
$a ~= s/^(.*$)/$1/
Looks rather pointless. It captures a string of characters up until end of string, then replaces it with itself. The captured text is stored in $1 and is simply replaced. The only marginally useful thing about this regex is that:
It matches up until newline \n, and nothing further, which may be of some use to a parser. A period . matches any character except newline, unless the /s modifier is present on the regex.
It captures the line in $1 for future use. However, a simple /^(.*$)/ would do the same.
1. the while
Usually while (regex) is used with the /g modifier, otherwise, if it matches, you get an infinite loop (unless you exit the loop, like using last).
statements would be executed continuously in an infinite loop.
In your case, adding the g
while($l=~/(\\\s*)$/g)
will have the while make only one loop, due to the $ - making a match unique (whatever matches up to the end of string is unique, as $ marks the end, and there is nothing after...).
2. $a ~= s/^(.*$)/$1/
This is a substitution. If the string ^.*$ matches (and it will, since ^.*$ matches (almost, see comment) anything) it is replaced with... $1 or what's inside the (), ie itself, since the match occurs from 1st char to the end of string
^ means beginning of string
(.*) means all chars
$ end of string
so that will replace $a with itself - probably not what you want.
it matches a literal backslash followed by 0 or more spaces followed by the end of the line.
it executes statements for all the lines in that text file that contain a \, followed by zero or more spaces ( \s* ), at the end of the line ($).
It matches lines that end with a backslash character, ignoring any trailing whitespace characters.
Ending a line with a backslash is used in some languages and data files to indicate that the line is being continued on the next line. So I suspect this is part of a parser that merges these continuation lines.
If you enter a regular expression at RegExr and hover your mouse over the pieces, it displays the meaning of each piece in a tooltip.
(\\\s*)$ this regex means --- a \ followed by zero or more number of white space characters which is followed by end of the line. Since you have your regex in (...), you can extract what you matched using $1, if you need.
http://rubular.com/r/dtHtEPh5DX
EDIT -- based on your update
$a ~= s/^(.$)/$1/ --- this is search and replace. So your regex matches a line which contains exactly one character (since you use . http://www.regular-expressions.info/dot.html), except a new-line character. Since you use (...), the character which matched the regex is extracted and stored in variable a
EDIT -- you changed your regex so here is the updated answer
$a ~= s/^(.*$)/$1/ -- same as above except now it matches zero or more characters (except new-line)

How to replace a character with null

I have one string
"8.53" I want my resulting string "853"
I have tried
the following code
tr|.||;
but its not replacing its giving 8.53 only .
I have tried another way using
tr|.|NULL|;
but its giving 8N53 can anyone please suggest me how to use tr to replace a character with NULL.
Thanks
You need to specify the d modifier to delete chars with no corresponding char:
tr/.//d;
Or you could use the (slower but more familiar) substitution operator:
s/\.//g;
You don't want tr because that transliterates characters from the 1st list with the corresponding character in the 2nd list (which was N in your example since that was the first character). You'll want the substitution operator.
my $var = "8.53";
$var =~ s/\.//;
print $var;
Add the g flag if there are multiple instances you want to replace (s/\.//g).