Perl Search and Replace — issues is caused by "\" - perl

I am parsing a text doc and replacing some text. Lines of text without the "\" seem to be found and replaced no issues.
By the way this is to be done in Perl
I have a string like below:
Path=S:\2014 March\Test Scenarios\load\2014 March
that contains "\" that slash is an issue. I am using a simple search and replace line of code
$nExit =~ s/$sMatchPattern/$sFullReplacementString/;
How should I do it?

I suspect that you're trying to match a literal string, and therefore need to escape regex special characters.
You can use quotemeta or the escape codes \Q ... \E to do that:
$nExit = s/\Q$sMatchPattern/$sFullReplacementString/;
The above variable $sMatchPattern will be interpolated, but then any special characters will be escaped before the regex is compiled. Therefore the value of $sMatchPattern will be treated like a literal string.

Is this string inputed, or is it embedded in your program. You could do this to get rid of the backslash character:
my $path = "S:/2014 March/Test Scenarios/load/2014 March";
By the way, it's best not to have spaces in file and path names. They can be a bit problematic in certain situations. If you can't eliminate them, it's understandable.
Two things you should look at:
Use quotemeta which can help quote special characters in strings and allow you to use them in substitutions. Even if you had backslashes in your strings, quotemeta will handle them.
You don't have to use / as separators in match and substitutions. Instead, you can substitute various other characters.
These are all the same:
$string =~ s/$regex/$replace/;
$string =~ s#$regex#$replace#;
$string =~ s|$regex|$replace|;
You can also use parentheses, square braces, or curly brackets:
$string =~ s($regex)($replace);
$string =~ s[$regex][$replace]; # Not really recommended because `[...]` is a common regex
$string =~ s{$regex}{$replace};
The advantage of these as regular expression quote-like characters is that they must be balanced, so if I had this:
my $string = "I have (parentheses) in my string";
my $regex = "(parentheses}";
my $replace = "{curly braces}";
$string = s($regex)($replace);
print "$string\n"; # Still works. This will be "I have {curly braces} in my string"
Even if my string contains these types of characters, as long as they're balanced, everything will still work.
For yours:
my $Path = 'S:\2014 March\Test Scenarios\load\2014 March';
$nExit = quotemeta $string; #Quotes all meta characters...
$nExit =~ s($sMatchPattern)($sFullReplacementString);
That should work for you.

if you want to have a \ in your replacement string or match string dont forget to put another backslash in front of the backslash you want, as its an operator...
$sFullReplacementString = "\\";
That would turn the string into a single \

Related

Perl q function or single quote doesn't return the string literal of UNC path correctly

Perl's q function or single quote is supposed to return the string literal as typed (except \'). But it doesn't work as expected for the following scenario.
I want to print the following UNC path
\\dir1\dir2\dir3
So I have used
my $path = q(\\dir1\dir2\dir3);
OR
my $path = '\\dir1\dir2\dir3';
But this skips one backslash at the front.
So if I print it i.e. print $path; it prints
\dir1\dir2\dir3
I want to know why? I have to type 3 or 4 backslashes at the beginning of the UNC path to make it work as expected. What am I missing?
From perldoc perlop:
q/STRING/
'STRING'
A single-quoted, literal string. A backslash represents a backslash unless followed by the delimiter or another backslash, in which case the delimiter or backslash is interpolated.
Change:
my $path = q(\\dir1\dir2\dir3);
to:
my $path = q(\\\dir1\dir2\dir3);
As for why, it's because Perl lets you include the quote delimiter in your string by escaping it with a backslash:
my $single_quote = 'This is a single quote: \'';
But if a backslash before the delimiter always escaped the delimiter, there would be no way to end a string with a backslash:
my $backslash = 'This is a backslash: \'; # nope
Allowing backslashes to be escaped too takes care of that:
my $backslash = 'This is a backslash: \\';
Interestingly enough, there is only one way to type in double backslashes in a perl string without it being interpolated as a single backslash.
As all the other answers showed, any of the quote operators treat backslashes as a backslash unless there is another one following it directly.
The only way to get the double backslashes to display exactly as you have typed them is to use a single quote here-doc.
my $path = <<'VISTA';
\\dir1\dir2\dir3
VISTA
chomp $path;
print $path."\n";
Would print it exactly as you've typed it in.

Java like trim function for perl

Is there an Java like trim function for Perl.
I am looking for function in Perl that removes all the leading and trailing characters below 0x20, like in Java.
After calling the function on the following string.
my $string = "\N{U+0020}\N{U+001f}\N{U+001e}\N{U+001d}\N{U+001c}\N{U+001b}\N{U+001a}\N{U+0019}\N{U+0018}\N{U+0017}\N{U+0016}\N{U+0015}\N{U+0014}\N{U+0013}\N{U+0012}\N{U+0011}Hello Moto\N{U+0010}\N{U+000f}\N{U+000e}\N{U+000d}\N{U+000c}\N{U+000b}\N{U+000a}\N{U+0009}\N{U+0008}\N{U+0007}\N{U+0006}\N{U+0005}\N{U+0004}\N{U+0003}\N{U+0002}\N{U+0001}\N{U+0000}";
Only "Hello Moto" should be left.
The trim from String::Util only removes the first whitespace (\N{U+0020}).
The traditional ASCII way was to use
$string =~ s/^\s+|\s+$//g;
(i.e. remove whitespace (\s) from the beginning (^) and end ($) of the string.
U+001f is not whitespace, it's a Control. You can use Unicode properties in regular expressions with \p:
my $drop = qr/[\p{Space}\p{Cc}]+/;
$whitespace =~ s/^$drop|$drop$//g;
Or, more verbose:
$drop = qr/[\p{White_Space}\p{Cntrl}]+/;
You should probably change the name of the variable.

How to search for a string that contains no whitespace in perl

my $string3 = "anima ls";
my $t3 = $string3 =~ /[^\s]+/;
print "$t3\n";
I wanted to write a regex that searches for a string containing no whitespace. The above code works even if i give space.
The regex [^\s]+ searches for at least one character that is not whitespace. It is better written as \S+, though. A regex that matches any string that does not contain a whitespace character is rather
/^\S+$/

how to delete single quotes but not apostrophes in perl

I would like to know how to delete single quotes but not apostrophes in perl.
For example:
'It's raining again!'
print
It's raining again!
Thanks so much
If you assume that a single-quote is always preceded or followed by whitespace, the following pair of regular expressions should work:
$line =~ s/\s'/ /g; #preceded by whitespace
$line =~ s/'\s/ /g; #followed by whitespace
you also need to account for if the string starts or ends with a single quote:
$str =~ s/^'//; #at the start of a string
$str =~ s/'$//; #at the end of a string
foreach (<DATA>) {
s/(:?(^\s*'|'$))//g;
print;
}
__DATA__
'It's raining again!'
OUTPUT
It's raining again!
EXPLANATIONS
there's more one than one way to do it
(:?) prevent non-needed capture
Tricky one. Some single quotes come after or before letters, but you want to remove only those between letters. Perhaps something like this, using negative lookarounds:
s/(?<![\pL\s])'|'(?![\pL\s])//g;
Which will remove either single quotes without letters or whitespace after or before it. Lots of negations to keep track of there. The expanded version:
s/
(?<![\pL\s])' # no letters or whitespace before single quote
| # or
'(?![\pL\s]) # no letters or whitespace after single quote
//gx;
This will cover words like - as Eli Algranti pointed out in a comment - boys' toys and that's, but language is always tricky to predict. For example, it will be next to impossible to solve something like:
'She looked at him and said, 'That's impossible!''
Of course, if you expect your single quotes to appear only at end or beginning of string, you don't need to be this fancy, you can just remove the last and first character, with any means necessary. Such as, for example, as sputnik just suggested:
s/^'|'$//g;

How do I escape special characters for a substitution in a Perl one-liner?

Is there some way to replace a string such as #or * or ? or & without needing to put a "\" before it?
Example:
perl -pe 'next if /^#/; s/\#d\&/new_value/ if /param5/' test
In this example I need to replace a #d& with new_value but the old value might contain any character, how do I escape only the characters that need to be escaped?
You have several problems:
You are using \b incorrectly
You are replacing code with shell variables
You need to quote metacharacters
From perldoc perlre
A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it
Neither of the characters # or & are \w characters. So your match is guaranteed to fail. You may want to use something like s/(^|\s)\#d\&(\s|$)/${1}new text$2/
(^|\s) says to match either the start of the string (^)or a whitespace character (\s).
(\s|$) says to match either the end of the string ($) or a whitespace character (\s).
To solve the second problem, you should use %ENV.
To solve the third problem, you should use the \Q and \E escape sequences to escape the value in $ENV{a}.
Putting it all together we get:
#!/bin/bash
export a='#d&'
export b='new text'
echo 'param5 #d&' |
perl -pe 'next if /^#/; s/(^|\s)\Q$ENV{a}\E(\s|$)/$1$ENV{b}$2/ if /param5/'
Which prints
param5 new text
As discussed at perldoc perlre:
...Today it is more common to use the quotemeta() function or the "\Q" metaquoting
escape sequence to disable all metacharacters' special meanings like this:
/$unquoted\Q$quoted\E$unquoted/
Beware that if you put literal backslashes (those not inside interpolated variables) between "\Q" and "\E", double-quotish backslash interpolation may
lead to confusing results. If you need to use literal backslashes within "\Q...\E", consult "Gory details of parsing quoted constructs" in perlop.
You can also use a ' as the delimiter in the s/// operation to make everything be parsed literally:
my $text = '#';
$text =~ s'#'1';
print $text;
In your example, you can do (note the single quotes):
perl -pe 's/\b\Q#f&\E\b/new_value/g if m/param5/ and not /^ *#/'
The other answers have covered the question, now here's your meta-problem: Leaning Toothpick Syndrome. Its when the delimiter and escapes start to blur together:
s/\/foo\/bar\\/\/bar\/baz/
The solution is to use a different delimiter. You can use just about anything, but balanced braces work best. Most editors can parse them and you generally don't have to worry about escaping.
s{/foo/bar\\}{/bar/baz}
Here's your regex with braced delimiters.
s{\#d\&}{new_value}
Much easier on the eyeholes.
If you really want to avoid typing the \s, put your search string into a variable and then use that in your regex instead. You don't need quotemeta or \Q ... \E in that case. For example:
my $s = '#d&';
s/$s/new_value/g;
If you must use this in a one-liner, bear in mind that you will have to escape the $s if you use "s to contain your perl code, or escape the 's if you use 's to contain your perl code.
If you have a string like
my $var1 = abc$123
and you want to replace it with abcd then you have to use \Q \E. If you don't then no matter what perl doesn't replace the string.
This is the only thing that worked for me.
my $var2 = s/\Q$var1\E/abcd/g;