how to delete single quotes but not apostrophes in perl - perl

I would like to know how to delete single quotes but not apostrophes in perl.
For example:
'It's raining again!'
print
It's raining again!
Thanks so much

If you assume that a single-quote is always preceded or followed by whitespace, the following pair of regular expressions should work:
$line =~ s/\s'/ /g; #preceded by whitespace
$line =~ s/'\s/ /g; #followed by whitespace
you also need to account for if the string starts or ends with a single quote:
$str =~ s/^'//; #at the start of a string
$str =~ s/'$//; #at the end of a string

foreach (<DATA>) {
s/(:?(^\s*'|'$))//g;
print;
}
__DATA__
'It's raining again!'
OUTPUT
It's raining again!
EXPLANATIONS
there's more one than one way to do it
(:?) prevent non-needed capture

Tricky one. Some single quotes come after or before letters, but you want to remove only those between letters. Perhaps something like this, using negative lookarounds:
s/(?<![\pL\s])'|'(?![\pL\s])//g;
Which will remove either single quotes without letters or whitespace after or before it. Lots of negations to keep track of there. The expanded version:
s/
(?<![\pL\s])' # no letters or whitespace before single quote
| # or
'(?![\pL\s]) # no letters or whitespace after single quote
//gx;
This will cover words like - as Eli Algranti pointed out in a comment - boys' toys and that's, but language is always tricky to predict. For example, it will be next to impossible to solve something like:
'She looked at him and said, 'That's impossible!''
Of course, if you expect your single quotes to appear only at end or beginning of string, you don't need to be this fancy, you can just remove the last and first character, with any means necessary. Such as, for example, as sputnik just suggested:
s/^'|'$//g;

Related

Need Regular expression - perl

I am looking for a regx for below expression:
my $text = "1170 KB/s (244475 bytes in 2.204s)"; # I want to retrieve last ‘2.204’ from this String.
$text =~ m/\d+[^\d*](\d+)/; #Regexp
my $num = $1;
print " $num ";
Output:
204
But I need 2.204 as output, please correct me.
Can any one help me out?
The regex is doing exactly what you asked it to: It is matching digits \d+, followed by one non-digit or star [^\d*], followed by digits \d+. The only thing that matches that in your string is 204.
If you want a quick fix, you can just move the parentheses:
m/(\d+[^\d*]\d+)/
This would (with the above input) match what you want. A more exact way to put it would be:
m/(\d+\.\d+)/
Of course this will match any float precision number, so if you can have more of those, that's not a good idea. You can shore it up by using an anchor, like so:
m/(\d+\.\d+)s\)/
Where s\) forces the match to occur at only that place. Further strictures:
m/\(\d+\D+(\d+\.\d+)s\)/
You might also want to account for the possibility of your target number not being a float:
m/\(\d+\D+(\d+\.?\d*)s\)/
By using ? and * we allow for those parts not to match at all. This is not recommended to do unless you are using anchors. You can also replace everything in the capture group with [\d.]+.
If you are not fond of matching the parentheses, you can match the text:
m/bytes in ([\d.]+)s/
I'd go with the second marker as indicator where you are in the string:
my ($num) = ($text =~ /(\d+\.\d+)s/);
with explanations:
/( # start of matching group
\d+ # first digits
\. # a literal '.', take \D if you want non-numbers
\d+ # second digits
)/x # close the matching group and the regex
You had the matching groups wrong. Also the [^\d] is a bit excessive, generally you can negate some of the backspaced special classes (\d,\h, \s and \w) with their respective uppercase letter.
Try this regex:
$text =~ m/\d+[^\d]*(\d+\.?\d*)s/;
That should match 1+ digits, a decimal point if there is one, 0 or more decimal places, and make sure it's followed by a "s".

Inserting newline between matched text is overwriting

I have a text file. I am trying to search for a comma followed by a date, like ",08/18/2014".
It looks like my code finds it, but it replaces it with a comma and a new line and removes everything after it.
if ($firstline =~ s|\,(\d{2}\/.*)|\,\n|g){
print "$firstline";
How do I add a new line between the comma and the date and not remove my text and date?
Thanks.
You can reference your capture groups in your replace expression. I also expanded the regular expression to match the date exactly:
if ($firstline =~ s|,(\d{2}-\d{2}-\d{4})|,\n$1|g) {
print "$firstline\n";
}
Using look ahead/behind:
s|(?<=,)(?=\d{2}\/)|\n|g
or
s|,\K(?=\d{2}\/)|\n|g # Probably faster, but requires 5.10+
Similar to Hunter's answer, without the extended regex, like so:
(Oh, and I don't think you need the backslash before the comma):
$firstline = "hello world ,10/20/1987, sounds great to me"; # testing...
if ($firstline =~ s|,(\d{2}/)|,\n$1|g){
print "$firstline";
}
output:
hello world ,
10/20/1987, sounds great to me

Perl Search and Replace — issues is caused by "\"

I am parsing a text doc and replacing some text. Lines of text without the "\" seem to be found and replaced no issues.
By the way this is to be done in Perl
I have a string like below:
Path=S:\2014 March\Test Scenarios\load\2014 March
that contains "\" that slash is an issue. I am using a simple search and replace line of code
$nExit =~ s/$sMatchPattern/$sFullReplacementString/;
How should I do it?
I suspect that you're trying to match a literal string, and therefore need to escape regex special characters.
You can use quotemeta or the escape codes \Q ... \E to do that:
$nExit = s/\Q$sMatchPattern/$sFullReplacementString/;
The above variable $sMatchPattern will be interpolated, but then any special characters will be escaped before the regex is compiled. Therefore the value of $sMatchPattern will be treated like a literal string.
Is this string inputed, or is it embedded in your program. You could do this to get rid of the backslash character:
my $path = "S:/2014 March/Test Scenarios/load/2014 March";
By the way, it's best not to have spaces in file and path names. They can be a bit problematic in certain situations. If you can't eliminate them, it's understandable.
Two things you should look at:
Use quotemeta which can help quote special characters in strings and allow you to use them in substitutions. Even if you had backslashes in your strings, quotemeta will handle them.
You don't have to use / as separators in match and substitutions. Instead, you can substitute various other characters.
These are all the same:
$string =~ s/$regex/$replace/;
$string =~ s#$regex#$replace#;
$string =~ s|$regex|$replace|;
You can also use parentheses, square braces, or curly brackets:
$string =~ s($regex)($replace);
$string =~ s[$regex][$replace]; # Not really recommended because `[...]` is a common regex
$string =~ s{$regex}{$replace};
The advantage of these as regular expression quote-like characters is that they must be balanced, so if I had this:
my $string = "I have (parentheses) in my string";
my $regex = "(parentheses}";
my $replace = "{curly braces}";
$string = s($regex)($replace);
print "$string\n"; # Still works. This will be "I have {curly braces} in my string"
Even if my string contains these types of characters, as long as they're balanced, everything will still work.
For yours:
my $Path = 'S:\2014 March\Test Scenarios\load\2014 March';
$nExit = quotemeta $string; #Quotes all meta characters...
$nExit =~ s($sMatchPattern)($sFullReplacementString);
That should work for you.
if you want to have a \ in your replacement string or match string dont forget to put another backslash in front of the backslash you want, as its an operator...
$sFullReplacementString = "\\";
That would turn the string into a single \

How do I replace all occurrences of certain characters with their predecessors?

$s = "bla..bla";
$s =~ s/([^%])\./$1/g;
I think it should replace all occurrences of . that is not after % with the character that is before ..
But $s is then: bla.bla, but
it should be blabla. Where is the problem? I know I can use quantifiers, but I need do it this way.
When a global regular expression is searching a string it will not find overlapping matches.
The first match in your string will be a., which is replaced with a. When the regex engine resumes searching it starts at the next . so it sees .bla as the rest of the string, and your regex requires a character to match before the . so it cannot match again.
Instead, use a negative lookbehind to perform the assertion that the previous character is not %:
$s =~ s/(?<!%)\.//g;
Note that if you use a positive lookbehind like (?<=[^%]), you will not replace the . if it is the first character in the string.
The problem is that even with the /g flag, each substitution starts looking where the previous one left off. You're trying to replace a. with a and then a. with a, but the second replacement doesn't happen because the a has already been "swallowed" by the previous replacement.
One fix is to use a zero-width lookbehind assertion:
$s =~ s/(?<=[^%])\.//g;
which will remove any . that is not the first character in the string, and that is not preceded by %.
But you might actually want this:
$s =~ s/(?<!%)\.//g;
which will remove any . that is not preceded by %, even if it is the first character in the string.
Much simpler than look-behinds is to use:
$s =~ s/([^%])\.+/$1/g;
This replaces any string of one or more dots after a character other than % by nothing.

Perl- How do I insert a space before each capital letter except for the first occurrence or existing?

I have a string like:
SomeCamel WasEnteringText
I have found various means of splitting up the string and inserting spaces with php str_replace but, i need it in perl.
Sometimes there may be a space before the string, sometimes not. Sometimes there will be a space in the string but, sometimes not.
I tried:
my $camel = "SomeCamel WasEnteringText";
#or
my $camel = " SomeCamel WasEntering Text";
$camel =~ s/^[A-Z]/\s[A-Z]/g;
#and
$camel =~ s/([\w']+)/\u$1/g;
and many more combinations of =~s//g; after much reading.
I need a guru to direct this camel towards an oasis of answers.
OK, based on the input below I now have:
$camel =~ s/([A-Z])/ $1/g;
$camel =~ s/^ //; # Strip out starting whitespace
$camel =~ s/([^[:space:]]+)/\u$1/g;
Which gets it done but seems excessive. Works though.
s/(?<!^)[A-Z][a-z]*+(?!\s+)\K/ /g;
And the less "screw this horsecrap" version:
s/
(?<!^) #Something not following the start of line,
[A-Z][a-z]*+ #That starts with a capital letter and is followed by
#Zero or more lowercased letters, not giving anything back,
(?!\s+) #Not followed by one or more spaces,
\K #Better explained here [1]
/ /gx; #"Replace" it with a space.
EDIT: I noticed that this also adds extra whitespace when you add punctuation into the mix, which probably isn't what the OP wants; thankfully, the fix is simply changing the negative look ahead from \s+ to \W+. Although now I'm beginning to wonder why I actually added those pluses. Drats, me!
EDIT2: Erm, apologies, originally forgot the /g flag.
EDIT3: Okay, someone downvote me. I went retarded. No need for the negative lookbehind for ^ - I really dropped the ball on this one. Hopefully fixed:
s/[A-Z][a-z]*+(?!\W)\K/ /gx;
1: http://perldoc.perl.org/perlre.html
Try:
$camel =~ s/(?<! )([A-Z])/ $1/g; # Search for "(?<!pattern)" in perldoc perlre
$camel =~ s/^ (?=[A-Z])//; # Strip out extra starting whitespace followed by A-Z
Please note that the obvious try of $camel =~ s/([^ ])([A-Z])/$1 $2/g; has a bug: it doesn't work if there are capital letters following one another (e.g. "ABCD" will be transformed into "ABCD" and not "A B C D")
Try :
s/(?<=[a-z])(?=[A-Z])/ /g
This inserts as space after a lower case character (ie not a space or start of string) and before and upper case character.
Improving ...
... on Hughmeir's, this works also with numbers and words starting with lower-case letters.
s/[a-z0-9]+(?=[A-Z])\K/ /gx
Tests
myBrainIsBleeding => my_Brain_Is_Bleeding
MyBrainIsBleeding => My_Brain_Is_Bleeding
myBRAInIsBLLEding => my_BRAIn_Is_BLLEding
MYBrainIsB0leeding => MYBrain_Is_B0leeding
0My0BrainIs0Bleeding0 => 0_My0_Brain_Is0_Bleeding0