How to search for a string that contains no whitespace in perl - perl

my $string3 = "anima ls";
my $t3 = $string3 =~ /[^\s]+/;
print "$t3\n";
I wanted to write a regex that searches for a string containing no whitespace. The above code works even if i give space.

The regex [^\s]+ searches for at least one character that is not whitespace. It is better written as \S+, though. A regex that matches any string that does not contain a whitespace character is rather
/^\S+$/

Related

Perl - can't remove trailing characters at the end of string

I have some trailing characters at the end of a string peregrinevwap^_^_
print "JH 4 - app: $application \n";
app: peregrinevwap^_^_
Do you know why they are there and how I can remove them. I tried the chomp command but this hasn't worked.
Check out the tr//cd operator to get rid of unwanted characters.
It's documented in "perldoc perlop"
$application =~ tr/a-zA-Z//cd;
Will remove everything except letters from the string and
$application =~ tr/^_//d;
Will remove all "^" and "_" characters.
If you only want to remove certain characters when they at the end of the string, use the s// search/replace operator with regular expressions and the $ anchor to match the end of the string.
Here's an example:
s/[\^_]*$//;
Let's hope the underscores do not occur at the end of your strings, otherwise you can't automatically separate them from these unwanted characters.
Are you sure these characters are actually ^ and _ characters?
^_ could also indicate Ctrl-Underscore, ASCII character 0x1F (Unit Separator). (Not a character I've ever seen used, but you never know.)
If this is in fact the case, you can remove them with something like:
$application =~ s/\x1F//g;

Perl Search and Replace — issues is caused by "\"

I am parsing a text doc and replacing some text. Lines of text without the "\" seem to be found and replaced no issues.
By the way this is to be done in Perl
I have a string like below:
Path=S:\2014 March\Test Scenarios\load\2014 March
that contains "\" that slash is an issue. I am using a simple search and replace line of code
$nExit =~ s/$sMatchPattern/$sFullReplacementString/;
How should I do it?
I suspect that you're trying to match a literal string, and therefore need to escape regex special characters.
You can use quotemeta or the escape codes \Q ... \E to do that:
$nExit = s/\Q$sMatchPattern/$sFullReplacementString/;
The above variable $sMatchPattern will be interpolated, but then any special characters will be escaped before the regex is compiled. Therefore the value of $sMatchPattern will be treated like a literal string.
Is this string inputed, or is it embedded in your program. You could do this to get rid of the backslash character:
my $path = "S:/2014 March/Test Scenarios/load/2014 March";
By the way, it's best not to have spaces in file and path names. They can be a bit problematic in certain situations. If you can't eliminate them, it's understandable.
Two things you should look at:
Use quotemeta which can help quote special characters in strings and allow you to use them in substitutions. Even if you had backslashes in your strings, quotemeta will handle them.
You don't have to use / as separators in match and substitutions. Instead, you can substitute various other characters.
These are all the same:
$string =~ s/$regex/$replace/;
$string =~ s#$regex#$replace#;
$string =~ s|$regex|$replace|;
You can also use parentheses, square braces, or curly brackets:
$string =~ s($regex)($replace);
$string =~ s[$regex][$replace]; # Not really recommended because `[...]` is a common regex
$string =~ s{$regex}{$replace};
The advantage of these as regular expression quote-like characters is that they must be balanced, so if I had this:
my $string = "I have (parentheses) in my string";
my $regex = "(parentheses}";
my $replace = "{curly braces}";
$string = s($regex)($replace);
print "$string\n"; # Still works. This will be "I have {curly braces} in my string"
Even if my string contains these types of characters, as long as they're balanced, everything will still work.
For yours:
my $Path = 'S:\2014 March\Test Scenarios\load\2014 March';
$nExit = quotemeta $string; #Quotes all meta characters...
$nExit =~ s($sMatchPattern)($sFullReplacementString);
That should work for you.
if you want to have a \ in your replacement string or match string dont forget to put another backslash in front of the backslash you want, as its an operator...
$sFullReplacementString = "\\";
That would turn the string into a single \

split string with "."

I am trying to split a string with "." but getting nothing in the array. File name is "Head-First-Java-2nd-edition.pdf" After splitting I want to extract extension, but don't know why it is giving blank array.
my #fileInfo = split(/./, $filename);
&logMsg("Array is: #fileInfo");
The split is giving an empty list because you are splitting on a wildcard .. Period is a meta character, and if you want to split on a literal period, you need to escape it
my #fileInfo = split(/\./, $filename);
Also, the syntax for calling a subroutine is NAME(LIST). Using the & prefix has a certain hidden feature, in that it circumvents prototypes. Read more in perldoc perlsub.
. in a regular expression means any character except \n. To split on a literal ., you need to escape it:
split /\./, $filename;

Splitting on whitespace character and strip empty fields

($red, $tapinfo) = split(/:/, $line);
#fields = split(/\s+/, $tapinfo);
In the array fields, I see that even space gets added. I want to eliminate the space so that fields only contains non-space characters. Please comment on what can be going wrong.
I assume you are talking about leading whitespace remaining, so that #fields looks something like:
$VAR1 = [
'', # empty field
'foo',
'bar'
];
This is because you are using /\s+/ for your split when you should be using the default ' ' (a single blank space character). This default behaviour will strip leading whitespace before splitting the string. In other words, you should do:
#fields = split(' ', $tapinfo);
This is documented in perldoc -f split:
As another special case, "split" emulates the default behavior
of the command line tool awk when the PATTERN is either omitted
or a *literal string* composed of a single space character (such
as ' ' or "\x20", but not e.g. "/ /"). In this case, any leading
whitespace in EXPR is removed before splitting occurs, and the
PATTERN is instead treated as if it were "/\s+/"; in particular,
this means that *any* contiguous whitespace (not just a single
space character) is used as a separator. However, this special
treatment can be avoided by specifying the pattern "/ /" instead
of the string " ", thereby allowing only a single space
character to be a separator.
What split does by default is the same as
my #list = $string =~ /\S+/g;
i.e. it finds all the contiguous substrings of non-whitespace characters.
You could use the regex, but to to get the default behaviour from split, pass a single literal space character as the first parameter. Not a regex. The documentation says this
As another special case, split emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a literal string composed of a single space character (such as ' ' or "\x20" , but not e.g. / / ). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were /\s+/ ; in particular, this means that any contiguous whitespace (not just a single space character) is used as a separator.

How does Perl split work with string exactly?

Quoted from perldoc -f split:
As a special case, specifying a PATTERN of space (' ' ) will split on
white space just as split with no arguments does. Thus, split(' ') can
be used to emulate awk's default behavior, whereas split(/ /) will
give you as many initial null fields (empty string) as there are
leading spaces.
The above is all that's mentioned about how split deals with string delimiter, but what's the general case,is the empty leading fields always deleted for string delimiters?
No, only when the delimiter is a string that is a single space. In any other case, the delimiter is interpreted as a regex pattern.