Seaching for a String between two strings using regular expression in perl - perl

i want to retrieve a string which falls between two specified strings multiple times in a file
i tried this, but this doesnt work
/(?m)"String 1"!.*?"String2":/;
i want every thing that falls between "String 1" and "String 2"
Please help

Assuming your input string is like this
$str='String 1GIANT FISHString 2'
this will work
($wanted)= $str =~ /String 1(.*)String 2/
$wanted is now "GIANT FISH"
dah..multiline in a file...edit coming up
ok with multiline, assuming input of
String 1Line oneString 2
String 1GIANT FISHString 2
String 1String2
this will get all the strings
(#wanted)= $str =~ /String 1(.*)String 2/g
#wanted has three entries
('Line one','GIANT FISH','')
In the second regex, g for global finds all matches in the string

Below will do:
perl -lne 'push #a,/string(.*?)string/g;END{print "#a"}'
the two strings are string and string and anything lying between them will be stroed as an array element.Below is the example that i have tested for the purpose.you can anyhow change the two string to which ever string you need.
tested:
> cat temp
string123stringstring234string
string456stringstring789string
> perl -lne 'push #a,/string(.*?)string/g;END{print "#a"}' temp
123 234 456 789

Related

Append string in the beginning and the end of a line containing certain string

all
I want to know how to append string in the beginning and the end of a line containing certain string using perl?
So for example, my line contains:
%abc %efd;
and I want to append 123 at the beginning of the line and 456 at the end of the line, so it would look like this:
123 %abc %efd 456
8/30/16 UPDATE--------------------------------
So far I have done something like this:
foreach file (find . -type f)
perl -ne 's/^\%abc\s+(\S*)/**\%abc $1/; print;' $file > tmp; mv tmp $file
end
foreach file (find . -type f)
perl -ne 's/$\%def\;\s+(\S*)/\%def\;**\n $1/; print;' $file > tmp; mv tmp $file
end
so this does pretty well except that when abc and def are not in one string.
for example:
%abc
something something something
%def
this would turn out to be
%abc
something something something
%def;
which is not what I want.
Thank you
In you case, you want to append string when line of file match the certain string, it means match and replace.
Firstly, read each line of your input file.
Secondly, check if it match with the string you want to append string into the beginning and the end.
Then replace the match string by the new string which contain additional beginning string, the match string and additional end string.
my $input_file = 'your file name here';
my $search_string = '%abc %efd';
my $add_begin = '123';
my $add_end = '456';
# Read file
open(my $IN, '<', $input_file) or die "cannot open file $input_file";
# Check each line of file
while (my $row = <$IN>) {
chomp $row;
$row =~ s/^($search_string)$/$add_begin $1 $add_end/g;
print $row."\n";
}
Try with input file as below:
%abc %efd
asdahsd
234234
%abc
%efd
%abc%efd
You will receive the result as we expected:
123 %abc %efd 456
asdahsd
234234
%abc
%efd
%abc%efd
Modify the code as your requirement and contact me if there's any issue.
Use m modifier to replacing beginning and ending with line by line.
s/^\%abc/123 $&/mg;
s/\%def$/ 456/mg;
Used together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string. source
Welcome to StackOverflow. We strive to help people solve problems in their existing code and learn languages, rather than simply answer one-off questions, the solutions to which can be easily found in 101 tutorials and documentation. The type of question you've posted doesn't leave a lot of room for learning, and doesn't do much to help future learners. It would help us greatly if you could post a more complete example, including what you've tried so far to get it working.
All that being said, there are two main ways to prepend and append to a string in Perl: 1. the concatenation operator, . and 2. string interpolation.
Concatenation
Use a . to join two strings together. You can chain operations together to compose a longer string.
my $str = '%abc %efd';
$str = '123 ' . $str . ' 456';
say $str; # prints "123 %abc %efd 456" with a trailing newline
Interpolation
Enclose a string in double quotes to instruct Perl to interpolate (i.e. find and evaluate) any Perl-style variables enclosed within the string.
my $str = '%abc %efd';
$str = "123 $str 456";
say $str; # prints "123 %abc %efd 456" with a trailing newline
You'll notice that in both examples we prepended and appended to the existing string. You can also create new variable(s) to hold the result(s) of these operations. Other methods of manipulating and building strings include the printf and sprintf functions, the substr function, the join function, and regular expressions, all of which you will encounter as you continue learning Perl.
As far as looking to see if a string contains a certain substring before performing the operation, you can use the index function or a regular expression:
if (index($str, '%abc %efd') >= 0) {
# or...
if ($str =~ /%abc %efd/) {
Remember to use strict; at the top of your Perl scripts and always (at least while you're learning) declare variables with my. If you're having trouble with the say function, you may need to add the statement use feature 'say'; to the top of your script.
You can find an index of excellent Perl tutorials at learn.perl.org. Good luck and have fun!
UPDATE Here is (I believe) a complete answer to your revised question:
find . -type f -exec perl -i.bak -pe's/^(%abc)\s+(\S*)\s+(%def;)$/**\1 \2 \3**/'
This will modify the files in place and create backup files with the extension .bak. Keep in mind that the expression \S* will only match non-whitespace characters; if you need to match strings that contain whitespace, you will need to update this expression (something like .*? might be workable for you).

perl - how to remove specific word from string?

I have a string "/project/pkt/sw/tool/xxx" and should be removed "sw/tool/xxx" from the original string.
Please let suggest me how to do it?
Input:
"/project/pkt/sw/tool/xxx";
Desired Output
"/project/pkt/"
Code
my $ str = "project/pkt/sw/tool/xxx";
$str =~ s|\w*/\w*/\w*$||;
print $str;
I am getting same original string here, please let me know how to remove last three words from the original string.
The following regex modifies $str to remove the last three words as defined in the question.
$str =~ s|\w*/\w*/\w*$||;

Counting occurance of the word "the" giving me all different answers

So I have a simple script to read in a text file from the command line, and I want to count the number of "the"s but I've been getting weird numbers.
while(<>){
$wordcount= split(/\bthe\b/, $_);}
print "\"the\" occurs $wordcount times in $ARGV";
So using that I get 10 occurrences, but if I use /\bthe\b/i I get 12. /\Bthe\b/ gives me 6 I believe. There are 11 occurrences in my test txt. Am I just an idiot? Should $wordcount just be started at 1 or 0? Also is it bad practice to use split this way? The code works fine for actually counting the words, but not when counting an exact string. New to perl so any and all abuse is appreciated. Thanks
Edit: also I know it's not adding, but now I get that $wordcount is being treated more like an array, so it worked for a previous iteration, though it was definitely poor form.
Use a regex in a list context to pull the count of matches:
my $wordcount = 0;
while (<>) {
$wordcount += () = /\bthe\b/g;
}
print qq{"the" occurs $wordcount times in $ARGV\n};
Reference: perlfaq4 - How can I count the number of occurrences of a substring within a string?
split splits the string into a list based on the regex provided. Your count comes from the fact you've put split in scalar context. From perldoc -f split:
split Splits the string EXPR into a list of strings and returns the
list in list context, or the size of the list in scalar context.
Given the string "The quick brown fox jumps over the lazy dog" I'd expect your $wordcount to be 2, which would be correct.
The quick brown fox jumps over the lazy dog
^^^============================^^^========= -> two fields
However if you had "A bird and the quick brown fox jumps over the lazy dog" you'd end up with 3 which is not correct.
A bird and the quick brown fox jumps over the lazy dog
===========^^^============================^^^========= -> three fields
First of all you absolutely would want \b as that matches a word boundary. \B matches things that aren't word boundaries so you'd be matching any word that contained "the" instead of the word "the".
Secondly you just want to count the occurrences - you do that by counting the matches of the entire string
$wordcount = () = $string =~ /\bthe\b/gi
$wordcount becomes the list in scalar context, () is a list you aren't actually capturing since you don't want the matches. $string is the string to match against. You're matching "the" at word boundaries and gi is the whole string (global), case insensitive.
With the /i flag, 'The' would be included, but not without it.
\B is a non-word boundary, so would only find things like "clothe", and not "the".
Yes, it is bad practice to use split that way. Properly, if you just want a count, do this:
$wordcount = () = split ...;
split in scalar context does something that seemed like a good idea originally, but doesn't seem so good anymore, so avoid it. The above incantation calls it in list context but assigns the number of elements found to $wordcount.
But the elements produced by splitting on the aren't what you want; you want a count of times the was found. So do (possibly with /ig instead of just /g):
$wordcount = () = /\bthe\b/g;
Note that you probably want +=, not =, to get a total for all lines.
sample.txt
Ajith
kumar
Ajith
my name is Ajith and Ajith
lastname is kumar
code
use Data::Dumper;
print "Enter your string = ";
my $input = <>; ## User input
chomp $input; ## The chomp() function will remove (usually) any newline character from the end of a string
my %count;
open FILE, "<sample.txt" or die $!; ## To read the data from a file
my #data = <FILE>;
for my $d (#data) {
my #array = split ('\s', $d); ##To split the more than one word in a line
for my $a (#array) {
$count{$a}++; ## Counter
}
}
print Dumper "Result: " . $count{$input};
The above code get the input vai command prompt, then search the word into the given text file "sample.txt", then display the output how many times it appears in the text file (sample.txt)
Note: User Input must be in "Case sensitive ".
INTPUT from the USER
Enter your string = Ajith
OUTPUT
$VAR1 = 'Result: 4';
print "Enter the string: ";
chomp($string = <>);
die "Error opening file" unless(open(fil,"filename.txt"));
my #file = <fil>;
my #mt;
foreach (#file){
#s = map split,$_;
push(#mt,#s);
}
$s = grep {m/$string/gi} #mt;
print "Total no., of $string is:: $s\n";
In this give the output what you expect.

Using Perl, search and replace a number string working backwards on that string

How can I do a global search and replace on a string of numbers starting from the end of the string and reading backwards?
Starting at the front of the string I can do this:
someword: 12345
s/someword: [0-9][0-9]/someword: ==/g;
someword: ==345
But that will only work if the string is five numbers long. Regardless of the length of the number string, I want to keep the last three numbers.
Thank you
I would use an executable substitution.
This code finds multiple digits that are followed by three more digits, and replaces them with the same number of equals = signs
my $s = 'someword: 12345678';
$s =~ s/ (\d+) (?=\d{3}) / '=' x length $1 /xe;
print $s;
output
someword: =====678
Just use a positive lookahead assertion:
my $string = 'someword: 12345';
$string =~ s/\d(?=\d{3})/=/g;
print "$string\n";
Outputs:
someword: ==345

Extracting alphanumeric phrase from a string

Trying to extract the alphanumeric characters from this string:
A_phase_I-II,_open-req_project_id_PX15RAD001
The problem is: the term PX15RAD001 can occur anywhere in the string.
Trying to extract the alpha-numeric part using the below expression. But this returns the entire string. I thought Alum was a valid keyword for alpha-numerics. Is that not the case?
(my $string = $line ) =~ s/\P{Alnum}//g;
print $string;
How can I extract the alphanumeric part of the afore mentioned string?
Thanks in advance.
-simak
At the end as per your input:
> echo "A_phase_I-II,_open-req_project_id_PX15RAD001"|perl -lne 'print $1 if(/id_([A-Z0-9]*)/)'
PX15RAD001
In the middle:
> echo "A_phase_I-II,_open-req_id_PX15RAD001_project" | perl -lne 'print $1 if(/id_([A-Z0-9]*)/)'
PX15RAD001
or in your terms:
$line=~m/id_([A-Z0-9]*)/g;
print $1;
Here are some testcases, produced with the comments of #Vijay s Answer:
my #line = (
'A_phase_I-II,_open-req_project_id_PX15RAD001',
'_PX15RAD001_A_phase_I-II,_open-req_project_id',
'A_pha3333se_I-II,_ope_PX15RAD001_n-req_project',
'A_phase_I-II,_PX15RAD001_open-req_projec123123123t_id',
'A_phase_I-II_PX15RAD001_roject_id'
);
foreach my $string ( #line ) {
$string =~ m{_([^_]{10})_?}g;
print $1 . "\n" if $1;
}
These kinds of questions are hard to answer because there is not enough information. What information we have is:
You say your target string is "alphanumeric", but the entire input string is alphanumeric, except for some punctuation, so that really doesn't tell us anything.
You say it is 12 characters long, but the sample you show is 10 characters long.
You seem to think that "alphanumeric" does not include underscore.
So, the reliable information I can sense from you is:
Target string is always delimited by underscore _
Target string is 10-12 characters, all alphanumeric except underscore.
The "reliable" solution based on this rather skimpy information is:
my $str = "A_phase_I-II,_open-req_project_id_PX15RAD001";
for my $field (split /_/, $str) {
if (length($field) <= 12 and
length($field) >= 10 and # field is 10-12 characters
$field !~ /\W/) { # and contains no non-alphanumerics
# do something
}
}
By splitting on underscore, we can easily isolate each field in the string and perform simpler tests on it, such as the ones above.