Search for a match, after the match is found take the number after the match and add 4 to it, it is posible in perl? - perl

I am a beginer in perl and I need to modify a txt file by keeping all the previous data in it and only modify the file by adding 4 to every number related to a specific tag (< COMPRESSED-SIZE >). The file have many lines and tags and looks like below, I need to find all the < COMPRESSED-SIZE > tags and add 4 to the number specified near the tag:
< SOURCE-START-ADDRESS >01< /SOURCE-START-ADDRESS >
< COMPRESSED-SIZE >132219< /COMPRESSED-SIZE >
< UNCOMPRESSED-SIZE >229376< /UNCOMPRESSED-SIZE >
So I guess I need to do something like: search for the keyword(match) and store the number 132219 in a variable and add the second number (4) to it, replace the result 132219 with 132223, the rest of the file must remain unchanged, only the numbers related to this tag must change. I cannot search for the number instead of the tag because the number could change while the tag will remain always the same. I also need to find all the tags with this name and replace the numbers near them by adding 4 to them. I already have the code for finding something after a keyword, because I needed to search also for another tag, but this script does something else, adds a number in front of a keyword. I think I could use this code for what i need, but I do not know how to make the calculation and keep the rest of the file intact or if it is posible in perl.
while (my $row = <$inputFileHandler>)
{
if(index($row,$Data_Pattern) != -1){
my $extract = substr($row, index($row,$Data_Pattern) + length($Data_Pattern), length($row));
my $counter_insert = sprintf "%08d", $counter;
my $spaces = " " x index($row,$Data_Pattern);
$data_to_send ="what i need to add" . $extract;
print {$outs} $spaces . $Data_Pattern . $data_to_send;
$counter = $counter + 1;
}
else
{
print {$outs} $row;
next;
}
}
Maybe you could help me with a block of code for my needs, $Data_Pattern is the match. Thank you very much!

This is a classic one-liner Perl task. Basically you would do something like
$ perl -i.bak -pe's/^< COMPRESSED-SIZE >\K(\d+)/$1 + 4/e' yourfile.txt
Which will in essence copy and replace your file with a new, edited file. This can be very dangerous, especially if you are a Perl newbie. The -i switch is here used with the .bak extension which saves a backup in yourfile.txt.bak. This does not make this operation safe, however, as running the command twice will overwrite the backup.
It is advisable to make a separate backup of the target file before using this command.
-i.bak edit "in-place", the file is overwritten, a backup of the original is created with extension .bak.
-p argument is treated as a file name, which is read, and printed back.
s/ // the substitution operator, which is applied to all lines of the file.
^ inside the regex looks for beginning of line.
\K keep the match that is to the left.
(\d+) capture () 1 or more digits \d+ and store them in $1
/e treat the right hand side of the substitution operator as an expression and use the result as the replacement string. In this case it will increase your number and return the sum.
The long version of this command is
while (<>) {
s/^< COMPRESSED-SIZE >\K(\d+)/$1 + 4/e
}
Which can be placed in a file and run with the -i switch.

Related

My Perl variable to variable substitutions do not work

I have a substitution to make in a Perl script, which I do not seem to get working. I have a string in a text file which has the form:
T+30H
The string T+30H has to be written in many files and has to change from file to file. It is two digits and sometimes three digits. First I define the variable:
my $wrffcr=qr{T+\d+H};
After reading the file containing the string, I have the following substitution command (starting with the file capture)
#scrptlines=<$NCLSCRPT>;
foreach $scrptlines (#scrptlines) {
$scrptlines =~ s/$wrffcr/T+$fcrange2[$jj]H/g;
}
$fcrange2[$jj] is defined and I confirm its value by printing its value just before the above 4 lines of code.
print "$fcrange2[$jj]\n";
When I run my script, nothing changes for this particular substitution. I suspect it is to do with the way I define the string to be substituted.
I will appreciate any assistance.
Zilore Mumba
Watch out for the first + in my $wrffcr=qr{T+\d+H};. It'll make it match 1 or more Ts, not T followed by a +. You probably want
my $wrffcr=qr{T\+\d+H};

Using perl to split over multiple lines

I'm trying to write a perl script to process a log4net log file. The fields in the log file are separated by a semi-colon. My end goal is to capture each field and populate a mysql table.
Usually I have lines that look a little like this (all on a single line)
DEBUG;2017-06-13T03:56:38,316-05:00;2017-06-13 08:56:38,316;79ab0b95-7f58-
44a8-a2c6-1f8feba1d72d;(null);WorkerStartup 1;"Starting services."
These are easy to process. I can simply split by semicolon to get the information I need.
However occassionally the "message" field at the end may span several lines, especially if there is a stack trace. I would want to capture the entire message as a single column. I cannot use split by semicolon, because the next lines would typically look like:
at some.random.classname
at another.classname
...
Can someone give some tips how to solve this problem?
The following solution uses that the number of " in a field is even ($p=~y/"//%2), this condition number of " odd may be changed by other that can indicate the field is not complete.
The number of columns splitted is fixed to 7 (to allow ; in last field) and may be changed for example #array = map {s/;$//} $p=~/\G(?:"[^"]*"|[^;])*;/g;.
The file is read line by line but a line is processed sub process when it's complete $p variable to store the previous line the last line is processed in END block.
perl -ne '
sub process {
#array = split /;/,$p,7;
# do something with array
print ((join "\n---\n", #array),"\n");
}
if ($p=~y/"//%2) {
$p.=$_;
next;
}
process;
$p=$_;
END{process}
' < logfile.txt

What is the right regex to match a relative path to an image file?

I have this path ../../Capture.jpg. So far I've figured out this incomplete regex: '[../]+'. I want to check if user puts in the right path like ../../image file name. The file extensions can be jpg, png, ..
your [../]+ is not sufficient or correct for the job at hand, if you REALLY want to match a bunch of ../ at the start of a filename.
It's not completely clear what you want to do exactly, but the following will match one or more ../ at the start of a string:
/^((?:\.\.\/)+)/
basically:
^ to anchor to the start of the string being tested - will not match any ../ INSIDE the string
( and the balancing ) at the end: capture the contents within. All your ../../ will be available in a variable called $1
then I'm using (?: ) to wrap the next content. This groups the bit inside, but does NOT save the value inside a $1, $2, etc. More information soon...
The REAL pattern of interest is
\.\.\/
Since . and / are magic characters, they need 'escaping' with backslash. This tells Perl that the . and / do NOT have a special meaning at this point.
I've used the (?: ) wrapper to group them together, so that the + operates on all 3 characters of interest. The + operator means "one or more repetitions".
So, my pattern will match one or more repetitions of ../ which are anchored to the start of the string. Furthermore, the exact contents matched will be available in $1 if you are interested in doing something with that (eg count how many ../ you have)
Please ask if you have further questions, or I have misunderstood your goals.
EDIT: to suit your new requirements, and add a bit of bonus:
m!^\.\./\.\./(([^/]+)\.([^.]+))$!
Note first that I've used m!pattern! instead of /pattern/. Firstly, if Perl sees /pattern/ it assumes it's m/pattern/ but you can use an alternative character to wrap the patterns. This is useful if you actually want to use / in your pattern without having to go nuts with backslashes.
so:
^ exactly match only from the start
followed by exactly ../../
next I've used ( ) wrappers to capture the bits following. Explanation after...
ignoring the ( and ) now:
[^/]+ one or more repetitions (+) of any character that isn't /
. literally a dot - the one before the extension
[^./]+ one or more repetitions of any character that isn't . or /
Notice how the [^/]+ allows for any character including . but prevents another directory part from sneaking in. Thus, the filename could be foo.bar.jpg and it will be collected properly.
Notice how [^./]+ allows for any character in the extension except a dot - and also excluding / to prevent another directory segment from sneaking in.
Finally, $ is used to ensure we've reached the end of the pattern.
as for the captures:
$1 will contain all of foo.bar.jpg
$2 will contain foo.bar
$3 will contain jpg (not .jpg) but I'll leave it up to you to figure out what to change if you wish to capture the dot as well.
FINALLY - in a typical script, you might do something like:
if($filename =~ m!^\.\./\.\./(([^/]+)\.([^./]+))$!) {
print "You correctly entered ../../$1 giving basename=$2 and extension=$3 - Bravo!\n";
}
else {
print "you've failed to read the instructions properly\n";
}
As a bonus, I even tested that, and found 2 spolling mistaiks you'll never have to see
cheers.
# convert relative file paths to md links ...
# file paths and names with letters , nums - and _ s supported
$str =~ s! (\.\.\/([a-zA-Z0-9_\-\/\\]*)[\/\\]([a-zA-Z0-9_\-]*)\.([a-zA-Z0-9]*)) ! [$3]($1) !gm
If you don't care the path prefix, use:
$path =~ /\.(jpg|png)$/
or
substr($path, -4) ~~ ['.jpg', '.png']
With exactly '../../', use:
$path =~ m!^\.\./\.\./[^/]*\.(jpg|png)$!
With any number of '../'s, use:
$path =~ m!^(\.\./)*[^/]*\.(jpg|png)$!

How to "jump" to a line of a file, rather than read file line by line, using Perl

I am opening a file containing a single but very long column. I want to retrieve from it just a short segment, starting at a specified line and ending at another specified line. Currently, my script is reading the file line by line until the desired lines are found. I am using:
my ( $from, $to ) = ( some line number, some larger line number );
my $count = 1;
my #seq = ();
while ( <SEQUENCE> ) {
print "$_ for $count\n";
$count++;
while ( $count >= $from && $count <= $to ) {
push( #seq, $_ );
last;
}
}
print "seq is: #seq\n";
Input looks like:
A
G
T
C
A
G
T
C
.
.
.
How might I "jump" to where I want to be?
You'll need to use seek to move to the correct portion of the file. ref: http://perldoc.perl.org/functions/seek.html
This works on bytes, not on lines, so generally if you need to use line seeking its not an option. However, since you're working on a fixed length line (2 or 3 bytes depending on your platform's EOL encoding) you can multiply the line length by the line you want (0 indexed) and you'll be at the correct location for reading.
If you happen to know that all the lines are of exactly the same length (accounting for line ending characters, generally 1 byte on Unix/Linux and 2 on Windows), you can use seek to go directly to a specified point in the file
The seek function lets you specify a file position in bytes/characters, not in lines. In the general case, the only way to go to a specified line number is to read from the beginning and skip that many lines (minus one).
Unless you have an index mapping line numbers to byte offsets; then you can look up the specified line number in the index and use seek to jump to that location. To do this, you have to build the index separately (a process that will require reading through the entire file) and make sure the index is always up to date. If the file changes frequently, this is likely to be impractical.
I'm not aware of any existing tools for building and using such an index, but I wouldn't be surprised if they exist. But it should be easy enough to roll your own.
But unless scanning the file to find the line number you want is a significant performance bottleneck, I wouldn't bother with the extra complexity.

Perl: pattern match a string and then print next line/lines

I am using Net::Whois::Raw to query a list of domains from a text file and then parse through this to output relevant information for each domain.
It was all going well until I hit Nominet results as the information I require is never on the same line as that which I am pattern matching.
For instance:
Name servers:
ns.mistral.co.uk 195.184.229.229
So what I need to do is pattern match for "Name servers:" and then display the next line or lines but I just can't manage it.
I have read through all of the answers on here but they either don't seem to work in my case or confuse me even further as I am a simple bear.
The code I am using is as follows:
while ($record = <DOMAINS>) {
$domaininfo = whois($record);
if ($domaininfo=~ m/Name servers:(.*?)\n/){
print "Nameserver: $1\n";
}
}
I have tried an example of Stackoverflow where
<DOMAINS>;
will take the next line but this didn't work for me and I assume it is because we have already read the contents of this into $domaininfo.
EDIT: Forgot to say thanks!
how rude.
So, the $domaininfo string contains your domain?
What you probably need is the m parameter at the end of your regular expression. This treats your string as a multilined string (which is what it is). Then, you can match on the \n character. This works for me:
my $domaininfo =<<DATA;
Name servers:
ns.mistral.co.uk 195.184.229.229
DATA
$domaininfo =~ m/Name servers:\n(\S+)\s+(\S+)/m;
print "Server name = $1\n";
print "IP Address = $2\n";
Now, I can match the \n at the end of the Name servers: line and capture the name and IP address which is on the next line.
This might have to be munged a bit to get it to work in your situation.
This is half a question and perhaps half an answer (the question's in here as I am not yet allowed to write comments...). Okay, here we go:
Name servers:
ns.mistral.co.uk 195.184.229.229
Is this what an entry in the file you're parsing looks like? What will follow immediately afterwards - more domain names and IP addresses? And will there be blank lines in between?
Anyway, I think your problem may (in part?) be related to your reading the file line by line. Once you get to the IP address line, the info about 'Name servers:' having been present will be gone. Multiline matching will not help if you're looking at your file line by line. Thus I'd recommend switching to paragraph mode:
{
local $/ = ''; # one paragraph instead of one line constitutes a record
while ($record = <DOMAINS>) {
# $record will now contain all consecutive lines that were NOT separated
# by blank lines; once there are >= 1 blank lines $record will have a
# new value
# do stuff, e.g. pattern matching
}
}
But then you said
I have tried an example of Stackoverflow where
<DOMAINS>;
will take the next line but this didn't work for me and I assume it is because we have already read the contents of this into $domaininfo.
so maybe you've already tried what I have just suggested? An alternative would be to just add another variable ($indicator or whatever) which you'll set to 1 once 'Name servers:' has been read, and as long as it's equal to 1 all following lines will be treated as containing the data you need. Whether this is feasible, however, depends on you always knowing what else your data file contains.
I hope something in here has been helpful to you. If there are any questions, please ask :)