Using perl to split over multiple lines - perl

I'm trying to write a perl script to process a log4net log file. The fields in the log file are separated by a semi-colon. My end goal is to capture each field and populate a mysql table.
Usually I have lines that look a little like this (all on a single line)
DEBUG;2017-06-13T03:56:38,316-05:00;2017-06-13 08:56:38,316;79ab0b95-7f58-
44a8-a2c6-1f8feba1d72d;(null);WorkerStartup 1;"Starting services."
These are easy to process. I can simply split by semicolon to get the information I need.
However occassionally the "message" field at the end may span several lines, especially if there is a stack trace. I would want to capture the entire message as a single column. I cannot use split by semicolon, because the next lines would typically look like:
at some.random.classname
at another.classname
...
Can someone give some tips how to solve this problem?

The following solution uses that the number of " in a field is even ($p=~y/"//%2), this condition number of " odd may be changed by other that can indicate the field is not complete.
The number of columns splitted is fixed to 7 (to allow ; in last field) and may be changed for example #array = map {s/;$//} $p=~/\G(?:"[^"]*"|[^;])*;/g;.
The file is read line by line but a line is processed sub process when it's complete $p variable to store the previous line the last line is processed in END block.
perl -ne '
sub process {
#array = split /;/,$p,7;
# do something with array
print ((join "\n---\n", #array),"\n");
}
if ($p=~y/"//%2) {
$p.=$_;
next;
}
process;
$p=$_;
END{process}
' < logfile.txt

Related

Search for a match, after the match is found take the number after the match and add 4 to it, it is posible in perl?

I am a beginer in perl and I need to modify a txt file by keeping all the previous data in it and only modify the file by adding 4 to every number related to a specific tag (< COMPRESSED-SIZE >). The file have many lines and tags and looks like below, I need to find all the < COMPRESSED-SIZE > tags and add 4 to the number specified near the tag:
< SOURCE-START-ADDRESS >01< /SOURCE-START-ADDRESS >
< COMPRESSED-SIZE >132219< /COMPRESSED-SIZE >
< UNCOMPRESSED-SIZE >229376< /UNCOMPRESSED-SIZE >
So I guess I need to do something like: search for the keyword(match) and store the number 132219 in a variable and add the second number (4) to it, replace the result 132219 with 132223, the rest of the file must remain unchanged, only the numbers related to this tag must change. I cannot search for the number instead of the tag because the number could change while the tag will remain always the same. I also need to find all the tags with this name and replace the numbers near them by adding 4 to them. I already have the code for finding something after a keyword, because I needed to search also for another tag, but this script does something else, adds a number in front of a keyword. I think I could use this code for what i need, but I do not know how to make the calculation and keep the rest of the file intact or if it is posible in perl.
while (my $row = <$inputFileHandler>)
{
if(index($row,$Data_Pattern) != -1){
my $extract = substr($row, index($row,$Data_Pattern) + length($Data_Pattern), length($row));
my $counter_insert = sprintf "%08d", $counter;
my $spaces = " " x index($row,$Data_Pattern);
$data_to_send ="what i need to add" . $extract;
print {$outs} $spaces . $Data_Pattern . $data_to_send;
$counter = $counter + 1;
}
else
{
print {$outs} $row;
next;
}
}
Maybe you could help me with a block of code for my needs, $Data_Pattern is the match. Thank you very much!
This is a classic one-liner Perl task. Basically you would do something like
$ perl -i.bak -pe's/^< COMPRESSED-SIZE >\K(\d+)/$1 + 4/e' yourfile.txt
Which will in essence copy and replace your file with a new, edited file. This can be very dangerous, especially if you are a Perl newbie. The -i switch is here used with the .bak extension which saves a backup in yourfile.txt.bak. This does not make this operation safe, however, as running the command twice will overwrite the backup.
It is advisable to make a separate backup of the target file before using this command.
-i.bak edit "in-place", the file is overwritten, a backup of the original is created with extension .bak.
-p argument is treated as a file name, which is read, and printed back.
s/ // the substitution operator, which is applied to all lines of the file.
^ inside the regex looks for beginning of line.
\K keep the match that is to the left.
(\d+) capture () 1 or more digits \d+ and store them in $1
/e treat the right hand side of the substitution operator as an expression and use the result as the replacement string. In this case it will increase your number and return the sum.
The long version of this command is
while (<>) {
s/^< COMPRESSED-SIZE >\K(\d+)/$1 + 4/e
}
Which can be placed in a file and run with the -i switch.

Read from csv file and write output to a file

I am new to Perl and would like your help on following scenario, can you please help on this subject.
I have a CSV files with following information and I am trying to prepare a key-value pair from CSV file. Can you please help me with below scenario.
Line 1: List,ID
Line 2: 1,2,3
Line 3: 4,5,6
Line 4: List,Name
Line 5: Tom, Peter, Joe
Line 6: Jim, Harry, Tim
I need to format the above CSV file to get an output in a new file like below:
Line 1: ID:1,2,3 4,5,6
Line 2: Name:Tom,Peter,Joe Jim, Harry, Tim
Can you please direct me on how I can use Perl functions for this scenario.
You're in luck, this is extremely easy in Perl.
There's a great library called Text::CSV which is available on CPAN, docs are here: https://metacpan.org/pod/Text::CSV
The synopsis at the top of the page gives a really good example which should let you do what you want with minor modifications.
I don't think the issue here is the CSV format so much as the fact that you have different lists broken up with header lines. I haven't tried this code yet, but I think you want something like the following:
while (<>) { # Loop over stdin one line at a time
chomp; # Strip off trailing newline
my ($listToken, $listName) = split(',');
next unless $listToken; # Skip over blank lines
if ($listToken =~ /^List/) { # This is a header row
print "\n$listName: "; # End previous list, start new one
} else { # The current list continues
print "$_ "; # Append the entire row to the output
}
}
print "\n"; # Terminate the last line
Note that this file format is a little dubious, as there is no way to have a data row where the first value is the literal "List". However, I'm assuming that either you have no choice in file format or you know that List is not a legal value.
(Note - I fixed a mistake where I used $rest as a variable; that was caused by my renaming them as I went along and missing one)

How to "jump" to a line of a file, rather than read file line by line, using Perl

I am opening a file containing a single but very long column. I want to retrieve from it just a short segment, starting at a specified line and ending at another specified line. Currently, my script is reading the file line by line until the desired lines are found. I am using:
my ( $from, $to ) = ( some line number, some larger line number );
my $count = 1;
my #seq = ();
while ( <SEQUENCE> ) {
print "$_ for $count\n";
$count++;
while ( $count >= $from && $count <= $to ) {
push( #seq, $_ );
last;
}
}
print "seq is: #seq\n";
Input looks like:
A
G
T
C
A
G
T
C
.
.
.
How might I "jump" to where I want to be?
You'll need to use seek to move to the correct portion of the file. ref: http://perldoc.perl.org/functions/seek.html
This works on bytes, not on lines, so generally if you need to use line seeking its not an option. However, since you're working on a fixed length line (2 or 3 bytes depending on your platform's EOL encoding) you can multiply the line length by the line you want (0 indexed) and you'll be at the correct location for reading.
If you happen to know that all the lines are of exactly the same length (accounting for line ending characters, generally 1 byte on Unix/Linux and 2 on Windows), you can use seek to go directly to a specified point in the file
The seek function lets you specify a file position in bytes/characters, not in lines. In the general case, the only way to go to a specified line number is to read from the beginning and skip that many lines (minus one).
Unless you have an index mapping line numbers to byte offsets; then you can look up the specified line number in the index and use seek to jump to that location. To do this, you have to build the index separately (a process that will require reading through the entire file) and make sure the index is always up to date. If the file changes frequently, this is likely to be impractical.
I'm not aware of any existing tools for building and using such an index, but I wouldn't be surprised if they exist. But it should be easy enough to roll your own.
But unless scanning the file to find the line number you want is a significant performance bottleneck, I wouldn't bother with the extra complexity.

Perl get array count so can start foreach loop at a certain array element

I have a file that I am reading in. I'm using perl to reformat the date. It is a comma seperated file. In one of the files, I know that element.0 is a zipcode and element.1 is a counter. Each row can have 1-n number of cities. I need to know the number of elements from element.3 to the end of the line so that I can reformat them properly. I was wanting to use a foreach loop starting at element.3 to format the other elements into a single string.
Any help would be appreciated. Basically I am trying to read in a csv file and create a cpp file that can then be compiled on another platform as a plug-in for that platform.
Best Regards
Michael Gould
you can do something like this to get the fields from a line:
my #fields = split /,/, $line;
To access all elements from 3 to the end, do this:
foreach my $city (#fields[3..$#fields])
{
#do stuff
}
(Note, based on your question I assume you are using zero-based indexing. Thus "element 3" is the 4th element).
Alternatively, consider Text::CSV to read your CSV file, especially if you have things like escaped delimiters.
Well if your line is being read into an array, you can get the number of elements in the array by evaluating it in scalar context, for example
my $elems = #line;
or to be really sure
my $elems = scalar(#line);
Although in that case the scalar is redundant, it's handy for forcing scalar context where it would otherwise be list context. You can also find the index of the last element of the array with $#line.
After that, if you want to get everything from element 3 onwards you can use an array slice:
my #threeonwards = #line[3 .. $#line];

Unwanted line breaks when using print when expression

I'm currently using the Print When Expression function on my fields and whenever a field is getting excluded because of the print when condition it is leaving a blank space instead of just skipping it and moving on to the next one. Here is a picture showing what is happening:
So I'm trying to find a way to ignore that line break and keep the entire list uniform.
Here is my Print When Expression condition (which may or may not help you in answering my question): $F{clicks} < 1
Apply the Print When expression to the whole band, not to the fields.