Find matches in a log file based on the time and ID - perl

I have a radius log file which is comma separated.
"1/3/2013","00:52:23","NASK","Stop","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC400",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","ATGGSN17","2","7",,,"1385772885",,
"1/3/2013","00:52:23","NASK","Start","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC500",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","A","2","7",,,"1385772885",,
Is it possible through any Linux command line tool like awk to count the number of occurrences where the second column (the time) and the seventh column (the number) are the same, and a Start event follows a Stop event?
I want to find the occurrences where a Stop is followed by a Start at the same time for the same number.
There will be other entries as well with the same timestamp between these cases.

You don't say very clearly what kind of result you want, but you should use Perl with Text::CSV to process CSV files.
This program just prints the three relevant fields from all lines of the file where the event is Start or Stop and the time and the ID string are duplicated.
use strict;
use warnings;
use Text::CSV;
my $csv = Text::CSV->new;
open my $fh, '<', 'text.csv' or die $!;
my %data;
while (my $row = $csv->getline($fh)) {
my ($time, $event, $id) = #$row[1,3,6];
next unless $event eq 'Start' or $event eq 'Stop';
push #{ $data{"$time/$id"} }, $row;
}
for my $lines (values %data) {
next unless #$lines > 1;
print "#{$_}[1,3,6]\n" for #$lines;
print "\n";
}
output
00:52:23 Stop 15444111111
00:52:23 Start 15444111111

I have tried the following using GNU sed & awk
sed -n '/Stop/,/Start/{/Stop/{h};/Start/{H;x;p}}' text.csv \
| awk -F, 'NR%2 != 0 {prev=$0;time=$2;num=$7} \
NR%2 == 0 {if($2==time && $7==num){print prev,"\n", $0}}'
The sed part would select pairing Stop line and Start line. There can(or not) be other lines between the two lines, and if there are multiple Stop lines before a Start line the last Stop line would be selected (This may be not necessary in this case...).
The awk part would compare the selected pairs in sed part, if the second and seventh columns are identical, the pair would be print out.
My test as below:
text.csv:
"1/3/2013","00:52:20","NASK","Stop","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC400",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","ATGGSN17","2","7",,,"1385772885",,
"1/3/2013","00:52:23","NASK","XXXX","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC400",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","ATGGSN17","2","7",,,"1385772885",,
"1/3/2013","00:52:23","NASK","Stop","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC400",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","ATGGSN17","2","7",,,"1385772885",,
"1/3/2013","00:52:23","NASK","XXXX","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC400",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","ATGGSN17","2","7",,,"1385772885",,
"1/3/2013","00:52:23","NASK","Start","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC500",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","A","2","7",,,"1385772885",,
"1/3/2013","00:52:28","NASK","Stop","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC400",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","ATGGSN17","2","7",,,"1385772885",,
"1/3/2013","00:52:29","NASK","Start","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC500",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","A","2","7",,,"1385772885",,
The output:
"1/3/2013","00:52:23","NASK","Stop","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC400",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","ATGGSN17","2","7",,,"1385772885",,
"1/3/2013","00:52:23","NASK","Start","15444111111","200","15444111111","15444111111","10.142.98.190","moen",,,,,"D89BA1F93E5DC500",,,"31026","216.155.166.8","310260010265999",,"10.184.81.145","780246","18","A","2","7",,,"1385772885",,

If the "stop" line is followed immediately by the "start" line, you could try the following:
awk -f cnt.awk input.txt
where cnt.awk is
BEGIN {
FS=","
}
$4=="\"Stop\"" {
key=($2 $5)
startl=$0
getline
if ($4=="\"Start\"") {
if (key==($2 $5)) {
print startl
print $0
}
}
}
Update
If there can be other lines between a "Start" and "Stop" line, you could try:
BEGIN {
FS=","
}
$4=="\"Stop\"" {
a[($2 $5)]=$0
next
}
$4=="\"Start\"" {
key=($2 $5)
if (key in a) {
sl[++i]=a[key]
el[i]=$0
}
}
END {
nn=i
for (i=1; i<=nn; i++) {
print sl[i]
print el[i]
}
}

Related

search for a key value pair and append the value to other keys in unix

I need to search for a key and append the value to every key:value pair in a Unix file
Input file data:
1A:trans_ref_id|10:account_no|20:cust_name|30:trans_amt|40:addr
1A:trans_ref_id|10A:ccard_no|20:cust_name|30:trans_amt|40:addr
My desired Output:
account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr
Basically, I need the value of 10 or 10A appended to every key:value pair and split into new lines. To be clear, this won't always be the second field.
I am new to sed, awk and perl. I started with extracting the value using awk:
awk -v FS="|" -v key="59" '$2 == key {print $2}' target.txt
I need the value of 10 or 10A appended to every key:value pair
Going by these requirements, you may try this awk:
awk '
BEGIN{FS=OFS="|"}
match($0, /\|10A?:[^|]+/) {
s = substr($0, RSTART, RLENGTH)
sub(/.*:/, "", s)
}
{
for (i=1; i<=NF; ++i)
print s, $i
}' file
account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr
# Looks for 10 or 10A
perl -F'\|' -lane'my ($id) = map /^10A?:(.*)/s, #F; print "$id|$_" for #F'
# Looks for 10 or 10<non-digit><maybe more>
perl -F'\|' -lane'my ($id) = map /^10(?:\D[^:]*)?:(.*)/s, #F; print "$id|$_" for #F'
-n executes the program for each line of input.
-l removes LF on read and adds it on print.
-a splits the line on | (specified by -F) into #F.
The first statement extracts what follows : in the field with id 10 or 10-plus-something.
The second statement prints a line for each field.
Specifying file to process to Perl one-liner
If you are still stuck on where to get started, you will use a field-separator and output-field-separator (FS and OFS) set equal to '|' that will split each record into fields at each '|'. Your fields are available as $1, $2, ... $NF. You care about getting, e.g. account_no from field two ($2) so you split() field two with the separator ':' saving the split fields in an array (a used below). You want the second part from field two which will be in the 2nd array element a[2] to use as the new field-1 in output.
The rest is just looping over each field and outputting a[2] a separator and then the current field. You can do that with:
awk 'BEGIN{FS=OFS="|"} {split ($2,a,":"); for(i=1;i<=NF;i++) print a[2],$i}' file
Example Use/Output
With your example input in file, the result would be:
account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr
Which appears to be what you are after. Let me know if you have further questions.
"10" or "10A" at Unknown Field
You can handle the fields containing "10" and "10A" in any order. You just add a loop to loop over the fields and determine which holds "10" or "10A" and save the 2nd element from the array resulting from split() from that field. The rest is the same, e.g.
awk '
BEGIN { FS=OFS="|" }
{ for (i=1;i<=NF;i++){
split ($i,a,":")
if (a[1]=="10"||a[1]=="10A"){
key=a[2]
break
}
}
for (i=1;i<=NF;i++)
print key, $i
}
' file1
Example Input
1A:trans_ref_id|10:account_no|20:cust_name|30:trans_amt|40:addr
1A:trans_ref_id|20:cust_name|30:trans_amt|10A:ccard_no|40:addr
Example Use/Output
awk '
> BEGIN { FS=OFS="|" }
> { for (i=1;i<=NF;i++){
> split ($i,a,":")
> if (a[1]=="10"||a[1]=="10A"){
> key=a[2]
> break
> }
> }
> for (i=1;i<=NF;i++)
> print key, $i
> }
> ' file1
account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|10A:ccard_no
ccard_no|40:addr
Which picks up the proper new field 1 for output from the 4th field containing "10A" for the second line above.
Let em know if this is what you needed.
EDIT: To find 10 OR 10A values in anywhere in line and then print as per that try following then.
awk '
BEGIN{
FS=OFS="|"
}
match($0,/(10|10A):[^|]*/){
split(substr($0,RSTART,RLENGTH),arr,":")
}
{
for(i=1;i<=NF;i++){
print arr[2],$i
}
}' Input_file
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program.
FS=OFS="|" ##Setting FS and OFS to | here.
}
match($0,/(10|10A):[^|]*/){ ##using match function to match either 10: till | OR 10A: till | here.
split(substr($0,RSTART,RLENGTH),arr,":") ##Splitting matched sub string into array arr with delmiter of : here.
}
{
for(i=1;i<=NF;i++){ ##Running for loop for each field for each line.
print arr[2],$i ##Printing 2nd element of ar, along with current field.
}
}' Input_file ##Mentioning Input_file name here.
With your shown samples, please try following.
awk '
BEGIN{
FS=OFS="|"
}
{
split($2,arr,":")
print arr[2],$1
for(i=2;i<=NF;i++){
print arr[2],$i
}
}
' Input_file
Perl script implementation
use strict;
use warnings;
use feature 'say';
my $fname = shift || die "run as 'script.pl input_file key0 key1 ... key#'";
open my $fh, '<', $fname || die $!;
while( <$fh> ) {
chomp;
my %data = split(/[:\|]/, $_);
for my $key (#ARGV) {
if( $data{$key} ) {
say "$data{$key}|$_" for split(/\|/,$_);
}
}
}
close $fh;
Run as script.pl input_file 10 10A
Output
account_no|1A:trans_ref_id
account_no|10:account_no
account_no|20:cust_name
account_no|30:trans_amt
account_no|40:addr
ccard_no|1A:trans_ref_id
ccard_no|10A:ccard_no
ccard_no|20:cust_name
ccard_no|30:trans_amt
ccard_no|40:addr
Here's an alternate perl solution:
perl -pe '($id) = /(?<![^|])10A?:([^|]+)/; s/([^|]+)[|\n]/$id|$1\n/g'
($id) = /(?<![^|])10A?:([^|]+)/ this will capture the string after 10: or 10A: and save in $id variable. First such match in the line will be captured.
s/([^|]+)[|\n]/$id|$1\n/g every field is then prefixed with value in $id and | character

perl line count a file containing a specific text

Basically I want to count the number if lines which contain the word Out.
my $lc1 = 0;
open my $file, "<", "LNP_Define.cfg" or die($!);
#return [ grep m|Out|, <$file> ]; (I tried something with return to but also failed)
#$lc1++ while <$file>;
#while <$file> {$lc1++ if (the idea of the if statement is to count lines if it contains Out)
close $file;
print $lc1, "\n";
The command line might be potential option for you too:
perl -ne '$lc1++ if /Out/; END { print "$lc1\n"; } ' LNP_Define.cfg
The -n assumes a while loop for all your code before END.
The -e expects code surrounded by ' '.
The $lc1++ will count only if the following if statement is true.
The if statement runs per line looking for "Out".
The END { } statement is for processing after the while loops ends. Here is where you can print the count.
Or without the command line:
my $lc1;
while ( readline ) {
$lc1++ if /Out/;
}
print "$lc1\n";
Then run on the command line:
$ perl count.pl LNP_Define.cfg
Use index:
0 <= index $_, 'Out' and $lc1++ while <$file>;

AWK how to print more than one information and redirect all to a text file

Need help on this
I am creatinga script that will analyse a file and want to use Awk to print 2 informations in an output txt file
I am able to print the information No one am looking for in my screen but how to print with the same Awk another information (exemple the number of lines of my file analyzed) and output those two information in a file calle test.txt
I tried with this code and code erreor : operator expected
#!/usr/bin/perl
if ($#ARGV ==-1)
{
print "Saisissez un nom de fichier a nalyser \n";
}
else
{
$fname = $ARGV[0];
open(FILE, $fname) || die ("cant open \n");
}
while($ligne=<FILE>)
{
chop ($ligne);
my ($elemnt1, $ellement2, $element3) = split (/ /, $ligne_);
}
system("awk '{print \$2 > "test.txt"}' $fname");
Try escaping the quotes on your last line, so test.txt is actually in the string passed to system.
system("awk '{print \$2 > \"test.txt\"}' $fname");
Edit: Adding the number of lines to the same file
The Awk variable NR ends up holding the number of lines in the input while the END rule is executing. Try this:
$outfile = '"test.txt"';
system("awk '{print \$2 > $outfile} END {print NR > $outfile}' $fname");
Notes:
Watch out that $outfile doesn't have any funny characters in the name.
Unlike in shell, it's perfectly safe to use > both times. See here.

How to compress 4 consecutive blank lines into one single line in Perl

I'm writing a Perl script to read a log so that to re-write the file into a new log by removing empty lines in case of seeing any consecutive blank lines of 4 or more. In other words, I'll have to compress any 4 consecutive blank lines (or more lines) into one single line; but any case of 1, 2 or 3 lines in the file will have to remain the format. I have tried to get the solution online but the only I can find is
perl -00 -pe ''
or
perl -00pe0
Also, I see the example in vim like this to delete blocks of 4 empty lines :%s/^\n\{4}// which match what I'm looking for but it was in vim not Perl. Can anyone help in this? Thanks.
To collapse 4+ consecutive Unix-style EOLs to a single newline:
$ perl -0777 -pi.bak -e 's|\n{4,}|\n|g' file.txt
An alternative flavor using look-behind:
$ perl -0777 -pi.bak -e 's|(?<=\n)\n{3,}||g' file.txt
use strict;
use warnings;
my $cnt = 0;
sub flush_ws {
$cnt = 1 if ($cnt >= 4);
while ($cnt > 0) {print "\n"; $cnt--; }
}
while (<>) {
if (/^$/) {
$cnt++;
} else {
flush_ws();
print $_;
}
}
flush_ws();
Your -0 hint is a good one since you can use -0777 to slurp the whole file in -p mode. Read more about these guys in perlrun So this oneliner should do the trick:
$ perl -0777 -pe 's/\n{5,}/\n\n/g'
If there are up to four new lines in a row, nothing happens. Five newlines or more (four empty lines or more) are replaced by two newlines (one empty line). Note the /g switch here to replace not only the first match.
Deparsed code:
BEGIN { $/ = undef; $\ = undef; }
LINE: while (defined($_ = <ARGV>)) {
s/\n{5,}/\n\n/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
HTH! :)
One way using GNU awk, setting the record separator to NUL:
awk 'BEGIN { RS="\0" } { gsub(/\n{5,}/,"\n")}1' file.txt
This assumes that you're definition of empty excludes whitespace
This will do what you need
perl -ne 'if (/\S/) {$n = 1 if $n >= 4; print "\n" x $n, $_; $n = 0} else {$n++}' myfile

perl + numeration word or parameter in file

I need help about how to numeration text in file.
I have also linux machine and I need to write the script with perl
I have file name: file_db.txt
In this file have parameters like name,ParameterFromBook,NumberPage,BOOK_From_library,price etc
Each parameter equal to something as name=elephant
My question How to do this by perl
I want to give number for each parameter (before the "=") that repeated (unique parameter) in the file , and increase by (+1) the new number of the next repeated parameter until EOF
lidia
For example
file_db.txt before numbering
parameter=1
name=one
parameter=2
name=two
file_db.txt after parameters numbering
parameter1=1
name1=one
parameter2=2
name2=two
other examples
Example1 before
name=elephant
ParameterFromBook=234
name=star.world
ParameterFromBook=200
name=home_room1
ParameterFromBook=264
Example1 after parameters numbering
name1=elephant
ParameterFromBook1=234
name2=star.world
ParameterFromBook2=200
name3=home_room1
ParameterFromBook3=264
Example2 before
file_db.txt before numbering
lines_and_words=1
list_of_books=3442
lines_and_words=13
list_of_books=344224
lines_and_words=120
list_of_books=341
Example2 after
file_db.txt after parameters numbering
lines_and_words1=1
list_of_books1=3442
lines_and_words2=13
list_of_books2=344224
lines_and_words3=120
list_of_books3=341
It can be condensed to a one line perl script pretty easily, though I don't particularly recommend it if you want readability:
#!/usr/bin/perl
s/(.*)=/$k{$1}++;"$1$k{$1}="/e and print while <>;
This version reads from a specified file, rather than using the command line:
#!/usr/bin/perl
open IN, "/tmp/file";
s/(.*)=/$k{$1}++;"$1$k{$1}="/e and print while <IN>;
The way I look at it, you probably want to number blocks and not just occurrences. So you probably want the number on each of the keys to be at least as great as the earliest repeating key.
my $in = \*::DATA;
my $out = \*::STDOUT;
my %occur;
my $num = 0;
while ( <$in> ) {
if ( my ( $pre, $key, $data ) = m/^(\s*)(\w+)=(.*)/ ) {
$num++ if $num < ++$occur{$key};
print { $out } "$pre$key$num=$data\n";
}
else {
$num++;
print;
}
}
__DATA__
name=elephant
ParameterFromBook=234
name=star.world
ParameterFromBook=200
name=home_room1
ParameterFromBook=264
However, if you just wanted to give the key it's particular count. This is enough:
my %occur;
while ( <$in> ) {
my ( $pre, $key, $data ) = m/^(\s*)(\w+)=(.*)/;
$occur{$key}++;
print { $out } "$pre$key$occur{$key}=$data\n";
}
in pretty much pseudo code:
open(DATA, "file");
my #lines = <DATA>;
my %tags;
foreach line (#lines)
{
my %parts=split(/=/, $line);
my $name=$parts[0];
my $value=$parts[1];
$name = ${name}$tags{ $name };
$tags{ $name } = $tags{ $name } + 1;
printf "${name}=$value\n";
}
close( DATA );
This looks like a CS101 assignment. Is it really good to ask for complete solutions instead of asking specific technical questions if you have difficulty?
If Perl is not a must, here's an awk version
$ cat file
name=elephant
ParameterFromBook=234
name=star.world
ParameterFromBook=200
name=home_room1
ParameterFromBook=264
$ awk -F"=" '{s[$1]++}{print $1s[$1],$2}' OFS="=" file
name1=elephant
ParameterFromBook1=234
name2=star.world
ParameterFromBook2=200
name3=home_room1
ParameterFromBook3=264