pattern matching and dealing with data in expect? - sed

contents of expect_out(buffer)
GigabitEthernet1/0/9 unassigned YES unset up up
GigabitEthernet1/0/10 unassigned YES unset down down
GigabitEthernet1/0/11 unassigned YES unset down down
GigabitEthernet1/0/23 unassigned YES unset down down
GigabitEthernet1/0/24 unassigned YES unset down down
GigabitEthernet1/1/1 unassigned YES unset down down
GigabitEthernet1/1/2 unassigned YES unset down down
GigabitEthernet1/1/3 unassigned YES unset down down
GigabitEthernet1/1/4 unassigned YES unset down down
Te1/1/1 unassigned YES unset down down
Te1/1/2 unassigned YES unset down down
FastEthernet2/0/1 unassigned YES unset down down
FastEthernet2/0/2 unassigned YES unset down down
FastEthernet2/0/24 unassigned YES unset down down
GigabitEthernet2/0/1 unassigned YES unset up up
GigabitEthernet2/0/2 unassigned YES unset down down
I have the foloowing data above and i need to count the number of data for each type
so that i can have the info like :
GigabitEthernet1 : 20
GigabitEthernet2 : 20
Tel : 2
FastEthernet2 : 4
FastEthernet1 : 4
total : 50
How can I do it?
Any help would be appreciated because I don't know in which direction to proceed because I am a novice as far as expect/tcl is concerned.
I tried to use split function to parse it by using newline as delimiter so that I can use regex inside for loop but it seems that because $expect_output(buffer) is a variable it might not have any lines in it.
Moreover can I use awk or sed inside expect then it would be not so difficult I guess. But an expected solution would be standard.

based on your current input data, this one-liner:
awk -F'/' '{a[$1]++}END{for(x in a){print x" : "a[x];t+=a[x];}print "total : "t}' file
gives:
FastEthernet2 : 3
GigabitEthernet1 : 9
GigabitEthernet2 : 2
Te1 : 2
total : 16

Since Expect is based on Tcl/TK you should familiarize yourself with that language since it contains numerous string handling options. Here is some code which hopefully sets you on the right track.
set str $expect_out(buffer)
# Strip everything after slash
regsub -all -line "/.*" $str "" str2
puts $str2 # just to see what you got so far
# Convert string into list
set li [split $str2 "\n"]
# Convert list into array
# This is actually the tricky part which converts the list into an
# associative array whose entries have first to be set to one
# and later have to be increased by one
for {set i 0} {$i < [llength $li]} {incr i} {
if { [info exists arr([lindex $li $i]) ] } {
incr arr([lindex $li $i]) } {
set arr([lindex $li $i]) 1 }
}
# Now get the statistics
array get arr
# will print this for your example
# GigabitEthernet2 2 Te1 2 FastEthernet2 3 GigabitEthernet1 9
And you should tag this question with Tcl and TK too.

Related

Change one line to column

I have a file like this:
20180127200000
DEFAULT, Proc_m0_s1
DEFAULT, Proc_m0_s2
DEFAULT, Proc_m0_s3
20180127200001
DEFAULT, Proc_m0_s1
DEFAULT, Proc_m0_s2
DEFAULT, Proc_m0_s3
I'd like to convert to:
20180127200000 DEFAULT, Proc_m0_s1
20180127200000 DEFAULT, Proc_m0_s2
20180127200000 DEFAULT, Proc_m0_s3
20180127200001 DEFAULT, Proc_m0_s1
20180127200001 DEFAULT, Proc_m0_s2
20180127200001 DEFAULT, Proc_m0_s3
The time stamp appears every 4 lines.
Awk should be both faster and simpler than a shell loop for this type of task.
awk 'NR%4==1 { time=$1; next }
{ print time " " $0 }' file
NR is the line number. If the line number modulo 4 is one, remember the first field but don't print. Otherwise, print with the remembered value and a space as the prefix.
iconvert() {
while true; do
read linePrefix || return 0
for i in $(seq $((prefix_lines - 1))); do
read line || return 0
echo "$linePrefix" "$line"
done
done
}
You just call iconvert like this:
prefix_lines=4
iconvert < file.txt
which gives your desired output.

Perl one liner to simulate awk script

I'm new to both awk and perl, so please bear with me.
I have the following awk script:
awk '/regex1/{p = 0;} /regex2/{p = 1;} p'
What this basically does is print all lines staring from line matching with regex2 until a line matching with regex1 is found.
Example:
regex1
regex2
line 1
line 2
regex1
regex2
regex1
Output:
regex2
line 1
line 2
regex2
Is it possible to simulate this using a perl one-liner? I know I can do it with a script saved in a file.
Edit:
A practical example:
24 May 2017 17:00:06,827 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
24 May 2017 17:00:06,828 [INFO] 567890 (Blah : Blah1) Service-name:: Content( May span multiple lines)
24 May 2017 17:00:06,829 [INFO] 123456 (Blah : Blah2)
Service-name: Multiple line content. Printing Object[ ID1=fac-adasd
ID2=123231
ID3=123108 Status=Unknown
Code=530007 Dest=CA
]
24 May 2017 17:00:06,830 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
24 May 2017 17:00:06,831 [INFO] 567890 (Blah : Blah2) Service-name:: Content( May span multiple lines)
Given the search key 123456 I want to extract the following:
24 May 2017 17:00:06,827 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
24 May 2017 17:00:06,829 [INFO] 123456 (Blah : Blah2)
Service-name: Multiple line content. Printing Object[ ID1=fac-adasd
ID2=123231
ID3=123108 Status=Unknown
Code=530007 Dest=CA
]
24 May 2017 17:00:06,830 [INFO] 123456 (Blah : Blah1) Service-name:: Single line content
The following awk script does the job:
awk '/[0-9]{2}\s\w+\s[0-9]{4}/{n = 0} /123456/ {n =1}n' file
perl -ne 'print if (/regex2/ .. /regex1/) =~ /^\d+$/'
This is slightly crazy, but here's how it works:
-n adds an implicit loop over the input lines
the current line is in $_
the two bare regex matches (/regex2/, /regex1/) implicitly test against $_
we use .. in scalar context, which turns it into a stateful flip-flop operator
By that I mean: X .. Y starts out in the "false" state. In the "false" state it only evaluates X. If X returns a false value, it remains in the "false" state (and returns false itself). Once X returns a true value, it moves into the "true" state and returns true.
In the "true" state it only evaluates Y. If Y returns false, it remains in the "true" state (and returns true itself). Once Y returns a true value, it moves into the "false" state but it still returns true.
had we just used print if /regex2/ .. /regex1/, it would have printed all the terminating regex1 lines, too
a close reading of Range Operators in perldoc perlop reveals that you can distinguish the end points of the range
the "true" value returned by .. is actually a sequence number starting from 1, so the start of a range can be identified by checking for 1
when the end of the range is reached (i.e. we're about to move from the "true" state to the "false" state again), the return value gets a "E0" tacked on to the end
Adding "E0" to an integer doesn't affect its numeric value. Perl implicitly converts strings to numbers when needed, and something like "5E0" is just scientific notation (meaning 5 * 10**0, which is 5 * 1, which is 5).
the "false" value returned by .. is the empty string, ""
We check that the result of .. matches the regex /^\d+$/, i.e. is all digits. This excludes the empty string (because we require at least one digit to match), so we don't print lines outside of the range. It also excludes the last line in our range, because E is not a digit.
Not sure if awk prints both the start and end of the range, but Perl does:
perl -ne 'if(/regex2/ ... /regex1/){print}' file
Edit: Awk (at least Gnu awk) also has a range operator, so this could have been done more simply as:
awk '/regex2/,/regex1/' file

perl how to force boolean to 0/1

I'm trying to get a 0 or 1 in a value for true/false. Here's the code:
use strict;
my %h = (Y => "y");
my $bool_x = 1 & exists $h{X};
my $bool_y = 1 & exists $h{Y};
print("x $bool_x y $bool_y\n");
I needed to add the "1 &" to force it to not be the empty string. Is there a better way to do this? I realize that it's an artifact of the way perl prints the false value, but I need it to be a 0 not the empty string.
The most efficient way to do this is just:
$bool || 0
Your code is far from clear. The & operator is a bitwise operator that behaves differently on numbers and strings, but exists returns a value that will obligingly be the number zero or the empty string depending on what is required of it. perldoc perlop has this to say
Although no warning is currently raised, the result is not well defined when this operation is performed on operands that aren't either numbers (see Integer Arithmetic) nor bitstrings (see Bitwise String Operators)
So because the result of exists is one such value your code is on shakey ground
There is no need to write something arcane, and if you want to represent Perl's internal true and false values by a different pair of values then the conditional operator is the correct choice
my $bool_x = exists $h{X} ? 1 : 0
I believe that's the most readable without being verbose, and that's all that matters here. It's also applicable to any other pair of values that you may choose, such as
my $bool_x = exists $h{X} ? 'Y' : 'N'
There are many ways to "numify" a true/false value.
If $var can contain 1 or any false value, all of these will evaluate to either "0" or "1":
0 + $var
0 | $var
$var || 0
1 * $var
1 & $var (you discovered this one, already)
chr(48+$var)
sprintf "%d", $var
These constructions return a 0/1 value when $var can contain any true or false value:
0 + !!$var ( !! true => 1, !! false => "" )
1 - !$var
$var ? 1 : 0

Trouble using 'while' loop to evaluate multiple lines, Perl

Thank you in advance for indulging an amateur Perl question. I'm extracting some data from a large, unformatted text file, and am having trouble combining the use of a 'while' loop and regular expression matching over multiple lines.
First, a sample of the data:
01-034575 18/12/2007 258,750.00 11,559.00 36 -2 0 6 -3 2 -2 0 2 1 -1 3 0 5 15
-13 -44 -74 -104 -134 -165 -196 -226 -257 -287 -318 -349 -377 -408 -438
-469 -510 -541 -572 -602 -633 -663
Atraso Promedio ---> 0.94
The first sequence, XX-XXXXXX is a loan ID number. The date and the following two numbers aren't important. '36' is the number of payments. The following sequence of positive and negative numbers represent how late/early this client was for this loan at each of the 36 payment periods. The '0.94' following 'Atraso Promedio' is the bank's calculation for average delay. The problem is it's wrong, since they substitute all negative (i.e. early) payments in the series with zeros, effectively over-stating how risky a client is. I need to write a program that extracts ID and number of payments, and then dynamically calculates a multi-line average delay.
Here's what I have so far:
#Create an output file
open(OUT, ">out.csv");
print OUT "Loan_ID,Atraso_promedio,Atraso_alt,N_payments,\n";
open(MYINPUTFILE, "<DATA.txt");
while(<MYINPUTFILE>){
chomp($_);
if($ID_select != 1 && m/(\d{2}\-\d{6})/){$Loan_ID = $1, $ID_select = 1}
if($ID_select == 1 && m/\d{1,2},\d{1,3}\.00\s+\d{1,2},\d{1,3}\.00\s+(\d{1,2})/) {$N_payments = $1, $Payment_find = 1};
if($Payment_find == 1 && $ID_select == 1){
while(m/\s{2,}(\-?\d{1,3})/g){
$N++;
$SUM = $SUM + $1;
print OUT "$Loan_ID,$1\n"; #THIS SHOWS ME WHAT NUMBERS THE CODE IS GRABBING. ACTUAL OUTPUT WILL BE WRITTEN BELOW
print $Loan_ID,"\n";
}
if(m/---> *(\d*.\d*)/){$Atraso = $1, $Atraso_select = 1}
if($ID_select == 1 && $Payment_find == 1 && $Atraso_select == 1){
...
There's more, but the while loop is where the program is breaking down. The problem is with the pattern modifier, 'g,' which performs a global search of the string. This makes the program grab numbers that I don't want, such as the '1' in loan ID and the '36' for the number of payments. I need the while loop to start from wherever the previous line in the code left off, which should be right after it has identified the number of loans. I've tried every pattern modifier that I've been able to look up, and only 'g' keeps me out of an infinite loop. I need the while loop to go to the end of the line, then start on the next one without combing over the parts of the string already fed through the program.
Thoughts? Does this make sense? Would be immensely grateful for any help you can offer. This work is pro-bono, unpaid: just trying to help out some friends in a micro-lending institution conduct a risk analysis.
Cheers,
Aaron
The problem is probably easier using split, for instance something like this:
use strict;
use warnings;
open DATA, "<DATA.txt" or die "$!";
my #payments;
my $numberOfPayments;
my $loanNumber;
while(<DATA>)
{
if(/\b\d{2}-\d{6}\b/)
{
($loanNumber, undef, undef, undef, $numberOfPayments, #payments) = split;
}
elsif(/Atraso Promedio/)
{
my (undef, undef, undef, $atrasoPromedio) = split;
# Calculate average of payments and print results
}
else
{
push(#payments, split);
}
}
If the data's clean enough, I might approach it by using split instead of regular expressions. The first line is identifiable if field[0] matches the form of a loan number and field[1] matches the format of a date; then the payment dates are an array slice of field[5..-1]. Similarly testing the first field of each line tells you where you are in the data.
Peter van her Heijden's answer is a nice simplification for a solution.
To answer the OP's question about getting the regexp to continue where it left off, see Perl operators - regexp-quote-like operators, specifically the section "Matching in list context" and the "\G assertion" section just after that.
Essentially, you can use m//gc along with the \G assertion to use regexps match where previous matches left off.
The example in the "\G assertion" section about lex-like scanners would seem to apply to this question.

Pass zero in to Getopt::Std

I am using Getopt::Std in a Perl script, and would like to pass in a zero as value. I am checking that values are set correctly using unless(). At the moment unless() is rejecting the value as being unset.
Is there a way to get unless() to accept zero as a valid value (any non-negative integer is valid).
This is probably perfeclty simple, but I've never touched Perl before a few days ago!
Rich
You need to use unless defined <SOMETHING> instead of unless <SOMETHING> , because zero is false in Perl.
Perl 5 has several false values: 0, "0", "", undef, ().
It is important to note that some things may look like they should be false, but aren't. For instance 0.0 is false because it is number that is equivalent to 0, but "0.0" is not (the only strings which are false are the empty string ("") and "0").
It also has the concept of definedness. A variable that has a value (other than undef) assigned to it is said to be defined and will return true when tested with the defined function.
Given that you want an argument to be a non-negative integer, it is probably better to test for that:
unless (defined $value and $value =~ /^[0-9]+$/) {
#blah
}