perl print formatting question - perl

I want to display a table in perl, the rows and column names for which will be of variable length. I want the columns to be neatly aligned. The problem is the row and column heading are of variable length, so the alignment shifts off for different files.
Here is the code I am using to format :
print "\n ";
foreach (keys(%senseToSenseCountHash))
{
printf "%15s",$_;
}
print "\n";
print "------------------------------------------------------------\n";
my $space = "---";
foreach my $realSense (keys(%actualSenseToWronglyDisambiguatedSense))
{
printf "%s",$realSense;
foreach (keys(%senseToSenseCountHash))
{
if(exists($actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_}))
{
printf "%15s",$actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_};
}
else
{
printf "%15s",$space;
}
}
print "\n";
}
The outputs I get are as follows (for different files that I have to test on) :
Microsoft IBM
------------------------------------------------------------
Microsoft 896 120
IBM 66 661
SERVE12 SERVE2 SERVE6 SERVE10
------------------------------------------------------------
SERVE12 319 32 19 8
SERVE2 44 159 39 25
SERVE6 22 9 102 1
SERVE10 14 16 12 494
HARD3 HARD2 HARD1
------------------------------------------------------------
HARD3 68 7 27
HARD2 6 60 90
HARD1 37 69 937
I want to make this output aligned regardless of the row and column name. Can anyone please help?
Thanks so much!

This line:
printf "%s",$realSense;
has no specific width, and is throwing off the alignment.

Found the answer, pasting it here in case any one wants to use it.
printf "%10s %-2s",'----------','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%s",'----------------';
}
print "\n";
printf "%10s %-2s",' ','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%-14s",$_;
}
print "\n";
printf "%10s %-2s",'----------','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%s",'----------------';
}
print "\n";
foreach my $key (sort { $senseToSenseCountHash{$b} <=>
$senseToSenseCountHash{$a} } keys %senseToSenseCountHash )
{
$maxSense = $senseToSenseCountHash{$key};
last;
}
my $space = "---";
foreach my $realSense (keys(%actualSenseToWronglyDisambiguatedSense))
{
printf "%-10s %-2s",$realSense,'|';
foreach (keys(%senseToSenseCountHash))
{
if(exists($actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_}))
{
printf "%-15s",$actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_};
}
else
{
printf "%-15s",$space;
}
}
print "\n";
}
printf "%10s %-2s",'----------','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%s",'----------------';
}
print "\n";
Output :
---------- | ------------------------------------------------
| HARD3 HARD2 HARD1
---------- | ------------------------------------------------
HARD3 | 68 7 27
HARD2 | 6 60 90
HARD1 | 37 69 937
---------- | ------------------------------------------------
---------- | ----------------------------------------------------------------
| SERVE12 SERVE2 SERVE6 SERVE10
---------- | ----------------------------------------------------------------
SERVE12 | 319 32 19 8
SERVE2 | 44 159 39 25
SERVE6 | 22 9 102 1
SERVE10 | 14 16 12 494
---------- | ----------------------------------------------------------------

Related

How to resolve this warning in Perl

I asked this type of ques previously but didn't provide the full code.
I am reading below file and checking the max word width present in each column and then write it to another file with proper alignment.
id0 id1 id2 batch
0 34 56 70
2 3647 58 72 566
4 39 616 75 98 78 78987 9876 7899 776
89 40 62 76
8 42 64 78
34 455 544 565
My code:
unlink "temp1.log";
use warnings;
use strict;
use feature 'say';
my $log1_file = "log1.log";
my $temp1 = "temp1.log";
open(IN1, "<$log1_file" ) or die "Could not open file $log1_file: $!";
my #col_lens;
while (my $line = <IN1>) {
my #fs = split " ", $line;
my #rows = #fs ;
#col_lens = map (length, #rows) if $.==1;
for my $col_idx (0..$#rows) {
my $col_len = length $rows[$col_idx];
if ($col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
};
};
close IN1;
open(IN1, "<$log1_file" ) or die "Could not open file $log1_file: $!";
open(tempp1,"+>>$temp1") or die "Could not open file $temp1: $!";
while (my $line = <IN1>) {
my #fs = split " ", $line;
my #az;
for my $h (0..$#fs) {
my $len = length $fs[$h];
my $blk_len = $col_lens[$h]+1;
my $right = $blk_len - $len;
$az[$h] = (" ") . $fs[$h] . ( " " x $right );
}
say tempp1 (join "|",#az);
};
My warning
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 3.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
I am getting the output correctly but don't know how to remove this warnings.
$col_idx can be up to the number of fields on a line, minus one. For the third line, this is more than the highest index of #col_lens, which contains at most 3 elements. So doing the following makes no sense:
if ($col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
Replace it with
if (!defined($col_lens[$col_idx]) || $col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
With this, there's really no point checking for $. == 1 anymore.
You're getting uninitialized warning because, while checking the $col_lens[$col_idx] < $col_len condition, one or both of them are undef.
Solution 1:
You can skip checking this condition by the use of next statement.
for my $col_idx (0..$#rows) {
my $col_len = length $rows[$col_idx];
next unless $col_lens[$col_idx];
if ($col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
}
Solution 2: (Not advised):
You can simply ignore Use of uninitialized value.. warnings by putting this line at top of your script. This will disable uninitialized warnings in a block.
no warnings 'uninitialized';
For more info, please refer this link
Following code demonstrates one of many possible ways for solution to this task
read line by line
get length of each field
compare with stored earlier
adjust to max length
form $format string for print
print formatted data
use strict;
use warnings;
use feature 'say';
my(#data,#length,$format);
while ( <DATA> ) {
my #e = split ' ';
my #l = map{ length } #e;
$length[$_] = ($length[$_] // 0) < $l[$_] ? $l[$_] : $length[$_] for 0..$#e;
push #data,\#e;
}
$format = join ' ', map{ '%'.$_.'s' } #length;
$format .= "\n";
for my $row ( #data ) {
printf $format, map { $row->[$_] // '' } 0..$#length;;
}
__DATA__
id0 id1 id2 batch
0 34 56 70
2 3647 58 72 566
4 39 616 75 98 78 78987 9876 7899 776
89 40 62 76
8 42 64 78
34 455 544 565
Output
id0 id1 id2 batch
0 34 56 70
2 3647 58 72 566
4 39 616 75 98 78 78987 9876 7899 776
89 40 62 76
8 42 64 78
34 455 544 565

Number date to Alphabetic in awk

I have a file:
ifile.txt
83080603 55 72
87090607 83 87
88010612 82 44
89080603 55 72
00110607 83 87
01030612 82 44
05120618 84 44
The 1st column shows the date and time with the format YYMMDDHH.
I would like to print the first column as hrHDDMMMYYYY
ofile.txt
03H06Aug1983 55 72
07H06Sep1987 83 87
12H06Jan1988 82 44
and so on
I can't able to convert the digit month-year to text month. My script is
awk '{printf "%s%5s%5s\n",
substr($0,7,2)"H"substr($0,5,2)substr($0,3,2)substr($0,1,2), $2, $3}' ifile.txt
EDIT: Adding solution as per OP's comment for adding year as per condition here.
awk -v curr_year=$(date +%y) '
BEGIN{
num=split("Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec",arr,",")
for(j=1;j<=num;j++){
key=sprintf("%02d",j)
months[key]=arr[j]
}
}
{
year=substr($1,1,2)>=00 && substr($1,1,2)<=curr_year?"20":"19"
$1=substr($1,length($1)-1)"H"substr($1,length($1)-3,2) months[substr($1,3,2)] year substr($1,1,2)
print
year=""
}
' Input_file
Could you please try following, written on mobile and successfully tested it on site https://ideone.com/4zYMu7 this also assume that since your Input_file first 2 letters denote year and you want to print only 19 in case there is another logic to get exact year then please do mention it
awk '
BEGIN{
num=split("Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec",arr,",")
for(j=1;j<=num;j++){
key=sprintf("%02d",j)
months[key]=arr[j]
}
}
{
$1=substr($1,length($1)-1)"H"substr($1,length($1)-3,2) months[substr($1,3,2)] "19" substr($1,1,2)
print
}
' Input_file
With GNU awk:
awk -v OFS=" " '
function totime(t, c, y, m, d, h) {
y = substr(t, 1 ,2)
m = substr(t, 3, 2)
d = substr(t, 5, 2)
h = substr(t, 7, 2)
c = y >= 69 ? 19 : 20 # GNU date treats years 69-99 as 19xx else 20xx
return mktime(c y " " m " " d " " h " 0 0")
}
function transformTime(t) {
return strftime("%HH%d%b%Y", totime(t))
}
{
$1 = transformTime($1)
print
}
' ifile.txt
03H06Aug1983 55 72
07H06Sep1987 83 87
12H06Jan1988 82 44
03H06Aug1989 55 72
07H06Nov2000 83 87
12H06Mar2001 82 44
18H06Dec2005 84 44
Or, Perl
perl -MTime::Piece -lane '
$F[0] = Time::Piece->strptime($F[0], "%y%m%d%H")->strftime("%HH%d%b%Y");
print join(" ", #F);
' ifile.txt
Using any awk in any shell on every UNIX box:
$ cat tst.awk
BEGIN {
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec",mths)
}
{
for (i=1; i<length($1); i+=2) {
t[(i-1)/2+1] = substr($1,i,2)
}
$1 = t[4] "H" t[3] mths[t[2]+0] "19" t[1]
print
}
.
$ awk -f tst.awk file
03H06Aug1983 55 72
07H06Sep1987 83 87
12H06Jan1988 82 44
03H06Aug1989 55 72
07H06Nov1900 83 87
12H06Mar1901 82 44
18H06Dec1905 84 44
{
for (i=0; i<4; i++) {
t[i] = substr($1, 2*i+1, 2);
}
d = t[0]"-"t[1]"-"t[2]" "t[3]":00";
cmd = "date --date='"d"' +%HH%d%b%Y";
cmd | getline $1;
close(cmd);
print
}
Save the above as script.awk and run as awk -f script.awk ifile.txt. It uses the date command for date format conversion.

Modifying Script to include the Count of a each time a name appears from a table

I have a script below that takes my FILE1 and parses out FILE2 only if the first column of FILE1 matches column number 10 of FILE2. So it will print out the rows I need. This part works great. The part I am having a tad bit of difficulty is inserting a sort of count for the output. The goal of the script is take column 10 at the end and produce an output. In my list there are 12 names and I want to get the count of each name. For the example below, I have used four names.
FILE1:
name1 15
name2 15
name2 30
name5 15
name4 10
name2 5
name2 5
FILE2:
23 15 5.4 1.3 5 55 128 21799 + 32 name2 1 77 0 1
23 20 5.4 1.3 5 55 128 7998 + 18 name4 1 77 0 1
23 20 5.4 1.3 6 55 128 9984 + 13 name4 1 77 1 1
23 20 5.4 1.3 7 55 128 7998 + 14 name5 1 77 2 1
23 20 5.4 1.3 6 55 128 994 + 14 name1 1 77 3
23 20 5.4 1.3 9 55 128 984 + 5 name7 1 77 4 1
23 20 5.4 1.3 5 55 128 99 + 5 name8 1 77 5 1
Expected Output
$VAR1 = {
'name1' => 1,
'name2' => 4,
'name4' => 1,
'name5' => 1,
};
5 55 128 21799 32 name2 77 0 1
5 55 128 7998 18 name4 77 0 1
6 55 128 9984 13 name4 77 1 1
7 55 128 7998 14 name5 77 2 1
6 55 128 994 14 name1 77 3 1
name1 1
name2 1
name4 2
name5 1
You can test the script it works. The part I am having difficulty with is inserting the count of each name based on the output. The print \%x is a way of checking if my original list was truly used as I am working with a much larger set of data. If someone could point me the right direction on how to modify my script without changing it drastically that would be great. I feel like this script fulfills the majority of my needs even if it is not the most efficient way of doing it.
use strict;
use Data::Dumper;
my %x;
open(FILE1, $ARGV[0]) or die "Cannot open the file: $!";
while (my $line = <FILE1>) {
my #array = split(" ", $line);
$x{$array[0]}++;
}
close FILE1;
print Dumper( \%x );
my %count;
open(FILE2, $ARGV[1]) or die "Cannot open the file: $!";
while (my $line = <FILE2>) {
my #name = split(" ", $line);
my $y = $name[9];
if ( $x{ $y } ) {
print join(" ", #name[4,5,6,7,9,11,12,13]), "\n";
$count{#name[9]}++;
}
}
print Dumper (\%count);
close FILE2;
exit;
Script now counts. Just need to debug.
the "minimal" change would be to set the elements of %x to 0 in the FILE1 loop, then check for exists $x{$y} in the FILE2 loop and do ++$x{$y} inside the condition body. Now at the end %x has the counts of all the occurrences.
The usual way (as mentioned in the comments of the question) would be to declare an additional %count and perform the same ++$count{$y} inside the if block as in the above method.
The first has the advantage and disadvantage (depending on your needs) of reporting the count even when the name has zero found occurrences.

Represent first 8 characters of the string as hex numbers separated by spaces

I need to represent first 8 characters of the string as hex numbers separated by spaces.
For example:
"This is the test!" converts to "54 68 69 73 20 69 73 20"
I use the following code to do it. Is there better(simpler) way to do it in Perl?
my $hex = unpack( "H16", $string );
my $hexOut = "";
for ( my $i = 0 ; $i < length($hex) ; $i += 2 )
{
$hexOut .= substr( $hex, $i, 2 ) . " ";
}
$hexOut = substr( $hexOut, 0, -1 );
I can't resist submitting a Perl one-liner!
my $string = "This is a test";
print(join(' ', unpack("(A2)*", unpack( "H16", $string ))) . "\n");
If you split on null, you get a list of bytes. Then just print them in hexadecimal.
use strict;
use warnings;
my $string = shift // 'This is the test!';
my #bytes = split //, $string;
for my $i (0..7) {
printf "%02X ", ord $bytes[$i];
}
print "\n";
But if you really want characters rather than bytes, then unpack.
my #chars = unpack "C0U*", $string;
for my $i (0..7) {
printf "%02X ", $chars[$i];
}
print "\n";
For the test string, it's the same
$ ./leon01.pl
54 68 69 73 20 69 73 20
54 68 69 73 20 69 73 20
but in general, it's not
$ ./leon01.pl 'A Møøse once bit my sister.'
41 20 4D C3 B8 C3 B8 73
41 20 4D F8 F8 73 65 20
$ ./leon01.pl '① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩'
E2 91 A0 20 E2 91 A1 20
2460 20 2461 20 2462 20 2463 20
my $string = "This is the test!";
my $hex_string = sprintf("%vx", substr($string, 0, 8));
$hex_string =~ y/./ /;
print $hex_string, "\n";
(The v modifier is a perl-specific extension to printf formats, introduced sometime in 5.8.x IIRC.)
I'll let you decide if this is better or not. Just another way to do it. ;-)
#! /usr/bin/perl -w
$string = "This is the test!";
$strLength = length($string);
#bytes = unpack(A2 x $strLength,unpack("H16",$string));
print "#bytes\n";
# Also could change it back to a string w/spaces:
$pretty = join(" ",#bytes);
print $pretty;

how to parse a table using perl

Name Mark1 Mark2 Mark3
Student 1 41 51 61
Student 2 42 52 62
Student 3 43 53 63
Student 4 44 54 64
Student 5 45 55 65
I when I give Name as input, I need to output the three Mark columns. How can I do this?
Assuming this is an array called #arr, where each entry is a line, and assuming the number of the student you're looking for is in $num, you can use:
foreach (#arr) {
if (/^Student \b$num\b\s+(\d.*\d)/) {
print "$2\n";
}
}
This iterates over all the entries in the array. It looks for lines that:
begin with "Student"
are followed by the exact number $num (the \bs around it specify word boundaries, so this can't be part of another number)
are followed by some whitespace
have a pattern beginning and ending with a number that is as long as possible.
If so, the pattern beginning and ending with a number is captured and printed. In this case, it corresponds exactly to Mark1, Mark2 & Mark3.
#!/usr/bin/perl
use warnings;
use strict;
my %hash = ();
print "Student No:"; #Eg:Student 1
chomp ( my $input = <>);
while (<DATA>) {
next if /^Name/;
chomp;
my ($student, $no, #marks) = split;
$hash{ "$student " . "$no" } = \#marks;
}
print join " ", #{$hash{ "$input" }};
__DATA__
Name Mark1 Mark2 Mark3
Student 1 41 51 61
Student 2 42 52 62
Student 3 43 53 63
Student 4 44 54 64
Student 5 45 55 65