I have a file:
ifile.txt
83080603 55 72
87090607 83 87
88010612 82 44
89080603 55 72
00110607 83 87
01030612 82 44
05120618 84 44
The 1st column shows the date and time with the format YYMMDDHH.
I would like to print the first column as hrHDDMMMYYYY
ofile.txt
03H06Aug1983 55 72
07H06Sep1987 83 87
12H06Jan1988 82 44
and so on
I can't able to convert the digit month-year to text month. My script is
awk '{printf "%s%5s%5s\n",
substr($0,7,2)"H"substr($0,5,2)substr($0,3,2)substr($0,1,2), $2, $3}' ifile.txt
EDIT: Adding solution as per OP's comment for adding year as per condition here.
awk -v curr_year=$(date +%y) '
BEGIN{
num=split("Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec",arr,",")
for(j=1;j<=num;j++){
key=sprintf("%02d",j)
months[key]=arr[j]
}
}
{
year=substr($1,1,2)>=00 && substr($1,1,2)<=curr_year?"20":"19"
$1=substr($1,length($1)-1)"H"substr($1,length($1)-3,2) months[substr($1,3,2)] year substr($1,1,2)
print
year=""
}
' Input_file
Could you please try following, written on mobile and successfully tested it on site https://ideone.com/4zYMu7 this also assume that since your Input_file first 2 letters denote year and you want to print only 19 in case there is another logic to get exact year then please do mention it
awk '
BEGIN{
num=split("Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec",arr,",")
for(j=1;j<=num;j++){
key=sprintf("%02d",j)
months[key]=arr[j]
}
}
{
$1=substr($1,length($1)-1)"H"substr($1,length($1)-3,2) months[substr($1,3,2)] "19" substr($1,1,2)
print
}
' Input_file
With GNU awk:
awk -v OFS=" " '
function totime(t, c, y, m, d, h) {
y = substr(t, 1 ,2)
m = substr(t, 3, 2)
d = substr(t, 5, 2)
h = substr(t, 7, 2)
c = y >= 69 ? 19 : 20 # GNU date treats years 69-99 as 19xx else 20xx
return mktime(c y " " m " " d " " h " 0 0")
}
function transformTime(t) {
return strftime("%HH%d%b%Y", totime(t))
}
{
$1 = transformTime($1)
print
}
' ifile.txt
03H06Aug1983 55 72
07H06Sep1987 83 87
12H06Jan1988 82 44
03H06Aug1989 55 72
07H06Nov2000 83 87
12H06Mar2001 82 44
18H06Dec2005 84 44
Or, Perl
perl -MTime::Piece -lane '
$F[0] = Time::Piece->strptime($F[0], "%y%m%d%H")->strftime("%HH%d%b%Y");
print join(" ", #F);
' ifile.txt
Using any awk in any shell on every UNIX box:
$ cat tst.awk
BEGIN {
split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec",mths)
}
{
for (i=1; i<length($1); i+=2) {
t[(i-1)/2+1] = substr($1,i,2)
}
$1 = t[4] "H" t[3] mths[t[2]+0] "19" t[1]
print
}
.
$ awk -f tst.awk file
03H06Aug1983 55 72
07H06Sep1987 83 87
12H06Jan1988 82 44
03H06Aug1989 55 72
07H06Nov1900 83 87
12H06Mar1901 82 44
18H06Dec1905 84 44
{
for (i=0; i<4; i++) {
t[i] = substr($1, 2*i+1, 2);
}
d = t[0]"-"t[1]"-"t[2]" "t[3]":00";
cmd = "date --date='"d"' +%HH%d%b%Y";
cmd | getline $1;
close(cmd);
print
}
Save the above as script.awk and run as awk -f script.awk ifile.txt. It uses the date command for date format conversion.
Related
This question is similar to How can I find the missing integers in a unique and sequential list (one per line) in a unix terminal?.
The difference being is that I want to know if it is possible to specify a starting range to the list
I have noted the following provided solutions:
awk '{for(i=p+1; i<$1; i++) print i} {p=$1}' file1
and
perl -nE 'say for $a+1 .. $_-1; $a=$_'
file1 is as below:
5
6
7
8
15
16
17
20
Running both solutions, it gives the following output:
1
2
3
4
9
10
11
12
13
14
18
19
Note that the output start printing from 1.
Question is how to pass an arbitrary starting/minimum to start with and if nothing is provided, assume the number 1 as the starting/minimum number?
9
10
11
12
13
14
18
19
Yes, sometimes you will want the starting number to be 1 but sometimes you will want the starting number as the least number from the list.
You can use your awk script, slightly modified, and pass it an initial p value with the -v option:
$ awk 'BEGIN{p=p<1?1:p} {for(i=p; i<$1; i++) print i} {p=p<=$1?$1+1:p}' file1
1
2
3
4
9
10
11
12
13
14
18
19
$ awk -v p=10 'BEGIN{p=p<1?1:p} {for(i=p; i<$1; i++) print i} {p=p<=$1?$1+1:p}' file1
10
11
12
13
14
18
19
The BEGIN block initializes p to 1 if it is not specified or set to 0 or a negative value. The loop starts at p instead of p+1, and the last block assigns $1+1 to p (instead of $1), if and only if p is less or equal $1.
This assumes that the default (1) is the minimum starting number you would want. If you would like to start from 0 or even from a negative number just replace BEGIN{p=p<1?1:p} by BEGIN{p=(p==""?1:p)}:
$ awk -v p=-2 'BEGIN{p=(p==""?1:p)} {for(i=p; i<$1; i++) print i} {p=p<=$1?$1+1:p}' file1
-2
-1
0
1
...
Slight variations of those one-liners to include a start point:
awk
# Optionally include start=NN before the first filename
$ awk 'BEGIN { start= 1 }
$1 < start { next }
$1 == start { p = start }
{ for (i = p + 1; i < $1; i++) print i; p = $1}' start=5 file1
9
10
11
12
13
14
18
19
$ awk 'BEGIN { start= 1 }
$1 < start { next }
$1 == start { p = start }
{ for (i = p + 1; i < $1; i++) print i; p = $1}' file1
1
2
3
4
9
10
11
12
13
14
18
19
perl
# Optionally include -start=NN before the first file and after the --
$ perl -snE 'BEGIN { $start //= 1 }
if ($_ < $start) { next }
if ($_ == $start) { $a = $start }
say for $a+1 .. $_-1; $a=$_' -- -start=5 file1
9
10
11
12
13
14
18
19
$ perl -snE 'BEGIN { $start //= 1 }
if ($_ < $start) { next }
if ($_ == $start) { $a = $start }
say for $a+1 .. $_-1; $a=$_' -- file1
1
2
3
4
9
10
11
12
13
14
18
19
Using Raku (formerly known as Perl_6)
raku -e 'my #a=lines.map: *.Int; .put for (#a.Set (^) #a.minmax.Set).sort.map: *.key;'
Sample Input:
5
6
7
8
15
16
17
20
Sample Output:
9
10
11
12
13
14
18
19
Here's an answer coded in Raku, a member of the Perl-family of programming languages. No, it doesn't address the OP's request for a user-definable starting point. Instead the code above is a general solution that computes the input's minimum Int and counts up from there, returning any missing Ints found up--to the input's maximum Int.
Really need a user-defined lower limit? Try the following code, which allows you to set a $init variable:
~$ raku -e 'my #a=lines.map: *.Int; my $init = 1; .put for (#a.Set (^) ($init..#a.max).Set).sort.map: *.key;'
1
2
3
4
9
10
11
12
13
14
18
19
For explanation and shorter code (including single-line return and/or return without sort), see the link below.
https://stackoverflow.com/a/72221301/7270649
https://raku.org
not as elegant as i hoped :
< file | mawk '
BEGIN { _= int(_)^(\
( ORS = "")<_)
} { ___[ __= $0 ] }
END {
do {
print _ in ___ \
? "" : _ "\n"
} while(++_ < __) }' \_=10
10
11
12
13
14
18
19
I asked this type of ques previously but didn't provide the full code.
I am reading below file and checking the max word width present in each column and then write it to another file with proper alignment.
id0 id1 id2 batch
0 34 56 70
2 3647 58 72 566
4 39 616 75 98 78 78987 9876 7899 776
89 40 62 76
8 42 64 78
34 455 544 565
My code:
unlink "temp1.log";
use warnings;
use strict;
use feature 'say';
my $log1_file = "log1.log";
my $temp1 = "temp1.log";
open(IN1, "<$log1_file" ) or die "Could not open file $log1_file: $!";
my #col_lens;
while (my $line = <IN1>) {
my #fs = split " ", $line;
my #rows = #fs ;
#col_lens = map (length, #rows) if $.==1;
for my $col_idx (0..$#rows) {
my $col_len = length $rows[$col_idx];
if ($col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
};
};
close IN1;
open(IN1, "<$log1_file" ) or die "Could not open file $log1_file: $!";
open(tempp1,"+>>$temp1") or die "Could not open file $temp1: $!";
while (my $line = <IN1>) {
my #fs = split " ", $line;
my #az;
for my $h (0..$#fs) {
my $len = length $fs[$h];
my $blk_len = $col_lens[$h]+1;
my $right = $blk_len - $len;
$az[$h] = (" ") . $fs[$h] . ( " " x $right );
}
say tempp1 (join "|",#az);
};
My warning
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 3.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
Use of uninitialized value in numeric lt (<) at new.pl line 25, <IN1> line 4.
I am getting the output correctly but don't know how to remove this warnings.
$col_idx can be up to the number of fields on a line, minus one. For the third line, this is more than the highest index of #col_lens, which contains at most 3 elements. So doing the following makes no sense:
if ($col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
Replace it with
if (!defined($col_lens[$col_idx]) || $col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
With this, there's really no point checking for $. == 1 anymore.
You're getting uninitialized warning because, while checking the $col_lens[$col_idx] < $col_len condition, one or both of them are undef.
Solution 1:
You can skip checking this condition by the use of next statement.
for my $col_idx (0..$#rows) {
my $col_len = length $rows[$col_idx];
next unless $col_lens[$col_idx];
if ($col_lens[$col_idx] < $col_len) {
$col_lens[$col_idx] = $col_len;
}
}
Solution 2: (Not advised):
You can simply ignore Use of uninitialized value.. warnings by putting this line at top of your script. This will disable uninitialized warnings in a block.
no warnings 'uninitialized';
For more info, please refer this link
Following code demonstrates one of many possible ways for solution to this task
read line by line
get length of each field
compare with stored earlier
adjust to max length
form $format string for print
print formatted data
use strict;
use warnings;
use feature 'say';
my(#data,#length,$format);
while ( <DATA> ) {
my #e = split ' ';
my #l = map{ length } #e;
$length[$_] = ($length[$_] // 0) < $l[$_] ? $l[$_] : $length[$_] for 0..$#e;
push #data,\#e;
}
$format = join ' ', map{ '%'.$_.'s' } #length;
$format .= "\n";
for my $row ( #data ) {
printf $format, map { $row->[$_] // '' } 0..$#length;;
}
__DATA__
id0 id1 id2 batch
0 34 56 70
2 3647 58 72 566
4 39 616 75 98 78 78987 9876 7899 776
89 40 62 76
8 42 64 78
34 455 544 565
Output
id0 id1 id2 batch
0 34 56 70
2 3647 58 72 566
4 39 616 75 98 78 78987 9876 7899 776
89 40 62 76
8 42 64 78
34 455 544 565
I have some data in hexdump code.
left hand are DEC and right hand are hexdump code.
16 = 10
51 = 33
164 = A4 01
388 = 84 03
570 = BA 04
657 = 91 05
1025 = 81 08
246172 = 9C 83 0F
How to calculate any hexdump to DEC ?
In perl, I tried to use ord() command but don't work.
Update
I don't known what it call. It look like 7bits data. I try to build formula in excel look like these:
DEC = hex2dec(X) + (128^1 * hex2dec(Y-1)) + (128^2 * hex2dec(Z-1)) + ...
What you have is a variable-length encoding. The length is encoded using a form of sentinel value: Each byte of the encoded number except the last has its high bit set. The remaining bits form the two's-complement encoding of the number in little-ending byte order.
0xxxxxxx ⇒ 0xxxxxxx
1xxxxxxx 0yyyyyyy ⇒ 00yyyyyy yxxxxxxx
1xxxxxxx 1yyyyyyy 0zzzzzzz ⇒ 000zzzzz zzyyyyyy yxxxxxxx
etc
The following can be used to decode a stream:
use strict;
use warnings;
use feature qw( say );
sub extract_first_num {
$_[0] =~ s/^([\x80-\xFF]*[\x00-\x7F])//
or return;
my $encoded_num = $1;
my $num = 0;
for (reverse unpack 'C*', $encoded_num) {
$num = ( $num << 7 ) | ( $_ & 0x7F );
}
return $num;
}
my $stream_buf = "\x10\x33\xA4\x01\x84\x03\xBA\x04\x91\x05\x81\x08\x9C\x83\x0F";
while ( my ($num) = extract_first_num($stream_buf) ) {
say $num;
}
die("Bad data") if length($stream_buf);
Output:
16
51
164
388
570
657
1025
246172
I need to represent first 8 characters of the string as hex numbers separated by spaces.
For example:
"This is the test!" converts to "54 68 69 73 20 69 73 20"
I use the following code to do it. Is there better(simpler) way to do it in Perl?
my $hex = unpack( "H16", $string );
my $hexOut = "";
for ( my $i = 0 ; $i < length($hex) ; $i += 2 )
{
$hexOut .= substr( $hex, $i, 2 ) . " ";
}
$hexOut = substr( $hexOut, 0, -1 );
I can't resist submitting a Perl one-liner!
my $string = "This is a test";
print(join(' ', unpack("(A2)*", unpack( "H16", $string ))) . "\n");
If you split on null, you get a list of bytes. Then just print them in hexadecimal.
use strict;
use warnings;
my $string = shift // 'This is the test!';
my #bytes = split //, $string;
for my $i (0..7) {
printf "%02X ", ord $bytes[$i];
}
print "\n";
But if you really want characters rather than bytes, then unpack.
my #chars = unpack "C0U*", $string;
for my $i (0..7) {
printf "%02X ", $chars[$i];
}
print "\n";
For the test string, it's the same
$ ./leon01.pl
54 68 69 73 20 69 73 20
54 68 69 73 20 69 73 20
but in general, it's not
$ ./leon01.pl 'A Møøse once bit my sister.'
41 20 4D C3 B8 C3 B8 73
41 20 4D F8 F8 73 65 20
$ ./leon01.pl '① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩'
E2 91 A0 20 E2 91 A1 20
2460 20 2461 20 2462 20 2463 20
my $string = "This is the test!";
my $hex_string = sprintf("%vx", substr($string, 0, 8));
$hex_string =~ y/./ /;
print $hex_string, "\n";
(The v modifier is a perl-specific extension to printf formats, introduced sometime in 5.8.x IIRC.)
I'll let you decide if this is better or not. Just another way to do it. ;-)
#! /usr/bin/perl -w
$string = "This is the test!";
$strLength = length($string);
#bytes = unpack(A2 x $strLength,unpack("H16",$string));
print "#bytes\n";
# Also could change it back to a string w/spaces:
$pretty = join(" ",#bytes);
print $pretty;
I want to display a table in perl, the rows and column names for which will be of variable length. I want the columns to be neatly aligned. The problem is the row and column heading are of variable length, so the alignment shifts off for different files.
Here is the code I am using to format :
print "\n ";
foreach (keys(%senseToSenseCountHash))
{
printf "%15s",$_;
}
print "\n";
print "------------------------------------------------------------\n";
my $space = "---";
foreach my $realSense (keys(%actualSenseToWronglyDisambiguatedSense))
{
printf "%s",$realSense;
foreach (keys(%senseToSenseCountHash))
{
if(exists($actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_}))
{
printf "%15s",$actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_};
}
else
{
printf "%15s",$space;
}
}
print "\n";
}
The outputs I get are as follows (for different files that I have to test on) :
Microsoft IBM
------------------------------------------------------------
Microsoft 896 120
IBM 66 661
SERVE12 SERVE2 SERVE6 SERVE10
------------------------------------------------------------
SERVE12 319 32 19 8
SERVE2 44 159 39 25
SERVE6 22 9 102 1
SERVE10 14 16 12 494
HARD3 HARD2 HARD1
------------------------------------------------------------
HARD3 68 7 27
HARD2 6 60 90
HARD1 37 69 937
I want to make this output aligned regardless of the row and column name. Can anyone please help?
Thanks so much!
This line:
printf "%s",$realSense;
has no specific width, and is throwing off the alignment.
Found the answer, pasting it here in case any one wants to use it.
printf "%10s %-2s",'----------','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%s",'----------------';
}
print "\n";
printf "%10s %-2s",' ','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%-14s",$_;
}
print "\n";
printf "%10s %-2s",'----------','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%s",'----------------';
}
print "\n";
foreach my $key (sort { $senseToSenseCountHash{$b} <=>
$senseToSenseCountHash{$a} } keys %senseToSenseCountHash )
{
$maxSense = $senseToSenseCountHash{$key};
last;
}
my $space = "---";
foreach my $realSense (keys(%actualSenseToWronglyDisambiguatedSense))
{
printf "%-10s %-2s",$realSense,'|';
foreach (keys(%senseToSenseCountHash))
{
if(exists($actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_}))
{
printf "%-15s",$actualSenseToWronglyDisambiguatedSense{$realSense}[0]{$_};
}
else
{
printf "%-15s",$space;
}
}
print "\n";
}
printf "%10s %-2s",'----------','|';
foreach(keys(%senseToSenseCountHash))
{
printf "%s",'----------------';
}
print "\n";
Output :
---------- | ------------------------------------------------
| HARD3 HARD2 HARD1
---------- | ------------------------------------------------
HARD3 | 68 7 27
HARD2 | 6 60 90
HARD1 | 37 69 937
---------- | ------------------------------------------------
---------- | ----------------------------------------------------------------
| SERVE12 SERVE2 SERVE6 SERVE10
---------- | ----------------------------------------------------------------
SERVE12 | 319 32 19 8
SERVE2 | 44 159 39 25
SERVE6 | 22 9 102 1
SERVE10 | 14 16 12 494
---------- | ----------------------------------------------------------------