How to extract a value from a variable within if condition - perl

I am trying to extract the timestamp portion from the variable, but somehow the substr function isnt working
This is what I have tried
while(<INFILE>){
chomp;
if(/timestamp:(.+$)/){
$ts = $1;
$ts =~ substr($ts, 10);
print $ts;
}
close(INFILE);
This is how the line is in the file
timestamp: 25JUN2019_02:55:02.234
somedata..
..
..
..
timestamp: 25JUN2019_07:00:28.718
I want the output to be
02:55:02.234
07:00:28.718
But instead the output is
25JUN2019_02:55:02.234
25JUN2019_07:00:28.718

Several issues:
You are using the bind operator =~ instead of the assignment operator =
You should always use strict and use warnings
You should ignore whitespace before your match
If there is additional data on the line, substr will return it as well. You should scope your substr to only include want you want.
Revised code:
use strict;
use warnings;
while(<DATA>) {
chomp;
if (/timestamp:\s*(.+$)/) {
my $ts = substr($1, 10, 12); # only include length of data you want
print $ts;
}
}
__DATA__
timestamp: 25JUN2019_02:55:02.234
Output:
02:55:02.234

Two problems:
=~ is the binding operator, you probably want normal assignment.
substr $ts, 10 returns the substring form position 10 to the end of $ts. To only extract 12 characters, use
$ts = substr $ts, 10, 12;
You can also extract the timestamp directly:
if(my ($ts) = /timestamp: [^_]+_(\S+)/){
print $ts, "\n";
}

Related

How to find the number of vowels in a string using Perl

sub Solution{
my $n=$_[0];
my $m=lc $_[1];
my #chars=split("",$m);
my $result=0;
my #vowels=("a","e","i","o","u");
#OUTPUT [uncomment & modify if required]
for(my $i=0;$i<$n;$i=$i+1){
for(my $j=0;$j<5;$j=$j+1){
if($chars[$i]==$vowels[$j]){
$result=$result+1;
last;
}
}
}
print $result;
}
#INPUT [uncomment & modify if required]
my $n=<STDIN>;chomp($n);
my $m=<STDIN>;chomp($m);
Solution($n,$m);
So I wrote this solution to find the number of vowels in a string. $n is the length of the string and $m is the string.
However, for the input 3 nam I always get the input as 3.
Can someone help me debug it?
== compares numbers. eq compares strings. So instead of $chars[$i]==$vowels[$j] you should write $chars[$i] eq $vowels[$j]. If you had used use warnings;, which is recommended, you'd have gotten a warning about that.
And by the way, there's no need to work with extra variables for the length. You can get the length of a string with length() and of an array for example with scalar(). Also, the last index of an array #a can be accessed with $#a. Or you can use foreach to iterate over all elements of an array.
A better solution is using a tr operator which, in scalar context, returns the number of replacements:
perl -le 'for ( #ARGV ) { $_ = lc $_; $n = tr/aeiouy//; print "$_: $n"; }' Use Perl to count how many vowels are in each string
use: 2
perl: 1
to: 1
count: 2
how: 1
many: 2
vowels: 2
are: 2
in: 1
each: 2
string: 1
I included also y, which is sometimes a vowel, see: https://simple.wikipedia.org/wiki/Vowel
Let me suggest a better approach to count letters in a text
#!/usr/bin/env perl
#
# vim: ai:ts=4:sw=4
#
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my $debug = 0; # debug flag
my %count;
my #vowels = qw/a e i o u/;
map{
chomp;
my #chars = split '';
map{ $count{$_}++ } #chars;
} <DATA>;
say Dumper(\%count) if $debug;
foreach my $vowel (#vowels) {
say "$vowel: $count{$vowel}";
}
__DATA__
So I wrote this solution to find the number of vowels in a string. $n is the length of the string and $m is the string. However, for the input 3 nam I always get the input as 3.
Can someone help me debug it?
Output
a: 7
e: 18
i: 12
o: 12
u: 5
Your code is slightly modified form
#!/usr/bin/env perl
#
# vim: ai:ts=4:sw=4
#
use strict;
use warnings;
use feature 'say';
my $input = get_input('Please enter sentence:');
say "Counted vowels: " . solution($input);
sub get_input {
my $prompt = shift;
my $input;
say $prompt;
$input = <STDIN>;
chomp($input);
return $input;
}
sub solution{
my $str = lc shift;
my #chars=split('',$str);
my $count=0;
my #vowels=qw/a e i o u/;
map{
my $c=$_;
map{ $count++ if $c eq $_} #vowels;
} #chars;
return $count;
}

How can I expand ranges like n-m, in Perl? [duplicate]

This question already has answers here:
How to replace `4-7` into `4,5,6,7` in Bash
(4 answers)
Closed 1 year ago.
Being a C developer I am fairly new to Perl. My requirement is that I need to convert the value like
3-6,9
to:
3,4,5,6,9
Also if in case the value is
3-6,9-18
to:
3,4,5,6,9,10,11,12,13,14,15,16,17,18"
Need to split twice to be able to detect missing range-end in both 3,9-12 and 3-6,9
use warnings;
use strict;
use feature 'say';
foreach my $string (q(3-6,9), q(3-6,9-12), q(3,9-12))
{
my #ranges = split /,/, $string; #/
my #result;
for (#ranges) {
my ($beg, $end) = split /-/;
push #result, ($end ? $beg .. $end : $beg);
}
my $res = join ',', #result;
say $res;
}
prints†
3,4,5,6,9
3,4,5,6,9,10,11,12
3,9,10,11,12
This works for more ranges in your strings 1-3,8,10-12,... etc. See range operator (..)
Another way to handle a possibly missing end of range is
push #result, $beg .. $end // $beg;
where // is the defined-or operator
† The test of $end in the ternary operator is for "truth," and it fails for either of undef, '' (empty string), 0 (number zero), and "0" (string). In this problem $end should strictly be a positive integer and this is implicitly used. However, undef is expected and it is often better to be specific
push #result, (defined $end ? $beg .. $end : $beg);
On the other hand, testing for all truth-cases may catch unexpected ones as well (like '').
Note that "truth" in Perl has a few more interesting cases. Thanks to Silvar for a comment.
One way to expand the ranges:
use strict;
use warnings;
my #newvals = ();
my #retrns = qw(3-6 9 89-99);
findRanges(#retrns);
sub findRanges
{
my #vals = #_;
#newvals = map { if($_=~m/(\d+)\-(\d+)/) { ($1 .. $2); } else { $_; } } #vals;
}
print join "\n", #newvals;

replace a string of characters with the line number

I have a text file that has approximately 3,000 lines. 99% of the time I need all 3,000 lines. However, periodically I will grep out the lines I need and direct the output to another text file to use.
The only problem I have in doing so, is: Embedded in the text file is a 6 character string of numbers that indicate the line number. In order to use the file, this area needs to be correctly renumbered...(I don't need to re-sort the data, but I need to replace the current six characters with the new line number. and it must be padded with zeros! Unfortuantely the entire rows is one long row of data with no field separators!
For example, my first three rows might look something like:
20130918082020ZZ000001RANDOMDATAFOLLOWSAFTERTHISABCDEFGH
20130810112000ZZ000999MORERANDOMDATAFOLLOWSAFTERTHISABCD
20130810112000ZZ000027SILLMORERANDOMDATAFOLLOWSAFTERTHIS
The six characters at positions 17-22 (Immediately following the "ZZ"), need be renumbered based on the current row number...so the above needs to look like:
20130918082020ZZ000001RANDOMDATAFOLLOWSAFTERTHISABCDEFGH
20130810112000ZZ000002MORERANDOMDATAFOLLOWSAFTERTHISABCD
20130810112000ZZ000003SILLMORERANDOMDATAFOLLOWSAFTERTHIS
Any ideas would be greatly appreciated!
Thanks,
KSL.
Here's the solution I came up with Perl. It assumes that the numbering is always 6 digits after the ZZ sequence.
In convert.pl:
use strict;
use warnings;
my $i = 1; # or the value you want to start numbering
while (<STDIN>) {
my $replace = sprintf("%06d", $i++);
$_ =~ s/ZZ\d{6}/ZZ$replace/g;
print $_;
}
In data.dat:
20130918082020ZZ000001RANDOMDATAFOLLOWSAFTERTHISABCDEFGH
20130810112000ZZ000999MORERANDOMDATAFOLLOWSAFTERTHISABCD
20130810112000ZZ000027SILLMORERANDOMDATAFOLLOWSAFTERTHIS
To run:
cat data.dat | perl convert.pl
Output
20130918082020ZZ000001RANDOMDATAFOLLOWSAFTERTHISABCDEFGH
20130810112000ZZ000002MORERANDOMDATAFOLLOWSAFTERTHISABCD
20130810112000ZZ000003SILLMORERANDOMDATAFOLLOWSAFTERTHIS
If I would solve this, I would create a simple python script to read those lines by filtering as grep does and using a internal counter from inside the python script.
As simple hints you can read each line in a string and access them using variablename[17:22] (17:22 is the position of the string you are trying to use).
Now, there is a method in the string in python which does the replace, just replace the values by the counter you create.
I hope this helps.
To do this in awk:
awk '{print substr($0,1,16) sprintf("%06d", NR) substr($0,23)}'
or
gawk 'match($0,/^(.*ZZ)[0-9]{6}(.*)/,a) {print a[1] sprintf("%06d",NR) a[2]}'
This is exactly the type of thing where unpack is useful.
#!/usr/bin/env perl
use v5.10.0;
use strict;
use warnings;
while( my $line = <> ){
chomp $line;
my #elem = unpack 'A16 A6 A*', $line;
$elem[1] = sprintf '%06d', $.;
# $. is the line number for the last used file handle
say #elem;
}
Actually looking at the lines, it looks like there is date information stored in the first 14 characters.
Assuming that at some point you might want to parse the lines for some reason you can use the following as an example of how you could use unpack to split up the lines.
#!/usr/bin/env perl
use v5.10.0; # say()
use strict;
use warnings;
use DateTime;
my #date_elem = qw'
year month day
hour minute second
';
my #elem_names = ( #date_elem, qw'
ZZ
line_number
random_data
');
while( my $line = <> ){
chomp $line;
my %data;
#data{ #elem_names } = unpack 'A4 (A2)6 A6 A*', $line;
# choose either this:
$data{line_number} = sprintf '%06d', $.;
say #data{#elem_names};
# or this:
$data{line_number} = $.;
printf '%04d' . ('%02d'x5) . "%2s%06d%s\n", #data{ #elem_names };
# the choice will affect the contents of %data
# this just shows the contents of %data
for( #elem_names ){
printf qq'%12s: "%s"\n', $_, $data{$_};
}
# you can create a DateTime object with the date elements
my $dt = DateTime->new(
(map{ $_, $data{$_} } #date_elem),
time_zone => 'floating',
);
say $dt;
print "\n";
}
Although it would be better to use a regular expression, so that you could throw out bogus data.
use v5.14; # /a modifier
...
my $rdate = join '', map{"(\\d{$_})"} 4, (2)x5;
my $rx = qr'$rdate (ZZ) (\d{6}) (.*)'xa;
while( my $line = <> ){
chomp $line;
my %data;
unless( #data{ #elem_names } = $line =~ $rx ){
die qq'unable to parse line "$line" ($.)';
}
...
It would be better still; to use named capture groups added in 5.10.
...
my $rx = qr'
(?<year> \d{4} ) (?<month> \d{2} ) (?<day> \d{2} )
(?<hour> \d{2} ) (?<minute> \d{2} ) (?<second> \d{2} )
ZZ
(?<line_number> \d{6} )
(?<random_data> .* )
'xa;
while( my $line = <> ){
chomp $line;
unless( $line =~ $rx ){
die qq'unable to parse line "$line" ($.)';
}
my %data = %+;
# for compatibility with previous examples
$data{ZZ} = 'ZZ';
...

Cutting apart string in Perl

I have a string in Perl that is 23 digits long. I need to cut it apart into different pieces. First 2 digits in one variable, next 3 in another variable, next 4 into another variable, etc. Basically the 23 digits needs to end up as 6 separate variables (2,3,4,4,3,7) characters, in that order.
Any ideas how I can cut the string up like this?
There are lots of ways to do it, but the shortest is probably unpack:
my $string = '1' x 23;
my #values = unpack 'A2A3A4A4A3A7', $string;
If you need separate variables, you can use a list assignment:
my ($v1, $v2, $v3, $v4, $v5, $v6) = unpack 'A2A3A4A4A3A7', $string;
Expanding on Alex's method, rather than specify each start and end, use the list you gave of lengths.
#!/usr/bin/env perl
use strict;
use warnings;
my $string = "abcdefghijklmnopqrstuvw";
my $pos = 0;
my #split = map {
my $start = $pos;
my $end = $_;
$pos += $end;
substr( $string, $start, $end);
} (2,3,4,4,3,7);
print "$_\n" for #split;
This said you probably should look at unpack which is used for fixed width fields. I have no experience with it though.
You could use a regex, viz:
$string =~ /\d{2}\d{3}\d{4}\d{4}\d{3}\d{7}/
and capture each part by surrounding with brackets ().
You then find each capture in the variables $1, $2 ...
or get them all in the returned list
See perldoc perlre
You want to use perldoc substr.
$substring = substr($string, $start, $length);
I'd also use `map' on a list of [start, length] pairs to make your life easier:
$string = "123456789";
#values = map {substr($string, $_->[0], $_->[1])} ([1, 3], [4, 2] , ...);
Here's a sub that will do it, using the already discussed unpack.
sub string_slices {
my $str = shift;
return unpack( join( 'A', '', #_ ), $str );
}

How can add values in each row and column and print at the end in Perl?

Below is the sample csv file
date,type1,type2,.....
2009-07-01,n1,n2,.....
2009-07-02,n21,n22,....
and so on...
I want to add the values in each row and each column and print at the end and bottom of each line. i.e.
date,type1,type2
2009-07-01,n1,n2,.....row_total1
2009-07-02,n21,n22,....row_total2
Total,col_total1,col_total1,......total
Please suggest.
Less elegant and shorter:
$ perl -plaF, -e '$r=0;$r+=$F[$_],$c[$_]+=$F[$_]for 1..$#F;$_.=",$r";END{$c[0]="Total";print join",",#c}'
Quick and dirty, but should do the trick in basic cases. For anything more complex, use Text::CSV and an actual script.
An expanded version as it's getting a little hairy:
#! perl -plaF,
$r=0;
$r+=$F[$_], $c[$_]+=$F[$_] for 1..$#F;
$_.=",$r";
END { $c[0]="Total"; print join ",", #c }'
Here is a straightforward way which you can easily build upon depending on your requirements:
use strict;
use warnings;
use 5.010;
use List::Util qw(sum);
use List::MoreUtils qw(pairwise);
use Text::ParseWords;
our ($a, $b);
my #header = parse_csv( scalar <DATA> );
my #total = (0) x #header;
output_csv( #header, 'row_total' );
for my $line (<DATA>) {
my #cols = parse_csv( $line );
my $label = shift #cols;
push #cols, sum #cols;
output_csv( $label, #cols );
#total = pairwise { $a + $b } #total, #cols;
}
output_csv( 'Total', #total );
sub parse_csv {
chomp( my $data = shift );
quotewords ',', 0, $data;
}
sub output_csv { say join ',' => #_ }
__DATA__
date,type1,type2
2009-07-01,1,2
2009-07-02,21,22
Outputs the expected:
date,type1,type2,row_total
2009-07-01,1,2,3
2009-07-02,21,22,43
Total,22,24,46
Some things to take away from above is the use of List::Util and List::MoreUtils:
# using List::Util::sum
my $sum_of_all_values_in_list = sum #list;
# using List::MoreUtils::pairwise
my #two_arrays_added_together = pairwise { $a + $b } #array1, #array2;
Also while I've used Text::ParseWords in my example you should really look into using Text::CSV. This modules covers more bizarre CSV edge cases and also provides correct CSV composition (my output_csv() sub is pretty naive!).
/I3az/
Like JB's perlgolf candidate, except prints the end line totals and labels.
#!/usr/bin/perl -alnF,
use List::Util qw(sum);
chomp;
push #F, $. == 1 ? "total" : sum(#F[1..$#F]);
print "$_,$F[-1]";
for (my $i=1;$i<#F;$i++) {
$totals[$i] += $F[$i];
}
END {
$totals[0] = "Total";
print join(",",#totals);
};
Is this something that needs to be done for sure in a Perl script? There is no "quick and dirty" method to do this in Perl. You will need to read the file in, accumulate your totals, and write the file back out (processing input and output line by line would be the cleanest).
If this is a one-time report, or you are working with a competent user base, the data you want can most easily be produced with a spreadsheet program like Excel.
Whenever I work with CSV, I use the AnyData module. It may add a bit of overhead, but it keeps me from making mistakes ("Oh crap, that date column is quoted and has commas in it!?").
The process for you would look something like this:
use AnyData;
my #columns = qw/date type1 type2 type3/; ## Define your input columns.
my $input = adTie( 'CSV', 'input_file.csv', 'r', {col_names => join(',', #columns)} );
push #columns, 'total'; ## Add the total columns.
my $output = adTie( 'CSV', 'output_file.csv', 'o', {col_names => join(',', #columns)} );
my %totals;
while ( my $row = each %$input ) {
next if ($. == 1); ## Skip the header row. AnyData will add it to the output.
my $sum = 0;
foreach my $col (#columns[1..3]) {
$totals{$col} += $row->{$col};
$sum += $row->{$col};
}
$totals{total} += $sum;
$row->{total} = $sum;
$output->{$row->{date}} = $row;
}
$output->{Total} = \%totals;
print adDump( $output ); ## Prints a little table to see the data. Not required.
undef $input; ## Close the file.
undef $output;
Input:
date,type1,type2,type3
2009-07-01,1,2,3
2009-07-03,31,32,33
2009-07-06,61,62,63
"Dec 31, 1969",81,82,83
Output:
date,type1,type2,type3,total
2009-07-01,1,2,3,6
2009-07-03,31,32,33,96
2009-07-06,61,62,63,186
"Dec 31, 1969",81,82,83,246
Total,174,178,182,534
The following in Perl does what you want, its not elegant but it works :-)
Call the script with the inputfile as argument, results in stdout.
chop($_ = <>);
print "$_,Total\n";
while (<>) {
chop;
split(/,/);
shift(#_);
$sum = 0;
for ($n = 0; 0 < scalar(#_); $n++) {
$c = shift(#_);
$sum += $c;
$sums[$n] += $c;
}
$total += $sum;
print "$_,$sum\n";
}
print "Total";
for ($n = 0; $n <= $#sums; $n++) {
print "," . $sums[$n];
}
print ",$total\n";
Edit: fixed for 0 values.
The output is like this:
date,type1,type2,type3,Total
2009-07-01,1, 2, 3,6
2009-07-02,4, 5, 6,15
Total,5,7,9,21