I have a file input as below;
#
volume stats
start_time 1
length 2
--------
ID
0x00a,1,2,3,4
0x00b,11,12,13,14
0x00c,21,22,23,24
volume stats
start_time 2
length 2
--------
ID
0x00a,31,32,33,34
0x00b,41,42,43,44
0x00c,51,52,53,54
volume stats
start_time 3
length 2
--------
ID
0x00a,61,62,63,64
0x00b,71,72,73,74
0x00c,81,82,83,84
#
I need output in below format;
1 33 36 39 42
2 123 126 129 132
3 213 216 219 222
#
Below is my code;
#!/usr/bin/perl
use strict;
use warnings;
#use File::Find;
# Define file names and its location
my $input = $ARGV[0];
# Grab the vols stats for different intervals
open (INFILE,"$input") or die "Could not open sample.txt: $!";
my $date_time;
my $length;
my $col_1;
my $col_2;
my $col_3;
my $col_4;
foreach my $line (<INFILE>)
{
if ($line =~ m/start/)
{
my #date_fields = split(/ /,$line);
$date_time = $date_fields[1];
}
if ($line =~ m/length/i)
{
my #length_fields = split(/ /,$line);
$length = $length_fields[1];
}
if ($line =~ m/0[xX][0-9a-fA-F]+/)
{
my #volume_fields = split(/,/,$line);
$col_1 += $volume_fields[1];
$col_2 += $volume_fields[2];
$col_3 += $volume_fields[3];
$col_4 += $volume_fields[4];
#print "$col_1\n";
}
if ($line =~ /^$/)
{
print "$date_time $col_1 $col_2 $col_3 $col_4\n";
$col_1=0;$col_2=0;$col_3=0;$col_4=0;
}
}
close (INFILE);
#
my code result is;
1
33 36 39 42
2
123 126 129 132
#
BAsically, for each time interval, it just sums up the columns for all the lines and displays all the columns against each time interval.
$/ is your friend here. Try setting it to '' to enable paragraph mode (separating your data by blank lines).
#!/usr/bin/env perl
use strict;
use warnings;
local $/ = '';
while ( <> ) {
my ( $start ) = m/start_time\s+(\d+)/;
my ( $length ) = m/length\s+(\d+)/;
my #row_sum;
for ( m/(0x.*)/g ) {
my ( $key, #values ) = split /,/;
for my $index ( 0..$#values ) {
$row_sum[$index] += $values[$index];
}
}
print join ( "\t", $start, #row_sum ), "\n";
}
Output:
1 33 36 39 42
2 123 126 129 132
3 213 216 219 222
NB - using tab stops for output. Can use sprintf if you need more flexible options.
I would also suggest that instead of:
my $input = $ARGV[0];
open (my $input_fh, '<', $input) or die "Could not open $input: $!";
You would be better off with:
while ( <> ) {
Because <> is the magic filehandle in perl, that - opens files specified on command line, and reads them one at a time, and if there isn't one, reads STDIN. This is just like how grep/sed/awk do it.
So you can still run this with scriptname.pl sample.txt or you can do curl http://somewebserver/sample.txt | scriptname.pl or scriptname.pl sample.txt anothersample.txt moresample.txt
Also - if you want to open the file yourself, you're better off using lexical vars and 3 arg open:
open ( my $input_fh, '<', $ARGV[0] ) or die $!;
And you really shouldn't ever be using 'numbered' variables like $col_1 etc. If there's numbers, then an array is almost always better.
Basically, a block begins with start_time and ends with a line of of whitespace. If instead end of block is always assured to be an empty line, you can change the test below.
It helps to use arrays instead of variables with integer suffixes.
When you hit the start of a new block, record the start_time value. When you hit a stat line, update column sums, and when you hit a line of whitespace, print the column sums, and clear them.
This way, you keep your program's memory footprint proportional to the longest line of input as apposed to the largest block of input. In this case, there isn't a huge difference, but, in real life, there can be. Your original program was reading the entire file into memory as a list of lines which would really cause your program's memory footprint to balloon when used with large input sizes.
#!/usr/bin/env perl
use strict;
use warnings;
my $start_time;
my #cols;
while (my $line = <DATA>) {
if ( $line =~ /^start_time \s+ ([0-9]+)/x) {
$start_time = $1;
}
elsif ( $line =~ /^0x/ ) {
my ($id, #vals) = split /,/, $line;
for my $i (0 .. $#vals) {
$cols[ $i ] += $vals[ $i ];
}
}
elsif ( !($line =~ /\S/) ) {
# guard against the possibility of
# multiple blank/whitespace lines between records
if ( #cols ) {
print join("\t", $start_time, #cols), "\n";
#cols = ();
}
}
}
# in case there is no blank/whitespace line after last record
if ( #cols ) {
print join("\t", $start_time, #cols), "\n";
}
__DATA__
volume stats
start_time 1
length 2
--------
ID
0x00a,1,2,3,4
0x00b,11,12,13,14
0x00c,21,22,23,24
volume stats
start_time 2
length 2
--------
ID
0x00a,31,32,33,34
0x00b,41,42,43,44
0x00c,51,52,53,54
volume stats
start_time 3
length 2
--------
ID
0x00a,61,62,63,64
0x00b,71,72,73,74
0x00c,81,82,83,84
Output:
1 33 36 39 42
2 123 126 129 132
3 213 216 219 222
When I run your code, I get warnings:
Use of uninitialized value $date_time in concatenation (.) or string
I fixed it by using \s+ instead of / /.
I also added a print after your loop in case the file does not end with a blank line.
Here is minimally-changed code to produce your desired output:
use strict;
use warnings;
# Define file names and its location
my $input = $ARGV[0];
# Grab the vols stats for different intervals
open (INFILE,"$input") or die "Could not open sample.txt: $!";
my $date_time;
my $length;
my $col_1;
my $col_2;
my $col_3;
my $col_4;
foreach my $line (<INFILE>)
{
if ($line =~ m/start/)
{
my #date_fields = split(/\s+/,$line);
$date_time = $date_fields[1];
}
if ($line =~ m/length/i)
{
my #length_fields = split(/\s+/,$line);
$length = $length_fields[1];
}
if ($line =~ m/0[xX][0-9a-fA-F]+/)
{
my #volume_fields = split(/,/,$line);
$col_1 += $volume_fields[1];
$col_2 += $volume_fields[2];
$col_3 += $volume_fields[3];
$col_4 += $volume_fields[4];
}
if ($line =~ /^$/)
{
print "$date_time $col_1 $col_2 $col_3 $col_4\n";
$col_1=0;$col_2=0;$col_3=0;$col_4=0;
}
}
print "$date_time $col_1 $col_2 $col_3 $col_4\n";
close (INFILE);
__END__
1 33 36 39 42
2 123 126 129 132
3 213 216 219 222
I have a file that I want to read it and for each 'word' found to delete the next 2 lines including the line with the 'word'.
the structure of the file is somekind line this:
1
2
3
word
321
3213
412
word
132
1231
this is what I have until now:
open FILE, "$localDir\\x.txt" or die $!;
#fileLines = <FILE>;
close FILE;
$output = 'y.txt';
open my $outfile, '>', $output or die "Can't write to $output: $!";
for ($i = 0; $i < scalar(#fileLines); $i++) {
next if ($fileLines[$i] =~ /'word/);
print $outfile $_ ;
}
thanks
I'd do it something like this:
#!/usr/bin/env perl
use strict;
use warnings;
#iterate one line at a time.
while ( <DATA> ) {
#if we hit the delimiter, read and discard two more line.
if ( m/word/ ) { <DATA>; <DATA> ; }
#otherwise print it.
else { print; };
}
__DATA__
1
2
3
word
321
3213
412
word
132
1231
Which gives:
1
2
3
412
The excellent Tie::File module often seems to be forgotten. It is ideal for this sort of thing
use strict;
use warnings;
use File::Copy qw/ copy /;
use Tie::File;
my $local_dir = '.';
copy "$local_dir/x.txt", 'y.txt';
tie my #file, 'Tie::File', 'y.txt';
for ( my $i = 0; $i < #file; ) {
if ( $file[$i] eq 'word' ) {
splice #file, $i, 3;
}
else {
++$i;
}
}
output
1
2
3
412
I am trying to perform a transpose on a data contained in a file. The data is as follows:
1 2 3 4 5
2 3 4 5 6
4 5 6 7 9
4 3 7 6 9
I am getting the result as follows which is incorrect. I am not getting the error in the code due to which the last column is not transposed properly. Any solution...
Code:
#!/usr/bin/perl
use strict;
use warnings;
my #dependent; # matrix of dependent variable
# Reading the data from text file to the matrix
open( DATA, "<example.txt" ) or die "Couldn't open file , $!"; #depenedent
# Storing data into the array in matrix form
while ( my $linedata = <DATA> ) {
push #dependent, [ split '\t', $linedata ];
}
my $m = #dependent;
#print "$m\n";
my $n = #{ $dependent[1] };
#print $n;
#print "Matrix of dependent variables Y \n";
for ( my $i = 0; $i < $m; $i++ ) {
for ( my $j = 0; $j < $n; $j++ ) {
#print $dependent[$i][$j]," ";
}
#print "\n";
}
my #transpose;
for ( my $i = 0; $i < $n; $i++ ) {
for ( my $j = 0; $j < $m; $j++ ) {
$transpose[$i][$j] = $dependent[$j][$i];
}
}
for ( my $i = 0; $i < $n; $i++ ) {
for ( my $j = 0; $j < $m; $j++ ) {
print $transpose[$i][$j], " ";
}
print "\n";
}
chomp your data when you read it, before you split it; your strange output is caused by the last element of each row of the input still having a newline attached.
Just as a side note, DATA isn't a very good name to pick for a filehandle; perl already defines a special builtin filehandle named DATA for reading data that's embedded in a script or a module, so using that name for yourself can lead to confusion :)
The input what I am handling is as follows.
Q9NRG9 15
Q9NRG9 160
Q9NRG9 56
Q9NRG9 89
Q16613 26
Q16613 63
Q16613 102
O95477 19
O95477 91
O95477 78
O95477 86
O95477 16
O95477 203
O95477 66
P78363 18
P78363 159
P78363 88
I want output as
Q9NRG9 15,160,56,89
Q16613 26,63,102
O95477 78,86,16,203,66
I tried with perl program, but I couldn't get correct output what I want.
Using perl from the command line:
perl -lane '
push #{ $h{$F[0]} }, $F[1]
}{
$" = ",";
print "$_ #{ $h{$_} }" for keys %h
' file
O95477 19,91,78,86,16,203,66
Q9NRG9 15,160,56,89
P78363 18,159,88
Q16613 26,63,102
To maintain the order, you can do:
perl -lane '
$k{$F[0]}++ or push #r, $F[0];
push #{ $h{$F[0]} }, $F[1]
}{
$" = ",";
print "$_ #{ $h{$_} }" for #r
' file
Try this:
open (FILE, "text.txt") or die "cannot open file".$!;
my %data;
while(<FILE>){
chomp($_);
my ($key, $value) = split(/\s+/,$_);
push(#{$data{$key}}, $value);
}
foreach (keys %data){
print $_." ".join(",",#{$data{$_}})."\n";
}
in blah.txt:
/a/b/c-test
in blah.pl
1 my #dirs;
2 $ws = '/a/b/c-test/blah/blah'; <--- trying to match this
3 sub blah{
4 my $err;
5 open(my $fh, "<", "blah.txt") or $err = "catn do it\n";
6 if ($err) {
7 print $err;
8 return;
9 } else {
10 while(<$fh>){
11 chomp;
12 push #dirs, $_;
13 }
14 }
15 close $fh;
16 print "successful\n";
17 }
18
19
20 blah();
21
22 foreach (#dirs) {
23 print "$_\n"; #/a/b/c-test
24 if ($_ =~ /$ws/ ) { <--- didnt match it
25 print "GOT IT!\n";
26 } else {
27 print "didnt get it\n";
28 }
29 }
~
perl blah.pl
successful
/a/b/c-test
didnt get it
I am not quite sure why it is not matching.
Anyone know?
Consider,
if ($ws =~ /$_/ ) {
instead of,
if ($_ =~ /$ws/ ) {
as /a/b/c-test/blah/blah contains /a/b/c-test string, not otherwise.
As a side notes:
use at least strict and warnings
read and process file in while() loop instead of filling array first
if you must fill array, use my #dirs = <$fh>; chomp(#dirs);