Perl simple filehandling of text - perl

What this program is meant to do is that it reads a text file which looks like:
Item \t\t Price
apple \t\t 20
orange \t\t 50
lime \t\t 30
I'm using split function to split these 2 columns and then i should apply a -25% discount on all items and print it out to a new file. My code so far does what i want but the new text file has a '0' value under my last number in price column. I also get 2 errors if i run it with "use warnings" which are:
Use of uninitialized value $item in multiplication * ...
Use of uninitialized value $item[0] in concatenation (.) ...
I should also tell total number of items calculated but i get like 5 1's instead of 5. (11111 instead of 5)
use strict;
use warnings;
my $filename = 'shop.txt';
if (-e $filename){
open (IN, $filename);
}
else{
die "Can't open input file for reading: $!";
}
open (OUT,">","discount.txt") or die "Can't open output file for writing: $!";
my $header = <IN>;
print OUT $header;
while (<IN>) {
chomp;
my #items = split(/\t\t/);
foreach my $item ($items[1]){
my $discount = $item * (0.75);
print OUT "$items[0]\t\t$discount\n";
}
}

This is too complicated and not clear what are you doing in foreach loop and you are not skipping empty lines. Keep it simple:
use warnings;
use strict;
use v5.10;
<>; # skip header
while(my $line = <>)
{
chomp $line;
next unless ($line);
my ($title, $price ) = split /\s+/, $line;
if( $title && defined $price )
{
$price *= 0.75;
say "$title\t\t$price";
}
}
and run like
perl script.pl <input.txt >output.txt

use strict;
use warnings;
my $filename = 'shop.txt';
if (-e $filename){
open (IN, $filename);
}
else{
die "Can't open input file for reading: $!";
}
open (OUT,">","discount.txt") or die "Can't open output file for writing: $!";
my $header = <IN>;
my $item;
my $price;
print OUT $header;
while (<IN>) {
chomp;
($item, $price) = split(/\t\t/);
my $discount = $price*0.75;
print OUT "$item $discount\n";
}
This should help! :)

If the total item count isn't very important to you:
$ perl -wane '$F[1] *= 0.75 if $. > 1; print join("\t", #F), "\n";' input.txt
Output:
Item Price
apple 15
orange 37.5
lime 22.5
If you really need the total item count:
$ perl -we 'while (<>) { #F = split; if ($. > 1) { $F[1] *= 0.75; $i++ } print join("\t", #F), "\n"; } print "$i items\n";' input.txt
Output:
Item Price
apple 15
orange 37.5
lime 22.5
3 items

I'd use this approach
#!/usr/bin/perl
use strict;
use warnings;
my %items;
my $filename = 'shop.txt';
my $discount = 'discount.txt';
open my $in, '<', $filename or die "Failed to open file! : $!\n";
open my $out, ">", $discount or die "Can't open output file for writing: $!";
print $out "Item\t\tPrice\n";
my $cnt = 0;
while (my $line = <$in>) {
chomp $line;
if (my ($item,$price) = $line =~ /(\w.+)\s+([0-9.]+)/){
$price = $price * (0.75);
print $out "$item\t\t$price\n";
$items{$item} = $price;
$cnt++;
}
}
close($in);
close($out);
my $total = keys %items;
print "Total items - $total \n";
print "Total items - $cnt\n";
Using regex capture groups to capture the item and price (using \w.+ in case the item is 2 words like apple sauce), this will also prevent empty lines from printing to file.
I also hard coded the Item and Price header, probably a good idea if you are going to be using a consistent header.
Hope it helps
---Update ----
I added 2 examples of a total count in my script. The first one is using a hash and printing out the hash size, the second method is using a counter. The hash option is good except if your list has 2 items that are the same in which case the key of the hash will be overridden with the last item found which shares the same name. The counter is a simple solution.

Related

Printing value from split result Perl

Here I have a abc.txt file:
aaa,1000,kevin
bbb,2000,john
ccc,3000,jane
ddd,4000,kevin
Then I want to print out:
kevin
john
jane
my Perl script is:
open (INFILE, $ARGV[1]) or die "An input file is required as argument\n";
#store=();
while(<INFILE>)
{
chomp();
#data=split(/,/);
#
#
#
if (%store ne "0")
{
print "Printing users:\n";
foreach $key (keys %store)
{print $key."\n";}
}
print "Printing users:\n";
foreach $key (keys %store)
{print $key."\n";}
}
My idea is to store the value into hash and create key to each value. How can I do in the ### line?
You have declared #store and then using %store. I didn't understand that why you doing that, but the below code will give you desire output. First read the input file, split the data and then remove the duplicates.
use strict;
use warnings;
my $infile = $ARGV[0];
open my $fh, "<", $infile or die "An input file is required as argument: $!";
my %store;
while(my $line = <$fh>)
{
chomp($line);
my #data = split /,/, $line;
my #removeduplicate = (grep { !$store{$_}++ } #data)[2];
foreach(#removeduplicate){
if( $_ ne ''){
print "$_\n";
}
}
}
close $fh;
Output:
kevin
john
jane
hmmm. it depends what do you want. maybe this example will help you:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper; #for debug if you want
my $infile='abc.txt'; #or ARGV[0] whatever it is
my $fh;
open $fh,'<',$infile or die "problem with $infile $# $!";
my $inputline;
my %Storage;
my #Values;
while (defined($inputline=<$fh>)) {
chomp $inputline;
#Values=split ',',$inputline;
if (#Values != 3) {
warn "$inputline has formatted badly";
next;
}
#warn if exists $Storage{$Values[1]}; #optional warning for detected duplicates
$Storage{$Values[1]}=#Values[0,2]; #create hash data
#duplicates will be removed automaticly
}
close $fh;
print Dumper \%Storage; #print how perl it stores
foreach my $Key (keys %Storage) { #example loop
print #{Storage->{$Key}},"\n"; #do anything
}
I hope this template will be enough for you.

how to count the number of specific characters through each line from file?

I'm trying to count the number of 'N's in a FASTA file which is:
>Header
AGGTTGGNNNTNNGNNTNGN
>Header2
AGNNNNNNNGNNGNNGNNGN
so in the end I want to get the count of number of 'N's and each header is a read so I want to make a histogram so I would at the end output something like this:
# of N's # of Reads
0 300
1 240
etc...
so there are 300 sequences or reads that have 0 number of 'N's
use strict;
use warnings;
my $file = shift;
my $output_file = shift;
my $line;
my $sequence;
my $length;
my $char_N_count = 0;
my #array;
my $count = 0;
if (!defined ($output_file)) {
die "USAGE: Input FASTA file\n";
}
open (IFH, "$file") or die "Cannot open input file$!\n";
open (OFH, ">$output_file") or die "Cannot open output file $!\n";
while($line = <IFH>) {
chomp $line;
next if $line =~ /^>/;
$sequence = $line;
#array = split ('', $sequence);
foreach my $element (#array) {
if ($element eq 'N') {
$char_N_count++;
}
}
print "$char_N_count\n";
}
Try this. I changed a few things like using scalar file handles. There are many ways to do this in Perl, so some people will have other ideas. In this case I used an array which may have gaps in it - another option is to store results in a hash and key by the count.
Edit: Just realised I'm not using $output_file, because I have no idea what you want to do with it :) Just change the 'print' at the end to 'print $out_fh' if your intent is to write to it.
use strict;
use warnings;
my $file = shift;
my $output_file = shift;
if (!defined ($output_file)) {
die "USAGE: $0 <input_file> <output_file>\n";
}
open (my $in_fh, '<', $file) or die "Cannot open input file '$file': $!\n";
open (my $out_fh, '>', $output_file) or die "Cannot open output file '$output_file': $!\n";
my #results = ();
while (my $line = <$in_fh>) {
next if $line =~ /^>/;
my $num_n = ($line =~ tr/N//);
$results[$num_n]++;
}
print "# of N's\t# of Reads\n";
for (my $i = 0; $i < scalar(#results) ; $i++) {
unless (defined($results[$i])) {
$results[$i] = 0;
# another option is to 'next' if you don't want to show the zero totals
}
print "$i\t\t$results[$i]\n";
}
close($in_fh);
close($out_fh);
exit;

How can I delete the last 10 lines of a file in perl

I am taking a total number of line as a user input and then I am deleting those numbers of l ine from the file.
I saw this learn.perl.org/faq/perlfaq5.html#How-do-I-count-the-number-of-lines-in-a-file- and then I tired the below simple logic.
Logic:
Get the Total number of lines
Subtracts it by the numbers entered by user
print the lines
Here is my code :
#!/usr/bin/perl -w
use strict;
open IN, "<", "Delete_line.txt"
or die " Can not open the file $!";
open OUT, ">", "Update_delete_line.txt"
or die "Can not write in the file $!";
my ($total_line, $line, $number, $printed_line);
print"Enter the number of line to be delete\n";
$number = <STDIN>;
while ($line = <IN>) {
$total_line = $.; # Total number of line in the file
}
$printed_line = $total_line - $number;
while ($line = <IN>) {
print OUT $line unless $.== $printed_line;
}
Well, neither i am getting any error in code nor any out put ? why I just don't know.
Can any one give me some suggestion.
A Perl solution that's efficient for large files requires the use of File::ReadBackwards
use File::ReadBackwards qw( );
my $num_lines = 10;
my $qfn = 'file.txt';
my $pos = do {
my $fh = File::ReadBackwards->new($qfn)
or die $!;
$fh->readline() for 1..$num_lines;
$fh->tell()
};
truncate($qfn, $pos)
or die $!;
This does not read the whole file twice (unlike the OP's method).
This does not read the whole file (unlike the Tie::File solutions).
This does not read the whole file into memory.
Yet another way is to use Tie::File
#!/usr/bin/env perl
use strict;
use warnings;
use Tie::File;
tie my #lines, 'Tie::File', 'myfile' or die "$!\n";
$#lines -= 10;
untie #lines;
This has the advantage of not loading the file into memory while acting like it does.
Here a solution that passes through a stream and prints all but the last n lines where n is a command line argument:
#!/usr/bin/perl
my #cache;
my $n = shift #ARGV;
while(<>) {
push #cache, $_;
print shift #cache if #cache > $n;
}
or the one-liner version:
perl -ne'BEGIN{$n=shift#ARGV}push#c,$_;print shift#c if#c>$n' NUMBER
After finishing reading from IN, you have to reopen it or seek IN, 0, 0 to reset its position. You also have to set $. to zero again.
Also, the final condition should be changed to unless $. > $printed_line so you skip all the lines over the threshold.
The "more fun" answer: use Tie::File!
use strict;
use warnings;
use Tie::File;
tie my #file, 'Tie::File', 'filename' or die "$!";
$#file -= 10;
Just read the file in reverse and delete the first n lines: -
open my $filehandle, "<", "info.txt";
my #file = <$filehandle>;
splice(#file, -10);
print #file;
Note: This loads the entire file into memory.
You could just buffer the last 10 lines and then not print out the remaining 10.
use English qw<$INPLACE_EDIT>;
{ local #ARGV = $name_of_file_to_edit;
local $INPLACE_EDIT = '.bak';
my #buffer;
for ( 1..$num_lines_to_trim ) {
push #buffer, <>;
}
while ( <> ) {
print shift #buffer;
push #buffer, $_;
}
}
You could also do this with File::Slurp::edit_file_lines:
my #buffer;
my $limit_reached = 0;
edit_file_lines {
push #buffer, $_;
return ( $limit_reached ||= #buffer > $num_lines_to_trim ) ? shift #buffer
: ''
;
} $name_of_file;
my $num_lines = 10;
my $qfn = 'file.txt';
system('head', '-n', -$num_lines, '--', $qfn);
die "Error" if $?;
Easy with a C like for :
#!/usr/bin/perl -w
use strict;
open(my $in,"<","Delete_line.txt") or die "Can not open the file $!";
open(my $out,">","Update_delete_line.txt") or die"Can not write in the file $!";
print"Enter the number of lines to be delete\n";
my $number=<STDIN>;
my #file = <$in>;
for (my $i = 0; $i < $#file - $number + 1; $i++) {
print $out $file[$i];
}
close $in;
close $out;
#
# Reads a file trims the top and the bottom of by passed num of lines
# and return the string
# stolen from : http://stackoverflow.com/a/9330343/65706
# usage :
# my $StrCatFile = $objFileHandler->ReadFileReturnTrimmedStrAtTopBottom (
# $FileToCat , $NumOfRowsToRemoveAtTop , $NumOfRowsToRemoveAtBottom) ;
sub ReadFileReturnTrimmedStrAtTopBottom {
my $self = shift ;
my $file = shift ;
my $NumOfLinesToRemoveAtTop = shift ;
my $NumOfLinesToRemoveAtBottom = shift ;
my #cache ;
my $StrTmp = () ;
my $StrReturn = () ;
my $fh = () ;
open($fh, "<", "$file") or cluck ( "can't open file : $file for reading: $!" ) ;
my $counter = 0;
while (<$fh>) {
if ($. >= $NumOfLinesToRemoveAtTop + 1) {
$StrTmp .= $_ ;
}
}
close $fh;
my $sh = () ;
open( $sh, "<", \$StrTmp) or cluck( "can't open string : $StrTmp for reading: $!" ) ;
while(<$sh>) {
push ( #cache, $_ ) ;
$StrReturn .= shift #cache if #cache > $NumOfLinesToRemoveAtBottom;
}
close $sh ;
return $StrReturn ;
}
#eof ReadFileReturnTrimmedStrAtTopBottom
#

Extracting specific data from text file in Perl

I am new to Perl and am trying to extract specific data from a file, which looks like this:
Print of 9 heaviest strained elements:
Element no Max strain
20004 9.6 %
20013 0.5 %
11189 0.1 %
20207 0.1 %
11157 0.1 %
11183 0.0 %
10665 0.0 %
20182 0.0 %
11160 0.0 %
==================================================
I would like to extract the element numbers only (20004, 20013 etc.) and write these to a new file. The reading of the file should end as soon as the line (=========) is reached, as there are more element numbers with the same heading later on in the file.
Hope that makes sense.
Any advice much appreciated!
I now have this code, which gives me a list of the numbers, maximum 10 in a row:
my $StrainOut = "PFP_elem"."_$loadComb"."_"."$i";
open DATAOUT, ">$StrainOut" or die "can't open $StrainOut"; # Open the file for writing.
open my $in, '<', "$POSTout" or die "Unable to open file: $!\n";
my $count = 0;
while(my $line = <$in>) {
last if $line =~ / ={10}\s*/;
if ($line =~ /% *$/) {
my #columns = split " ", $line;
$count++;
if($count % 10 == 0) {
print DATAOUT "$columns[1]\n";
}
else {
print DATAOUT "$columns[1] ";
}
}
}
close (DATAOUT);
close $in;
What needs changing is the "my #columns = split..." line. At the moment it splits up the $line scalar whenever it has '9 spaces'. As the number of digits of the element numbers can vary, this is a poor way of extracting the data. Is it possible to just read from left to right, omitting all spaces and recording numbers only until the numbers are followed by more spaces (that way the percentage value is ignored)?
#!/usr/bin/perl
use strict;
use warnings;
while (<>) { # read the file line by line
if (/% *$/) { # if the line ends in a percent sign
my #columns = split; # create columns
print $columns[0], "\n"; # print the first one
}
last if /={10}/; # end of processing
}
A one-liner using flip-flop:
perl -ne '
if ( m/\A\s*(?i)element\s+no/ .. ($end = /\A\s*=+\s*\Z/) ) {
printf qq[$1\n] if m/\A\s*(\d+)/;
exit 0 if $end
}
' infile
Result:
20004
20013
11189
20207
11157
11183
10665
20182
11160
#!/usr/bin/perl
use strict;
use warnings;
while (my $f= shift) {
open(F, $f) or (warn("While opening $f: $!", next);
my foundstart=0;
while(<F>) {
($foundstart++, next) if /^\s#Element/;
last if /\s*=+/;
print $_ if $foundstart;
}
$foundstart=0;
close(F);
}
#!/usr/bin/perl
use strict;
use warnings;
open my $rh, '<', 'input.txt' or die "Unable to open file: $!\n";
open my $wh, '>', 'output.txt' or die "Unable to open file: $!\n";
while (my $line = <$rh>) {
last if $line =~ /^ ={50}/;
next unless $line =~ /^ {6}(\d+)/;
print $wh "$1\n";
}
close $wh;
You could do it by running this one-liner in a command shell.
On *nix:
cat in_file.txt | perl -ne 'print "$1\n" if ( m/\s*(\d+)\s*\d+\.\d+/ )' > out_file.txt
On Windows:
type in_file.txt | perl -ne "print qq{$1\n} if ( m/\s*(\d+)\s*\d+\.\d+/ )" > out_file.txt

perl increasing the counter number every time the script running

I have a script to compare 2 files and print out the matching lines on the file. what I want to add a logic to help me to identify for how long these devices are matched. currently I have add the starting point 1 so I want to increase that number every time the script run and matched.
Example.
inputfile:-########################
retiredDevice.txt
Alpha
Beta
Gamma
Delta
prodDevice.txt
first
second
third
forth
Gamma
Delta
output file :-#######################
final_result.txt
1 Delta
1 Gamma
my objective is to add a counter stamp on each matching line to identify for how long "Delta" and "Gamma" matched. the script running every week. so every time the script running adding 1 so when I audit the 'finalResult.txt. the result should looks like
Delta 4
Gamma 3
the result indicate me Delta matched for last 4 weeks and Gamma for last 3 weeks.
#! /usr/local/bin/perl
my $ndays = 1;
my $f1 = "/opt/retiredDevice.txt ";
my $f2 = "prodDevice.txt";
my $outfile = "/opt/final_result.txt";
my %results = ();
open FILE1, "$f1" or die "Could not open file: $! \n";
while(my $line = <FILE1>){ $results{$line}=1;
}
close(FILE1);
open FILE2, "$f2" or die "Could not open file: $! \n";
while(my $line =<FILE2>) {
$results{$line}++;
}
close(FILE2);
open (OUTFILE, ">$outfile") or die "Cannot open $outfile for writing \n";
foreach my $line (keys %results) {
my $x = $ndays;
$x++;
print OUTFILE "$x : ", $line if $results{$line} != 1;
}
close OUTFILE;
Thanks in advance for any help!
Based on your earlier question and comments, perhaps this might work.
use strict;
use warnings;
use autodie;
my $logfile = 'int.txt';
my $f1 = shift || "/opt/test.txt";
my $f2 = shift || "/opt/test1.txt";
my %results;
open my $file1, '<', $f1;
while (my $line = <$file1>) {
chomp $line;
$results{$line} = 1;
}
open my $file2, '<', $f2;
while (my $line = <$file2>) {
chomp $line;
$results{$line}++;
}
{ ############ added part
my %c;
for (keys %results) {
$c{$_} = $results{$_} if $results{$_} > 1;
}
%results = %c;
} ############ end added part
my (%log, $log);
if ( -e $logfile ) {
open $log, '<', $logfile;
while (<$log>) {
my ($num, $key) = split;
$log{$key} = $num;
}
}
open $log, '>', $logfile or die $!;
for my $key (keys %results) {
my $old = ( $log{$key} || 0 ); # keep old count, or 0 otherwise
my $new = ( $results{$key} ? 1 : 0 ); # 1 if it exists, 0 otherwise
print $log $old + $new, " $key\n";
}
Perform this computation in two steps.
Each time you run the comparison between retired and prod, produce an output file that you save with a unique file name, e.g. result-XXX where XXX denotes when you ran the comparison.
Then write a script which iterates over all of the result-XXX files and produces a summary.
I would name the files result-YYYY-MM-DD where YYYY-MM-DD is the date that the comparison was created. Then it will be relatively easy to iterate over a subset of the files (e.g. ones for a certain month).
Or store the data in a relational database.