File handle array - perl

I wanted to choose what data to put into which file depending on the index. However, I seem to be stuck with the following.
I have created the files using an array of file handles:
my #file_h;
my $file;
foreach $file (0..11)
{
$file_h[$file]= new IT::File ">seq.$file.fastq";
}
$file= index;
print $file_h[$file] "$record_r1[0]$record_r1[1]$record_r1[2]$record_r1[3]\n";
However, I get an error for some reason in the last line. Help anyone....?

That should simply be:
my #file_h;
for my $file (0..11) {
open($file_h[$file], ">", "seq.$file.fastq")
|| die "cannot open seq.$file.fastq: $!";
}
# then later load up $some_index and then print
print { $file_h[$some_index] } #record_r1[0..3], "\n";

You can always use the object-oriented syntax:
$file_h[$file]->print("$record_r1[0]$record_r1[1]$record_r1[2]$record_r1[3]\n");
Also, you can print out the array more simply:
$file_h[$file]->print(#record_r1[0..3],"\n");
Or like this, if those four elements are actually the whole thing:
$file_h[$file]->print("#record_r1\n");

Try assigning the $file_h[$file] to a temporary variable first:
my #file_h;
my $file;
my $current_file;
foreach $file (0..11)
{
$file_h[$file]= new IT::File ">seq.$file.fastq";
}
$file= index;
$current_file = $file_h[$file];
print $current_file "$record_r1[0]$record_r1[1]$record_r1[2]$record_r1[3]\n";
As far as I remember, Perl doesn't recognize it as an output handle otherwise, complaining about invalid syntax.

Related

Writing to a file inside if statement not working in Perl

I've looked around here a bit and found similar questions but not exactly. If there is one, I apologize and please point me to it.
I have the following code. I'm trying to create a csv file of simply an ID pulled from a filename and the filename itself. This is the ENTIRE script.
use strict;
use warnings;
use File::Find;
find( \&findAllFiles, '.');
exit;
sub findAllFiles {
my #fp1;
my #fp2;
my $patId;
my $filename;
my $testvar = "hello again";
$filename = $File::Find::name;
if ($filename =~ /\.pdf$/) {
open (my $fh, '>', 'filenames.csv') or die "Failed to open - $!\n";
print $fh "starting...$testvar\n" or die "Failed to print to file - $!\n";
#fp1 = split('/', $filename);
#fp2 = split('_', $fp1[-1]);
$patId = $fp2[-1];
$patId =~ s/\.pdf$//;
print "Adding $patId, file = $filename\n";
print $fh "$patId,$filename\n" or die "File print error: $!";
close $fh or warn "close failed! - $!";
}
return;
}
The line that prints to the screen, prints perfectly.
If I take the file open/close and the first print statement out of the if block, it prints that line into the file, but not the data inside the block.
I've tried every combo I can think of and it doesn't work. I've alternated between '>' and '>>' since it clearly needs the append since it's looping over filenames, but neither works inside the if block.
Even this code above doesn't throw the die errors! It just ignores those lines! I'm figuring there's something obvious I'm missing.
Quoting File::Find::find's documentation:
Additionally, for each directory found, it will chdir() into that directory
It means that when you open inside findAllFiles, you are potentially opening a file filenames.csv inside a subdirectory of your initial directory. You can run something like find . -name filenames.csv from your terminal, and you'll see plenty of filenames.csv. You can change this behavior by passing no_chdir option to find:
find( { wanted => \&findAllFiles, no_chdir => 1}, '.');
(and additionally changing > for >> in your open)
However, personally, I'd avoid repeatedly opening and closing filenames.csv when you could open it just once before calling find. If you don't want to have your filehandle globally defined, you can always pass it as an argument to findAllFiles:
{
open my $fh, '>', 'filenames.csv' or die "Failed to open 'filenames.csv': $!";
find(sub { findAllFiles($fh) }, '.')
}
sub findAllFiles {
my ($fh) = #_;
...
filenames.csv will be created in the directory where the pdf is found, since find() changes directories as it searches. If that's not what you want, use an absolute path to open it (or open it before calling find, which seems like a better idea).

Picking a specific line with a specific string

I am trying this in Perl to pick one complete line from whole document which contains "CURRENT_RUN_ID". I have been using below code to accomplish the above said task but I am unable to enter the while loop.
my $sSuccessString = "CURRENT_RUN_ID";
open(LOG, "$slogfile") or die("Can't open $slogfile\n");
my $sLines;
{
local $/ = undef;
$sLines=<LOG>;
}
my $spool = 0;
my #matchingLines;
while (<LOG>)
{
print OUTLOG "in while loop\n";
if (m/$sSuccessString/i) {
print OUTLOG "in if loop\n";
$spool = 1;
print map { "$_ \n" } #matchingLines;
#matchingLines = ();
}
if ($spool) {
push (#matchingLines, $_);
}
}
You are already done reading from the filehandle LOG after you have slurped it into $sLines. <LOG> in the head of the while will return undef because it has reached eof. You either have to use that variable $sLines in your while loop or get rid of it. You're not using it anyway.
If you only want to print the line that matches, all you need to do is this:
use strict;
use warnings;
open my $fh_in, '<', 'input_file' or die $!;
open my $fh_out '>', 'output_file' or die $!;
while (my $line = <$fh_in>) {
print $fh_out $line if $line =~ m/CURRENT_RUN_ID/;
}
close $fh_in;
close $fh_out;
When you execute this code:
$sLines=<LOG>;
it reads all of the data from LOG into $sLines and it leaves the file pointer for LOG at the end of the file. So when you next try to read from that file handle with:
while (<LOG>)
nothing is returned as there is no more data to read.
If you want to read the file twice, then you will need to use the seek() function to reset the file pointer before your second read.
seek LOG, 0, 0;
But, given that you never do anything with $sLines I suspect that you can probably just remove that whole section of the code.
The whole thing with $spool and #matchingLines seems strange too. What were you trying to achieve there?
I think your code can be simplified to just:
my $sSuccessString = "CURRENT_RUN_ID";
open(LOG, $slogfile) or die("Can't open $slogfile\n");
while (<LOG>) {
print OUTLOG if /$sSuccessString/i/;
}
Personally, I'd make it even simpler, by reading from STDIN and writing to STDOUT.
my $sSuccessString = 'CURRENT_RUN_ID';
while (<>) {
print if /$sSuccessString/i/;
}
And then using Unix I/O redirection to connect up the correct files.
$ ./this_filter.pl < your_input.log > your_output.log

Perl read text in paragraph and patternmatch

I have an index file, which keeps an index of each object separated by a blank line. Now, I have to search for a keyword on each object, and if present, dump to another file, instead of rebuilding the entire index from the scratch. The piece of code
####files is an array that contains the list of packages in the index
open("FH", $indexfile) or die ;
my #linearray = <FH>;
close ("FH");
open (NFH, '>', "$tmpfile") or die "cannot create";
foreach my $pattern (#files)
{
if (my #matches = grep /$pattern/, #linearray) {
print NFH "#matches";
} else {
push #newpkgs,$pattern;
}
}
close (NFH);
But this is not working as expected. How can I get a paragraph as an element in an array?
Modify $/
my #linearray;
{
open("FH", $indexfile) or die ;
local $/ = '';
#linearray = <FH>;
close ("FH");
print Dumper #linearray;
}
This will give you the required output.
I used the braces {} to limit the scope of Field Record Separator modification till the objects are read to the array. You can extend it according to your requirement.

perl file read, truncate

I am trying to modify a config file.
I first read it into #buffer, depending on a regex match.
The modified buffer gets written back on disk, in case the file got smaller, a trunciation is done.
Unfortunatly this does not work, and it already crashes at fseek, but as far as I can say my usage of fseek conforms to perl doc.
open (my $file, "+<", "somefilethatexists.txt");
flock ($file, LOCK_EX);
foreach my $line (<$file>) {
if ($line =~ m/(something)*/) {
push (#buffer, $line);
}
}
print "A\n";
seek($file,0,0); #seek to the beginning, we read some data already
print "B\n"; # never appears
write($file, join('\n',#buffer)); #write new data
truncate($file, tell($file)); #get rid of everything beyond the just written data
flock($file, LOCK_UN);
close ($file);
perlopentut says this about Mixing Reads and Writes
... when it comes to updating a file ... you probably don't want to
use this approach for updating.
You should use Tie::File for this. It opens the file for both read and write on the same filehandle and allows you to treat a file as an array of lines
use strict;
use warnings;
use Tie::File;
tie my #file, 'Tie::File', 'somefilethatexists.txt' or die $!;
for (my $i = 0; $i < #file; ) {
if (m/(something)*/) {
$i++;
}
else {
splice #file, $i, 1;
}
}
untie #file;
Where are your fseek(), fwrite() and ftruncate() functions defined? Perl doesn't have those functions. You should be using seek(), print() (or syswrite()) and truncate(). We can't really help you if you're using functions that we know nothing about.
You also don't need (and probably don't want) that explicit call to unlock the file or the call to close the file. The filehandle will be closed and unlocked as soon as your $file variable goes out of scope.
Maybe you can try this:
$^I = '.bak';
#ARGV = 'somefilethatexists.txt';
while (<>) {
if (/(something)*/) {
print;
}
}

Is it possible to read multiple files with a single filehandle in Perl?

I have a few log files like these:
/var/log/pureftpd.log
/var/log/pureftpd.log-20100328
/var/log/pureftpd.log-20100322
Is it possible to load all of them into a single filehandle or will I need to load each of them separately?
One ugly hack would be this:
local #ARGV = qw(
/var/log/pureftpd.log
/var/log/pureftpd.log-20100328
/var/log/pureftpd.log-20100322
);
while(<>) {
# do something with $_;
}
You could use pipes to virtually concat these files to a single one.
It's not terribly hard to do the same thing with a different filehandle for each file:
foreach my $file ( #ARGV )
{
open my($fh), '<', $file or do { warn '...'; next };
while( <$fh> )
{
...
}
}