Extract specific data from multiple text files in a directory - perl

This is my first program in perl.
I have more than 1000 files and I want to extract specific data from a file. The structure of all the files are same.
Its really difficult to open every file and then copy a specific data,
How can I achieve this using perl.
The structure looks like this.
LensMode=Normal
MicronMarker=500
DataDisplayCombine=1
Voltage=0 Volt
PixelSize=1.586612
I want to extract MicronMarker and PixelSize from each file.
Any help in the right direction is appreciated.
the location is D:\Files\Folder1

Try this
Use glob to read the directory
while (my $files = glob(" D:\Files\Folder1\*"))
{
open my $handler,"<","$files";
my #extract = grep{ m/^(MicronMarker|PixelSize)/g} <$handler>;
print #extract;
}
Extract the word from a file using the while loop by opendir.
opendir(my $dir, " D:\Files\Folder1");
while (my $ech = readdir($dir))
{
open my $handler,"<","test/$ech";
while(my $l = <$handler>)
{
if($l =~m/^(?:MicronMarker|PixelSize)/g)
{
print "$l";
}
}
close ($handler);
}
This is easy way to extract a words from a file using grep
while (my $ech = readdir($dir))
{
open my $handler,"<","test/$ech";
my #extract = grep{ m/^(MicronMarker|PixelSize)/g} <$handler>;
print #extract;
close($handler);
}

Related

7Zip execute 2 extractions from 2 different subfolders (only first executes)

Okay I'm at a total loss.
I am trying to extract all the XMLs and PDFs from a 7zip file.
There is more stuff inside said file, so I just want to extract from the PDF folder and the XML folder. Leaving the file structure behind and not searching in any other folders.
I am using the 7Zip command line to do this.
I have two sub routines that I execute which are almost identical.
sub Extract_pdfs_from_this
{
my ($file, $destination) = #_;
my $sevenzip_executable = '\\\\server\7-Zip\7z.exe';
my $extract_pdfs = "$sevenzip_executable e -y -o$destination $file output\\JETPDF\\DISB\\*.pdf ";
print STDOUT "\n\nExtracting PDFs From $file \n>>$extract_pdfs \n";
eval{system($extract_pdfs)};
print STDOUT "Finished Extracting PDFs \n";
return;
}
..
sub Extract_xmls_from_this
{
my ($file, $destination) = #_;
my $sevenzip_executable = '\\\\server\7-Zip\7z.exe';
my $extract_xmls = "$sevenzip_executable e -y -o$destination $file staging\\DISB\\OnBase\\*.xml ";
print STDOUT "\n\nExtracting XMLs From $file \n>>$extract_xmls \n";
eval{system($extract_xmls)};
print STDOUT "Finished Extracting XMLs \n";
return;
}
and I use it like so...
my $in_extraction_directory = dirname(__FILE__);
my $input_subdirectory = "$directory\\$subdirectory";
my #in_seven_zip_files = Get_all_sevenzips_in($input_subdirectory);
foreach my $sevenzip_file (#in_seven_zip_files)
{
$sevenzip_file = "$input_subdirectory\\$sevenzip_file";
Extract_pdfs_from_this($sevenzip_file, $in_extraction_directory);
Extract_xmls_from_this($sevenzip_file, $in_extraction_directory);
}
When executed the PDFs get extracted but not the XMLs.
I get an error, there are no files to process.
I feel like 7zip is hung up on the file from the previous call. Is there a way to close it or release the file?
Any help appreciated, much time wasted on this.
Thanks!
Check exit status $?, if you feel it's hung.
Also you can try first extracting xmls then pdfs to really make sure, if extracting pdfs command is making issue.
share console output, Which can show much details.
User error... Works just how it should.
I had a condition:
unless ($number_of_pdfs == $number_of_xmls)
{
print STDOUT "The number of PDFs and XMLs did not match!\n\n";
print STDOUT "PDFs: $number_of_pdfs \nXMLs: $number_of_xmls\nFile: $sevenzip_file \nExtraction Directory: $output_directory\n\n";
die;
}
and in the first file I was extracting, the XML was not in the correct path... Someone didn't follow pattern. Very embarrassing thanks for the response.

perl script to parse log files from different test locations.which takes dynamic path of testcases

I want a perl script that will go in to every test folder and parse the log file in it.
Eg:
results/testcases/ **?** /test.log
The above path must be dynamically changing with different test folder names in the place of **?** mark.
I am using this results/testcases/#array/test.log
#array has test names
My suggestion would be:
my $path = "results/testcases";
opendir(TEMPDIR,$path) or die "err1";
my #dir = grep -d, readdir TEMPDIR;
foreach(#dir)
{
if( $_ !~ /\./ )
{
open( my $fileHandle , "results/testcases/".$_."/test.log" ) or die "err2";
# parsing log file
close $fileHandle or die "err2-2";
}
}
close TEMPDIR or die "err1-2";
First, you need to read the folder "results/testcases" for current correct folder names. Second, you need to open the files one by one, stead of putting #array in the middle of the path. Third, you should read basic perl, otherwise you won't be able to parse in a proper manner. Fourth, you really should read through HOW TO ASK, you should put in your code so that we could be more helpful and your questions shall help others, as well.
If your test folders relative paths are stored in #array. You can do the following:
my #testlogs = grep { -e $_ } map { "results/testcases/".$_."/test.log" } #array;
The new array #testlogs now contains the list of paths to existing 'test.log' files.
Then, you can parse each file like this:
map { ... parsing call ... } #testlogs;

How to upload multiple files in perl?

I need to upload multiple files using perl cgi.
i used form type as
enctype="multipart/form-data
and also set
multiple='multiple' in input type file.
just need to know what should we do write at server side ?
Can anybody tell me how to upload multiple files using perl?
The following piece of code is enough and upload files present in the params to /storage location:
use CGI;
my $cgi = new CGI;
my #files = $cgi->param('multi_files[]');
my #io_handles=$cgi->upload('multi_files[]');
foreach my $upload(#files){
print "Filename: $upload<br>";
my $file_temp_path = "/storage";
my $upload_file = shift #io_handles;
open (UPLOADFILE,">$file_temp_path/$upload") or print "File Open Error";
binmode UPLOADFILE;
while (<$upload_file>) {
print UPLOADFILE;
}
}
print "Files Upload done";
On the server side, you first retrive the file file handle like this:
use CGI;
my $q = CGI->new();
my $myfh = $q->upload('field_name');
Now you have a filehandle to the temporary storage whither the file was uploaded.
The uploaded file anme can be had using the param() method.
$filename = $q->param('field_name');
and the temporary file can be directly accessed via:
$filename = $query->param('uploaded_file');
$tmpfilename = $query->tmpFileName($filename);
I highly recommend giving the CGI.pm docs a good solid read, a couple of times. While not trivial, it's all rather straightforward.
Something like this should handle multiple files upload:
my #fhs = $Cgi->upload('files');
foreach my $fh (#fhs) {
if (defined $fh) {
my $type = $Cgi->uploadInfo($fh)->{'Content-Type'};
next unless ($type eq "image/jpeg");
my $io_handle = $fh->handle;
open (OUTFILE,'>>','/var/directory/'.$fh);
while (my $bytesread = $io_handle->read(my $buffer, 1024)) {
print OUTFILE $buffer;
}
close (OUTFILE);
}
}
Ofc 'files' is the name of the file upload form.

Perl script that copies files listed in text file only copies the last file successfully

I have a Perl script that reads in a list of files from a file, accessFiles.txt, and copies them to another location.
open my $accessFiles, "$scriptDir\\accessFiles.txt" or die "Could not open access file list $!";
while (my $accessFile = <$accessFiles>) {
my($file, $dir, $ext) = fileparse($accessFile, qr/\.[^.]*/);
my $accessDir = "$localDir\\AccessFiles\\$file";
my $accessCopy = "$accessDir\\$file$ext";
system("rmdir","/S", "/Q",$accessDir);
system("mkdir",$accessDir);
system("copy", $accessFile, $accessCopy);
}
The output of the copy command says it copied one file for each file in accessFiles.txt, but only the last file gets copied.
I've added input statements before and after the copy, and I cannot see any of the other files in the copied directory at any time.
Now, if I change the script to read from an array of files, then it works perfectly.
my #files = ('\\\\sourceshare\acc1.accdb', '\\\\sourceshare\acc2.accdb');
foreach my $accessFile (#files) {
my($file, $dir, $ext) = fileparse($accessFile, qr/\.[^.]*/);
my $accessDir = "$localDir\\AccessFiles\\$file";
my $accessCopy = "$accessDir\\$file$ext";
system("rmdir","/S", "/Q",$accessDir);
system("mkdir",$accessDir);
system("copy", $accessFile, $accessCopy);
}
Thanks in advance.
You didn't remove the trailing newline. Add
chomp($accessFile);

File handle array

I wanted to choose what data to put into which file depending on the index. However, I seem to be stuck with the following.
I have created the files using an array of file handles:
my #file_h;
my $file;
foreach $file (0..11)
{
$file_h[$file]= new IT::File ">seq.$file.fastq";
}
$file= index;
print $file_h[$file] "$record_r1[0]$record_r1[1]$record_r1[2]$record_r1[3]\n";
However, I get an error for some reason in the last line. Help anyone....?
That should simply be:
my #file_h;
for my $file (0..11) {
open($file_h[$file], ">", "seq.$file.fastq")
|| die "cannot open seq.$file.fastq: $!";
}
# then later load up $some_index and then print
print { $file_h[$some_index] } #record_r1[0..3], "\n";
You can always use the object-oriented syntax:
$file_h[$file]->print("$record_r1[0]$record_r1[1]$record_r1[2]$record_r1[3]\n");
Also, you can print out the array more simply:
$file_h[$file]->print(#record_r1[0..3],"\n");
Or like this, if those four elements are actually the whole thing:
$file_h[$file]->print("#record_r1\n");
Try assigning the $file_h[$file] to a temporary variable first:
my #file_h;
my $file;
my $current_file;
foreach $file (0..11)
{
$file_h[$file]= new IT::File ">seq.$file.fastq";
}
$file= index;
$current_file = $file_h[$file];
print $current_file "$record_r1[0]$record_r1[1]$record_r1[2]$record_r1[3]\n";
As far as I remember, Perl doesn't recognize it as an output handle otherwise, complaining about invalid syntax.