Read one worksheet from Excel workbook in Perl - perl

I have a perl script read Excel file and parse using a perl module "Spreadsheet::ParseExcel::Simple; what I want is my script to read only one tab instead of reading all tabs on the spread sheet. here what looks like the portion of the script.
# ### Use a Spreadsheet CPAN module for parsing Excel spreadsheet
require Spreadsheet::ParseExcel::Simple;
my $xls = Spreadsheet::ParseExcel::Simple->read('/opt/test.xls');
$Output_Circuit_file = "/opt/result.text";
############################################################################
$err_idx = 0;
$out_idx = 0;
#date_var = localtime;
$date = sprintf("%02d",($date_var[4] + 1)) . '/' . sprintf("%02d",$date_var[3]) . '/' . ($date_var[5] + 1900);
## #############
foreach my $sheet ($xls->sheets) {
while ($sheet->has_data) {
my #words = $sheet ->next_row;
$ctr++;
anyone can help, where I can modify to read my script read only tab "A" and ignore everything else.

foreach my $sheet ($xls->sheets) this loop is causing you to read all sheets (worksheets), which is another word for the tabs in excel. Just read the first of the $xls->sheets and you've got it
#sheets = $xls->sheets;
$sheet = $sheets[0];
if($sheet->has_data) {
....
}

add the following lines and read specific worksheet name
$sheetname= $sheet->{sheet}->{Name};
if ($sheetname !~ 'Test') { next; }
Thanks all for your effort to help me!!

Related

Perl Script did not find the newest file

I use the following perl script from "https://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/Check-Newest-files-age-and-size-in-Diredtory/"
But in these script is an Error. The script is not showing the newest file. Can someone find the mistake? In the comments of the site have wrote somebody, that in line 22 the mistake is. I can't find it:
Here is the code:
# Check that file exists (can be directory or link)
unless (-d $opt_f) {
print "FILE_AGE CRITICAL: Folder not found - $opt_f\n";
exit $ERRORS{'CRITICAL'};
}
my $list = opendir DIRHANDLE, $opt_f or die "Cant open directory: $!";
while ($_ = readdir(DIRHANDLE))
{
$file=sprintf("%s/%s",$opt_f,$_);
$attrs = stat("$file");
$diff = time()-$attrs->mtime;
if($temp == 0)
{
#$temp=$diff;
$new=$file;
}
if($_ ne "." && $_ ne "..")
{
if($diff<$temp)
{
$temp=$diff;
$new=$_;
}
else
{
$temp=$diff; $new=$_;
}
}
}
$st = File::stat::stat($opt_f."/".$new);
$age = time - $st->mtime;
$size = $st->size;
Example:
I have some files on a filer (backups in a .img File). I use this script, to check the newest file size. If I create a new folder with a new file, the check looks to the correct file. But if I create a second file, the check looks to the old file anytime. If I create a third file, the check goes to the correct file. The fourth file is wrong and the fifth file is correct again(and so on)
An easy (easier?) way to do this would be to use the built-in glob function to read the directory instead of opening it, and then use simple file tests to sort the files by creation or modification time:
my #files = sort {-M($a) <=> -M($b)} glob "*"; # or -C for creation
# $files[0] is the newest file
A list of file test operators is at
https://users.cs.cf.ac.uk/Dave.Marshall/PERL/node69.html
Note that -C and -M relate to when the script started, so for long-running or daemon scripts you might need to do something a bit different.
You want to find the earliest mtime, so we're talking about a simple comparison of the previously-found earlier mtime with the mtime of the current file. But there's so much code beyond that in what you posted ...and the first thing you do with the value you want to compare is change it? What?
Let's start over.
my $earliest_mtime = -1;
my $earliest_qfn;
while (defined( my $fn = readdir($dh) )) {
next if $fn =~ /^\.\.?\z/;
my $qfn = "$dir_qfn/$fn";
my $stats = stat($qfn)
or warn("Can't stat \"$qfn\": $!\n"), next;
my $mtime = $stats->mtime;
if ($mtime < $earliest_mtime) {
$earliest_mtime = $mtime;
$earliest_qfn = $qfn;
}
}
if (defined($earliest_qfn)) {
say $earliest_qfn;
}
The biggest issue with the script seems to be that line 12 calls the core version of stat but line 13 expects the output to be that of File::stat::stat(). I suspect that testing for '.' or '..' should be done at the top of the while loop and all the variables should be defined before they are used.
As Jeremy has said, you're better off sorting an array of the files and pushing/poping the first/last value, depending on what you're looking for.

Using Net::Google::Drive::Simple to download all a spreadsheet's sheets as CSV

I'm writing a script that pulls down spreadsheets Google Drive using Net::Google::Drive::Simple for further processing. I'd like to parse them with Perl, so I'd prefer to download them as CSV files and spare myself some processing trouble.
Unfortunately, I'm finding that only the first sheet is getting exported as a CSV file. Is there a way to use this interface to get the other sheets?
EDIT:
Sample of the code, but it's honestly pretty rudimentary:
my $gd = Net::Google::Drive::Simple->new();
my $children = $gd->children( "/My Spreadsheets" );
foreach my $character ( #$children ) {
next if $character->is_folder;
print "\nFILE: " . $character->title . "\n";
foreach my $type (keys %{$character->exportLinks()}) {
print "TYPE: $type, LINK: " . $character->exportLinks()->{$type} . "\n";
}
}
That list of exports produces a single CSV file that represents the first sheet. Not seeing any indications of how to get to the second or subsequent sheets.

BioPerl with clustalw - outputting file

I have a perl script to automate many multiple alignments (I'm making the script first with only one file and one multiple alignment - big one though. I can then modify for multiple files) and I want to output the resulting file, but I am unsure on how to do with with AlignIO: so far I have:
use warnings;
use strict;
use Bio::AlignIO;
use Bio::SeqIO;
use Bio::Tools::Run::Alignment::Clustalw;
my $file = shift or die; # Get filename from command prompt.
my $factory = Bio::Tools::Run::Alignment::Clustalw->new(-matrix => 'BLOSUM');
my $ktuple = 3;
$factory->ktuple($ktuple);
my $inseq = Bio::SeqIO->new(
-file => "<$file",
-format => $format
);
my $seq;
my #seq_array;
while ($seq = $inseq->next_seq) {
push(#seq_array, $seq);
}
# Now we do the actual alignment.
my $seq_array_ref = \#seq_array;
my $aln = $factory->align($seq_array_ref);
Once the alignment is done I have $aln which is the alignment I want to get out of the process as a fasta file - I tried something like:
my $out = Bio::AlignIO->new(-file => ">outputalignmentfile",
-format => 'fasta');
while( my $outaln = $aln->next_aln() ){
$out->write_aln($outaln);
}
but it didn't work, presumably because the method next_aln() only applies to AlignIO things, which $aln is probably not. So I need to know what it is that is generated by the line my $aln = $factory->align($seq_array_ref); and how to get the aligned sequences output into a file. My next step is tree estimation or network analysis.
Thanks,
Ben.
$out->write_aln($outaln);
Was the only line needed to write the object returned by the clustalw line to output the object to that stream.

Perl Loop Output to Excel Spreadsheet

I have a Perl script, the relevant bits of which are posted below.
# Pull values from cells
ROW:
for my $row ( $row_min + 1 .. $row_max ) {
my $target_cell = $worksheet->get_cell( $row, $target_col);
my $response_cell = $worksheet->get_cell( $row, $response_col);
if ( defined $target_cell && defined $response_cell ) {
my $target = $target_cell->value();
my $response = $response_cell->value();
# Determine relatedness
my $value = $lesk->getRelatedness($target, $response);
# Copy output to new Excel spreadhseet, 'data.xls'
my $workbook1 = Spreadsheet::WriteExcel->new('data.xls');
my $worksheet1 = $workbook1->add_worksheet();
$worksheet1->set_column(0, 3, 18);
my $row = 0;
foreach ($target) {
$row++;
$worksheet1->write( $row, 0, "Target = $target\n");
$worksheet1->write( $row, 1, "Response = $response\n");
$worksheet1->write( $row, 2, "Relatedness = $value\n");
}
}
}
This script uses the Perl modules ParseExcel and WriteExcel. The input data spreadsheet is a list of words under two columns, one labelled 'Target' and the other labelled 'Response.' The script takes each target word and each response word and computes a value of relatedness between them (that's what the
$lesk->getRelatedness
section of code is doing. It is calling a perl module called WordNet::Similarity that computes a measure of relatedness between words).
All of this works perfectly fine. The problem is I am trying to write the output (the measure of similarity, or $value in this script) into a new Excel file. No matter what I do with the code, the only output it will give me is the relatedness between the LAST target and response words. It ignores all of the rest.
However, this only occurs when I am trying to write to an Excel file. If I use the 'print' function instead, I can see all of the outputs in the command window. I can always just copy and paste this into Excel, but it would be much easier if I could automate this. Any idea what the problem is?
You're resetting the value of $row each time to 0.
Problem is solved. I just needed to move the
my $workbook1 = Spreadsheet::WriteExcel->new('data.xls');
my $worksheet1 = $workbook1->add_worksheet();
lines to another part of the script. Since they were in the 'for' statement, the program kept overwriting the 'data.xls' file every time it ran through the loop.

How do I apply formatting to a particular word in a docx file using Win32::Ole in Perl?

For example, my docx file contains the following sentences:
This is a Perl example
This is a Python example
This is another Perl example
I want to apply bold style to all the occurrences of the word "Perl" like so:
This is a Perl example
This is a Python example
This is another Perl example
I've so far come up with the following script:
use strict; use warnings;
use Win32::OLE::Const 'Microsoft Word';
my $file = 'E:\test.docx';
my $Word = Win32::OLE->new('Word.Application', 'Quit');
$Word->{'Visible'} = 0;
my $doc = $Word->Documents->Open($file);
my $paragraphs = $doc->Paragraphs() ;
my $enumerate = new Win32::OLE::Enum($paragraphs);
while(defined(my $paragraph = $enumerate->Next())) {
my $text = $paragraph->{Range}->{Text};
my $sel = $Word->Selection;
my $font = $sel->Font;
if ($text =~ /Perl/){
$font->{Bold} = 1;
}
$sel->TypeText($text);
}
$Word->ActiveDocument->Close ;
$Word->Quit;
But it has applied bold style to the whole paragraph and it does not edit the sentences in their original place. It gives me both the modified version and the original version like this:
This is a Perl example
This is a Python example
This is another Perl example
This is a Perl example
This is a Python example
This is another Perl example
How should I fix my problem. Any pointers? Thanks like always :)
UPDATE
Problem solved! Big thanks to #Zaid, and #cjm :)
Here's the code that works lovely:
while ( defined (my $paragraph = $enumerate->Next()) ) {
my $words = Win32::OLE::Enum->new( $paragraph->{Range}->{Words} );
while ( defined ( my $word = $words->Next() ) ) {
my $font = $word->{Font};
$font->{Bold} = 1 if $word->{Text} =~ /Perl/;
}
}
Try using the Words method instead of Text.
Untested:
while ( defined (my $paragraph = $enumerate->Next()) ) {
my $words = Win32::OLE::Enum->new( $paragraph->{Range}->{Words} );
while ( defined ( my $word = $words->Next() ) ) {
my $font = $word->{Font};
$font->{Bold} = 1 if $word->{Text} =~ /Perl/;
}
}
I dont know anything about perl.
But you look at office open xml
You can use treat the .docx file as a zip file, and do a simple search and replace, wich works a million times quicker than the interop. and you dont have to worry about the million things that can go wrong with it also.
Rename your .docx file to .zip and open it and you will see what I mean.