How to use the Windows "hook function" with Perl Win32::GUI? - perl

I am working on Windows XP SP 3, Strawberry Perl.
I would like to let the user of my Perl program select a file; but when using Win32::GUI::GetOpenFileName(), I would like the Windows file selection dialog to open in the "Details" file list option, and not in the default "List" file list option.
Googling on the Net, it seems that I have to use the Windows "hook" function, and send certain messages to the file selection control. The documentation about that is MSDN, and I don't seem to master how to apply it in Perl.
Can anyone recommend what should be the right call syntax in Perl?
Here is my code sample, where the file selection dialog opens with the (default) "List" option:
use strict;
use warnings;
use 5.014;
use Win32::Console;
use Win32::GUI();
use autodie;
use warnings qw< FATAL utf8 >;
use Carp::Always;
use Win32API::File::Time qw{:win};
use Image::ExifTool qw(:Public);
use Date::Parse;
# ...
my ( $FileName, $ImageDir, $DIR, $TopDir);
# ...
$TopDir = 'D:\My Documents';
$ImageDir = Win32::GUI::BrowseForFolder( -root => $TopDir, -includefiles => 1,);
unless ($ImageDir) {
say '$DirName not defined after calling Win32::GUI::BrowseForFolder, ',
'Photo date set line'.__LINE__;
exit;
}
else {
say "Identified directory: $ImageDir";
}
# now select a file
$FileName = Win32::GUI::GetOpenFileName( -title => 'Select an image file', -directory => $ImageDir,
-file => "\0" . " " x 256,
-filter => ["Image files (*.jpg)" => "*.jpg;*.jpeg", "All files", "*.*", ],);
unless ($FileName) {
say '$FileName not defined after calling Win32::GUI::GetOpenFileName, ',
'Photo date set line'.__LINE__;
}
else {
say "Identified image file: $FileName";
}
# ...
Note: (somewhat) similar post at: http://www.perlmonks.org/?node_id=989418

Unfortunately the Win32::GUI API exposes neither the OFN_ENABLEHOOK flag bit or the lpfnHook field of the GetOpenFileName options.
You could perhaps get it working using the Win32::API module to work at an even lower level, but you would have to build the entire OPENFILENAME structure yourself using pack and write some XS code for the hook handler.

Related

Manipulating a Word doc with PERL and Win32::OLE works on command line, but not from a BuildForge Step

I have functioning PERL scripts which are used to update Word documents.
The PERL scripts are coded to work with MS Office 2003 or 2007.
The machine I'm trying to do the updates via BF on has Office 2003 on, the appropriate template is installed, the Macro Security settings have been updated.
When I run the exact command I want BF to use on the command line, it works as expected.
When I run it via a BF step I get "*** Unable to open doc at \servername\projectname\bin\updateVer.pl line 94" (the line number is the croak in the Perl script).
The script looks like this up to the croak:
# enable Sanity checking and make the variable names meaningful
use strict;
use warnings;
use English;
use Win32::OLE;
# Gain access to MS Word 'wd' constants
use Win32::OLE::Const ('Microsoft Word');
use FindBin qw($RealDir);
use lib ($RealDir, "$RealDir/..", "$RealDir/../lib");
# include the common and log utilities
use SCCM::Common;
use SCCM::Logs;
# use command line inputs
use Getopt::Long qw(:config auto_abbrev permute ignore_case pass_through);
# set up logs and process logfile options
logOptions(qw(-log now));
my $bookmark_update_result = "";
my $update_ref_result = "";
# Get input from user
my $path;
my $bookmarkName;
my $bookmarkValue;
my $Word;
my $newWord = 0;
GetOptions("path=s" => \$path,
"bookmarkName=s" => \$bookmarkName,
"bookmarkValue=s" => \$bookmarkValue);
unless ( defined($path) )
{ croakf "%[Fail] Path and filename of SVD are required\n"; }
unless ( defined($bookmarkName) && defined($bookmarkValue) )
{ croakf "%[Fail] bookmarkName and bookmarkValue parameters are both required.\n"; }
# Start Word in a safer way, checking to see if user has it open first.
eval
{
$Word = Win32::OLE->GetActiveObject('Word.Application');
if (! $Word)
{
$newWord = 1;
$Word = Win32::OLE->new('Word.Application', 'Quit');
}
};
croakf "%[Fail] -- unable to start Word Engine: $#\n", Win32::OLE->LastError() if ($# || ! $Word);
my $dispAlerts = $Word->{'DisplayAlerts'};
$Word->{'DisplayAlerts'} = wdAlertsNone;
if ($newWord)
{
$Word->{'Visible'} = 0;
}
my $doc = $Word->Documents->Open($path) or
croakf ("%[Fail] Unable to open doc ", Win32::OLE->LastError() );
The script is being called like this:
ccperl \servername\projectname\bin\updateVer.pl -path "C:\BuildForgeBuilds\BFProjectName\BFProjectName_0177\MyDocument.doc" -bookmarkName REV_Baseline -bookmarkValue My_Baseline_10.20.30
Can I get some direction to convince BF that it's OK to open my work doc?
Thank you!
Turns out after I removed the 'Archive' bit from the template on the machine the job was running on, BF was able to successfully run its job.
Funny I could run it directly on the server with the 'Archive' bit set on the template. But hey, I'm no longer stuck on this one.
Thanks to anybody who read my question and even considered what might have been the problem.

Handling unicode directory and filenames in Perl on Windows

I have an encoding problem with Perl and Windows. On a Windows 7 running Perl (strawberry 5.16) and a simple TK GUI I need to open files and/or access directories with non-english characters in their name/path. For opening files I've come out with this solution which seems to work fine:
#!/usr/bin/perl -w
use strict;
use warnings;
use Win32::Unicode::File;
use Encode;
use Tk;
my $mw = Tk::MainWindow->new;
my $tissue_but = $mw->Button(
-text => 'Open file',
-command => [ \&select_unicode_file ],
);
$tissue_but->grid( -row => 3, -column => 1 );
Tk::MainLoop();
sub select_unicode_file{
my $types = [ ['Txt', '.txt'],
['All Files', '*'],];
my $input_file= $mw->getOpenFile(-filetypes => $types);
my $fh = Win32::Unicode::File->new;
if ($fh->open('<', $input_file)){
while (my $line = $fh->readline()){
print "\n$line\n";
}
close $fh;
}
else{
print "Couldn't open file: $!\n";
}
}
This correctly opens files such as Поиск/Поиск.txt
What I CANNOT do is to simply get a directory path and than process it. I think I should use use Win32::Unicode::Dir but I really can't understand the documentation.
It should be something like this:
#!/usr/bin/perl -w
use strict;
use warnings;
use Win32::Unicode::Dir;
use Encode;
use Tk;
my $mw = Tk::MainWindow->new;
my $tissue_but = $mw->Button(
-text => 'Open file',
-command => [ \&select_unicode_directory ],
);
$tissue_but->grid( -row => 3, -column => 1 );
Tk::MainLoop();
sub select_unicode_directory{
my $dir = $mw->chooseDirectory( );
my $wdir = Win32::Unicode::Dir->new;
my $dir = $wdir->open($dir) || die $wdir->error;
my $dir_complete = "$dir/a.txt";
open (MYFILE, $dir_complete );
while (<MYFILE>) {
chomp;
print "$_\n";
}
close (MYFILE);
}
There is a logical error in:
my $dir = $wdir->open($dir) || die $wdir->error;
my $dir_complete = "$dir/a.txt";
$wdir->open('path') returns an object, not a string. You can't use it like a path. But that is not the worst of it. Sadly, it seems like the Tk implementation does not yet have support for Unicode file names (including chooseDirectory). I guess you will have to write a custom dir selector, but I'm not sure it's even possible.
This is capable of listing files in an ascii-chars folder (and ->fetch can list utf-8 files), and crashes when opening a folder with utf-8 chars. Well, to be fair it crashes when opening ??????.
use strict;
use warnings;
use Win32::Unicode::Dir;
use Win32::Unicode::Console;
use Encode;
use Tk;
my $mw = Tk::MainWindow->new;
my $tissue_but = $mw->Button(
-text => 'Select dir',
-command => [ \&select_unicode_directory ],
);
$tissue_but->grid( -row => 3, -column => 1 );
Tk::MainLoop();
sub select_unicode_directory {
my $wdir = Win32::Unicode::Dir->new;
my $selected = $mw->chooseDirectory(-parent =>$mw);
# http://search.cpan.org/dist/Tk/pod/chooseDirectory.pod#CAVEATS
$selected = encode("utf-8", $selected);
print "selected: $selected\n";
$wdir->open($selected) || die $wdir->error;
print "\$mw->chooseDirectory: $selected\n";
print "\$wdir->open(\$selected): $wdir\n";
# CRASH HERE, presumably because winders can't handle '?' in a file (dir) name
for ($wdir->fetch) {
# http://search.cpan.org/~xaicron/Win32-Unicode-0.38/lib/Win32/Unicode/Dir.pm
next if /^\.{1,2}$/;
my $path = "$selected/$_";
if (file_type('f', $path)) { print "file: $path\n"; }
elsif (file_type('d', $path)) { print " dir: $path\n"; }
}
print "closing \n";
$wdir->close || die $wdir->error;
}
Sample out (opening Поиск/):
Both samples below were run using: Strawberry Perl 5.12.3 built for MSWin32-x64-multi-thread
selected: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/?????
$mw->chooseDirectory: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/?????
$wdir->open($selected): Win32::Unicode::Dir=HASH(0x2e38158)
>>> perl crash <<<
Sample out (opening parent of Поиск):
selected: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk
$mw->chooseDirectory: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk
$wdir->open($selected): Win32::Unicode::Dir=HASH(0x2b92c10)
file: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/.select_uni_dir.pl.swp
file: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/o
file: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/o.dir
file: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/select_uni_dir.pl
file: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/select_uni_file.pl
dir: C:/cygwin/home/jaroslav/tmp/so/perl/open-file-tk/Поиск
Conclusion
The Tk dir selector returns ????? instead of Поиск. While looking for a way to enable Unicode in Tk, I found this:
http://search.cpan.org/dist/Tk/pod/UserGuide.pod#Perl/Tk_and_Unicode :
(...) Unfortunately, there are still places in Perl ignorant of
Unicode. One of these places are filenames. Consequently, the file selectors
in Perl/Tk do not handle encoding of filenames properly. Currently they
suppose that filenames are in iso-8859-1 encoding, at least on Unix systems.
As soon as Perl has a concept of filename encodings, then Perl/Tk will also
implement such schemes.
So at first glance it seems what you're trying to do is impossible (unless you
write or find a custom dir-selector). Actually, it may not be a bad idea to
submit this bug, because the GUI did show "Поиск" so the error is in the return value.

Perl reading zip files with IO::Uncompress::AnyUncompress

We are moving from our current build system (which is a mess) to one that uses Ant with Ivy. I'm cleaning up all the build files, and finding the jar dependencies. I thought it might be easier if I could automate it a bit, by going through the jars that are checked into the project, finding what classes they contain, then matching those classes with the various import statements in the Java code.
I have used Archive::Tar before, but Archive::Zip isn't a standard Perl module. (My concern is that someone is going to try my script, call me in the middle of the night and tell me it isn't working.)
I noticed that IO::Uncompress::AnyUncompress is a standard module, so I thought I could try IO::Uncompress::AnyUncompressor at leastIO::Uncompress::Unzip` which is also a standard module.
Unfortunately, the documentation for these modules give no examples (According to the documentation, examples are a todo).
I'm able to successfully open my jar and create an object:
my $zip_obj = IO::Uncompress::AnyUncompress->new ( $zip_file );
Now, I want to see the contents. According to the documentation:
getHeaderInfo
Usage is
$hdr = $z->getHeaderInfo();
#hdrs = $z->getHeaderInfo();
This method returns either a hash reference (in scalar context) or a list or hash references (in array context) that contains information about each of the header fields in the compressed data stream(s).
Okay, this isn't an object like Archive::Tar or Archive::Zip returns, and there are no methods or subroutines mentioned to parse the data. I'll use Data::Dumper and see what hash keys are contained in the reference.
Here's a simple test program:
#! /usr/bin/env perl
use 5.12.0;
use warnings;
use IO::Uncompress::AnyUncompress;
use Data::Dumper;
my $obj = IO::Uncompress::AnyUncompress->new("testng.jar")
or die qq(You're an utter failure);
say qq(Dump of \$obj = ) . Dumper $obj;
my #header2 = $obj->getHeaderInfo;
say qq(Dump of \$header = ) . Dumper $headers->[0];
And here's my results:
Dump of $obj = $VAR1 = bless( \*Symbol::GEN0, 'IO::Uncompress::Unzip' );
Dump of $header = $VAR1 = {
'UncompressedLength' => 0,
'Zip64' => 0,
'MethodName' => 'Stored',
'Stream' => 0,
'Time' => 1181224440,
'MethodID' => 0,
'CRC32' => 0,
'HeaderLength' => 43,
'ExtraFieldRaw' => '¦- ',
'ExtraField' => [
[
'¦-',
''
]
],
'FingerprintLength' => 4,
'Type' => 'zip',
'TrailerLength' => 0,
'CompressedLength' => 0,
'Name' => 'META-INF/',
'Header' => 'PK
+N¦6 META-INF/¦- '
};
Some of that looks sort of useful. However, all of my entries return `'Name' => 'META-INF/``, so it doesn't look like a file name.
Is it possible to use IO::Uncompress::AnyUncompress (or even IO::Uncompress:Unzip) to read through the archive and see what files are in its contents. And, if so, how do I parse that header?
Otherwise, I'll have to go with Archive::Zip and let people know they have to download and install it from CPAN on their systems.
The files in the archive are compressed in different data streams, so you need to iterate through the streams to get the individual files.
use strict;
use warnings;
use IO::Uncompress::Unzip qw(unzip $UnzipError);
my $zipfile = 'zipfile.zip';
my $u = new IO::Uncompress::Unzip $zipfile
or die "Cannot open $zipfile: $UnzipError";
die "Zipfile has no members"
if ! defined $u->getHeaderInfo;
for (my $status = 1; $status > 0; $status = $u->nextStream) {
my $name = $u->getHeaderInfo->{Name};
warn "Processing member $name\n" ;
if ($name =~ /\/$/) {
mkdir $name;
}
else {
unzip $zipfile => $name, Name => $name
or die "unzip failed: $UnzipError\n";
}
}

How to Extract text from MS Word?

I'm trying to open a word document and just extract all the text that is in the document and display it to the user using Win32::OLE
#usr/bin/perl
#OLEWord.pl
#Use string and print warnings
use strict;use warnings;
#Using OLE + OLE constants for Variants and OLE enumeration for Enumerations
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Word';
$Win32::OLE::Warn = 3;
#set the file to be opened
my $file = '/work/Test.docx';
#Create a new instance of Win32::OLE for the Word application, die if could not open the application
my $MSWord = Win32::OLE->new('Word.Application','Quit') and "Opened Word" or die "Unable to open document ", Win32::OLE->LastError();
#Set the screen to Visible, so that you can see what is going on
$MSWord->{'Visible'} = 1;
#open the request file or die and print warning message
my $Doc = $MSWord->Documents->Open('C:\work\Test.docx') or die "Could not open ", $file, " Error:", Win32::OLE->LastError();
#$MSWord->ActiveDocument->SaveAs({Filename => 'AlteredTest.docx',
#FileFormat => wdFormatDocument});
sub ShowObjs {
my $obj = shift;
foreach (sort keys %$obj) {
print "Keys: $_ - $obj->{$_}\n"; }
}
my $paragraphs = $Doc->Paragraphs;
ShowObjs($paragraphs);
# Get and print the Text inside the opened file
my $paragraphs = $Doc->Paragraphs;
my $txt = $Doc->Range->Text;
print $txt;
$MSWord->ActiveDocument->Close;
$MSWord->Quit;
I'm getting this error code:
OLE exception from "Microsoft Word":
Command Failed
Win32::OLE(0.1709) error ox800a1066
in METHOD/PROPERTYGET "Open" at OLEWord.pl line 20
Update: I can open up the Word application fine, it's just when I try to open up the file that is the problem
I have few scripts to convert DOC to various output format using Win32::OLE. They usually start like this:
use Win32::OLE;
use Win32::OLE::Const 'Microsoft Word';
my $wr = Win32::OLE->new('Word.Application')
or die "Failure - word. \n";
$wr->{DisplayAlerts} = wdAlertsNone;
$wr->{Visible} = 0;
my $Doc = $wr->Documents->Open({
FileName => $input_file_path,
ConfirmConversions => 0,
AddToRecentFiles => 0,
Revert => 0,
ReadOnly => 1,
OpenAndRepair => 0,
}) or exit;
...
Please note that $input_file_path has to contain absolute path to your file. You can also enable Visible and DisplayAlerts to see any error Word might give you.
Edit: You can traverse paragraphs using in enumerator:
use Win32::OLE qw(in);
...
my $paragraphs = $Doc->Paragraphs;
for my $par (in $paragraphs) {
print $par->Range->Text();
}
Or you can use Word's own exporting method and save document as one of supported formats:
$Doc->SaveAs({
FileName => 'c:\\work\\Test.txt',
FileFormat => wdFormatEncodedText,
});
The advantage of latter method is that formatting is retained if possible, which produces better results for bullets, numbering and such.
Win32::OLE can be a little funny about interaction. if anything triggers a prompt, you may get a message like this. Typically it may be that it wants to open the file read-only, for example, and put up a dialog, but these dialogs can break under the default initialization of Win32::OLE.
If this is the case, calling
Win32::OLE->Initialize(Win32::OLE::COINIT_OLEINITIALIZE);
before you do anything like instantiate any objects (i.e., before Win32::OLE->new) might do the trick.

perl html parsing lib/tool

Is there some powerful tools/libs for perl like BeautifulSoup to python?
Thanks
HTML::TreeBuilder::XPath is a decent solution for most problems.
I never used BeautifulSoup, but from quick skim over its documentation you might want HTML::TreeBuilder. It can process even broken documents well and allows traverse over parsed tree or query items - look at look_down method in HTML::Element.
If you like/know XPath, see daxim's recommendation. If you like to pick items via CSS selector, have a look at Web::Scraper or Mojo::DOM.
As you're looking for power, you can use XML::LibXML to parse HTML. The advantage then is that you have all the power of the fastest and best XML toolchain (excecpt MSXML, which is MS only) available to Perl to process your document, including XPath and XSLT (which would require a re-parse if you used another parser than XML::LibXML).
use strict;
use warnings;
use XML::LibXML;
# In 1.70, the recover and suppress_warnings options won't shup up the
# warnings. Hence, a workaround is needed to keep the messages away from
# the screen.
sub shutup_stderr {
my( $subref, $bufref ) = #_;
open my $fhbuf, '>', $bufref;
local *STDERR = $fhbuf;
$subref->(); # execute code that needs to be shut up
return;
}
# ==== main ============================================================
my $url = shift || 'http://www.google.de';
my $parser = XML::LibXML->new( recover => 2 ); # suppress_warnings => 1
# Note that "recover" and "suppress_warnings" might not work - see above.
# https://rt.cpan.org/Public/Bug/Display.html?id=58024
my $dom; # receive document
shutup_stderr
sub { $dom = $parser->load_html( location => $url ) }, # code
\my $errmsg; # buffer
# Now process document as XML.
my #nodes = $dom->getElementsByLocalName( 'title' );
printf "Document title: %s\n", $_->textContent for #nodes;
printf "Lenght of error messages: %u\n", length $errmsg;
print '-' x 72, "\n";
print $dom->toString( 1 );