Too late for -CSD - perl

Trying to run this little perl program from parsCit:
parsCit-client.pl e1.txt
Too late for -CSD option at [filename] line 1
e1.txt is here: http://dl.dropbox.com/u/10557283/parserProj/e1.txt
I'm running the program from win7 cmd, not Cygwin.
filename is parsCit-client.pl - entire program is here:
#!/usr/bin/perl -CSD
#
# Simple SOAP client for the ParsCit web service.
#
# Isaac Councill, 07/24/07
#
use strict;
use encoding 'utf8';
use utf8;
use SOAP::Lite +trace=>'debug';
use MIME::Base64;
use FindBin;
my $textFile = $ARGV[0];
my $repositoryID = $ARGV[1];
if (!defined $textFile || !defined $repositoryID) {
print "Usage: $0 textFile repositoryID\n".
"Specify \"LOCAL\" as repository if using local file system.\n";
exit;
}
my $wsdl = "$FindBin::Bin/../wsdl/ParsCit.wsdl";
my $parsCitService = SOAP::Lite
->service("file:$wsdl")
->on_fault(
sub {
my($soap, $res) = #_;
die ref $res ? $res->faultstring :
$soap->transport->status;
});
my ($citations, $citeFile, $bodyFile) =
$parsCitService->extractCitations($textFile, $repositoryID);
#print "$citations\n";
#print "CITEFILE: $citeFile\n";
#print "BODYFILE: $bodyFile\n";

From perldoc perlrun, about the -C switch:
Note: Since perl 5.10.1, if the -C option is used on the "#!" line, it
must be specified on the command line as well, since the standard
streams are already set up at this point in the execution of the perl
interpreter. You can also use binmode() to set the encoding of an I/O
stream.
Which is presumably what the compiler means by it being "too late".
In other words:
perl -CSD parsCit-client.pl

Because command-line options in a #! "shebang" are not passed consistently across all operating systems (see this answer), and Perl has already opened streams before parsing the script shebang, and so cannot compensate for this in some older OSs, it was decided in bug 34087 to forbid -C in the shebang. Of course, not everyone was happy with this "fix", particularly if it would have otherwise worked on their OS and they don't want to think about anything other than UTF-8.
If you think binmode() is ugly and unnecessary (and doesn't cover command-line arguments), you might like to consider the utf8::all package which has a similar effect to perl -CSDL.
Or were you using *nix, I would suggest export PERL_UNICODE="SDA" in the enclosing script to get Perl to realise it's in a UTF-8 environment.

Related

Perl - reconcile (on Windows) checksum of file generated on Unix? [duplicate]

I am looking for ways to get file checksums in Perl but not by executing the system command cksum -- would like to do it in Perl itself because the script needs to be portable between UNIX and Windows. cksum <FILENAME> | awk '{ print $1 }' works on UNIX but obviously not in Windows. I have explored MD5 but it seems like getting a file handle is necessary and generally it doesn't seem like a very compact way to get that data (one-liner preferable).
Is there a better way?
Here are three different ways depending on which modules you have available:
use Digest::MD5 qw(md5_hex);
use File::Slurp;
print md5_hex(read_file("filename")), "\n";
use IO::All;
print md5_hex(io("filename")->all), "\n";
use IO::File;
print md5_hex(do { local $/; IO::File->new("filename")->getline }), "\n";
Not completely one-line but pretty close.
Replace Digest::MD5 with any hash algorithm you want, e.g. SHA1.
IO::File is in core and should be available everywhere, but that's the solution I personally dislike the most. Anyway, it works.
I couldn't make any of the above work for me in windows, I would always get an incorrect MD5. I got suspicious that it was being caused by differences in linebreak, but converting the file to DOS or to unix made no difference. The same code with the same file would give me the right answer on linux and the wrong one in windows. Reading the documentation, I finally found something that would work both in windows and linux:
use Digest::MD5;
open ($fh, '<myfile.txt');
binmode ($fh);
print Digest::MD5->new->addfile($fh)->hexdigest;
I hope this helps other people having difficulty in windows, I find it so weird that I didn't find any mentions to problems on windows...
This also works:
use Digest::MD5 qw(md5_base64);
...
open(HANDLE, "<", $dirItemPath);
my $cksum = md5_base64(<HANDLE>);
print "\nFile checksum = ".$cksum;

Compact way of getting file checksum in Perl

I am looking for ways to get file checksums in Perl but not by executing the system command cksum -- would like to do it in Perl itself because the script needs to be portable between UNIX and Windows. cksum <FILENAME> | awk '{ print $1 }' works on UNIX but obviously not in Windows. I have explored MD5 but it seems like getting a file handle is necessary and generally it doesn't seem like a very compact way to get that data (one-liner preferable).
Is there a better way?
Here are three different ways depending on which modules you have available:
use Digest::MD5 qw(md5_hex);
use File::Slurp;
print md5_hex(read_file("filename")), "\n";
use IO::All;
print md5_hex(io("filename")->all), "\n";
use IO::File;
print md5_hex(do { local $/; IO::File->new("filename")->getline }), "\n";
Not completely one-line but pretty close.
Replace Digest::MD5 with any hash algorithm you want, e.g. SHA1.
IO::File is in core and should be available everywhere, but that's the solution I personally dislike the most. Anyway, it works.
I couldn't make any of the above work for me in windows, I would always get an incorrect MD5. I got suspicious that it was being caused by differences in linebreak, but converting the file to DOS or to unix made no difference. The same code with the same file would give me the right answer on linux and the wrong one in windows. Reading the documentation, I finally found something that would work both in windows and linux:
use Digest::MD5;
open ($fh, '<myfile.txt');
binmode ($fh);
print Digest::MD5->new->addfile($fh)->hexdigest;
I hope this helps other people having difficulty in windows, I find it so weird that I didn't find any mentions to problems on windows...
This also works:
use Digest::MD5 qw(md5_base64);
...
open(HANDLE, "<", $dirItemPath);
my $cksum = md5_base64(<HANDLE>);
print "\nFile checksum = ".$cksum;

Non-Ascii data behaves differently with different Perl installations

I have the following script which behaves differently on two different Perl installations I have. One is Perl 5.8.5 and the other is Perl 5.8.8.
Here is the script:
#!/usr/bin/perl
use FindBin(qw($Bin));
use lib $Bin;
use lib "$Bin/../lib";
use XML::LibXML;
use strict; # quote strings, declare variables
use warnings; # on by default
use warnings qw(FATAL utf8); # fatalize encoding glitches
use open qw(:std :utf8); # undeclared streams in UTF-8
my $xml =<<EOS;
<?xml version="1.0" encoding="UTF8"?>
<foo>Привет, мир!</foo>
EOS
my $parser = new XML::LibXML;
my $doc = '';
eval { $doc = $parser->parse_string($xml); };
if ($#) {
die "Error: $#";
}
my $root = $doc->getDocumentElement();
print "XML after parsing: ", $root->toString(), "\n";
On my 5.8.8 Perl installation I get:
XML after parsing: <foo>Привет, мир!</foo>
On my 5.8.5 Perl installation I get:
XML after parsing: <foo>Привет, мир!</foo>
I want my 5.8.5 installation to behave like the 5.8.8 one in this regard. Is this a matter of just upgrading my Perl, or setting some special compilation flag?
First of all, both outputs are equivalent. XML::LibXML is free to generate either one, and it shouldn't matter to the receiving parser. Of course, XML is suppose to be human readable, and this is probably what concerns you.
No, XML::LibXML does not have an option to control which characters it escapes. In fact, I've only known it to escape only when needed, which is the first behaviour.
No need to upgrade Perl. Upgrading XML::LibXML or libxml2 (the underlying library used by XML::LibXML) will do the trick.
# XML::LibXML's version
>perl -MXML::LibXML -E"say $XML::LibXML::VERSION"
1.70
# libxml2's version
>perl -MXML::LibXML -E"say XML::LibXML::LIBXML_DOTTED_VERSION"
2.7.7
Off-topic tips:
I presume your source code is encoded using UTF-8? If so, I would add use utf8; to let Perl know that. If you do, you'll need to change
my $xml = <<EOS;
to
my $xml = encode_utf8(<<EOS);
Using
<<'EOI'
instead of
<<EOI
will prevent Perl from messing with your XML (prevent interpolation and interpretation of \ sequences).

perl mktemp and echo

i am trying to put some word in tempfile via commandline
temp file creat but word not past in tempfile
#!/usr/bin/perl -w
system ('clear');
$TMPFILE = "mktemp /tmp/myfile/devid.XXXXXXXXXX";
$echo = "echo /"hello world/" >$TMPFILE";
system ("$TMPFILE");
system ("$echo");
Please Help to Solve This
To capture the name output by mktemp, do this instead:
chomp($TMPFILE = `mktemp /tmp/myfile/devid.XXXXXXXXXX`);
But Perl can do all the things you are doing without resorting to the shell.
Avoid using external commands from perl script as much as possible.
you can use: File::Temp module in this case, see this
Here's a specific demonstration of the advice that others have given you: where possible, use Perl directly rather than invoking system. Also, you should get in the habit of including use strict and use warnings in your Perl scripts.
use strict;
use warnings;
use File::Temp;
my $ft = File::Temp->new(
UNLINK => 0,
TEMPLATE => '/tmp/myfile/devid.XXXXXXXXXX',
);
print "Writing to temp file: ", $ft->filename, "\n";
print $ft "Hello, world.\n";

Can I run a Perl script from stdin?

Suppose I have a Perl script, namely mytest.pl. Can I run it by something like cat mytest.pl | perl -e?
The reason I want to do this is that I have a encrypted perl script and I can decrypt it in my c program and I want to run it in my c program. I don't want to write the decrypted script back to harddisk due to secruity concerns, so I need to run this perl script on-the-fly, all in memory.
This question has nothing to do with the cat command, I just want to know how to feed perl script to stdin, and let perl interpreter to run it.
perl < mytest.pl
should do the trick in any shell. It invokes perl and feeds the script in via the shell redirection operator <.
As pointed out, though, it seems a little unnecessary. Why not start the script with
#!/usr/bin/perl
or perhaps
#!/usr/bin/env perl
? (modified to reflect your Perl and/or env path)
Note the Useless Use of Cat Award. Whenever I use cat I stop and think whether the shell can provide this functionality for me instead.
Sometimes one needs to execute a perl script and pass it an argument. The STDIN construction perl input_file.txt < script.pl won't work. Using the tip from How to assign a heredoc value to a variable in Bash we overcome this by using a "here-script":
#!/bin/bash
read -r -d '' SCRIPT <<'EOS'
$total = 0;
while (<>) {
chomp;
#line = split "\t";
$total++;
}
print "Total: $total\n";
EOS
perl -e "$SCRIPT" input_file.txt
perl mytest.pl
should be the correct way. Why are you doing the unnecessary?
cat mytest.pl | perl
…is all you need. The -e switch expects the script as a command line argument.
perl will read the program from STDIN if you don't give it any arguments.
So you could theoretically read an encrypted file, decrypt it, and run it, without saving the file anywhere.
Here is a sample program:
#! /usr/bin/perl
use strict;
use warnings;
use 5.10.1;
use Crypt::CBC;
my $encrypted = do {
open my $encrypted_file, '<', 'perl_program.encrypted';
local $/ = undef;
<$encrypted_file>;
};
my $key = pack("H16", "0123456789ABCDEF");
my $cipher = Crypt::CBC->new(
'-key' => $key,
'-cipher' => 'Blowfish'
);
my $plaintext = $cipher->decrypt($encrypted);
use IPC::Run qw'run';
run [$^X], \$plaintext;
To test this program, I first ran this:
perl -MCrypt::CBC -e'
my $a = qq[print "Hello World\n"];
my $key = pack("H16", "0123456789ABCDEF");
my $cipher = Crypt::CBC->new(-key=>$key,-cipher=>"Blowfish");
my $encrypted = $cipher->encrypt($a);
print $encrypted;
' > perl_program.encrypted
This still won't stop dedicated hackers, but it will prevent most users from looking at the unencrypted program.