Change the position of program lead to syntax error - perl

I have a program called shuffle.pl . When I use perl shuffle.pl Input Shuffled to execute , it success work and show no error .
I create a directory called ./tools under my home directory , and I set this path to .cshrc . So I can execute the program without typing perl to execute . ( This is my first time to do this , maybe some wrong in here)
But when I move the shuffle.pl to ~/.tools and execute . it show I have error in line 5 . But if I use perl ~/.tools/shuffle.plit can work . So it means it should have no syntax error in my program ,But why it can't work after I put my program to ~/.tools
error message
.tools/shuffle.pl: 5: Syntax error: "(" unexpected
.cshrc
set path = (. ~ ~/.tools /sbin /bin /usr/sbin /usr/bin /usr/games /usr/local/sbin /usr/local/bin )
thanks
here is my program
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw(first max maxstr min minstr reduce shuffle sum);
open(my $fh,"<","$ARGV[0]");
my #Lines = readline($fh);
my #Shuffled = shuffle(#Lines);
close $fh;
open(my $shuf,">","$ARGV[1]");
print $shuf #Shuffled;
close $shuf;

The shebang is used to tell which interpreter should be used for this script. For this to work, the magic number #! has to appear at the immediate beginning of the file. Otherwise, the default interpreter is used.
In this case, the shebang was preceded by a few empty lines. They have to be removed.
The shebang is not parsed when an explicit interpreter is used to execute the file, E.g. in $ perl script.pl.
It is only important when launched as executable: ./script.pl. In that case, the kernel is left to figure out what to do with it: Load into memory as compiled program? Launch an interpreter? Which one? Magic numbers like #! resolve this.
In general, if the shebang doesn't work, the following possible errors can be checked:
An UTF byte order mark precedes the #!.
Diagnosis: A hexdump shows FE FF at the beginning.
Solution: configure your editor to store files without a BOM
The script is encoded in such a way that the beginning does not decode to #! as ASCII.
Diagnosis: The file does not begin with #! when opened as ASCII or does not begin with 23 21 in a hexdump. Or your editor shows UTF-16 or UTF-32 as the encoding.
Solution: Store the script in ASCII-compatible encoding. UTF-8 is an especially good choice.
Non-native line endings can be confused to be part of the executable name. E.g. with windows line endings, the shebang in
#!/usr/bin/perl
print 1;
could be taken as the interpreter name "/usr/bin/perl\r". Many filesystems allow line endings inside filenames.
Diagnosis: A hexdump shows something other than a space (20) or newline (0A) after the executable name.
Solution: Convert line endings to Unix.

A few general tips:
You should have -w on the shebang line to catch warnings.
You should probably use strict; also.
Don't put double quotes around "$ARGV[0]" and "$ARGV[1]" because they serve no purpose.
Use "do or die" syntax on the file opens, e.g.:
open (File, "<", $ARGV[0]) || die "File open error: $!";
Do those things and I am pretty sure the solution will appear rapidly.

Related

Perl newbie first experience with Unicode (in filename, -e operator, open operator, and cmd window)

I have a Windows Perl (5.16.1 32 bit) program that opens a media file and (using ffmpeg) it extracts segments of audio - the purpose of which is to convert a single album music track (containing multiple songs) into multiple individual song files.
When the name of the media file to be processed is all ASCII characters, this all works rather well.
I recently tried this program against a filename that includes Russian characters, and the program fails miserably in several areas.
While this must have to do with Unicode, and as I have never previously needed to do anything with Unicode - I am rather confused about the various aspects of failures that I am experiencing here, nor do I know the fix for the variety of issues I am now facing.
I have distilled this down to the minimum to demonstrate the problems.
If I open a cmd window, and type 'chcp', the return value is 437.
If I do a 'dir' command, this is what is shown for me:
04/01/2019 11:46 AM 71,982,427 IC3PEAK альбом Сладкая.mkv
06/10/2020 10:42 PM 275 test.pl
(Note how in my cmd window, the Russian characters do display as Russian characters.)
My 'test.pl' Perl script is here:
use open ":std", ":encoding(UTF-8)";
$media = "IC3PEAK альбом Сладкая.mkv";
if (-e $media) {
print "Media file does exist\n";
} else {
print "Media file does NOT exist\n";
}
open(IN, $media) || die "Media file ($media) can not be opened!\n";
When this Perl script runs, using default chcp value of 437, I get this as output:
Media file does NOT exist
Media file (IC3PEAK альбом Сладкая.mkv) can not be opened!
If I run 'chcp 1250' in my cmd window, and I re-run this Perl script, I get this as output:
Media file does NOT exist
Media file (IC3PEAK Ă°Ă»ÑŒĂ±ĂÂľĂÂĽ Ă¡Ă»Ă°Ă´ĂÂşĂ°Ñ.mkv) can not be opened!
Problem 1: I am told the media file does not exist.
Problem 2: When I print the media file name to STDOUT, notice how the displayed file name non longer matches how it looks when I did the 'dir' command?
Can anyone suggest how to fix these two problems?
PS - Noting, when I change the disk file name to pure ASCII 'IC3PEAK.mkv', and change the $media variable to also equal 'IC3PEAK.mkv', running the modified Perl script gives:
Media file does exist
Following code was tested in Windows 10 1903, perl -MWin32 -e"CORE::say Win32::GetACP()" returns ACP 1252 (Win 10 North America) with Win32 strawberry-perl 5.30.2.1 #1 Tue Mar 17 03:21:32 2020 x64.
Initial attempt to install cpan Win32::Unicode::File failed with t/04_print.t (Wstat: 768 Tests: 13 Failed: 3) message.
A quick search in Google lead to following post on Perl Monks. It looks like the problem with Win32::Unicode::File installation is known for some time.
NOTE: ikegami pointed out that the module can be forcefully installed and failed test can be ignored. Please see his comment bellow.
Following test code confirms that a forced installation cpan -f -i Win32::Unicode::File produces desired outcome.
use strict;
use warnings;
use feature 'say';
use utf8;
use Win32::Console;
use Win32::Unicode::File;
Win32::Console::OutputCP( 65001 );
binmode STDOUT, ':encoding(UTF-8)';
binmode STDERR, ':encoding(UTF-8)';
my $fname = 'Доброе утро Россия.mkv';
my $fh = Win32::Unicode::File->new;
open $fh, '<:encoding(UTF-8)', $fname
or die "Can't open $fname $!";
while( <$fh> ) {
say;
}
close $fh;
Content of input file Доброе утро Россия.mkv is
Доброе утро Россия
As suggested in above mentioned post I resorted to try Win32::LongPath as an alternative. Installation of the module went successfully through.
use strict;
use warnings;
use feature 'say';
use utf8;
use Win32::Console;
use Win32::LongPath;
Win32::Console::OutputCP( 65001 );
binmode STDOUT, ':encoding(UTF-8)';
binmode STDERR, ':encoding(UTF-8)';
my $fname = 'IC3PEAK альбом Сладкая.mkv';
my $fh;
openL \$fh, '<:encoding(UTF-8)', $fname
or die "Can't open $fname ($^E)";
while( <$fh> ) {
# process input
say;
}
close $fh;
Instead of real file IC3PEAK альбом Сладкая.mkv a text file with same name was used in the test with following content
Привет Москва
Note: use openL \$fh, '<', $fname on real mkv file to read content of the file
Three fixes are needed.
Non-ASCII source without use utf8;
Your source contains non-ASCII characters.
$media = "IC3PEAK альбом Сладкая.mkv";
Perl expects source code to be encoded using ASCII, unless you use use utf8;. Encode your source using UTF-8 and use use utf8;.
use utf8;
# String of decoded text (aka string of Unicode Code Points).
# Length = 26
my $media = "IC3PEAK альбом Сладкая.mkv";
Assuming your file was encoded using UTF-8, what you had was equivalent to the following:
use utf8;
use Encode qw( encode );
# String of text encoded using UTF-8 (aka string of bytes).
# Length = 39
my $media = encode("UTF-8", "IC3PEAK альбом Сладкая.mkv");
Incorrect output encoding
Your code contains
use open ":std", ":encoding(UTF-8)";
This tells Perl the following:
Decode bytes received from STDIN using UTF-8.
Encode characters sent to STDOUT and STDERR using UTF-8.
Do the same for file handles opened in the current lexical scope.
The problem is that your terminal isn't expecting UTF-8. It's expecting cp437 (before chcp 1250) or cp1250 (after chcp 1250).
Solution 1:
Adjust the encoding specified in the use open line. This shows how this can be done without hardcoding the encoding.
Of course, you'll only be able to print the Cyrillic characters if the terminal's OEM code page (as set using chcp) supports the characters. This brings us to a second solution.
Solution 2:
Adjust the terminal to provide/expect UTF-8. This can be done using the following:
chcp 65001
Limitation of builtin functions that accept file names
Windows provides two versions of each functions that accepts strings:
The "UNICODE" version (suffixed with "W" for "wide") accepts/returns strings encoded using UTF-16le. This version supports all Unicode characters.
The "ANSI" version (suffixed with "A") accepts/returns strings encoded using the Active Code Page (ACP). The "A" version only supports a small subset of the Unicode characters.
You can obtain the ACP for your system using the following:
perl -MWin32 -e"CORE::say Win32::GetACP()"
Unfortunately, Perl functions (named operators) use the "A" version of system calls and expect/return text encoded using the ACP. This severely limits which file names that can be passed to them.
For example, my system's ACP is 1252, so the "A" version of system calls would not support Cyrillic characters. This means there is nothing I can do to make open, -e, etc work with file names containing Cyrillic characters. ouch.
[Upd: I now recommend Win32::LongPath instead.] The Win32-Unicode distribution can help with this. For example, -e is just a call to stat, and Win32::Unicode::File provides statW, a version of stat that accepts file names as decoded text. Similarly, it provides a replacement for open.

Perl IO::Stringy and IO::InnerFile install test failure on Windows OS (and how to fix it)

After clean installation of Active Perl (64-bit edition, version 5.24.3) on Windows 8.1 PC I needed to add Spreadsheet::Read Perl module. However, its CPAN installation failed.
Analysis of the console report showed that the root cause of failure is IO::InnerFile module, which was not installed. Or – better said – failure of all seven automated tests of this module. The test script is named IO_InnerFile.t and (in my case) it is located in the C:\Perl64\cpan\build\IO-stringy-2.111-0\t directory.
The approach by jmasa didn't work for me, but this did:
In IO_InnerFile.t, change the block right after # Create a test file to:
# Create a test file
do {
open(OUT, '>t/dummy-test-file') || die("Cannot write t/dummy-test-file: $!");
local $\ = "\n"; ## Use `print` vs. `say` to avoid extra blank lines.
binmode OUT, ':raw'; ## Force output of UNIX line terminators.
print OUT <<'EOF';
Here is some dummy content.
Here is some more dummy content
Here is yet more dummy content.
And finally another line.
EOF
close(OUT);
};
This localizes the change to $/ (changed to the UNIX single-character line terminator) and uses it to output the test text to the dummy file.
Then run
dmake test
dmake install
Soon I realized that (because of seeking) the test script IO_InnerFile.t can be used only on platforms where a line terminator is a single byte.
In case of M$ Windows, where the line terminator consists of two bytes \r\n sequence, seeking to absolute position does not work – simply the test itself is not portable.
The possible fix is adding PerlIO layer ":crlf":
Open the file IO_InnerFile.t and find the line reading:
my $fh = IO::File->new('<t/dummy-test-file');
and change it to:
my $fh = IO::File->new('<:crlf t/dummy-test-file');
Note the space between “<:crlf” and “t/ dummy-test-file”
Open console window and switch to the module build directory (C:\Perl64\cpan\build\IO-stringy-2.111-0 in my case)
Run manually:
dmake test
dmake install
Note: I don’t bother with proper PATH settings and absolute file positions, which may vary.

How to read a .conf file in Perl

I just created a text test.conf file with some information. How can I read it on Perl?
I am new to Perl and I am not sue would will I need to do.
I tried the following:
C:\Perl\Perl_Project>perl
#!/usr/local/bin/perl
open (MYFILE, 'test.conf');
while (<MYFILE>)
{ chomp; print "$_\n"; }
close (MYFILE);
I tried installing Perl on my laptop that has Windows 7 OS, and using command line.
Instead of using command line, write your program in a file (you can use any editor to write your program, I would suggest use Notepad++) and save as myprogram.pl in the same directory where you have your .conf file.
use warnings;
use strict;
open my $fh, "<", "test.conf" or die $!;
while (<$fh>)
{
chomp;
print "$_\n";
}
close $fh;
Now open a command prompt and go to the same path where you have your both file myprogram.pl and test.conf file and execute your program by typing this:
perl myprogram.pl
You can give full path of your input file inside program and can run your program from any path from command prompt by giving full path of your program:
perl path\to\myprogram.pl
Side note: Always use use warnings; and use strict; at the top of your program and to open file always use lexical filehandle with three arguments with error handling.
This is an extended comment more than an answer, as I believe #serenesat has given you everything you need to execute your program.
When you do "command line" Perl, it's typically stuff that is relatively brief or trivial, such as:
perl -e "print 2 ** 16"
Anything that goes beyond a few lines, and you're probably better off putting that in a file and having Perl run the file. You certainly can put larger programs on the command line, but when it comes to going back in and editing lines, it becomes more of a hassle than a shortcut.
Also, for what it's worth the -n and -p parameters allow you to process the contents of a stream, meaning you could do something like this:
perl -ne "print if /oracle/i" test.conf

How to delete a bunch of lines in perl (adapting a known one-liner)?

context: I'm a beginner in Perl and struggling, please be patient, thanks.
the question: there is a one-liner that seems to do the job I want (in a cygwin console it does fine on my test file). So now I would need to turn it into a script, but I can't manage that unfortunately.
The one-liner in question is provided in the answer by Aki here Delete lines in perl
perl -ne 'print unless /HELLO/../GOODBYE/' <file_name>
Namely I would like to have a script that opens my file "test.dat" and removes the lines between some strings HELLO and GOODBYE. Here is what I tried and which fails (the path is fine for cygwin):
#!/bin/perl
use strict;
use warnings;
open (THEFILE, "+<test.dat") || die "error opening";
my $line;
while ($line =<THEFILE>){
next if /hello/../goodbye/;
print THEFILE $line;
}
close (THEFILE);
Many thanks in advance!
Your one-liner is equivalent to the following
while (<>) {
print unless /HELLO/../GOODBYE/;
}
Your code does something quite different. You should not attempt to read and write to the same file handle, that usually does not do what you think. When you want to quickly edit a file, you can use the -i "in-place edit" switch:
perl -ni -e 'print unless /HELLO/../GOODBYE/' file
Do note that changes to the file are irreversible, so you should make backups. You can use the backup option for that switch, e.g. -i.bak, but be aware that it is not flawless, as running the same command twice will still overwrite your backup (by saving to the same file name twice).
The simplest and safest way to do it, IMO, is to simply use shell redirection
perl script.pl file.txt > newfile.txt
While using the script file I showed at the top.

What is the significance of -T or -w in #!/usr/bin/perl?

I googled about #!/usr/bin/perl, but I could not find any satisfactory answer. I know it’s a pretty basic thing, but still, could explain me what is the significance of #!/usr/bin/perl in Perl? Moreover, what does -w or -T signify in #!/usr/bin/perl? I am a newbie to Perl, so please be patient.
The #! is commonly called a "shebang" and it tells the computer how to run a script. You'll also see lots of shell-scripts with #!/bin/sh or #!/bin/bash.
So, /usr/bin/perl is your Perl interpreter and it is run and given the file to execute.
The rest of the line are options for Perl. The "-T" is tainting (it means input is marked as "not trusted" until you check it's format). The "-w" turns warnings on.
You can find out more by running perldoc perlrun (perldoc is Perl's documentation reader, might be installed, might be in its own package).
For scripts you write I would recommend starting them with:
#!/usr/bin/perl
use warnings;
use strict;
This turns on lots of warnings and extra checks - especially useful while you are learning (I'm still learning and I've been using Perl for more than 10 years now).
Both -w and -T are sort of "foolproof" flags.
-w is the same as use warning statement in your code, and it's an equivalent of warning option in many compilers. A simplest example would be a warning about using uninitialized variable:
#!/usr/bin/perl -w
print "$A\n";
print "Hello, world!\n";
Will print:
Name "main::A" used only once: possible typo at ./perl-warnings line 3.
Use of uninitialized value $A in concatenation (.) or string at
./perl-warnings line 3.
Hello, world!
The -T flag means that any value that came from the outside world (as opposite to being calculated inside the program) is considered potential threat, and disallows usage of such values in system-related operations, like writing files, executing system command, etc. (That's why Perl would activate the "taint" mode when the script is running under setuid/setgid.)
The "tainted" mode is "enforcing" you to double-check the value inside the script.
E.g., the code:
#!/usr/bin/perl -T
$A = shift;
open FILE, ">$A";
print "$A\n";
close FILE;
Will produce a fatal error (terminating the program):
$ ./perl-tainted jkjk
Insecure dependency in open while running with -T switch at
./perl-tainted line 3.
And that's only because the argument value came from "outside" and was not "double-checked". The "taint" mode is drawing your attention to that fact. Of course, it's easy to fool it, e.g.:
#!/usr/bin/perl -T
$A = shift;
$A = $1 if $A =~ /(^.*$)/;
open FILE, ">$A";
print "$A\n";
close FILE;
In this case everything worked fine. You "fooled" the "taint mode". Well, the assumption is that programer's intentions are to make the program safer, so the programmer wouldn't just work around the error, but would rather take some security measures. One of Perl's nicknames is "the glue and the duct tape of system administrators". It's not unlikely that system administrator would create Perl script for his own needs and would run it with root permissions. Think of this script doing something normal users are not allowed to do... you probably want to double-check things which are not part of the program itself, and you want Perl to remind you about them.
Hope it helps.
about Taint Mode(-T):
require and use statements change when taint mode is turned on.
The path to load libraries/modules no longer contains . (the current directory) from its path.
So if you load any libraries or modules relative to the current working directory without explicitly specifying the path, your script will break under taint mode.
For ex: Consider perl_taint_ex.pl
#!/usr/bin/perl -T
require "abc.pl";
print "Done";
would fail like this
D:\perlex>perl perl_taint_ex.pl
"-T" is on the #! line, it must also be used on the command line
at perl_taint_ex.pl line 1.
D:\perlex>perl -T perl_taint_ex.pl
Can't locate abc.pl in #INC (#INC contains: C:/Perl/site/lib C:/Perl/lib)
at perl_taint_ex.pl line 3.
So when taint mode is on, you must tell the require statement explicitly where to load the library since . is removed during taint mode from the #INC array.
#INC contains a list of valid paths to read library files and modules from.
If taint mode is on, you would simply do the following:
D:\perlex>perl -ID:\perlex -T perl_taint_ex.pl
Done
-ID:\perlex will include directory D:\perlex in #INC.
You can try other ways for adding path to #INC,this is just one example.
It's called a shebang. On Unix based systems (OSX, Linux, etc...) that line indicates the path to the language interpreter when the script is run from the command line. In the case of perl /usr/bin/perl is the path to the perl interpreter. If the hashbang is left out the *nix systems won't know how to parse the script when invoked as an executable. It will instead try to interpret the script in whatever shell the user happens to be running (probably bash) and break the script.
http://en.wikipedia.org/wiki/Hashbang
The -W and -T are arguments that controll the way the perl interpreter operates. They are the same arguments that you could invoke when calling perl interpreter directly from the command line.
-W shows warnings (aka debuging information).
-T turns on taint / security checking.