Why can't I set $LIST_SEPARATOR in Perl? - perl

I want to set the LIST_SEPARATOR in perl, but all I get is this warning:
Name "main::LIST_SEPARATOR" used only once: possible typo at ldapflip.pl line 7.
Here is my program:
#!/usr/bin/perl -w
#vals;
push #vals, "a";
push #vals, "b";
$LIST_SEPARATOR='|';
print "#vals\n";
I am sure I am missing something obvious, but I don't see it.
Thanks

Only the mnemonic is available
$" = '|';
unless you
use English;
first.
As described in perlvar. Read the docs, please.
The following names have special meaning to Perl. Most punctuation names have reasonable mnemonics, or analogs in the shells. Nevertheless, if you wish to use long variable names, you need only say
use English;
at the top of your program. This aliases all the short names to the long names in the current package. Some even have medium names, generally borrowed from awk. In general, it's best to use the
use English '-no_match_vars';
invocation if you don't need $PREMATCH, $MATCH, or $POSTMATCH, as it avoids a certain performance hit with the use of regular expressions. See English.

perlvar is your friend:
• $LIST_SEPARATOR
• $"
This is like $, except that it applies to array and slice values interpolated into a double-quoted string (or similar interpreted string). Default is a space. (Mnemonic: obvious, I think.)
$LIST_SEPARATOR is only avaliable if you use English; If you don't want to use English; in all your programs, use $" instead. Same variable, just with a more terse name.

Slightly off-topic (the question is already well answered), but I don't get the attraction of English.
Cons:
A lot more typing
Names not more obvious (ie, I still have to look things up)
Pros:
?
I can see the benefit for other readers - especially people who don't know Perl very well at all. But in that case, if it's a question of making code more readable later, I would rather this:
{
local $" = '|'; # Set interpolated list separator to '|'
# fun stuff here...
}

you SHOULD use the strict pragma:
use strict;
you might want to use the diagnostics pragma to get additional hits about the warnings (that you already have enabled with the -w flag):
use diagnostics;

Related

The ?PATTERN? operator is not working for Matching Only Once in Perl

I am new to Perl and am practising some programs. I have encountered a syntax error. Please help me.
My Perl program
#!/usr/bin/perl
#list = qw/ food foosball subeo footnote terfoot canic footbridge /;
foreach ( #list ) {
$first = $1 if ?(foo.*)?;
$last = $1 if /(foo.*)/;
}
print "First: $first, Last: $last\n";
Output
syntax error at MatchingOnlyOnce.pl line 9, near "if ?"
Execution of MatchingOnlyOnce.pl aborted due to compilation errors.
Output of perl -v
This is perl 5, version 24, subversion 1 (v5.24.1) built for MSWin32-x64-multi-t
hread
Copyright 1987-2017, Larry Wall
Perl may be copied only under the terms of either the Artistic License or the
GNU General Public License, which may be found in the Perl 5 source kit.
Complete documentation for Perl, including FAQ lists, should be found on
this system using "man perl" or "perldoc perl". If you have access to the
Internet, point your browser at http://www.perl.org/, the Perl Home Page.
Use
$first=$1 if m?(foo.*)?;
?PATTERN? could be used as a shortcut for m?PATTERN?, but you can no longer omit the match operator's leading m when you use ? as the delimiter.
5.14 deprecated the ability to omit the leading m from m?PATTERN?flags.
5.22 removed the ability to omit the leading m from m?PATTERN?flags.
5.22 and 5.24's perlop lists both m?PATTERN?flags and ?PATTERN?flags, but only the former if legal in these versions.
5.26's documentation will be free of all mentions of ?PATTERN? (as opposed to m?PATTERN?).
If you are not using the default / pattern delimiters, you must specify the match operation as in $x =~ m{...}, $x =~ m!...! etc.
?...? is different than those other alternative delimiters as ?...? does something different than /.../. perldoc perlreref currently states:
?pattern? is like m/pattern/ but matches only once. No alternate delimiters can be used. Must be reset with reset().
That is misleading as Perl used to recognize the plain ?...?, but support for that was completely removed a few years ago:
[perl #120912] [PATCH] Remove support for ?PATTERN? without explicit 'm' operator
…
This has issued a deprecation warning since Perl v5.14 (commit
725a61d70), and precludes using ? as an operator after a unary operator
that defaults to $_, such as:
ref ? $_ : [$_]
Here is the motivation for the deprecation and eventual removal:
Deprecate ?PATTERN?, recommending the equivalent m?PATTERN? syntax, in
order to eventually allow the question mark to be used in new operators
that would currently be ambiguous.
If you are just beginning to learn Perl, you ought to enable strict and warnings. Declare your variables in the smallest applicable scope.
#!/usr/bin/env perl
use strict;
use warnings;
my #list = qw/food foosball subeo footnote terfoot canic footbridge/;
my ($first, $last);
foreach my $item (#list) {
$item =~ m?(foo.*)?
and $first = $1;
$item =~ /(foo.*)/
and $last = $1;
}
print "First: $first, Last: $last\n";
Output:
$ perl tt.pl
First: food, Last: footbridge
A lot of things about Perl are unusual for a programming language, and you have happened upon a corner of the language that is rarely used
Perl was designed by a the linguist Larry Wall, and there are many similarities between Perl and spoken languages. Perl allows abbreviations of some constructs, for instance, a pattern match like
/abc/
is equivalent to
$_ =~ m/abc/
That is to say, if it looks like a regex pattern it will be treated as one, regardless of whether the leading m is there or not. Also, many Perl operators work on $_ by default, which allows you to write a few lines of code without explicitly mentioning a variable
But m?...? is an old-fashioned construct that has really been edged out by lexical variables. If you declare a variable using my then there is no need for the one-shot match, or the corresponding reset operator. It is your declaration that defines the lifetime of the variable
If you are just starting with Perl, I recommend that you
Always start every program with use strict and use warnings 'all'. This isn't optional as it's the first line of defence against simple mistakes and errors
Always declare every variable using my as close as possible to its first use. Occasionally you may want to declare variables before a loop so that its value is kept across iterations, but generally variables should be temporary and useful only for a few lines of code
Forget about m?...?. I have never seen a program that uses it in twenty years of writing Perl professionally
I hope this helps

Evaluating escape sequences in perl

I'm reading strings from a file. Those strings contain escape sequences which I would like to have evaluated before processing them further. So I do:
$t = eval("\"$t\"");
which works fine. But I'm having doubt about the performance. If eval is forking a perl process each time, it will be a performance killer. Another way I considered to do the job were regex, where I have found related questions in SO.
My question: is there a better, more efficient way to do it?
EDIT: before calling eval in my example $t is containing \064\065\x20a\n. It is evaluated to 45 a<LF>.
It's not quite clear what the strings in the file look like and what you do to them before passing off to eval. There's something missing in the explanation.
If you simply want to undo C-style escaping (as also used in Perl), use Encode::Escape:
use Encode qw(decode);
use Encode::Escape qw();
my $string_with_unescaped_literals = decode 'ascii-escape', $string_with_escaped_literals;
If you have placeholders in the file which look like Perl variables that you want to fill with values, then you are abusing eval as a poor man's templating engine. Use a real one that does not have the dangerous side effect of running arbitrary code.
$string =~ s/\\([rnt'"\\])/"qq|\\$1|"/gee
string eval can solve the problem too, but it brings up a host of security and maintenance issues, like # in string
oh gah don't use eval for this, thats dangerous if someone provides it with input like "system('sync;reboot');"..
But, you could do something like this:
#!/usr/bin/perl
$string = 'foo\"ba\\\'r';
printf("%s\n", $string);
$string =~ s/\\([\"\'])/$1/g;
printf("%s\n", $string);

Why use strict and warnings?

It seems to me that many of the questions in the Perl tag could be solved if people would use:
use strict;
use warnings;
I think some people consider these to be akin to training wheels, or unnecessary complications, which is clearly not true, since even very skilled Perl programmers use them.
It seems as though most people who are proficient in Perl always use these two pragmas, whereas those who would benefit most from using them seldom do. So, I thought it would be a good idea to have a question to link to when encouraging people to use strict and warnings.
So, why should a Perl developer use strict and warnings?
For starters, use strict; (and to a lesser extent, use warnings;) helps find typos in variable names. Even experienced programmers make such errors. A common case is forgetting to rename an instance of a variable when cleaning up or refactoring code.
Using use strict; use warnings; catches many errors sooner than they would be caught otherwise, which makes it easier to find the root causes of the errors. The root cause might be the need for an error or validation check, and that can happen regardless or programmer skill.
What's good about Perl warnings is that they are rarely spurious, so there's next to no cost to using them.
Related reading: Why use my?
Apparently use strict should (must) be used when you want to force Perl to code properly which could be forcing declarations, being explicit on strings and subs, i.e., barewords or using refs with caution. Note: if there are errors, use strict will abort the execution if used.
While use warnings; will help you find typing mistakes in program like you missed a semicolon, you used 'elseif' and not 'elsif', you are using deprecated syntax or function, whatever like that. Note: use warnings will only provide warnings and continue execution, i.e., it won't abort the execution...
Anyway, it would be better if we go into details, which I am specifying below
From perl.com (my favourite):
use strict 'vars';
which means that you must always declare variables before you use them.
If you don't declare you will probably get an error message for the undeclared variable:
Global symbol "$variablename" requires explicit package name at scriptname.pl line 3
This warning means Perl is not exactly clear about what the scope of the variable is. So you need to be explicit about your variables, which means either declaring them with my, so they are restricted to the current block, or referring to them with their fully qualified name (for ex: $MAIN::variablename).
So, a compile-time error is triggered if you attempt to access a variable that hasn't met at least one of the following criteria:
Predefined by Perl itself, such as #ARGV, %ENV, and all the global punctuation variables such as $. Or $_.
Declared with our (for a global) or my (for a lexical).
Imported from another package. (The use vars pragma fakes up an import, but use our instead.)
Fully qualified using its package name and the double-colon package separator.
use strict 'subs';
Consider two programs
# prog 1
$a = test_value;
print "First program: ", $a, "\n";
sub test_value { return "test passed"; }
Output: First program's result: test_value
# prog 2
sub test_value { return "test passed"; }
$a = test_value;
print "Second program: ", $a, "\n";
Output: Second program's result: test passed
In both cases we have a test_value() sub and we want to put its result into $a. And yet, when we run the two programs, we get two different results:
In the first program, at the point we get to $a = test_value;, Perl doesn't know of any test_value() sub, and test_value is interpreted as string 'test_value'. In the second program, the definition of test_value() comes before the $a = test_value; line. Perl thinks test_value as sub call.
The technical term for isolated words like test_value that might be subs and might be strings depending on context, by the way, is bareword. Perl's handling of barewords can be confusing, and it can cause bug in program.
The bug is what we encountered in our first program, Remember that Perl won't look forward to find test_value(), so since it hasn't already seen test_value(), it assumes that you want a string. So if you use strict subs;, it will cause this program to die with an error:
Bareword "test_value" not allowed while "strict subs" in use at
./a6-strictsubs.pl line 3.
Solution to this error would be
Use parentheses to make it clear you're calling a sub. If Perl sees $a = test_value();,
Declare your sub before you first use it
use strict;
sub test_value; # Declares that there's a test_value() coming later ...
my $a = test_value; # ...so Perl will know this line is okay.
.......
sub test_value { return "test_passed"; }
And If you mean to use it as a string, quote it.
So, This stricture makes Perl treat all barewords as syntax errors. A bareword is any bare name or identifier that has no other interpretation forced by context. (Context is often forced by a nearby keyword or token, or by predeclaration of the word in question.) So If you mean to use it as a string, quote it and If you mean to use it as a function call, predeclare it or use parentheses.
Barewords are dangerous because of this unpredictable behavior. use strict; (or use strict 'subs';) makes them predictable, because barewords that might cause strange behavior in the future will make your program die before they can wreak havoc
There's one place where it's OK to use barewords even when you've turned on strict subs: when you are assigning hash keys.
$hash{sample} = 6; # Same as $hash{'sample'} = 6
%other_hash = ( pie => 'apple' );
Barewords in hash keys are always interpreted as strings, so there is no ambiguity.
use strict 'refs';
This generates a run-time error if you use symbolic references, intentionally or otherwise.
A value that is not a hard reference is then treated as a symbolic reference. That is, the reference is interpreted as a string representing the name of a global variable.
use strict 'refs';
$ref = \$foo; # Store "real" (hard) reference.
print $$ref; # Dereferencing is ok.
$ref = "foo"; # Store name of global (package) variable.
print $$ref; # WRONG, run-time error under strict refs.
use warnings;
This lexically scoped pragma permits flexible control over Perl's built-in warnings, both those emitted by the compiler as well as those from the run-time system.
From perldiag:
So the majority of warning messages from the classifications below, i.e., W, D, and S can be controlled using the warnings pragma.
(W) A warning (optional)
(D) A deprecation (enabled by default)
(S) A severe warning (enabled by default)
I have listed some of warnings messages those occurs often below by classifications. For detailed info on them and others messages, refer to perldiag.
(W) A warning (optional):
Missing argument in %s
Missing argument to -%c
(Did you mean &%s instead?)
(Did you mean "local" instead of "our"?)
(Did you mean $ or # instead of %?)
'%s' is not a code reference
length() used on %s
Misplaced _ in number
(D) A deprecation (enabled by default):
defined(#array) is deprecated
defined(%hash) is deprecated
Deprecated use of my() in false conditional
$# is no longer supported
(S) A severe warning (enabled by default)
elseif should be elsif
%s found where operator expected
(Missing operator before %s?)
(Missing semicolon on previous line?)
%s never introduced
Operator or semicolon missing before %s
Precedence problem: open %s should be open(%s)
Prototype mismatch: %s vs %s
Warning: Use of "%s" without parentheses is ambiguous
Can't open %s: %s
These two pragmas can automatically identify bugs in your code.
I always use this in my code:
use strict;
use warnings FATAL => 'all';
FATAL makes the code die on warnings, just like strict does.
For additional information, see: Get stricter with use warnings FATAL => 'all';
Also... The strictures, according to Seuss
There's a good thread on perlmonks about this question.
The basic reason obviously is that strict and warnings massively help you catch mistakes and aid debugging.
Source: Different blogs
Use will export functions and variable names to the main namespace by
calling modules import() function.
A pragma is a module which influences some aspect of the compile time
or run time behavior of Perl. Pragmas give hints to the compiler.
Use warnings - Perl complains about variables used only once and improper conversions of strings into numbers. Trying to write to
files that are not opened. It happens at compile time. It is used to
control warnings.
Use strict - declare variables scope. It is used to set some kind of
discipline in the script. If barewords are used in the code they are
interpreted. All the variables should be given scope, like my, our or
local.
The "use strict" directive tells Perl to do extra checking during the compilation of your code. Using this directive will save you time debugging your Perl code because it finds common coding bugs that you might overlook otherwise.
Strict and warnings make sure your variables are not global.
It is much neater to be able to have variables unique for individual methods rather than having to keep track of each and every variable name.
$_, or no variable for certain functions, can also be useful to write more compact code quicker.
However, if you do not use strict and warnings, $_ becomes global!
use strict;
use warnings;
Strict and warnings are the mode for the Perl program. It is allowing the user to enter the code more liberally and more than that, that Perl code will become to look formal and its coding standard will be effective.
warnings means same like -w in the Perl shebang line, so it will provide you the warnings generated by the Perl program. It will display in the terminal.

Check if given string matches one of set of prefixes, effectively

What algorithm to use to check if a given string matches one of set of prefixes, and which prefix from that set?
Other variation: given path and a set of directories, how to check if path is in one of set of directories (assuming that there are no symbolic links, or they do not matter)?
I'm interested in description or name of algorithm, or Perl module which solves this (or can be used to solve this).
Edit
Bonus points for solution which allow to effectively find 'is prefix of' relation between set of strings (set of directories)
For example, given set of directories: foo, foo/bar, foo/baz, quux, baz/quux, baz/quux/plugh the algorithm is to find that foo is prefix of foo/bar and foo/baz, and that baz/quux is prefix of baz/quux/plugh... hopefully without O(n^2) time.
The efficient way to do this would be using a Trie:
http://en.wikipedia.org/wiki/Trie
There is a package for it on CPAN:
https://metacpan.org/pod/Tree::Trie
(never used that package myself though)
You need to consider your what operations need to be the most efficient. The lookup is very cheap in a Trie, but if you only build the trie for one lookup, it might not be the fastest way...
The first function in the List::Util Core module can find if a prefix matches a string. It searches through the list of prefixes, and returns as soon as it finds a match. It does not search through the whole list if it is not necessary:
first returns the first element where the
result from BLOCK is a true value. If
BLOCK never returns true or LIST was
empty then undef is returned.
You pose an interesting question, but as I went out to look for such a thing (in List::MoreUtils for example), I kept coming back to, how is this any different than a grep. So here it is, my basic implementation based on grep. If you don't mind searching the whole list, or want all the matches here is an example:
#!/usr/bin/perl
use strict;
use warnings;
my #prefixes = qw/ pre1 pre2 pre3 /;
my $test = 'pre1fixed';
my #found = grep { $test =~ /^$_/ } #prefixes;
print "$_ is a prefix of $test\n" for #found;
I also I imagine that there must be some way to use the smart-match operator ~~ to do this in a short-circuiting way. Also, as toolic points out the List::Util function could be used for this too. This stops the search after a match is found.
#!/usr/bin/perl
use strict;
use warnings;
use List::Util qw/first/;
my #prefixes = qw/ pre1 pre2 pre3 /;
my $test = 'pre1fixed';
my $found = first { $test =~ /^$_/ } #prefixes;
print "$found is the prefix of $test\n";
The only algorithm I am aware of is the Aho-Corasick though I will leave it as an exercise to the reader (i.e. I don't know) to see if this will help you. I see that there is a module (Algorithm::AhoCorasick). I also believe I have read somewhere that this and trie structures are implemented in Perl's matching under certain circumstances. Perhaps someone knows where I read that? Edit: found it in SO question on matching alternatives

When should you use a package variable vs a lexical variable (and what's the difference)?

I'm looking at some older Perl code on Perl Monks to figure out programming with Win32::OLE and MS Word. Scattered throughout the code are variables with names like $MS::Word and the like, without a 'my' included in their declaration. After reading a bit on Google, I understand that these are called 'package variables' versus 'lexical variables' declared using my.
My first question is 'What are package variables good for?'. I (think) I understand what lexical variables are, but I don't understand the purpose of package variables or how their use differs from lexicals, so my second question would be, 'What is the difference between lexical and package variables?'
You should read Coping with Scoping by MJD.
perldoc perlmod would also be useful reading.
The code is out of this world ugly. It tramples on all sorts of namespaces without a concern in the world just because the author seems to think $author::email is cool.
A better way would have been to use a hash:
my %author = (
email => 'author#example.com',
...
);
Trampling all over the symbol table is not necessary.
I do have a few Win32::OLE examples: http://www.unur.com/comp/ which are no works of art but I believe are improvements on this style. See also Why are the number of pages in a Word document different in Perl and Word VBA?
I am going to rant a little:
#pgm::runtime_args = #ARGV ;
So, we give up on the standard #ARGV array to trample on the pgm namespace. Not only that, every Perl programmer knows what #ARGV is. In any case, #pgm::runtime_args is not used again in the script.
$pgm::maxargs = $#pgm::runtime_args + 1 ;
Of course #pgm::runtime_args in scalar context would give us the number of elements in that array. I have no idea why $pgm::maxargs might be needed, but if it were, then this line should have been:
$pgm::maxargs = #pgm::runtime_args;
I am not going quote more of this stuff. I guess this is what happens when Cobol programmers try to write Perl.
$program::copyright = "Copyright (c) 02002 - Kenneth Tomiak : All rights reserved.";
I am glad he allocated five digits for the year. Ya never know!
PS: I believe my excerpts constitute fair use.
A package variable lives in a symbol table, so given its name, it's possible to read or modify it from any other package or scope. A lexical variable's scope is determined by the program text. The section "Private Variables via my()" in the perlsub manpage gives more detail about defining lexicals.
Say we have the following MyModule.pm:
package MyModule;
# these are package variables
our $Name;
$MyModule::calls = "I do not think it means what you think it means.";
# this is a lexical variable
my $calls = 0;
sub say_hello {
++$calls;
print "Hello, $Name!\n";
}
sub num_greetings {
$calls;
}
1;
Notice that it contains a package $calls and a lexical $calls. Anyone can get to the former, but the module controls access to the latter:
#! /usr/bin/perl
use warnings;
use strict;
use MyModule;
foreach my $name (qw/ Larry Curly Moe Shemp /) {
$MyModule::Name = $name;
MyModule::say_hello;
}
print MyModule::num_greetings, "\n";
print "calls = $MyModule::calls\n";
The program's output is
Hello, Larry!
Hello, Curly!
Hello, Moe!
Hello, Shemp!
4
calls = I do not think it means what you think it means.
As you can see, package variables are globals, so all the usual gotchas and advice against apply. Unless explicitly provided access, it's impossible for code outside the MyModule package to access its lexical $calls.
The rule of thumb is you very nearly always want to use lexicals. Perl Best Practices by Damian Conway is direct: "Never make variables part of a module's interface" (emphasis in original).
Package variables are global variables; they're visible everywhere in the entire program (even other modules). They're useful when you want or need that level of visibility and/or external influence. For example the Text::Wrap module uses them to allow a single configuration point for the number of columns at which to wrap text. Futhermore, package variables allow you to use something called "dynamic scoping" -- but that's a somewhat advanced and slightly esoteric concept.
For your second question, see What is the difference between my and our in Perl?