I've found some Perl code that makes the following check:
if ($^O eq 'MSWin32')
What is the $^0 variable? It looks like it contains the architecture/OS of the machine it's running on, but I don't know if that's the result of assigning it elsewhere in the script (maybe it's the result of a regex match, though I can't see any match operations performed before that point), or whether that variable always has a value related to the machine it's running on.
I'd like to check the OS and bitness of the machine running the script, and would like to know if I can use $^0 to help me with that (if not, then I'm still curious what it is).
I'd rather not publish other parts of the Perl script, as it's proprietary.
This strikes me as the sort of question that should have been asked before, but Google isn't much use, thanks to the special characters (I often think the inability to Google Perl code led in part to its demise), and Stack Overflow doesn't have any useful suggested questions either.
There's no match for $^0 or $^1 or $^# on the perlvar page, and I'm not convinced that $^N or $^X are related.
It is not a zero.
It is the letter "O", capitalized.
In perlvar, search for OSNAME, which is the long form name of the variable when you use English:
The name of the operating system under which this copy of Perl was
built, as determined during the configuration process.
There is no special "dollar caret zero" variable.
Related
I've seen many (code-golf) Perl programs out there and even if I can't read them (Don't know Perl) I wonder how you can manage to get such a small bit of code to do what would take 20 lines in some other programming language.
What is the secret of Perl? Is there a special syntax that allows you to do complex tasks in few keystrokes? Is it the mix of regular expressions?
I'd like to learn how to write powerful and yet short programs like the ones you know from the code-golf challenges here. What would be the best place to start out? I don't want to learn "clean" Perl - I want to write scripts even I don't understand anymore after a week.
If there are other programming languages out there with which I can write even shorter code, please tell me.
There are a number of factors that make Perl good for code golfing:
No data typing. Values can be used interchangeably as strings and numbers.
"Diagonal" syntax. Usually referred to as TMTOWTDI (There's more than one way to do it.)
Default variables. Most functions act on $_ if no argument is specified. (A few act
on #_.)
Functions that take multiple arguments (like split) often have defaults that
let you omit some arguments or even all of them.
The "magic" readline operator, <>.
Higher order functions like map and grep
Regular expressions are integrated into the syntax (i.e. not a separate library)
Short-circuiting operators return the last value tested.
Short-circuiting operators can be used for flow control.
Additionally, without strictures (which are off be default):
You don't need to declare variables.
Barewords auto-quote to strings.
undef becomes either 0 or '' depending on context.
Now that that's out of the way, let me be very clear on one point:
Golf is a game.
It's great to aspire to the level of perl-fu that allows you to be good at it, but in the name of $DIETY do not golf real code. For one, it's a horrible waste of time. You could spend an hour trying to trim out a few characters. Golfed code is fragile: it almost always makes major assumptions and blithely ignores error checking. Real code can't afford to be so careless. Finally, your goal as a programmer should be to write clear, robust, and maintainable code. There's a saying in programming: Always write your code as if the person who will maintain it is a violent sociopath who knows where you live.
So, by all means, start golfing; but realize that it's just playing around and treat it as such.
Most people miss the point of much of Perl's syntax and default operators. Perl is largely a "DWIM" (do what I mean) language. One of it's major design goals is to "make the common things easy and the hard things possible".
As part of that, Perl designers talk about Huffman coding of the syntax and think about what people need to do instead of just giving them low-level primitives. The things that you do often should take the least amount of typing, and functions should act like the most common behavior. This saves quite a bit of work.
For instance, the split has many defaults because there are some use cases where leaving things off uses the common case. With no arguments, split breaks up $_ on whitespace because that's a very common use.
my #bits = split;
A bit less common but still frequent case is to break up $_ on something else, so there's a slightly longer version of that:
my #bits = split /:/;
And, if you wanted to be explicit about the data source, you can specify the variable too:
my #bits = split /:/, $line;
Think of this as you would normally deal with life. If you have a common task that you perform frequently, like talking to your bartender, you have a shorthand for it the covers the usual case:
The usual
If you need to do something, slightly different, you expand that a little:
The usual, but with onions
But you can always note the specifics
A dirty Bombay Sapphire martini shaken not stirred
Think about this the next time you go through a website. How many clicks does it take for you to do the common operations? Why are some websites easy to use and others not? Most of the time, the good websites require you to do the least amount of work to do the common things. Unlike my bank which requires no fewer than 13 clicks to make a credit card bill payment. It should be really easy to give them money. :)
This doesn't answer the whole question, but in regards to writing code you won't be able to read in a couple days, here's a few languages that will encourage you to write short, virtually unreadable code:
J
K
APL
Golfscript
Perl has a lot of single character special variables that provide a lot of shortcuts eg $. $_ $# $/ $1 etc. I think it's that combined with the built in regular expressions, allows you to write some very concise but unreadable code.
Perl's special variables ($_, $., $/, etc.) can often be used to make code shorter (and more obfuscated).
I'd guess that the "secret" is in providing native operations for often repeated tasks.
In the domain that perl was originally envisioned for you often have to
Take input linewise
Strip off whitespace
Rip lines into words
Associate pairs of data
...
and perl simple provided operators to do these things. The short variable names and use of defaults for many things is just gravy.
Nor was perl the first language to go this way. Many of the features of perl were stolen more-or-less intact (or often slightly improved) from sed and awk and various shells. Good for Larry.
Certainly perl wasn't the last to go this way, you'll find similar features in python and php and ruby and ... People liked the results and weren't about to give them up just to get more regular syntax.
What's Java's secret of copying a variable in only one line, without worrying about buses and memory? Answer: the code is transformed to bigger code. Same for every language ever invented.
I am using Devel::LeakTrace::Fast to debug a memory leak in a perl script designed as a daemon which runs an infinite loop with sleeps until interrupted. I am having some trouble both reading the output and finding documentation to help me understand the output. The perldoc doesn't contain much information on the output. Most of it makes sense, such as pointing to globals in DBI. Intermingled with the output, however, are several
leaked SV(<LOCATION>) from (eval #) line #
Where the numbers are numbers and <LOCATION> is a location in memory. The script itself is not using eval at any point - I have not investigated each used module to see if evals are present. Mostly what I want to know is how to find these evals (if possible).
I also find the following entries repeated over and over again
leaked SV(<LOCATION>) from line #
Where line # is always the same #. Not very helpful in tracking down what file that line is in.
You may not be using eval anywhere directly, but some module you are using likely is. Additionally, there could be a problem in some XS code you are linking into.
Have you tried reducing your script bit by bit, cutting out parts that you think might be suspect (or even parts that you think are not), and seeing how your results change? If you can split your script up into discrete pieces (which is a good idea to do anyway, from an architectural and maintainability standpoint), you might be able to find which area is the culprit, and then drill down from there.
Assume that the following Perl code is given:
my $user_supplied_string = &retrieved_from_untrusted_user();
$user_supplied_string =~ s/.../.../g; # filtering done here
my $output = `/path/to/some/command '${user_supplied_string}'`;
The code is clearly insecure, but assume that the only thing that can be changed is the filtering code on line #2.
My question:
What is the minimal set of characters that needs to be filtered on line #2 to make the above code secure?
Please note:
Whitelisting is not an option in this case, so please keep your answer focused on what to filter out to make it secure. And more specifically; what is the minimal set of characters to filter out to make it secure? Everything else is off-topic.
Make sure your answer addresses the question stated ("What is the minimal set of characters that needs to be filtered on line #2 to make the above code secure?"). If your answer does not address that very specific question then don't post. Thanks.
First, given that you are concerned with security, I suggest you look into taint mode. As for the minimal set of characters to allow to be visible to shell, you are better off not letting any characters be seen by the shell:
my $output = do {
local $/;
open my $pipe, "-|", "/path/to/some/command", $user_supplied_string
or die "could not run /path/to/some/command: $!";
<$pipe>;
};
The set of characters that you allow depend on what the application in that system call is going to do with them. There's the shell special characters, but that's ony one part of the problem. You also have to ensure that the value you give to the command is valid input, and that requires some more work.
See, for instance, my chapter on security in Mastering Perl where I go into the gory details of the problem.
Perhaps you can explain why your problem ties both your hands behind your back and blindfolds you. Your problem isn't technical if those are your constraints.
After a little research, the following may be the minimal set you're looking for, at least on a subset of UNIX-like systems. Of course, I have not personally tested it, so YMMV:
&;`'\"|*?~<>^()[]{}$\n\r
In a regex:
s/[\&\;\`\'\\\"\|\*\?\~\<\>\^\(\)\[\]\{\}\$\n\r]//g
I don't think actually using this in real code would be a good idea, but I can see how it could be interesting out of pure curiosity.
I'm trying to understand someone else's Perl code without knowing much Perl myself. I would appreciate your help.
I've encountered a Perl function along these lines:
MyFunction($arg1,$arg2__size,$arg3)
Is there a meaning to the double-underscore syntax in $arg2, or is it just part of the name of the second argument?
There is no specific meaning to the use of a __ inside of a perl variable name. It's likely programmer preference, especially in the case that you've cited in your question. You can see more information about perl variable naming here.
As in most languages underscore is just part of an identifier; no special meaning.
But are you sure it's Perl? There aren't any sigils on the variables. Can you post more context?
As far as the interpreter is concerned, an underscore is just another character allowed in identifiers. It can be used as an alternative to concatenation or camel case to form multi-word identifiers.
A leading underscore is often used to mean an identifier is for local use only, e.g. for non-exported parts of a module. It's merely a convention; the interpreter doesn't care.
In the context of your question, the double underscore doesn't have any programmatic meaning. Double underscores does mean something special for a limited number of values in Perl, most notably __FILE__ & __LINE__. These are special literals that aren't prefixed with a sigil ($, % or #) and are only interpolated outside of quotes. They contain the full path & name of the currently executing file and the line that is being executed. See the section on 'Special Literals' in perldata or this post on Perl Monks
I'm fairly certain arg2__size is just the name of a variable.
Mark's answer is of course correct, it has no special meaning.
But I want to note that your example doesn't look like Perl at all. Perl variables aren't barewords. They have the sigils, as you will see from the links above. And Perl doesn't have "functions", it has subroutines.
So there may be some confusion about which language we're talking about.
You will need to tell the interpreter that "$arg2" is the name of a variable. and not "$arg2__size". For this you will need to use the parenthesis. (This usage is similar to that seen in shell).
This should work
MyFunction($arg1,${arg2}__size,$arg3)
--Binu
I have a framework written in Perl that sets a bunch of environment variables to support interprocess (typically it is sub process) communication. We keep a sets of key/value pairs in XML-ish files. We tried to make the key names camel-case somethingLikeThis. This all works well.
Recently we have had occasion to pass control (chain) processes from Windows to UNIX. When we spit out the %ENV hash to a file from Windows the somethingLikeThis key becomes SOMETHINGLIKETHIS. When the Unix process picks up the file and reloads the environment and looks up the value of $ENV{somethingLikeThis} it does not exist since UNIX is case sensitive (from the Windows side the same code works fine).
We have since gone back and changed all the keys to UPPERCASE and solved the problem, but that was tedious and caused pain to the users. Is there a way to make Perl on Windows preserve the character case of the keys of the environment hash?
I believe that you'll find the Windows environment variables are actually case insensitive, thus the keys are uppercase in order to avoid confusion.
This way Windows scripts which don't have any concept of case sensitivity can use the same variables as everything else.
As far as I remember, using ALL_CAPS for environment variables is the recommended practice in both Windows and *NIX worlds. My guess is Perl is just using some kind of legacy API to access the environment, and thus only retrieves the upper-case-only name for the variable.
In any case, you should never rely on something like that, even more so if you are asking your users to set up the variables, just imagine how much aggravation and confusion a simple misspelt variable would produce! You have to remember that some OSes that will remain nameless have not still learned how to do case sensitive files...
First, to solve your problem, I believe using backticks around set and parsing it yourself will work. On my Windows system, this script worked just fine.
my %env = map {/(.*?)=(.*)/;} `set`;
print join(' ', sort keys %env);
In the camel book, the advice in Chapter 25: Portable Perl, the System Interaction section is "Don't depend on a specific environment variable existing in %ENV, and don't assume that anything in %ENV will be case sensitive or case preserving. Don't assume Unix inheritance semantics for environment variables; on some systems, they may be visible to all other processes."
Jack M.: Agreed, it is not a problem on Windows. If I create an environment variable Foo I can reference it in Perl as $ENV{FOO} or $ENV{fOO} or $ENV{foo}. The problem is: I create it as Foo and dump the entire %ENV to a file and then read in the file from *NX to recreate the Environment hash and use the same script to reference $ENV{Foo}, that hash value does not exist (the $ENV{FOO} does exist).
We had adopted the all UPPERCASE workaround that davidg suggested. I was just wondering if there was ANY way to "preserve case" when writing out the keys to the %ENV hash from Perl on Windows.
To the best of my knowledge, there is not. It seems that you may be better off using another hash instead of %ENV. If you are calling many outside modules and want to track the same variables across them, a Factory pattern may work so that you're not breaking DRY, and are able to use a case-sensitive hash across multiple modules. The only trick would then be to keep these variables updated across all objects from the Factory, but I'm sure you can work that out.