How to understand this perl multi-line writing command - perl

I am trying to understand the perl commands below:
$my = << EOU;
This is an example.
Example too.
EOU
What is the name of this way? Could somebody can explain more about this "multi-line writing" command?

Essentially the syntax is allowing you to put anything unique as a marker so that it won't conflict with your contents. You can do this:
$my = <<ABCDEFG;
This is an example.
Example too.
BLAH
ABCDEFG
Everything between "This.." and "BLAH" will be assigned to the variable. Note that you shouldn't have a space after the << symbols otherwise you will get a syntax error. It helps avoid adding CR characters, or append (.) everywhere, and useful when passing data into another application (eg. ftp session). Here Documents is the correct term for this.

Everything between <<EOU and EOU is a multi-line, non-escapable, string. It's nothing fancy, think of them as start and end quote marks with nothing inside requiring escapes to be literally what you typed...

Related

Single quotes in a variable name in Perl?

I was writing some Perl code in vim and accidentally typed a single quote character in a variable name and noticed that it highlighted it in a different color than normal single quoted strings.
I thought that was odd, so I wrote a small test program (shown above) and tried to run it to see how Perl would handle it and I got this error:
"my" variable $var::with::apostrophes can't be in a package
What exactly is going on here? Are there situations where single quotes in variable names are actually valid? If so, what meaning do single quotes have when used in this context?
The single quote is the namespace separator used in Perl 4, replaced by the double colon :: in Perl 5. Because Perl is mostly backwards compatible, this still works. It's great for golfing, but not much else.
Here's an article about it on perl.com that doesn't explain it.

sed seems to match pattern properly only when newline inserted

I am currently running the following sed command:
sed 's/P(\(.*\))\\mid(\(.*\))/\\condprob{\1}{\2}/g' myfile.tex
Essentially, I have inherited an oddly formatted tex file, and want to replace everything like this:
P(<foo>)\mid(<bar>)
With this
\condprob{<foo>}{<bar>}
The file I am trying to run sed on contains the following line:
P(\vec{m}_i)\mid(t,h,\alpha) = \prod_{u\in\mathcal{U}} P(\vec{m}_{iu})\mid(t,h,\alpha)
Which I would like to change to this:
\condprob{\vec{m}_i}{t,h,\alpha} = \prod_{u\in\mathcal{U}}\condprob{\vec{m}_{iu}}{t,h,\alpha}
However, sed keeps missing the first \mid and instead gives me this:
\condprob{\vec{m}_i)\mid(t,h,\alpha) = \prod_{u\in\mathcal{U}} P(\vec{m}_{iu}}{t,h,\alpha}
If I add a line break at the = sign it matches everything fine
Can someone please a) help me resolve this, and b) perhaps tell me why it is happening?
Thanks.
Edit: thanks choroba and Sloopjon, you've both answered my why, and Sloopjon's solution is actually exactly what I was needing. choroba: I guess I will have to wait another day to learn perl.
For those that are interested Sloopjon's solution when translated into my problem looks like this (match everything that isn't a closing parenthesis):
sed 's/P(\([^)]*\))\\mid(\([^)]\))/\\condprob{\1}{\2}/g' myfile.tex
It looks like you expect P(\(.*\)) to match only P(\vec{m}_i), but the * quantifier is greedy, so it actually matches P(\vec{m}_i)\mid...P(\vec{m}_{iu}). There are two common fixes for this: use a non-greedy quantifier if your tool supports it, or change the pattern so that it only matches what you expect. For example, if you know that parentheses won't nest in this P() construct, change .* to [^)]*.
Edit: I also suggest that you look for a regex visualizer or debugger when you have a problem like this. For example, pasting your example into debuggex.com makes it clear what's happening.
The problem is the greediness of the * quantifier. It matches as many times as it can, i.e. it doesn't stop at the first ).
You can try Perl, that features "non-greedy" (frugal, lazy) *?:
perl -pe 's/P\((.*?)\)\\mid\((.*?)\)/\\condprob{$1}{$2}/g'

What is print <<EOF; and how is it working? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Help me understand this Perl statement with <<'ESQ'
What is the statement in https://stackoverflow.com/questions/4151279/perl-print-eof doing exactly? I came across the previous post but didn't understand what he is trying to explain. What is that PETE? Can anyone explain every line? How is the code is working?
print <<EOF;
This is
a multiline
string
EOF
print <<PETE;
This is
a multiline
string
PETE
What is the difference and similarity between these two? In place of PETE I have used many other words like DOG and it works the same every time.
This is called a here-doc. It basically grabs everything from the next line up until an end marker line and presents that as standard input to the program you're running. The end marker line is controlled by the text following the <<.
As an example, in bash (which I'm more familiar with than Perl), the command:
cat <<EOF
hello
goodbye
EOF
will run cat and then send two lines to its standard input (the hello and goodbye lines). Perl also has this feature though the syntax is slightly different (as you would expect, given it's a different language). Still, it's close enough for the explanation to still hold.
Wikipedia has an entry for this which you probably would have found had you known it was called a here-doc, but otherwise it would be rather hard to figure it out.
In your particular cases, there is no difference between using EOF and PETE, there's a relationship between the heredoc marker (the bit following <<) and the end of standard input.
For example, if one of your input lines was EOF, you couldn't really use that as a marker since the standard input would be terminated prematurely:
cat <<EOF
This section contains the line ...
EOF
but then has more stuff
and this line following is the real ...
EOF
In that case, you could use PETE (or anything else that doesn't appear in the text on its own line).
There are other options such as using quotes around the marker (so the indentation can look better) and the use of single or double quotes to control variable substitution.
If you go to the perlop page and search for <<EOF, it will hopefully all become clear.
See Quote and Quote-like Operators (it's pretty well explained).

How do I define a Windows file path as a variable?

I am screwing around with a tiny script I am making and one thing I am trying to figure out is how to make a perl variable reflect an executable, for example.
$putty = C:\putty.exe;
When ever I run it like this it tells me "C:\ is not recognizable command, what am I doing wrong? I have also tried surrounding it in quotes and no help by that.
You should be quoting literal strings, for example like
my $putty = 'C:\putty.exe';
If this is news to you, you might have been missing out on the strict pragma before. I highly recommend having a look at that and using it in all of your code.

Help me understand this Perl statement with <<'ESQ'

substr($obj_strptime,index($strptime,"sub")+6,0) = <<'ESQ';
shift; # package
....
....
ESQ
What is this ESQ and what is it doing here? Please help me understand these statements.
It marks the end of a here-doc section.
EOF is more traditional than ESQ though.
This construct is known as a here-doc (because you're getting standard input from a document here rather than an external document on the file system somewhere).
It basically reads everything from the next line up to but excluding an end marker line, and uses that as standard input to the program or command that you're running. The end marker line is controlled by the text following the <<.
As an example, in bash (which I'm more familiar with than Perl), the command:
cat <<EOF
hello
goodbye
EOF
will run cat and then send two lines to its standard input (the hello and goodbye lines). Perl also has this feature though the syntax is slightly different (as you would expect, given it's a different language). Still, it's close enough for the explanation to still hold.
Wikipedia has an entry for this which you probably would have found had you known it was called a here-doc, but otherwise it would be rather hard to figure it out.
You can basically use any suitable marker. For example, if one of your input lines was EOF, you couldn't really use that as a marker since the standard input would be terminated prematurely:
cat <<EOF
This section contains the line ...
EOF
but then has more stuff
and this line following is the real ...
EOF
In that case, you could use DONE (or anything else that doesn't appear in the text on its own line).
There are other options such as using quotes around the marker (so the indentation can look better) and the use of single or double quotes to control variable substitution.
If you go to the perlop page and search for <<EOF, it will hopefully all become clear.