Perl command for same behaviour as readlink? - perl

What is the equivalent Perl command to the GNU coreutils command readlink -f?
If any component of the file name except the last one is missing or unavailable, readlink produces no
output and exits with a nonzero exit code. A trailing slash is
ignored.

You can use Cwd:
use Cwd 'abs_path';
my $path = "/some/arbitrary/path";
print abs_path($path);
Test:
for q in exists imaginary imarginary/imaginary ; do
echo "$q"
echo -n "readlink -f: " ; readlink -f "$q"
echo -n "abs_path: " ; perl -MCwd=abs_path -E'say abs_path $ARGV[0]' "$q"
echo
done
Output:
exists
readlink -f: /home/eric/exists
abs_path: /home/eric/exists
imaginary
readlink -f: /home/eric/imaginary
abs_path: /home/eric/imaginary
imaginary/imaginary
readlink -f: abs_path:

As a total Perl rookie, I'm happy to say I have figured out this STDIN solution all by myself (after several tries, remember that Perl's learning curve IS known to be steep).
devnull's solution was great with no doubt, but it was a little too "scriptish" for my taste - whereas I'd sometimes just want to pipe a perl one-liner to an echo'ed string, like this:
echo "/home/user/somesymlinkedpath" | perl -MCwd=abs_path -nle 'print abs_path $_'
So as there might be more people around who want to know about how to code this kind of piped form (making perl read the argument from STDIN), I've decided to post it here too.

Related

perl one-liner to keep only desired lines

I have a text file (input.txt) like this:
NP_414685.4: 15-26, 131-138, 441-465
NP_418580.2: 493-500
NP_418780.2: 36-48, 44-66
NP_418345.2:
NP_418473.3: 1-19, 567-1093
NP_418398.2:
I want a perl one-liner that keeps only those lines in file where ":" is followed by number range (that means, here, the lines containing "NP_418345.2:" and "NP_418398.2:" get deleted). For this I have tried:
perl -ni -e "print unless /: \d/" -pi.bak input.txt del input.txt.bak
But it shows exactly same output as the input file.
What will be the exact pattern that I can match here?
Thanks
First, print unless means print if not -- opposite to what you want.
More to the point, it doesn't make sense using both -n and -p, and when you do -p overrides the other. While both of them open the input file(s) and set up the loop over lines, -p also prints $_ for every iteration. So with it you are reprinting every line. See perlrun.
Finally, you seem to be deleting the .bak file ... ? Then don't make it. Use just -i
Altogether
perl -i -ne 'print if /:\s*\d+\s*-\s*\d+/' input.txt
If you do want to keep the backup file use -i.bak instead of -i
You can see the code equivalent to a one-liner with particular options with B::Deparse (via O module)
Try: perl -MO=Deparse -ne 1 and perl -MO=Deparse -pe 1
This way:
perl -i.bak -ne 'print if /:\s+\d+-\d/' input.txt
This:
perl -ne 'print if /:\s*(\d+\s*-\s*\d+\s*,?\s*)+\s*$/' input.txt
Prints:
NP_414685.4: 15-26, 131-138, 441-465
NP_418580.2: 493-500
NP_418780.2: 36-48, 44-66
NP_418473.3: 1-19, 567-1093
I'm not sure if you want to match lines that are possibly like this:
NP_418580.2: 493-500, asdf
or this:
NP_418580.2: asdf
This answer will not print these lines, if given to it.

grep regex to perl or awk

I have been using Linux env and recently migrated to solaris. Unfortunately one of my bash scripts requires the use of grep with the P switch [ pcre support ] .As Solaris doesnt support the pcre option for grep , I am obliged to find another solution to the problem.And pcregrep seems to have an obvious loop bug and sed -r option is unsupported !
I hope that using perl or nawk will solve the problem on solaris.
I have not yet used perl in my script and am unware neither of its syntax nor the flags.
Since it is pcre , I beleive that a perl scripter can help me out in a matter of minutes. They should match over multiple lines .
Which one would be a better solution in terms of efficiency the awk or the perl solution ?
Thanks for the replies .
These are some grep to perl conversions you might need:
grep -P PATTERN FILE(s) ---> perl -nle 'print if m/PATTERN/' FILE(s)
grep -Po PATTERN FILE(s) ---> perl -nle 'print "$1\n" while m/(PATTERN)/g' FILE(s)
That's my guess as to what you're looking for, if grep -P is out of the question.
Here's a shorty:
grep -P /regex/ ====> perl -ne 'print if /regex/;'
The -n takes each line of the file as input. Each line is put into a special perl variable called $_ as Perl loops through the whole file.
The -e says the Perl program is on the command line instead of passing it a file.
The Perl print command automatically prints out whatever is in $_ if you don't specify for it to print out anything else.
The if /regex/ matches the regular expression against whatever line of your file is in the $_ variable.

Perl command line search and replace with multiple expressions

I am using Perl to search and replace multiple regular expressions:
When I execute the following command, I get an error:
prompt> find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g' -pe 's/(\W)##/\1/g'
syntax error at -e line 2, near "s/(\W)##/\1/g"
Execution of -e aborted due to compilation errors.
xargs: perl: exited with status 255; aborting
Having multiple -e is valid in Perl, then why is this not working? Is there a solution to this?
Several -e's are allowed.
You are missing the ';'
find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g;' -pe 's/(\W)##/\1/g;'
Perl statements has to end with ;.
Final statement in a block doesn't need a terminating semicolon.
So a single -e without ; will work, but you will have to add ; when you have multiple -e statements.
Having multiple -e values are valid, but is it useful? The values from the multiple -e are merely combined into one program, and it's up to you to ensure that together they make a syntactically correct program. The B::Deparse program can show you what perl thinks the program is:
$ perl -MO=Deparse -e 'print' -e 'q(Hello' -e ')'
print "Hello\n";
-e syntax OK
A curious thing to note is that a newline snuck in there. Think about how it got there to see what else perl is doing to combine multiple -e values.
In your program, you are substituting on the current line, then taking the modified line and substituting again. That's better written as:
prompt> find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g; s/(\W)##/\1/g'
Now, if you are building up this command line by adding more and more -e through some automated process and you don't know ahead of time what you get, maybe those -e make sense. However, you might consider that you can do the same thing to build up the string you give to -e. I don't know what might be better because you didn't explain why you are doing it that way.
But, I suspect that in some cases, people are actually thinking about having only one substitution work. They want to try one and if its pattern doesn't work, try a different one until one succeeds. In that case you don't want to separate the substitutions by semicolons. Use the short-circuiting || instead. The s/// returns the number of substitutions it made and || will stop (short circuit) when it finds a true value:
prompt> find "*.cpp" | xargs perl -i -pe 's/##(\W)/\1/g || s/(\W)##/\1/g'
And note, you only need one -p. It only does its job once. Here's the program with multiple -p deparsed:
$ perl -MO=Deparse -i -pe 's/##(\W)/\1/g;' -pe 's/(\W)##/\1/g;'
BEGIN { $^I = ""; }
LINE: while (defined($_ = readline ARGV)) {
s/##(\W)/$1/g;
s/(\W)##/$1/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
It's the same thing as having only one -p:
$ perl -MO=Deparse -pi -e 's/##(\W)/\1/g;' -e 's/(\W)##/\1/g;'
BEGIN { $^I = ""; }
LINE: while (defined($_ = readline ARGV)) {
s/##(\W)/$1/g;
s/(\W)##/$1/g;
}
continue {
die "-p destination: $!\n" unless print $_;
}
-e syntax OK
Thanks so much! You helped me reduce my ascii / decimal / 8-bit binary table printer enough to fit in a tweet:
for i in {32..126};do printf "'\x$(printf %x $i)'(%3i) = " $i; printf '%03o\n' $i | perl \
-pe 's#0#000#g;' -pe 's#1#001#g;' -pe 's#2#010#g;' -pe 's#3#011#g;' \
-pe 's#4#100#g;' -pe 's#5#101#g;' -pe 's#6#110#g;' -pe 's#7#111#g' ; done | \
perl -pe 's#= 0#= #'

Only print matching lines in perl from the command line

I'm trying to extract all ip addresses from a file. So far, I'm just using
cat foo.txt | perl -pe 's/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
but this also prints lines that don't contain a match. I can fix this by piping through grep, but this seems like it ought to be unnecessary, and could lead to errors if the regexes don't match up perfectly.
Is there a simpler way to accomplish this?
Try this:
cat foo.txt | perl -ne 'print if s/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
or:
<foo.txt perl -ne 'print if s/.*?((\d{1,3}\.){3}\d{1,3}).*/\1/'
It's the shortest alternative I can think of while still using Perl.
However this way might be more correct:
<foo.txt perl -ne 'if (/((\d{1,3}\.){3}\d{1,3})/) { print $1 . "\n" }'
If you've got grep, then just call grep directly:
grep -Po "(\d{1,3}\.){3}\d{1,3}" foo.txt
You've already got a suitable answer of using grep to extract the IP addresses, but just to explain why you were seeing non-matches being printed:
perldoc perlrun will tell you about all the options you can pass Perl on the command line.
Quoting from it:
-p causes Perl to assume the following loop around your program, which makes it
iterate over filename arguments somewhat like sed:
LINE:
while (<>) {
... # your program goes here
} continue {
print or die "-p destination: $!\n";
}
You could have used the -n switch instead, which does similar, but does not automatically print, for example:
cat foo.txt | perl -ne '/((?:\d{1,3}\.){3}\d{1,3})/ and print $1'
Also, there's no need to use cat; Perl will open and read the filenames you give it, so you could say e.g.:
perl -ne '/((?:\d{1,3}\.){3}\d{1,3})/ and print $1' foo.txt
ruby -0777 -ne 'puts $_.scan(/((?:\d{1,3}\.){3}\d{1,3})/)' file

Why does Perl and /bin/sha1 give different results?

I'm confused as to why the following return separate sHA1s
$ perl -MDigest::SHA1 -E'say Digest::SHA1::sha1_hex("http://i.aultec.com/v/8066/Originals/1FTVX12585NA9832010.jpg");'
e1133fa3b7ea0bfb8ffa4d877932ed6c6fa10cef
$ echo "http://i.aultec.com/v/8066/Originals/1FTVX12585NA9832010.jpg" | sha1sum
5c3731e83ae0184ed93b595b9f5604863dd331e6 -
Which one is right? Am /I/ doing it wrong?
$ perl -MDigest::SHA -E'say Digest::SHA::sha1_hex("http://i.aultec.com/v/8066/Originals/1FTVX12585NA9832010.jpg");'
e1133fa3b7ea0bfb8ffa4d877932ed6c6fa10cef
You can see the digest is right in the successor (Digest::SHA)
Both are right. Your echo command includes a newline at the end. (and the perl string doesn't) Try with echo -n ...
Perl is giving you the hash of the literal string you entered, whereas echo is appending a newline. If you tell echo to not add a newline, you'll get the same result:
drewfus:~$ perl -MDigest::SHA1 -E'say Digest::SHA1::sha1_hex("foo");'
0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33
drewfus:~$ echo -n "foo" | sha1sum
0beec7b5ea3f0fdbc95d0dd47f3c5bc275da8a33 -
This is such a frequent mistake and I've made it many times. The echo command is also returning a newline.