I'm trying to access a series of webpages in perl and write them to a series of files. The code I have looks like this:
open IN , "AbsoluteFinalData.txt"; #Each line has a number and ID name separated by a tab.
while(my $line = <IN>){
chop $line; #removes newline at the end
my #first_split = split(/\t/, $line);
my $IDnum = $first_split[0];
my $Uniprot = $first_split[1];
system('Uniprot=$Uniprot; curl -o "$Uniprot.html" http://pfam.xfam.org/\protein/.$Uniprot'); #More stuff after
The program, however, is giving me fits when I try to call $Uniprot in system(). Is there any way to call a variable defined in the perl script using system()?
system('Uniprot=$Uniprot; curl -o "$Uniprot.html" http://pfam.xfam.org/\protein/.$Uniprot');
You use single quotes, which doesn't interpolate. The literal command:
Uniprot=$Uniprot; curl -o "$Uniprot.html" http://pfam.xfam.org/\protein/.$Uniprot
Is being executed.
You want to interpolate your variables, which means using double quotes (and escaping contained ones:)
system("Uniprot=$Uniprot; curl -o \"$Uniprot.html\" http://pfam.xfam.org/\protein/.$Uniprot");
Or the qq quote-like operator which functions like the double quote but avoids needing to escape contained double quotes:
system(qq(Uniprot=$Uniprot; curl -o "$Uniprot.html" http://pfam.xfam.org/\protein/.$Uniprot"));
Given that you're not relying on the system command to perform any shell interpretation or file I/O redirection, you can achieve everything you want safely like this:
system 'curl', '-o', "$Uniprot.html", "http://pfam.xfam.org/protein/.$Uniprot";
The "list" version of system is safer to use than the single string version because it prevents shell command injection attacks.
Note also the use of double quotes to enable Perl's own variable interpolation, and also that there's no need to create the shell local variable Uniprot=$Uniprot since it's not used by Curl and is only being used by you to attempt to perform variable interpolation yourself.
Perl only interpolates variables within double quotes ("..."), not single quotes ('...').
system("Uniprot=$Uniprot; curl -o \"$Uniprot.html\" http://pfam.xfam.org/\protein/.$Uniprot");
Will do the substitution you're looking for.
Related
Suppose I have a text file with content like below:
'Jack', is a boy
'Jenny', is a girl
...
...
...
I'd like to use perl in Cli to only capture the names between pairs of single quotes
cat text| perl -ne 'print $1."\n" if/\'(\w+?)\'/'
Above command was what I ran but it didn't work. It seems like "'" messed up with Shell.
I know we have other options like writing a perl script. But given my circumstances, I'd like to find a way to fulfill this in Shell command line.
Please advise.
The shell has the interesting property of concatenating quoted strings. Or rather, '...' or "..." should not be considered strings, but modifiers for available escapes. The '...'-surrounded parts of a command have no escapes available. Outside of '...', a single quote can be passed as \'. Together with the concatenating property, we can embed a single quote like
$ perl -E'say "'\''";'
'
into the -e code. The first ' exits the no-escape zone, \' is our single quote, and ' re-enters the escapeless zone. What perl saw was
perl // argv[0]
-Esay "'"; // argv[1]
This would make your command
cat text| perl -ne 'print $1."\n" if/'\''(\w+?)'\''/'
(quotes don't need escaping in regexes), or
cat text| perl -ne "print \$1.qq(\n) if/'(\w+?)'/"
(using double quotes to surround the command, but using qq// for double quoted strings and escaping the $ sigil to avoid shell variable interpolation).
Here are some methods that do not require manually escaping the perl statement:
(Disclaimer: I'm not sure how robust these are – they haven't been tested extensively)
Cat-in-the-bag technique
perl -ne "$(cat)" text
You will be prompted for input. To terminate cat, press Ctrl-D.
One shortcoming of this: The perl statement is not reusable. This is addressed by the variation:
$pline=$(cat)
perl -ne "$pline" text
The bash builtin, read
Multiple lines:
read -rd'^[' pline
Single line:
read -r pline
Reads user input into the variable pline.
The meaning of the switches:
-r: stop read from interpreting backslashes (e.g. by default read interprets \w as w)
-d: determines what character ends the read command.
^[ is the character corresponding to Esc, you insert ^[ by pressing Ctrl-V then Esc.
Heredoc and script.
(You said no scripts, but this is quick and dirty, so might as well...)
cat << 'EOF' > scriptonite
print $1 . "\n" if /'(\w+)'/
EOF
then you simply
perl -n scriptonite text
#!/usr/bin/perl
$command = "SetBaseStationParam(\\\"PDP_ACTIVATION_REJECT\\\",0);"
system (boa.exp $command);
boa.exp script will take this command login to a linux machine and executes the script.
# /Usr/bin/expect
set timeout 5
set arg1 [lindex 0]
spawn ssh root#10.xx.xx.xx
expect "password:"
send "pass\r"
expect "$"
send "$arg1\r"
expect "$"
But this script is removing the first double quotes in the command and printing it as
output is
SetBaseStationParam(\PDP_ACTIVATION_REJECT",0);
Expected output is
SetBaseStationParam("PDP_ACTIVATION_REJECT",0);
Please let me know if there is any solution for this
When you use double backslashes it escapes the backslash, so the proper way to escape a quote is \".
However, a better solution is to use qq(). It can be used with a great variety of characters as delimiter, such as | for example:
$command = qq|SetBaseStationParam("PDP_ACTIVATION_REJECT",0)|;
Or in your case, even use single quotes
$command = 'SetBaseStationParam("PDP_ACTIVATION_REJECT",0)';
You should be aware that not using
use strict;
use warnings;
Is a very bad idea indeed.
I'm trying to use GNU Date to get the seconds between two dates. The reason I'm using GNU Date is for performance (in testing was 10x faster than Perl) for this purpose. However, one of my arguments is a perl variable. Like this:
my $b_row="2012-01-05 20:20:22";
my $exec =qx'CUR_DATE=`echo $(date +"%F %T")` ; echo $(($(date -d "$CUR_DATE" +%s)-$(date -d "$b_row" +%s)))';
The problem is that b_row is not being expanded. I've tried a couple different solutions (IPC::System::Simple) being one, tried adjusting the backticks etc. No success, any ideas how to do this appropriately? The main thing is I need to capture the output from the bash command.
Make it easier on yourself and do the minimum amount of work in the shell. This works for me:
my $b_row = '2012-01-05 20:20:22';
my $diff = qx(date -d "\$(date +'%F %T')" +%s) -
qx(date -d "$b_row" +%s);
Just be absolutely sure $b_row doesn't have any shell metacharacters in it.
That's because you use ' :
Using single-quote as a delimiter protects the command from
Perl's double-quote interpolation, passing it on to the shell
instead:
$perl_info = qx(ps $$); # that's Perl's $$
$shell_info = qx'ps $$'; # that's the new shell's $$
qx has the feature of letting you choose a convenient delimiter, including the option of whether to interpolate the string or not (by choosing ' as the delimiter). For this use case, sometimes you want interpolation and sometimes you don't, so qx (and backticks) may not be the right tool for the job.
readpipe is probably a better tool. Like the system EXPR command, it takes an arbitrary scalar as input, and you have all of Perl's tools at your disposal to construct that scalar. One way to do it is:
my $exec = readpipe
'CUR_DATE=`echo $(date +"%F %T")` ;' # interp not desired
. ' echo $(($(date -d "$CUR_DATE" +%s)-$(date -d "'
. qq/"$b_row"/ # now interp is desired
. ' +%s)))'; # interp not desired again
I want to run some commands using the system() command, I do this way:
execute_command_error("trash-put '/home/$filename'");
Where execute_command_error will report if there was an error with whatever system command it ran. I know I could just unlink the file using Perl commands, but I want to delete stuff using trash-put as it's a type of recycling program.
My problem is that $filename will sometimes have apostrophes, quotes, and other weird characters in it that mess up the system command or Perl itself.
Generate the command name and arguments as an array, and pass that to system:
my(#command) = ("trash-put", 'home/$filename');
system #command;
This means that Perl does not invoke the shell to do any metacharacter expansion (or I/O redirection, or command piping, or ...). It does mean it does exactly what you told it to do.
sub execute_command_error
{
system #_;
}
Borrowing information from the copious collection of comments:
Which is clearly documented in perldoc -f system or at perldoc.perl.org/functions/system.html (#Ether).
(See also the discussion of 'exec' below which is closely related.)
Did you mean to put $filename in single quotes? (#mobrule).
I did intend to use single quotes - I'm demonstrating that the $filename does not get expanded by Perl or Shell...In my test script, I used 'my.$file', and that gave me a file with a $ in the name - as I intended.
I think the desired quoting if you do want to invoke the shell (for example if you want some piping) is $command_line = "\"$command\" \"$arg1\" \"$arg2\"...". (#Jefromi).
Adding double quotes around the arguments won't help with embedded $, backtick1, '$(...)' and related notations. You more nearly need single quotes around things, but then you need to rewrite embedded single quotes as "'\''" which generates a single quote to terminate the current single-quoted argument, a backslash-quote combination to represent a single quote, and another single quote to resume the single-quoted argument.
This would be a good solution if I used system command directly; however I am using webmin's execute_command function, which is a bit over my head so I wouldn't know how to edit it to allow for arrays. Could you expand on the rewrite of embedded single quotes as "'\''"...This is what I will use, for now. (#Brian)
Roughly speaking, the way the (Unix) shells treat single quotes is "everything from the first single quote up to the next is literal text, no metacharacters". So, to get the shell to treat something as literal text, enclose it in single characters. That deals with everything except single quotes themselves. As my comment says, you have to use the 4-character replacement string to get a single quote embedded into the middle of a single quoted argument.
There is probably a neater way to do it than this (using one or two map operations, perhaps), but this should work:
for (my $i = 0; $i < scalar(#command); $i++)
{
$command[$i] =~ s/'/'\\''/g; # Replace single quotes by the magic sequence
$command[$i] = "'$command[$i]'"; # Wrap value in single quotes
}
You can then join the array to make a single string for transmission to execute_command.
It's better to write that as system { $command[0] } #command to handle the case where #command has one element. This is one of the things I talk about in the "Secure Programming Techniques" chapter of Mastering Perl. (#briandfoy).
As a general rule, I'll accept this correction. I'm not sure it is crucial in this instance, though, where the command name is provided by the program and it is only the arguments that are possibly provided the user. The command name 'trash-put' is safe from shell expansions (IFS is reset to default by the shell when it starts, so that avenue of attack is not available).
This issue is discussed in the 'perldoc -f exec' man page:
If you don't really want to execute the first argument, but want to lie to the program you are executing about its own name, you can specify the program you actually want to run as an "indirect object" (without a comma) in front of the LIST. (This always forces interpretation of the LIST as a multivalued list, even if there is only a single scalar in the list.)
Example:
$shell = '/bin/csh';
exec $shell '-sh'; # pretend it's a login shell
or, more directly,
exec {'/bin/csh'} '-sh'; # pretend it's a login shell
When the arguments get executed via the system shell, results are subject to its quirks and capabilities. See "STRING" in perlop for details.
Using an indirect object with exec or system is also more secure. This usage (which also works fine with system()) forces interpretation of the arguments as a multivalued list, even if the list had just one argument. That way you're safe from the shell expanding wildcards or splitting up words with whitespace in them.
#args = ( "echo surprise" );
exec #args; # subject to shell escapes
# if #args == 1
exec { $args[0] } #args; # safe even with one-arg list
The first version, the one without the indirect object, ran the echo program, passing it "surprise" an argument. The second version didn't; it tried to run a program named "echo surprise", didn't find it, and set $? to a non-zero value indicating failure.
1 How do you get a back-tick to display in Markdown?
As stated in perldoc -f system:
If there is more than one argument in LIST, or if LIST is an array
with more than one value, starts the program given by the first element of the list with arguments given by the rest of the list. If there is
only one scalar argument, the argument is checked for shell metacharacters, and if there are any, the entire argument is passed to the system's
command shell for parsing (this is "/bin/sh -c" on Unix platforms, but varies on other platforms). If there are no shell metacharacters in the
argument, it is split into words and passed directly to "execvp", which is more efficient.
I like to use IPC::System::Simple's version of system(), which gives more control, such as being able to capture various exceptions and handle certain error codes as "bad" and others as "good":
use IPC::System::Simple qw(system);
system("cat *.txt"); # will die on failure
I'm working on a Perl script that I was hoping to capture a string entered on the command line without having to enter the quotes (similiar to bash's "$#" ability). I'll be using this command quite a bit so I was hoping that this is possible. If I have:
if ($ARGV) {
I have to put the command line string in quotes. I'd rather do the command something like this:
htmlencode some & HTML <> entities
Without the quotes. Is there a way to do this in Perl?
The #ARGV array contains the arguments to the Perl script - no quotes needed.
That said, the question asks about:
I have to put the command line string in quotes. I'd rather do the command something like this:
htmlencode some & HTML <> entities
Without the quotes. Is there a way to do this in perl?
Well, if the command shown is written at the shell command line, you have to obey shell conventions - which means escaping the '&' and '<>' to prevent the shell from interpreting them. Likewise, within a Perl script, that sequence would need to be protected from Perl. Maybe you'd write:
system "htmlencode", "some", "&", "HTML", "<>", "entities";
That is, everything would have to be in quotes - but that notation would avoid executing the shell and having the shell interpret the commands.
Alternatively again, if you put the arguments into an array (with quotes at the time the array is loaded), then you could pass that array to system and not use any quotes:
my #args = ( "htmlencode", "some", "&", "HTML", "<>", "entities" );
system #args;
But I think the question is confused.
You put quotes around $# in bash so that it expands to have each element in the array quoted. The reason to do that is so that each element of the array continues to be treated as a single argument when you pass them all to the next command.
The analogue to that in Perl is when you want to pass those parameters to another external command. If you're running the external program with the backtick operators, then you'd need to quote each parameter, but if you use system, then Perl will take care of keeping all the parameters separate for you.
In fact, separate parameters are the way programs are executed on Unix anyway. The single-string command-line format is there because we need to be able to type things at the command prompt. Like all shells, bash has special rules about how to split that single string into multiple arguments. But if you already have them separated in a Perl array, don't put them back into a single string. Keep them separate.