I am trying to use variable interpolation in a replacement string including $1, $2,...
However, I can't get it to expand $1 into the replacement. I eventually will have the
$pattern and $replacement variables be read from a configuration file, but even
setting them manually doesn't work.
In the example script, you can see that the $1 (which should be 'DEF') is not
expanded in $new_name, but it is in $new_name2 (without variables).
Adding an 'e' flag to the substitution doesn't help.
How do I fix this?
Matt
EXAMPLE CODE:
#!/usr/local/bin/perl
use strict;
my $old_name = 'ABC_DEF_GHI';
my $pattern = 'ABC_(...)_GHI';
my $replacement = 'CBA_${1}_IHG';
# using variables - doesn't work
my $new_name = $old_name;
$new_name =~ s|$pattern|$replacement|;
printf("%s --> %s\n", $old_name, $new_name);
# not using variables - does work
my $new_name2 = $old_name;
$new_name2 =~ s|ABC_(...)_GHI|CBA_${1}_IHG|;
printf("%s --> %s\n", $old_name, $new_name2);
OUTPUT:
ABC_DEF_GHI --> CBA_${1}_IHG
ABC_DEF_GHI --> CBA_DEF_IHG
You need to do this changes in your code:
my $replacement = '"CBA_$1_IHG"'; #note the single and double quotes
# ...
$new_name =~ s|$pattern|$replacement|ee; #note the double "ee", double evaluation
See this SO answer for more information
/e treat $replacement as Perl code. The Perl code $replacement simply returns the value it contains.
If you want to evaluate the contents of $replacement as Perl code, you need
s/$search/ my $s = eval $replacement; die $# if $#; $s /e
which can be written as
s/$search/$replacement/ee
Note that since $replacement is expected to contain Perl code, it means that this can be used to execute arbitrary Perl code.
A better solution is to realise you are writing your own subpar templating system, and use an existing one instead. String::Interpolate understands the templating syntax you are currently using:
use String::Interpolate qw( interpolate );
s/$search/interpolate $replace/e
Related
Hopefully you can help a scientist to decipher whats wrong with the code I'm trying to run to clean up some NGS results. The Perl file itself comes from https://github.com/mtokuyama/ERVmap, though I am posting the code below for reference. The other Perl files in the package work just fine and, while I have built a passing ability to use the linux terminal, Perl is a little beyond me.
The linux terminal I'm using is currently running: Ubuntu 16.04.6 LTS
This is the Perl code I'm trying to run using the following command line on linux as instructed by their GitHub page:
perl clean_htseq.pl ./ c c2 __
#!/usr/bin/env perl
#$Id: run_clean_htseq.pl,v 1.2 2015/03/02 17:24:35 yk336 Exp $
#
# create pbs file
#
use warnings;
use strict;
use File::Basename;
use POSIX;
my $dir = shift;
my $e1 = shift;
my $e2 = shift;
my $stop = shift;
die "$e1 eq $e2" if ($e1 eq $e2);
my $find = "find $dir -name \"*${e1}\"";
my $out = `$find`;
my #files = split(/\n/, $out);
for my $f (#files) {
my $o = $f;
$o =~ s/${e1}$/$e2/;
my $cmd = "./clean_htseq.pl $stop $f > $o";
print "$cmd\n";
system($cmd);
}
The first error that I had was that the _clean_htseq.pl_ wasn't found (line 30, already altered to solution) which i solved by adding the ./ in front of it and giving the software permission to use the script file.
My current issue with the code/command line is the following error:
Use of uninitialized value $e2 in string eq at ./clean_htseq.pl line 18.
find: warning: Unix filenames usually don't contain slashes (though pathnames do). That means that '-name ‘*./SRR7251667.c’' will probably evaluate to false all the time on this system. You might find the '-wholename' test more useful, or perhaps '-samefile'. Alternatively, if you are using GNU grep, you could use 'find ... -print0 | grep -FzZ ‘*./SRR7251667.c’'.
This has been tracked down to the "__" at the end of the command line, while i'm sure this is supposed to mean something to the script I removed it and resulted in the following error:
Use of uninitialized value $stop in concatenation (.) or string at clean_htseq.pl line 30.
./clean_htseq.pl ./SRR7251667.c > ./SRR7251667.c2
Use of uninitialized value $e1 in string eq at ./clean_htseq.pl line 18.
Use of uninitialized value $e2 in string eq at ./clean_htseq.pl line 18.
Use of uninitialized value $e1 in concatenation (.) or string at ./clean_htseq.pl line 18.
Use of uninitialized value $e2 in concatenation (.) or string at ./clean_htseq.pl line 18.
eq at ./clean_htseq.pl line 18.
An error occurs too when I remove the "." from "./" but it comes back with an error about not finding the _clean_htseq.pl_ file which is in the working directory.
Your problem seems to be here:
my $dir = shift;
my $e1 = shift;
my $e2 = shift;
my $stop = shift;
Outside of a subroutine, shift works on #ARGV—the array that holds the command line arguments. You shift four times, so you need four arguments:
perl clean_htseq.pl ./ c c2 __
You only seem to give it two, and $stop has no value (so you are giving it less than two):
./clean_htseq.pl $stop $f
You can't just remove arguments and hope things still work out. Likely you're going to have to look at the source to see what those things mean (which should motivate you as a scientist to use good variable names and document code—Best Practices for Scientific Computing).
A first step may be to set defaults. The defined-or operator does well here:
use v5.10;
my $dir = shift // 'default_dir';
my $e1 = shift // 'default_value';
my $e2 = shift // 'default_value';
my $stop = shift // 'default_value';
Or, you could just give up if there aren't enough arguments. An array in scalar context gives you the number of elements in the array (although it doesn't guarantee anything about their values):
die "Need four arguments!\n" unless #ARGV == 4;
There are various other improvements which would help this script, some of which I go through in the "Secure Programming Techniques" chapter in Mastering Perl. Taking unchecked user input and passing it to another program is generally not a good idea.
I am working on a script that has some variables which are passed on to a string and then they a printed out. The initial string was only 6 lines I didn't need an external file for it but I now have a new string which can fill over 1000 lines. The new string also has some fields that are to be replaced by variables declared in the script.
The text file has something like:
Hello $name
The code is supposed to have several parts to it.
Declaration of variable
my $name = 'Foo';
Open file and read it into a string.
my $content;
open(my $fh, '<', $filename) or die "cannot open file $filename";
{
local $/;
$content = <$fh>;
}
close($fh);
Print string
print $content
Expected outcome:
Hello Foo
I am wondering if it's possible to read "Hello $name" from a file but print it as "Hello Foo" since the variable name is declared as Foo.
So you want your file to be a template. Why not use a proper template language like this one?
use Template qw( );
my %vars = (
name => "Foo",
);
my $tt = Template->new();
$tt->process($qfn, \%vars)
or die($tt->error());
Template:
Hello [% name %]
The output can be captured instead of printed by using ->process's third arg.
Simplest way:
my $foo = 'Fred';
my $bar = 'Barney';
my $string = 'Say hello to $foo and $bar';
say eval qq{"$string"}
The correct answer to the question (as you've already seen) is to use a proper templating system instead.
But it's worth noting that this is answered in the Perl FAQ.
How can I expand variables in text strings?
If you can avoid it, don't, or if you can use a templating system, such as Text::Template or Template Toolkit, do that instead. You might even be able to get the job done with sprintf or printf:
my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
However, for the one-off simple case where I don't want to pull out a full templating system, I'll use a string that has two Perl scalar variables in it. In this example, I want to expand $foo and $bar to their variable's values:
my $foo = 'Fred';
my $bar = 'Barney';
$string = 'Say hello to $foo and $bar';
One way I can do this involves the substitution operator and a double /e flag. The first /e evaluates $1 on the replacement side and turns it into $foo. The second /e starts with $foo and replaces it with its value. $foo, then, turns into 'Fred', and that's finally what's left in the string:
$string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
The /e will also silently ignore violations of strict, replacing undefined variable names with the empty string. Since I'm using the /e flag (twice even!), I have all of the same security problems I have with eval in its string form. If there's something odd in $foo, perhaps something like #{[ system "rm -rf /" ]}, then I could get myself in trouble.
To get around the security problem, I could also pull the values from a hash instead of evaluating variable names. Using a single /e, I can check the hash to ensure the value exists, and if it doesn't, I can replace the missing value with a marker, in this case ??? to signal that I missed something:
my $string = 'This has $foo and $bar';
my %Replacements = (
foo => 'Fred',
);
# $string =~ s/\$(\w+)/$Replacements{$1}/g;
$string =~ s/\$(\w+)/
exists $Replacements{$1} ? $Replacements{$1} : '???'
/eg;
print $string;
If you're going to be using Perl, then it's really worth your while to spend an afternoon getting to know the FAQ.
$test='abc="def"';
$replacement='$1="ghj"';
$test =~ s/(.+)="(.+)"/"$replacement/;
print $test;
It prints:
$1=ghj
How can I treat $replacement to be interpreted?
You add the /e modifier to your regex. You need to modify your replacement string too, so that it is evaluated correctly. Double evaluation is needed to interpolate the variable.
my $test='abc="def"';
my $replacement='"$1=ghj"';
$test =~ s/(.+)="(.+)"/$replacement/ee;
print $test;
Output:
abc=ghj
It should be noted that this is somewhat unsafe, especially if others can affect the value of your replacement. Then they can execute arbitrary code on your system.
There are approximately 3 answers to this question.
Your replacement "string" is actually code to be evaluated at match time to generate the replacement string. That is, it is better represented as a function:
my $test = 'abc="def"';
my $replacement = sub { $1 . '="ghj"' };
$test =~ s/(.+)="(.+)"/$replacement->()/e;
print $test;
If you don't need the full power of arbitrary Perl expressions (or if your replacement string comes from an external source), you can also treat it as a template to be filled in with the match results. There is a module that encapsulates this in the form of a JavaScript-like replace function, Data::Munge:
use Data::Munge qw(replace);
my $test = 'abc="def"';
my $replacement = '$1="ghj"';
$test = replace $test, qr/(.+)="(.+)"/, $replacement;
print $test;
Finally, you can represent Perl code as a string to be eval'd. This is not only inefficient but also fraught with quoting issues (you have to make sure everything in $replacement is syntactically valid Perl) and security holes (if $replacement is generated at runtime, especially if it comes from an external source). My least favorite approach:
my $test = 'abc="def"';
my $replacement = '$1 . "=\\"ghj\\""';
$test =~ s/(.+)="(.+)"/eval $replacement/e;
print $test;
(The s//eval $foo/e part can also be written as s//$foo/ee. I don't like to do that because eval is evil and shouldn't be more hidden than it already is.)
I am getting the an error while reading a file and below is the script.
#!/bin/bash
$file = "SampleLogFile.txt"; #--- line 2
open(MYINPUTFILE,$file); #--- line 3
while(<**MYINPUTFILE**>) {
# Good practice to store $_ value because
# subsequent operations may change it.
my($line) = $_;
# Good practice to always strip the trailing
# newline from the line.
chomp($line);
# Convert the line to upper case.
print "$line" if $line = ~ /sent/;
}
close (MYINPUTFILE);
Output :
PerlTesting_New.ksh[2]: =: not found
PerlTesting_New.ksh[3]: syntax error at line 3 : `(' unexpected
Any idea what the issue is ?
Change
#!/bin/bash
to
#!/usr/bin/perl
Otherwise Perl will not be interpreting your script. Change path accordingly as per your system
Okay, whoever is teaching you to write Perl like this needs to move out of the nineties.
#!/usr/bin/perl
use strict; # ALWAYS
use warnings; # Also always.
# When you learn more you can selectively turn off bits of strict and warnings
# functionality on an as needed basis.
use IO::File; # A nice OO module for working with files.
my $file_name = "SampleLogFile.txt"; # note that we have to declare $file now.
my $input_fh = IO::File->new( $file_name, '<' ); # Open the file read-only using IO::File.
# You can avoid assignment through $_ by assigning to a variable, even when you use <$fh>
while( my $line = $input_fh->getline() ) {
# chomp($line); # Chomping is usually a good idea.
# In this case it does nothing but screw up
# your output, so I commented it out.
# This does nothing of the sort:
# Convert the line to upper case.
print "$line" if $line = ~ /sent/;
}
You can also do this with a one liner:
perl -pe '$_ = "" unless /sent/;' SampleLogFile.txt
See perlrun for more info on one-liners.
hmm, your first line : #!/bin/bash
/bin/bash : This is the Bash shell.
You may need to change to
!/usr/bin/perl
I'm running into a little trouble with Perl's built-in split function. I'm creating a script that edits the first line of a CSV file which uses a pipe for column delimitation. Below is the first line:
KEY|H1|H2|H3
However, when I run the script, here is the output I receive:
Col1|Col2|Col3|Col4|Col5|Col6|Col7|Col8|Col9|Col10|Col11|Col12|Col13|
I have a feeling that Perl doesn't like the fact that I use a variable to actually do the split, and in this case, the variable is a pipe. When I replace the variable with an actual pipe, it works perfectly as intended. How could I go about splitting the line properly when using pipe delimitation, even when passing in a variable? Also, as a silly caveat, I don't have permissions to install an external module from CPAN, so I have to stick with built-in functions and modules.
For context, here is the necessary part of my script:
our $opt_h;
our $opt_f;
our $opt_d;
# Get user input - filename and delimiter
getopts("f:d:h");
if (defined($opt_h)) {
&print_help;
exit 0;
}
if (!defined($opt_f)) {
$opt_f = &promptUser("Enter the Source file, for example /qa/data/testdata/prod.csv");
}
if (!defined($opt_d)) {
$opt_d = "\|";
}
my $delimiter = "\|";
my $temp_file = $opt_f;
my #temp_file = split(/\./, $temp_file);
$temp_file = $temp_file[0]."_add-headers.".$temp_file[1];
open(source_file, "<", $opt_f) or die "Err opening $opt_f: $!";
open(temp_file, ">", $temp_file) or die "Error opening $temp_file: $!";
my $source_header = <source_file>;
my #source_header_columns = split(/${delimiter}/, $source_header);
chomp(#source_header_columns);
for (my $i=1; $i<=scalar(#source_header_columns); $i++) {
print temp_file "Col$i";
print temp_file "$delimiter";
}
print temp_file "\n";
while (my $line = <source_file>) {
print temp_file "$line";
}
close(source_file);
close(temp_file);
The first argument to split is a compiled regular expression or a regular expression pattern. If you want to split on text |. You'll need to pass a pattern that matches |.
quotemeta creates a pattern from a string that matches that string.
my $delimiter = '|';
my $delimiter_pat = quotemeta($delimiter);
split $delimiter_pat
Alternatively, quotemeta can be accessed as \Q..\E inside double-quoted strings and the like.
my $delimiter = '|';
split /\Q$delimiter\E/
The \E can even be omitted if it's at the end.
my $delimiter = '|';
split /\Q$delimiter/
I mentioned that split also accepts a compiled regular expression.
my $delimiter = '|';
my $delimiter_re = qr/\Q$delimiter/;
split $delimiter_re
If you don't mind hardcoding the regular expression, that's the same as
my $delimiter_re = qr/\|/;
split $delimiter_re
First, the | isn't special inside doublequotes. Setting $delimiter to just "|" and then making sure it is quoted later would work or possibly setting $delimiter to "\\|" would be ok by itself.
Second, the | is special inside regex so you want to quote it there. The safest way to do that is ask perl to quote your code for you. Use the \Q...\E construct within the regex to mark out data you want quoted.
my #source_header_columns = split(/\Q${delimiter}\E/, $source_header);
see: http://perldoc.perl.org/perlre.html
It seems as all you want to do is count the fields in the header, and print the header. Might I suggest something a bit simpler than using split?
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/\w+/"Col" . ++$count/eg;
print "$str\n";
Works with most any delimeter (except alphanumeric and underscore), it also saves the number of fields in $count, in case you need it later.
Here's another version. This one uses the character class brackets instead, to specify "any character but this", which is just another way of defining a delimeter. You can specify delimeter from the command-line. You can use your getopts as well, but I just used a simple shift.
my $d = shift || '[^|]';
if ( $d !~ /^\[/ ) {
$d = '[^' . $d . ']';
}
my $str="KEY|H1|H2|H3";
my $count=0;
$str =~ s/$d+/"Col" . ++$count/eg;
print "$str\n";
By using the brackets, you do not need to worry about escaping metacharacters.
#!/usr/bin/perl
use Data::Dumper;
use strict;
my $delimeter="\\|";
my $string="A|B|C|DD|E";
my #arr=split(/$delimeter/,$string);
print Dumper(#arr)."\n";
output:
$VAR1 = 'A';
$VAR2 = 'B';
$VAR3 = 'C';
$VAR4 = 'DD';
$VAR5 = 'E';
seems you need define delimeter as \\|