Perl readline problem - perl

I'd to read a file, e.g. test.test which contains
#test:testdescription\n
#cmd:binary\n
#return:0\n
#stdin:|\n
echo"toto"\n
echo"tata"\n
#stdout:|\n
toto\n
tata\n
#stderr:\n
I succeeded in taking which are after #test: ; #cmd: etc...
but for stdin or stdout, I want to take all the line before the next # to a table #stdin and #stdout.
I do a loop while ($line = <TEST>) so it will look at each line. If i see a pattern /^#stdin:|/, I want to move to the next line and take this value to a
table until i see the next #.
How do I move to the next line in the while loop?

This file format can be easily handled with some creativity in selecting the appropriate value for $/:
use strict; use warnings;
my %parsed;
{
local $/ = '#';
while ( my $line = <DATA> ) {
chomp $line;
my $content = (split /:/, $line, 2)[1];
next unless defined $content;
$content =~ s/\n+\z//;
if ( my ($chan) = $line =~ /^(std(?:err|in|out))/ ) {
$content =~ s/^\|\n//;
$parsed{$chan} = [ split /\n/, $content];
}
elsif ( my ($var) = $line =~ /^(cmd|return|test)/ ) {
$parsed{ $var } = $content;
}
}
}
use YAML;
print Dump \%parsed;
__DATA__
#test:testdescription
#cmd:binary
#return:0
#stdin:|
echo"toto"
echo"tata"
#stdout:|
toto
tata
#stderr:
Output:
---
cmd: binary
return: 0
stderr: []
stdin:
- echo"toto"
- echo"tata"
stdout:
- toto
- tata
test: testdescription

UPDATED as per user's colmments
If I understand the question correctly, you want to read one more line within a loop?
If so, you can either:
just do another line read inside the loop.
my $another_line = <TEST>;
Keep some state flag and use it next iteration of the loop, and accumulate lines between stdins in a buffer:
my $last_line_was_stdin = 0;
my #line_buffer = ();
while ($line = <TEST>) {
if (/^#stdin:|/) {
#
# Some Code to process all lines acccumulated since last "stdin"
#
#line_buffer = ();
$last_line_was_stdin = 1;
next;
}
push #line_buffer, $line;
}
This solution may not do 100% of what you need but it defines a pattern you need to follow in your state machine implementation: read a line. Check your current state (if it matters). Based on the current state and a pattern in the line, verify what do do about the current line (add to the buffer? change the state? If changing a state, process the buffer based on last state?)
Also, as per your comment, you have a bug in your regex - the pipe (| character) means "OR" in regex, so you are saying "if line starts with #stdin OR matches an empty regex" - the latter part is always true so your regex will match 100% of time. You need to escape the "|" via /^#stdin:\|/ or /^#stdin:[|]/

Related

Extract and filter a range of lines from the input using Perl

I'm quite new to Perl and I have some problems in skipping lines using a foreach loop. I want to copy some lines of a text file to a new one.
When the first words of a line are FIRST ITERATION, skip two more lines and print everything following until the end of the file or an empty line is encountered.
I've tried to find out a similar post but nobody talks about working with text files.
This is the form I thought of
use 5.010;
use strict;
use warnings;
open( INPUT, "xxx.txt" ) or die("Could not open log file.");
open( OUT, ">>yyy.txt" );
foreach my $line (<INPUT>) {
if ( $line =~ m/^FIRST ITERATION/ ) {
# print OUT
}
}
close(OUT);
close(INFO);
I tried using next and $line++ but my program prints only the line that begins with FIRST ITERATION.
I may try to use a for loop but I don't know how many lines my file may have, nor do I know how many lines there are between "First Iteration" and the next empty line.
The simplest way is to process the file a line at a time and keep a state flag which is set to 1 if the current line is begins with FIRST ITERATION and 0 if it is blank, otherwise it is incremented if it is already positive so that it provides a count of the line number within the current block
This solution expects the path to the input file as a parameter on the command line and prints its output to STDOUT, so you will need to redirect the output to the file on the command line as necessary
Note that the regex pattern /\S/ checks whether there is a non-blank character anywhere in the current line, so not /\S/ is true if the line is empty or all blank characters
use strict;
use warnings;
my $lines = 0;
while ( <> ) {
if ( /^FIRST ITERATION/ ) {
$lines = 1;
}
elsif ( not /\S/ ) {
$lines = 0;
}
elsif ( $lines > 0 ) {
++$lines;
}
print if $lines > 3;
}
This can be simplified substantially by using Perl's built-in range operator, which keeps its own internal state and returns the number of times it has been evaluated. So the above may be written
use strict;
use warnings;
while ( <> ) {
my $s = /^FIRST ITERATION/ ... not /\S/;
print if $s and $s > 3;
}
And the last can be rewritten as a one-line command line program like this
$ perl -ne '$s = /^FIRST ITERATION/ ... not /\S/; print if $s and $s > 3' myfile.txt
Use additional counter, that will say on which condition print line. Something like this:
$skipCounter = 3;
And in foreach:
if ($skipCounter == 2) {
// print OUT
}
if ( $line =~ m/^FIRST ITERATION/) {
$skipCounter = 0;
}
$skipCounter++;
Advice: Use STDIN and STDOUT instead of files, this will allowes you to change them without modifying script
Code:
#!/usr/bin/perl
use 5.010;
use strict;
use warnings;
open(INPUT, "xxx.txt" ) or die "Could not open log file: $!.";
open(OUT, ">yyy.txt") or die "Could not open output file: $!";
while( my $line = <INPUT> )
{
if ( $line =~ m/^FIRST ITERATION/) {
<INPUT>; # skip line
<INPUT>; # skip line
while( $line = <INPUT>) # print till empty line
{
last if $line eq "\n";
print OUT $line;
}
};
};
close (OUT);
close (INPUT);
You're on the right track. What you need to use is the flip-flop operator (which is basically the range operator) ... It will toggle for you between two matches, so you get everything in between. After that, it's a matter of keeping track of the lines you want to skip.
So basically we are checking for FIRST ITERATION and for an empty line, and grab everything in between those. $skip is used to remember how many lines were skipped. It starts at 0 and gets incremented for the first two lines after we start being in the flip-flop if block. In the else case, where we are after the flip-flop, it gets reset to 0 so we can start over with the next block.
Since you know how to open and write files, I'll skip that.
use strict;
use warnings;
my $skip = 0;
while (<DATA>) {
if (/^FIRST ITERATION$/ .. /^$/) {
next if $skip++ <= 2;
print $_;
} else {
$skip = 0;
}
}
__DATA__
FIRST ITERATION
skip1
skip2
foo
bar
baz
don't print this
The output of this is:
foo
bar
baz
To stick with your own code, here's a very verbose solution that uses a foreach and no flip-flop. It does the same thing, just with a lot more words.
my $skip = 0; # skip lines
my $match = 0; # keep track of if we're in between the borders
foreach my $line (<DATA>) {
if ( $line =~ m/^FIRST ITERATION/ ) {
$match = 1; # we are inside the match
next;
}
if ($line =~ m/^$/) {
$match = 0; # we are done matching
next;
}
if ($match) {
$skip++; # count skip-lines
if ($skip <= 2) {
next; # ... and skip the first two
}
print $line; # this is the content we want
}
}
Using paragraph mode (which returns blocks separated by blank lines rather than lines):
local $/ = ""; # Paragraph mode.
while (<>) {
s/\n\K\n+//; # Get rid of trailing empty lines.
print /^FIRST ITERATION\n[^\n]*\n[^\n]*\n(.*)/ms;
}
Using the flip-flop operator:
while (<>) {
if (my $line_num = /^FIRST ITERATION$/ .. /^$/) {
print if $line_num > 3 && $line_num !~ /E0/;
}
}
$line_num !~ /E0/ is true when the flip-flop is flopping (i.e. for the first empty line after FIRST ITERATION). This is checked to avoid printing the blank line.

Grep using perl

I'm trying to grep multiple patterns from a log file using perl. For the first pattern i'm getting the desired matching pattern via read only variable($1,$2..). But for the next pattern the read only variable is returning the previous value but not the value matching the second pattern.
here is the code:
$tmp = `grep "solo_video_channel_.*(0): queueing" $log`;
chomp($tmp);
$tmp =~ m/(.*):.*solo_video_channel_write(.*): queueing page (.*).*/;
$chnl = $2;
$page = $3;
$timestamp = $1;
$tmp1 = `grep "(0): DUMP GO" $log`;
chomp($tmp1);
$tmp1 =~ m/(.*): solo_video_channel_write(0): DUMP GO/;
$dmp = $1;
print "dump go time = $1\n";
tmp1's value after grep is coming as expected. but $1 value remains same as the previous one.
Any suggestions?
Always make sure that you verify that a regex matched before using a captured variable.
Additionally, there is no reason to shell out to grep. Use Perl's file processing instead:
use strict;
use warnings;
local #ARGV = $log;
while (<>) {
chomp;
if (/solo_video_channel_.*\(0\): queueing/) {
if ( my ( $timestamp, $chnl, $page ) = m/(.*):.*solo_video_channel_write(.*): queueing page (.*).*/ ) {
print "$. - $timestamp, $chnl, $page\n";
}
}
if ( my ($dmp) = m/(.*): solo_video_channel_write\(0\): DUMP GO/ ) {
print "dump go time = $dmp\n";
}
}
Note, your first set of if's could almost certainly be combined into a single if statement, but I left it as is for now.
Why not use Pure Perl? It's faster than running external greps. Plus, you can grep both regular expressions at once. Faster than looping through the file twice.
Always check the value of your rexp match. Here I'm using if statements to do this. Note too that I am printing all lines that don't match with UNMATCHED LINES. You can remove the else when you see that everything is working, or simply redirect 2> /dev/null.
use strict;
use warnings;
use autodie;
use feature qw(say);
my $log = "log.txt";
open my $log_fh, "<", $log;
while ( my $line = <$log_fh> ) {
my $timestamp;
my $channel;
my $page;
my $gotime;
if ( $line =~ /(.*):.*solo_video_channel_(.*):\s+queueing page (.*)/ ) {
$timestamp = $1;
$channel = $2;
$page = $3;
say qq(Timestamp = "$timestamp" Channel = "$channel" Page = "$page");
}
elsif ( $line =~ /(.*): solo_video_channel_write(0): DUMP GO/ ) {
$gotime = $1;
say "Dump Go Time = $1";
}
else {
say STDERR qq(UNMATCHED LINES: "$line");
}
}
close $log_fh;
In the second regexp you need to escape the literal brackets
$tmp1 =~ m/(.*): solo_video_channel_write\(0\): DUMP GO/
This is because the expression \(0\) matches the exact pattern (0)
In the example given in this answer this would include strings such as
37: solo_video_channel_write(0): DUMP GO
In contrast, the expression (0) matches the exact pattern 0 and sets a capture group.
With the regexp given in your original question
$tmp1 =~ m/(.*): solo_video_channel_write(0): DUMP GO/;
matching would occur on strings such as
37: solo_video_channel_write0: DUMP GO
Of course in the original program the strings are not in this format, so they do not match and $1 is not set
The regular expression syntax for the shell program grep is (confusingly) different
To use round brackets for setting a capture group they must be escaped with a backslash, which is the opposite to the syntax in perl

find a match and replace next line in perl

I am working on the perl script and need some help with it. The requirement is, I have to find a lable and once the label is found, I have to replace the word in a line immediately following the label. for Example, if the label is ABC:
ABC:
string to be replaced
some other lines
ABC:
string to be replaced
some other lines
ABC:
string to be replaced
I want to write a script to match the label (ABC) and once the label is found, replace a word in the next line immediately following the label.
Here is my attempt:
open(my $fh, "<", "file1.txt") or die "cannot open file:$!";
while (my $line = <$fh>))
{
next if ($line =~ /ABC/) {
$line =~ s/original_string/replaced_string/;
}
else {
$msg = "pattern not found \n ";
print "$msg";
}
}
Is this correct..? Any help will be greatly appreciated.
The following one-liner will do what you need:
perl -pe '++$x and next if /ABC:/; $x-- and s/old/new/ if $x' inFile > outFile
The code sets a flag and gets the next line if the label is found. If the flag is set, it's unset and the substitution is executed.
Hope this helps!
You're doing this in your loop:
next if ($line =~ /ABC/);
So, you're reading the file, if a line contains ABC anywhere in that line, you skip the line. However, for every other line, you do the replacement. In the end, you're replacing the string on all other lines and printing that out, and your not printing out your labels.
Here's what you said:
I have to read the file until I find a line with the label:
Once the label is found
I have to read the next line and replace the word in a line immediately following the label.
So:
You want to read through a file line-by-line.
If a line matches the label
read the next line
replace the text on the line
Print out the line
Following these directions:
use strict;
use warnings; # Hope you're using strict and warnings
use autodie; # Program automatically dies on failed opens. No need to check
use feature qw(say); # Allows you to use say instead of print
open my $fh, "<", "file1.txt"; # Removed parentheses. It's the latest style
while (my $line = <$fh>) {
chomp $line; # Always do a chomp after a read.
if ( $line eq "ABC:" ) { # Use 'eq' to ensure an exact match for your label
say "$line"; # Print out the current line
$line = <$fh> # Read the next line
$line =~ s/old/new/; # Replace that word
}
say "$line"; # Print the line
}
close $fh; # Might as well do it right
Note that when I use say, I don't have to put the \n on the end of the line. Also, by doing my chomp after my read, I can easily match the label without worrying about the \n on the end.
This is done exactly as you said it should be done, but there are a couple of issues. The first is that when we do $line = <$fh>, there's no guarantee we are really reading a line. What if the file ends right there?
Also, it's bad practice to read a file in multiple places. It makes it harder to maintain the program. To get around this issue, we'll use a flag variable. This allows us to know if the line before was a tag or not:
use strict;
use warnings; # Hope you're using strict and warnings
use autodie; # Program automatically dies on failed opens. No need to check
use feature qw(say); # Allows you to use say instead of print
open my $fh, "<", "file1.txt"; # Removed parentheses. It's the latest style
my $tag_found = 0; # Flag isn't set
while (my $line = <$fh>) {
chomp $line; # Always do a chomp after a read.
if ( $line eq "ABC:" ) { # Use 'eq' to ensure an exact match for your label
$tag_found = 1 # We found the tag!
}
if ( $tag_found ) {
$line =~ s/old/new/; # Replace that word
$tag_found = 0; # Reset our flag variable
}
say "$line"; # Print the line
}
close $fh; # Might as well do it right
Of course, I would prefer to eliminate mysterious values. For example, the tag should be a variable or constant. Same with the string you're searching for and the string you're replacing.
You mentioned this was a word, so your regular expression replacement should probably look like this:
$line =~ s/\b$old_word\b/$new_word/;
The \b mark word boundaries. This way, if you're suppose to replace the word cat with dog, you don't get tripped up on a line that says:
The Jeopardy category is "Say what".
You don't want to change category to dogegory.
Your problem is that reading in a file does not work like that. You're doing it line by line, so when your regex tests true, the line you want to change isn't there yet. You can try adding a boolean variable to check if the last line was a label.
#!/usr/bin/perl;
use strict;
use warnings;
my $found;
my $replacement = "Hello";
while(my $line = <>){
if($line =~ /ABC/){
$found = 1;
next;
}
if($found){
$line =~ s/^.*?$/$replacement/;
$found = 0;
print $line, "\n";
}
}
Or you could use File::Slurp and read the whole file into one string:
use File::Slurp;
$x = read_file( "file.txt" );
$x =~ s/^(ABC:\s*$ [\n\r]{1,2}^.*?)to\sbe/$1to was/mgx;
print $x;
using /m to make the ^ and $ match embedded begin/end of lines
x is to allow the space after the $ - there is probably a better way
Yields:
ABC:
string to was replaced
some other lines
ABC:
string to was replaced
some other lines
ABC:
string to was replaced
Also, relying on perl's in-place editing:
use File::Slurp qw(read_file write_file);
use strict;
use warnings;
my $file = 'fakefile1.txt';
# Initialize Fake data
write_file($file, <DATA>);
# Enclosed is the actual code that you're looking for.
# Everything else is just for testing:
{
local #ARGV = $file;
local $^I = '.bac';
while (<>) {
print;
if (/ABC/ && !eof) {
$_ = <>;
s/.*/replaced string/;
print;
}
}
unlink "$file$^I";
}
# Compare new file.
print read_file($file);
1;
__DATA__
ABC:
string to be replaced
some other lines
ABC:
string to be replaced
some other lines
ABC:
string to be replaced
ABC:
outputs
ABC:
replaced string
some other lines
ABC:
replaced string
some other lines
ABC:
replaced string
ABC:

Perl --Change in all matches in previous lines according to match in current line

File I want to parse:
input Pattern;
input SDF;
input ABC
input Pattern;
output Pattern;
output XYZ;
In perl, usual operation is scan line by line.
I want to check that if
current line has output Pattern; and previous line (or all previous lines)has input Pattern;
then change all the previous lines matches to "input Pattern 2;" and current line to "output Pattern2;".
It is complicated ,I hope I have explained properly.
Is it possible in Perl to scan and change previous lines after they have been read?
Thanks
If this is your data:
my $sfile =
'input Pattern;
input SDF;
input ABC
input Pattern;
output Pattern;
output XYZ;' ;
then, the following snippet will read the whole file and change text accordingly:
open my $fh, '<', \$sfile or die $!;
local $/ = undef; # set file input mode to 'slurp'
my $content = <$fh>;
close $fh;
$content =~ s{ ( # open capture group
input \s+ (Pattern); # find occurence of input pattern
.+? # skip some text
output \s+ \2 # find same for output
) # close capture group
}
{ # replace by evaluated expression
do{ # within a do block
local $_=$1; # get whole match to $_
s/($2)/$1 2/g; # substitute Pattern by Pattern 2
$_ # return substituted text
} # close do block
}esgx;
Then, you may close your file and check the string:
print $content;
=>
input Pattern 2;
input SDF;
input ABC
input Pattern 2;
output Pattern 2;
output XYZ;
You may even include a counter $n which will be incremented after each successful match (by code assertion (?{ ... }):
our $n = 1;
$content =~ s{ ( # open capture group
input \s+ (Pattern); # find occurence of input pattern
.+? # skip some text
output \s+ \2 # find same for output
) # close capture group
(?{ $n++ }) # ! update match count
}
{ # replace by evaluated expression
do{ # within a do block
local $_=$1; # get whole match to $_
s/($2)/$1 $n/g; # substitute Pattern by Pattern and count
$_ # return substituted text
} # close do block
}esgx;
The substitution will now start with input Pattern 2; und increment subsequently.
I think this will do what you need, but try it on a 'scratch' file first (a copy of the original) since it actually changes the file:
use Modern::Perl;
open my $fh_in, '<', 'parseThis.txt' or die $!;
my #fileLines = <$fh_in>;
close $fh_in;
for ( my $i = 1 ; $i < scalar #fileLines ; $i++ ) {
next
if $fileLines[$i] !~ /output Pattern;/
and $fileLines[ $i - 1 ] !~ /input Pattern;/;
$fileLines[$i] =~ s/output Pattern;/output Pattern2;/g;
$fileLines[$_] =~ s/input Pattern;/input Pattern 2;/g for 0 .. $i - 1;
}
open my $fh_out, '>', 'parseThis.txt' or die $!;
print $fh_out #fileLines;
close $fh_out;
Results:
input Pattern 2;
input SDF;
input ABC;
input Pattern 2;
output Pattern2;
output XYZ;
Hope this helps!
#!/usr/bin/env perl
$in1 = 'input Pattern';
$in2 = 'input Pattern2';
$out1 = 'output Pattern';
$out2 = 'output Pattern2';
undef $/;
$_ = <DATA>;
if (/^$in1\b.*?^$out1\b/gms) {
s/(^$in1\b)(?=.*?^$out1\b)/$in2/gms;
s/^$out1\b/$out2/gms;
}
print;
__DATA__
input Pattern;
input SDF;
input ABC;
input Pattern;
output Pattern;
output XYZ;
Will there be additional "Input pattern1: lines folloring an occurence of "Output Patttern1?"
Are there going to be multiple pattern to search for, or will it just be "If we find Output Pattern1 then perform the replacement?
Will the "output pattern occur multiple times, or just once?
Will there be additional "Input pattern1: lines folloring an occurence of "Output Patttern1?"
I would perform this task in two/mutiple passes:
Pass1 - read the file, looking for the matching output lines, store the line number in memory.
Pass 2 - read the file, and based on the line numbers in the set of matches, perform the replacement on the appropriate Input lines.
So in semi-perlish, untested psuedocode:
my #matches = ();
open $fh, $inputfile, '<';
while (<$fh>) {
if (/Pattern1/) {
push #matches, $.;
}
}
close $fh;
open $fh, $inputfile, '<';
while (<$fh>) {
if ($. <= $matches[-1]) {
s/Input Pattern1/Input Pattern2/;
print ;
}
else {
pop #matches);
last unless #matches;
}
}
close $fh;
You run this like:
$ replace_pattern.pl input_file > output_file
You'll need to adjust it a little to meet your exact needs, but that should get you close.
You cannot go back and change lines in Perl. What you can do is open the file for the first time in read mode, find out which line has the pattern (say the 5th line), close it before gulping the entire file into an array, open it again in write mode, modify the contents of the array upto the 5th line, dump that array into that file, and close it. Something like this (assuming each file will have at most one output pattern):
my #arr;
my #files = ();
while (<>) {
if ($. == 0) {
$curindex = undef;
#lines = ();
push #files, $ARGV;
}
push #lines, $_;
if (/output pattern/) { $curindex = $. }
if (eof) {
push #arr, [\#lines, $curindex];
close $ARGV;
}
}
for $file (#files) {
open file, "> $file";
#currentfiledetails = #{ $arr[$currentfilenumber++] };
#currentcontents = #{ $currentfiledetails[0] };
$currentoutputmarker = $currentfiledetails[1];
if ($currentoutputmarker) {
for (0 .. $currentoutputmarker - 2) {
$currentcontents[$_] =~ s/input pattern/input pattern2/g;
}
$currentcontents[$currentoutputmarker - 1] =~
s/output pattern/output pattern2/g;
}
print file for #currentcontents;
close file;
}

Perl and Environment Variables

Some of the environment variables which we use in Unix are as below (just an example):
VAR1=variable1
VAR2=variable2
VAR3=variable3
# and so on
Now, I have a perl script (let's call it test.pl) which reads a tab delimited text file (let's call it test.txt) and pushes the contents of it columnwise in separate arays. The first column of test.txt contains the following information for example (the strings in first column are delimited by / but I do not know how may / a string would contain and at what position the environment variable would appear):
$VAR1/$VAR2/$VAR3
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2
The extract of the script is as below:
use strict;
my $input0 = shift or die "must provide test.txt as the argument 0\n";
open(IN0,"<",$input0) || die "Cannot open $input0 for reading: $!";
my #first_column;
while (<IN0>)
{
chomp;
my #cols = split(/\t/);
my $first_col = `eval $cols[0]`; #### but this does not work
# here goes the push stmt to populate the array
### more code here
}
close(IN0);
Question: How can I access environment variables in such a situation so that the array is populated as below:
$first_column[0] = variable1/vraible2/variable3
$first_column[1] = variable3/some_string/SOME_OTHER_STRING/and_so_on/variable2
$first_column[2] = variable2/variable1/some_string/some_string_2/some_string_3/some_string_n/variable2
I think you are looking for a way to process configuration files. I like Config::Std for that purpose although there are many others on CPAN.
Here is a way of processing just the contents of $cols[0] to show in an explicit way what you need to do with it:
#!/usr/bin/perl
use strict; use warnings;
# You should not type this. I am assuming the
# environment variables are defined in the environment.
# They are here for testing.
#ENV{qw(VAR1 VAR2 VAR3)} = qw(variable1 variable2 variable3);
while ( my $line = <DATA> ) {
last unless $line =~ /\S/;
chomp $line;
my #components = split qr{/}, $line;
for my $c ( #components ) {
if ( my ($var) = $c =~ m{^\$(\w+)\z} ) {
if ( exists $ENV{$var} ) {
$c = $ENV{$var};
}
}
}
print join('/', #components), "\n";
}
__DATA__
$VAR1/$VAR2/$VAR3
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2
Instead of the split/join, you can use s/// to replace patterns that look like variables with the corresponding values in %ENV. For illustration, I put a second column in the __DATA__ section which is supposed to stand for a description of the path, and turned each line in to a hashref. Note, I factored out the actual substitution to eval_path so you can try alternatives without messing with the main loop:
#!/usr/bin/perl
use strict; use warnings;
# You should not type this. I am assuming the
# environment variables are defined in the environment.
# They are here for testing.
#ENV{qw(VAR1 VAR2 VAR3)} = qw(variable1 variable2 variable3);
my #config;
while ( my $config = <DATA> ) {
last unless $config =~ /\S/;
chomp $config;
my #cols = split /\t/, $config;
$cols[0] = eval_path( $cols[0] );
push #config, { $cols[1] => $cols[0] };
}
use YAML;
print Dump \#config;
sub eval_path {
my ($path) = #_;
$path =~ s{\$(\w+)}{ exists $ENV{$1} ? $ENV{$1} : $1 }ge;
return $path;
}
__DATA__
$VAR1/$VAR2/$VAR3 Home sweet home
$VAR3/some_string/SOME_OTHER_STRING/and_so_on/$VAR2 Man oh man
$VAR2/$VAR1/some_string/some_string_2/some_string_3/some_string_n/$VAR2 Can't think of any other witty remarks ;-)
Output:
---
- Home sweet home: variable1/variable2/variable3
- Man oh man: variable3/some_string/SOME_OTHER_STRING/and_so_on/variable2
- Can't think of any other witty remarks ;-): variable2/variable1/some_string/some_string_2/some_string_3/some_string_n/variable2
I think you just want to do this:
my #cols = map { s/(\$(\w+))/ $ENV{$2} || $1 /ge; $_ } split /\t/;
What you would do here is after you split them you would take each sequence of '$' followed by word characters and check to see if there was an environment variable for the word portion of it, otherwise leave it as is.
The e switch on a substitution allows you to execute code for the replacement value.
If you expect a '0' for any environment variable value, it's better off to do a defined or, that came in with 5.10.
my #cols = map { s|(\$(\w+))| $ENV{$2} // $1 |ge; $_ } split /\t/;
(Ignore the markup. // is a defined-or, not a C-comment)
If you want to allow for full shell expansions, one option to use the shell to do the expansion for you, perhaps via echo:
$ cat input
$FOO
bar
${FOO//cat/dog}
$ FOO=cat perl -wpe '$_ = qx"echo $_"' input
cat
bar
dog
If you cannot trust the contents of the environment variable, this introduces a security risk, as invoking qx on a string may cause the shell to invoke commands embedded in the string. As a result, this scriptlet will not run under taint mode (-T).
Perl keeps its environment variables in %ENV, in your case you can change your code like so:
my $first_col = $ENV[$cols[0]];