ExtUtils::MakeMaker PERL_MM_OPT split on whitespace work-around

ExtUtils::MakeMaker PERL_MM_OPT split on whitespace work-around - perl

ExtUtils::MakeMaker splits PERL_MM_OPT on whitespace, which means that something like the following will not work.
export PERL_MM_OPT='LIBS="-L/usr/sfw/lib -lssl -lcrypto" INC=/usr/sfw/include'
Is there a known workaround for this, or will I have to avoid using PERL_MM_OPT in this scenario?
-- update --
mobrule came up with the excellent suggestion to use tabs instead of spaces. mobrule is right about it splitting on spaces only. However, the solution doesn't work because it looks like tabs are converted to spaces in environment variables.
> cat tmp.sh
export PERL_MM_OPT='LIBS="-L/usr/sfw/lib -lssl -lcrypto" INC=-I/usr/sfw/include'
echo $PERL_MM_OPT | perl -pe 's/\t/[t]/g' | perl -pe 's/ /[s]/g'
> head -1 tmp.sh | perl -pe 's/\t/[t]/g' | perl -pe 's/ /[s]/g'
export[s]PERL_MM_OPT='LIBS="-L/usr/sfw/lib[t]-lssl[t]-lcrypto"[s]INC=-I/usr/sfw/include'
> bash tmp.sh
LIBS="-L/usr/sfw/lib[s]-lssl[s]-lcrypto"[s]INC=-I/usr/sfw/include
-- update 2 --
So, the tab suggestion worked (I was misled by the behavior of echo, and came to the wrong conclusion as to why it failed,) but it doesn't solve the problem.
Now the problem is that ExtUtils/Liblist/Kid.pm isn't expecting a leading doublequote (same result happens with singlequote.)
Unrecognized argument in LIBS ignored: '"-L/usr/sfw/lib
So, it seems that the solution to this problem (if one exists) can't depend on quotes.

Actually, MakeMaker.pm splits on spaces but not on all whitespace. Could you use tabs?
export PERL_MM_OPT='LIBS="-L/usr/sfw/libTab-lsslTab-lcrypto" INC=/usr/sfw/include
I think you the way you have set environment variables with tabs is OK -- it is the echo command that is converting tabs to spaces:
$ VAR='abc^Idef'
$ echo $VAR | od -c
0000000 a b c d e f \n
0000010
That looks like it didn't work. But wait:
$ export VAR
$ perl -e 'print $ENV{VAR}' | od -c
0000000 a b c \t d e f
0000007
This still might or might not work in ExtUtils::MakeMaker depending on how the parameters in $ENV{PERL_MM_OPT} get passed to a subprocess (via system, exec, open |, etc.):
system("gcc helloworld.c -lssl\t-lcrypto\t-L/usr/sfw/lib") ### 1 ###
system("gcc", "helloworld.c", "-lssl\t-lcrypto\t-L/usr/sfw/lib") ### 2 ###
system call 1 will work because when the system call has one arg with any metacharacters, it passes the command to the shell. The shell will parse the arguments correctly.
system call 2 fails because multi-arg system always bypasses the shell and gcc is stuck looking for a library with the unlikely name of"libssl^I-lcrypto^I-L/usf/sfw/lib.a". If ExtUtils::MakeMaker is using this calling style to run the compiler, then this workaround won't get the job done.

Related

Awk inside of qsub

I have a bash script in which I have a few qsubs. Each of them are waiting for a preivous qsub to be done before starting.
My first qsub consist of sending files in a certain directory to a perl program and having the outfiles printed in a new directory. At the end, I echo the array with all my jobs names. This script works as intented.
mkdir -p /perl_files_dir
for ID_FILES in `ls Infiles_dir/*.txt`;
do
JOB_ID=`echo "perl perl_scirpt.pl $ID_FILES" | qsub -j oe `
JOB_ID_ARRAY="${JOB_ID_ARRAY}:$JOB_ID"
done
echo $JOB_ID_ARRAY
My second qsub is meant to sort all my previous files made with my perl script in a new outfile and to start after all these jobs are done (about 100 jobs) with depend=afterany. Again, this part is working fine.
SORT_JOB=`echo "sort -m -n perl_files_dir/*.txt >>sorted_file.txt" | qsub -j oe -W depend=afterany$JOB_ID_ARRAY`
SORT_ARRAY="${SORT_ARRAY}:$SORT_JOB"
My issue is that in my sorted file, I have a few columns I wish to remove (2 to 6), so I came up with this last line using awk piped to sed with another depend=afterany
SED=`echo "awk '{\$2="";\$3="";\$4="";\$5="";\$6=""; print \$0}' sorted_file.txt \
| sed 's/ //g' >final_file.txt" | qsub -j oe -W depend=afterany$SORT_ARRAY`
This last step creates final_file.txt, but leaves it empty. I added SED= before my echo because it would otherwise give me Command not found.
I tried without the pipe so it would just print everything. Unfortunately it prints nothing.
I assume it is not opening my sorted file and this is why my final file is empty after my sed. If it's the case, then why won't awk read it?
In my script, I am using variables to define my directories and files (with the correct path). I know my issue is not about find my files or directories since they are perfectly defined at the beginning and used throughout the script. I tried to write the whole path instead of a variable and I get the same results.

for ID_FILES in `ls Infiles_dir/*.txt`
Simplify this to
for ID_FILES in Infiles_dir/*.txt
ls lists the files you pass it (except when you pass it directories, then it lists their content). Rather than telling it to display a list of files and parse the output, use the list of files you already have! This is more reliable (parsing the output of ls will fail if the file names contain whitespace or wildcard characters), clearer and faster. Don't parse the output of ls.
SORT_JOB=`echo "sort -m -n perl_files_dir/*.txt >>sorted_file.txt" | qsub -j oe -W depend=afterany$JOB_ID_ARRAY`
You'd make your life simpler if you used the right form of quoting in the right place. Don't use backquotes, because it's difficult to know how to quote things inside. Use $(…) instead, it's exactly equivalent except that it is parsed in a sane way.
I recommend using a here document for the shell snippet that you're feeding to qsub. You have fewer quoting issues to worry about, and it's more readable.
While we're at it, always put double quotes around variable substitutions and command substitutions: "$some_variable", "$(some_command)". Annoyingly, $var in shell syntax doesn't mean “take the value of the variable var”, it means “take the value of the variable var, parse it as a list of wildcard patterns, and replace each pattern by the list of matching files if there are matching files”. This extra stuff is turned off if the substitution happens inside double quotes (or in a here document, by the way): "$var" means “take the value of the variable var”.
SORT_JOB=$(qsub -j oe -W depend="afterany$JOB_ID_ARRAY" <<'EOF'
sort -m -n perl_files_dir/*.txt >>sorted_file.txt
EOF
)
We now get to the snippet where the quoting was actually causing a problem.
SED=`echo "awk '{\$2="";\$3="";\$4="";\$5="";\$6=""; print \$0}' sorted_file.txt \
| sed 's/ //g' >final_file.txt" | qsub -j oe -W depend=afterany$SORT_ARRAY`
The string that becomes the argument to the echo command is:
awk '{$2=;$3=;$4=;$5=;$6=; print $0}' sorted_file.txt | sed 's/ //g' >final_file.txt
This is syntactically incorrect, and that's why you're not getting any output.
You didn't escape the double quotes inside what was meant to be the awk snippet. It's a lot clearer if you use a here document. Also, you don't need the SED= part. You added it because you had a command substitution (a command between …), which substitutes the output of a command. But since you aren't interested in the output of the qsub command, don't take its output, just execute it.
qsub -j oe -W depend="afterany$SORT_ARRAY" <<'EOF'
awk '{$2="";$3="";$4="";$5="";$6=""; print $0}' sorted_file.txt |
sed 's/ //g' >final_file.txt
EOF
I'm not familiar with qsub, but presumably there's a way to get the error output and the return status of the commands it runs. Inspect that error output, you should have seen the errors from awk.

The version of awk that I am using, does not like the character escapes
awk --version
GNU Awk 3.1.7
spuder#cent64$ awk '{\$2="";\$3="";\$4=""; print \$0}' foo.txt
awk: {\$2="";\$3="";\$4=""; print \$0}
awk: ^ backslash not last character on line
Try the following syntax
awk '{for(i=2;i<=7;i++) $i="";print}' foo.txt
As a side note, if you are using Torque 4.x you may not be able to use a comma separated list of jobs with -W depend=, instead you may need to create a new PBS declarative (-W) for each job.
eg...
#Invalid syntax in newer versions of torque
qsub -W depend=foo,bar
Resources
backslash in gawk fields
Print all but the first three columns
http://docs.adaptivecomputing.com/torque/help.htm#topics/commands/qsub.htm#-W

Clarification of 'sed' usage

I just blindly followed a command from a tutorial to rename several folders at a time. Can anyone explain the meaning of "p;s" given as the argument to sed's -e option.
[root#LinuxD delsure]# ls
ar1 ar2 ar3 ar4 ar5 ar6 ar7
[root#LinuxD delsure]# find . -type d -name "ar*"|sed -e "p;s/ar/AR/g"|xargs -n2 mv
[root#LinuxD delsure]# ls
AR1 AR2 AR3 AR4 AR5 AR6 AR7

A sed script (the bit following the -e option) can contain multiple commands, separated by ;
The script in your example uses the p command to print the pattern space (i.e. the line just read from the input) followed by the s command to perform a substitution on the pattern space.
By default (unless the pattern space is cleared or the -n option is given to sed) after processing each line the current pattern spaceline is printed again, so the result of the substitution will be printed.
Another way to write the same thing would be:
sed -e "p" -e "s/ar/AR/g"
This separates the commands into two scripts. Another way would be:
sed "p;s/ar/AR/g"
because if the only argument to sed is a script then the -e option is not needed

The argument to the -e option is a script consisting of two commands. The first is p, which prints the unadulterated input, the second is a standard, global substitution. So for input ar1, this should output
ar1
AR1
The other part of this trick is the -n2 option on xargs, which forces it to only use two arguments at a time (instead of as many as it can handle, which would produce very different results).

One way in bash:
$ ls
ar6 ar7
$ find . -name 'ar*' | while IFS= read -r file; do echo mv "$file" "${file^^}"; done
mv ./ar6 ./AR6
mv ./ar7 ./AR7
get rid of the "echo" when you're happy with the output.

How do I do unbuffered substitution in a perl oneliner?

I've got a bash script that wraps mvn (Apache Maven) to add colour to its output. A cut-down version of what it does is:
mvn "$#" | sed -e "s/^\[INFO\] \-.*/$bldblu&$rst/g"
where $bldblu is the ANSI color escape characters for bold blue, and $rst resets the colours.
The issue I'm having is that sometimes mvn writes a line that doesn't end in a newline, thus (as far as I can tell) sed keeps waiting for input and never prints the prompt (which makes it seem like Maven is hanging). I've tried adding -u to sed but that just forces sed to do line-by-line buffering instead of buffering more than one line - not helpful for me.
So far this is what I've come up with:
mvn "$#" | perl -pe "$| = 1; s/^(\[INFO\] \-.*)/$bldblu\$1$rst/g"
but I think the use of -p is not correct here. Any help?

A substitution may be overkill, especially when the replacement pattern has special characters in it. How about this?
export bldblu
export rst
mvn "$#" | perl -pe 'if(/^.INFO. -/){ $_=$ENV{bldblu}.$_.$ENV{rst} }'
or rather than reinventing the wheel
mvn "$#" | perl -MTerm::ANSIColor -pe
'$_=color("bold blue").$_.color("reset") if /^.INFO. -/'

(workaround) Use sed --unbuffered
I couldn't figure out the solution but thankfully this is good enough for my particular usage:
cat - | sed --unbuffered 's/.*?from//g'
But I too would like to know the answer. Perl one line substitution is a key idiom in my toolbelt.
BSD
Looks like there is no common flag for GNU and BSD. For the latter, you'd need:
-l Make output line buffered.

How do I run a Perl one liner from a makefile?

I know the perl one liner below is very simple, works and does a global substitution, A for a; but how do I run it in a makefile?
perl -pi -e "s/a/A/g" filename
I have tried (I now think the rest of the post is junk as the shell command does a command line expansion - NOT WHAT I WANT!) The question above still stands!
APP = $(shell perl -pi -e "s/a/A/g" filename)
with and without the following line
EXE = $(APP)
and I always get the following error
make: APP: Command not found
which I assume comes from the line that starts APP
Thanks

If you want to run perl as part of a target's action, you might use
$ cat Makefile
all:
echo abc | perl -pe 's/a/A/g'
$ make
echo abc | perl -pe 's/a/A/g'
Abc
(Note that there's a TAB character before echo.)
Perl's -i option is for editing files in-place, but that will confuse make (unless perhaps you're writing a phony target). A more typical pattern is to make targets from sources. For example:
$ cat Makefile
all: bAr
bAr: bar.in
perl -pe 's/a/A/g' bar.in > bAr
$ cat bar.in
bar
$ make
perl -pe 's/a/A/g' bar.in > bAr
$ cat bAr
bAr
If you let us know what you're trying to do, we'll be able to give you better, more helpful answers.

You should show the smallest possible Makefile which demonstrates your problem, and show how you are calling it. Assuming your Makefile looks something like this, I get the error message. Note that there is a tab character preceding the APP in the all: target.
APP = $(shell date)
all:
APP
Perhaps you meant to do this instead:
APP = $(shell date)
all:
$(APP)
I did not use your perl command because it does not run for me as-is.
Do you really mean to use Perl's substitution operator? perl -pi -e "s/a/A/g"
Here is a link to GNU make documentation.

Using Sed to expand environment variables inside files

I'd like to use Sed to expand variables inside a file.
Suppose I exported a variable VARIABLE=something, and have a "test" file with the following:
I'd like to expand this: "${VARIABLE}"
I've been trying commands like the following, but to no avail:
cat test | sed -e "s/\(\${[A-Z]*}\)/`eval "echo '\1'"`/" > outputfile
The result is the "outputfile" with the variable still not expanded:
I'd like to expand this: "${VARIABLE}"
Still, running eval "echo '${VARIABLE}' in bash console results in the value "something" being echoed. Also, I tested and that pattern is trully being matched.
The desired output would be
I'd like to expand this: "something"
Can anyone shed a light on this?

Consider your trial version:
cat test | sed -e "s/\(\${[A-Z]*}\)/`eval "echo '\1'"`/" > outputfile
The reason this doesn't work is because it requires prescience on the part of the shell. The sed script is generated before any pattern is matched by sed, so the shell cannot do that job for you.
I've done this a couple of ways in the past. Normally, I've had a list of known variables and their values, and I've done the substitution from that list:
for var in PATH VARIABLE USERNAME
do
echo 's%${'"$var"'}%'$(eval echo "\$$var")'%g'
done > sed.script
cat test | sed -f sed.script > outputfile
If you want to map variables arbitrarily, then you either need to deal with the whole environment (instead of the fixed list of variable names, use the output from env, appropriately edited), or use Perl or Python instead.
Note that if the value of an environment variable contains a slash in your version, you'd run into problems using the slash as the field separator in the s/// notation. I used the '%' since relatively few environment variables use that - but there are some found on some machines that do contain '%' characters and so a complete solution is trickier. You also need to worry about backslashes in the value. You probably have to use something like '$(eval echo "\$$var" | sed 's/[\%]/\\&/g')' to escape the backslashes and percent symbols in the value of the environment variable. Final wrinkle: some versions of sed have (or had) a limited capacity for the script size - older versions of HP-UX had a limit of about 100. I'm not sure whether that is still an issue, but it was as recently as 5 years ago.
The simple-minded adaptation of the original script reads:
env |
sed 's/=.*//' |
while read var
do
echo 's%${'"$var"'}%'$(eval echo "\$$var" | sed 's/[\%]/\\&/g')'%g'
done > sed.script
cat test | sed -f sed.script > outputfile
However, a better solution uses the fact that you already have the values in the output from env, so we can write:
env |
sed 's/[\%]/\\&/g;s/\([^=]*\)=\(.*\)/s%${\1}%\2%/' > sed.script
cat test | sed -f sed.script > outputfile
This is altogether safer because the shell never evaluates anything that should not be evaluated - you have to be so careful with shell metacharacters in variable values. This version can only possibly run into any trouble if some output from env is malformed, I think.
Beware - writing sed scripts with sed is an esoteric occupation, but one that illustrates the power of good tools.
All these examples are remiss in not cleaning up the temporary file(s).

Maybe you can get by without using sed:
$ echo $VARIABLE
something
$ cat test
I'd like to expand this: ${VARIABLE}
$ eval "echo \"`cat test`\"" > outputfile
$ cat outputfile
I'd like to expand this: something
Let shell variable interpolation do the work.