How to use LSB_JOBINDEX in bsub array job arguments in Platform LSF? - lsf

I would like to pass LSB_JOBINDEX to as an argument to my script instead of using an environment variable.
This makes my script more LSF agnostic and avoids creating a helper script that uses the environment variable.
However, I was not able to use LSB_JOBINDEX in arguments: it only works as part of the initial command string.
For example, from a bash shell, I use the test command:
bsub -J 'myjobname[1-4]' -o bsub%I.log \
'echo $LSB_JOBINDEX' \
'$LSB_JOBINDEX' \
\$LSB_JOBINDEX \
'$LSB_JOBINDEX' \
"\$LSB_JOBINDEX"
and the output of say bsub2.log is:
2 $LSB_JOBINDEX $LSB_JOBINDEX $LSB_JOBINDEX $LSB_JOBINDEX
So in this case, only the first $LSB_JOBINDEX got expanded, but not any of the following ones.
But I would rather not pass the entire command as a single huge string as the 'echo $LSB_JOBINDEX' in this example. I would prefer to just use separate arguments as in a regular bash command.
I've also tried to play around with %I but it only works for -o and related bsub options, not for the command itself.
Related: Referencing job index in LSF job array
Tested in LSF 10.1.0. Related documentation: https://www.ibm.com/support/knowledgecenter/en/SSWRJV_10.1.0/lsf_admin/job_array_cl_args.html

bsub will add single quotes around the arguments if the argument starts with $. For example. If the bsub command line is
bsub command -a $ARG1 -b $ARG2
Then bsub will add quotes to the arguments to the 2nd and 4th parameters. The command is stored like this
command -a '$ARG1' -b '$ARG2'
One way to prevent this is to put the commands in a script. Like this:
$ cat cmd
echo $LSB_JOBINDEX
echo "line 2"
echo $LSB_JOBINDEX
Then run your job like this:
$ bsub -I < cmd
Job <2669> is submitted to default queue <normal>.
<<Waiting for dispatch ...>>
<<Starting on hostA>>
0
line 2
0
Note that the -I is not needed. Its just so you can see the job output on the bsub's stdout.
EDIT
OK. Looks like this works. But its not really a serious answer since it's so ugly. The thing is that bsub will surround the argument with single quotes if the argument starts with $. So the strategy is to find some way to make sure that the first character in the argument isn't a $. One way is to put any character other than $ as the first character of the argument. Follow it by a backspace literal, followed by the $. Note that it needs to be the actual backspace character, not ^ followed by H. Use ctrl-v followed by a ctrl-h to get the literal appended to the command line.
$ bsub -I echo "x^H\$LSB_JOBINDEX" "x^H\$LSB_JOBINDEX"
Job <2686> is submitted to default queue <normal>.
<<Waiting for dispatch ...>>
<<Starting on hostA>>
0 0
EDIT2
A tab literal also works. Not that its much better.
$ bsub -I echo " \$LSB_JOBINDEX" " \$LSB_JOBINDEX"
Job <2687> is submitted to default queue <normal>.
<<Waiting for dispatch ...>>
<<Starting on hostA>>
0 0

Related

setup new database in ubuntu using a script [duplicate]

I have a script where I need to start a command, then pass some additional commands as commands to that command. I tried
su
echo I should be root now:
who am I
exit
echo done.
... but it doesn't work: The su succeeds, but then the command prompt is just staring at me. If I type exit at the prompt, the echo and who am i etc start executing! And the echo done. doesn't get executed at all.
Similarly, I need for this to work over ssh:
ssh remotehost
# this should run under my account on remotehost
su
## this should run as root on remotehost
whoami
exit
## back
exit
# back
How do I solve this?
I am looking for answers which solve this in a general fashion, and which are not specific to su or ssh in particular. The intent is for this question to become a canonical for this particular pattern.
Adding to tripleee's answer:
It is important to remember that the section of the script formatted as a here-document for another shell is executed in a different shell with its own environment (and maybe even on a different machine).
If that block of your script contains parameter expansion, command substitution, and/or arithmetic expansion, then you must use the here-document facility of the shell slightly differently, depending on where you want those expansions to be performed.
1. All expansions must be performed within the scope of the parent shell.
Then the delimiter of the here document must be unquoted.
command <<DELIMITER
...
DELIMITER
Example:
#!/bin/bash
a=0
mylogin=$(whoami)
sudo sh <<END
a=1
mylogin=$(whoami)
echo a=$a
echo mylogin=$mylogin
END
echo a=$a
echo mylogin=$mylogin
Output:
a=0
mylogin=leon
a=0
mylogin=leon
2. All expansions must be performed within the scope of the child shell.
Then the delimiter of the here document must be quoted.
command <<'DELIMITER'
...
DELIMITER
Example:
#!/bin/bash
a=0
mylogin=$(whoami)
sudo sh <<'END'
a=1
mylogin=$(whoami)
echo a=$a
echo mylogin=$mylogin
END
echo a=$a
echo mylogin=$mylogin
Output:
a=1
mylogin=root
a=0
mylogin=leon
3. Some expansions must be performed in the child shell, some - in the parent.
Then the delimiter of the here document must be unquoted and you must escape those expansion expressions that must be performed in the child shell.
Example:
#!/bin/bash
a=0
mylogin=$(whoami)
sudo sh <<END
a=1
mylogin=\$(whoami)
echo a=$a
echo mylogin=\$mylogin
END
echo a=$a
echo mylogin=$mylogin
Output:
a=0
mylogin=root
a=0
mylogin=leon
A shell script is a sequence of commands. The shell will read the script file, and execute those commands one after the other.
In the usual case, there are no surprises here; but a frequent beginner error is assuming that some commands will take over from the shell, and start executing the following commands in the script file instead of the shell which is currently running this script. But that's not how it works.
Basically, scripts work exactly like interactive commands, but how exactly they work needs to be properly understood. Interactively, the shell reads a command (from standard input), runs that command (with input from standard input), and when it's done, it reads another command (from standard input).
Now, when executing a script, standard input is still the terminal (unless you used a redirection) but the commands are read from the script file, not from standard input. (The opposite would be very cumbersome indeed - any read would consume the next line of the script, cat would slurp all the rest of the script, and there would be no way to interact with it!) The script file only contains commands for the shell instance which executes it (though you can of course still use a here document etc to embed inputs as command arguments).
In other words, these "misunderstood" commands (su, ssh, sh, sudo, bash etc) when run alone (without arguments) will start an interactive shell, and in an interactive session, that's obviously fine; but when run from a script, that's very often not what you want.
All of these commands have ways to accept commands by ways other than in an interactive terminal session. Typically, each command supports a way to pass it commands as options or arguments:
su root -c 'who am i'
ssh user#remote uname -a
sh -c 'who am i; echo success'
Many of these commands will also accept commands on standard input:
printf 'uname -a; who am i; uptime' | su
printf 'uname -a; who am i; uptime' | ssh user#remote
printf 'uname -a; who am i; uptime' | sh
which also conveniently allows you to use here documents:
ssh user#remote <<'____HERE'
uname -a
who am i
uptime
____HERE
sh <<'____HERE'
uname -a
who am i
uptime
____HERE
For commands which accept a single command argument, that command can be sh or bash with multiple commands:
sudo sh -c 'uname -a; who am i; uptime'
As an aside, you generally don't need an explicit exit because the command will terminate anyway when it has executed the script (sequence of commands) you passed in for execution.
If you want a generic solution which will work for any kind of program, you can use the expect command.
Extract from the manual page:
Expect is a program that "talks" to other interactive programs according to a script. Following the script, Expect knows what can be expected from a program and what the correct response should be. An interpreted language provides branching and high-level control structures to direct the dialogue. In addition, the user can take control and interact directly when desired, afterward returning control to the script.
Here is a working example using expect:
set timeout 60
spawn sudo su -
expect "*?assword" { send "*secretpassword*\r" }
send_user "I should be root now:"
expect "#" { send "whoami\r" }
expect "#" { send "exit\r" }
send_user "Done.\n"
exit
The script can then be launched with a simple command:
$ expect -f custom.script
You can view a full example in the following page: http://www.journaldev.com/1405/expect-script-example-for-ssh-and-su-login-and-running-commands
Note: The answer proposed by #tripleee would only work if standard input could be read once at the start of the command, or if a tty had been allocated, and won't work for any interactive program.
Example of errors if you use a pipe
echo "su whoami" |ssh remotehost
--> su: must be run from a terminal
echo "sudo whoami" |ssh remotehost
--> sudo: no tty present and no askpass program specified
In SSH, you might force a TTY allocation with multiple -t parameters, but when sudo will ask for the password, it will fail.
Without the use of a program like expect any call to a function/program which might get information from stdin will make the next command fail:
ssh use#host <<'____HERE'
echo "Enter your name:"
read name
echo "ok."
____HERE
--> The `echo "ok."` string will be passed to the "read" command

how to handle quotes and semicolon in bsub command

I'm trying to submit commands to the LSF scheduler with bsub but this command includes a parameter value that must be quoted and contains a semicolon.
Here is a simple command to illustrate my problem
bsub -o t.o -e t.e echo "foo;bar"
it fails with "line 8: bar: command not found", so I thought I could escape the semicolon but this
bsub -o t.o -e t.e echo "foo\;bar"
causes the same error, so does this
bsub -o t.o -e t.e echo 'foo;bar'
I know I can get around it by writing the command to a script file and executing that as the bsub command but in this case I am going to test a number of parameters and it would be so much handier to just modify the bsub command rather than editing a shell script each time.
Thanks for your help!
One simple way I can think of to do this is to use bsub's subshell interface: simply execute bsub <options> from your command line without specifying a command. bsub will then prompt you for a command in a subshell, and you can use quotes in this subshell.
Send the subshell an end-of-file (CTRL+D) to let it know you're done. Here's an example run using something similar to your case but running interactively instead of using -o to capture the output:
% bsub -I
bsub> echo "foo;bar"
bsub> <================[### Hit CTRL+D here ###]
Job <5841> is submitted to default queue <normal>.
<<Waiting for dispatch ...>>
<<Starting on hb05b10>>
foo;bar
%

Returning exit code of piped command in a LSF command

I hope my problem is not too specific...
There are lots of questions and answers about how to return the exit code for a command that is piped into another command, but my case is a little different...
I have a generic command that I pipe the output to a syntax coloring scripts. This command is executed via LSF's bsub. Something like this:
bsub <switches> "command | colorize"
Assume the command returns a non zero exit value. The bsub is returning a zero exit value because the colorize command.
If I don't pipe it--
bsub <switches> "command"
the exit value is the correct non-zero value from command.
Is there a way to get the non-zero value with the pipe?
For full disclosure, this bsub is actually being called via a system() call in perl. As long as the bsub returns nonzero, the system call should return non-zero and all is good.
I looked at how to get exit codes from piped commands via $PIPESTATUS, but I don't think it works in this case because 1) I'm running from perl and not a shell, and 2) I don't know if bsub would return that.
Following on Mr. Llama's comment that:
Some shells like bash offer an option like -o pipefail which will cause a pipe chain to return the first non-zero return code (if any).
You could put your pipeline into a script like so:
#!/bin/bash
set -o pipefail
command | colorize
Then submit your job by spooling the script directly into bsub:
bsub <switches> < yourscript.sh
As a sidenote, you can also define <switches> inside your script like so:
#!/bin/bash
#BSUB -n 4
#BSUB -o outfile.txt
set -o pipefail
command | colorize
Then spool it into bsub this way:
bsub < yourscript.sh

Awk inside of qsub

I have a bash script in which I have a few qsubs. Each of them are waiting for a preivous qsub to be done before starting.
My first qsub consist of sending files in a certain directory to a perl program and having the outfiles printed in a new directory. At the end, I echo the array with all my jobs names. This script works as intented.
mkdir -p /perl_files_dir
for ID_FILES in `ls Infiles_dir/*.txt`;
do
JOB_ID=`echo "perl perl_scirpt.pl $ID_FILES" | qsub -j oe `
JOB_ID_ARRAY="${JOB_ID_ARRAY}:$JOB_ID"
done
echo $JOB_ID_ARRAY
My second qsub is meant to sort all my previous files made with my perl script in a new outfile and to start after all these jobs are done (about 100 jobs) with depend=afterany. Again, this part is working fine.
SORT_JOB=`echo "sort -m -n perl_files_dir/*.txt >>sorted_file.txt" | qsub -j oe -W depend=afterany$JOB_ID_ARRAY`
SORT_ARRAY="${SORT_ARRAY}:$SORT_JOB"
My issue is that in my sorted file, I have a few columns I wish to remove (2 to 6), so I came up with this last line using awk piped to sed with another depend=afterany
SED=`echo "awk '{\$2="";\$3="";\$4="";\$5="";\$6=""; print \$0}' sorted_file.txt \
| sed 's/ //g' >final_file.txt" | qsub -j oe -W depend=afterany$SORT_ARRAY`
This last step creates final_file.txt, but leaves it empty. I added SED= before my echo because it would otherwise give me Command not found.
I tried without the pipe so it would just print everything. Unfortunately it prints nothing.
I assume it is not opening my sorted file and this is why my final file is empty after my sed. If it's the case, then why won't awk read it?
In my script, I am using variables to define my directories and files (with the correct path). I know my issue is not about find my files or directories since they are perfectly defined at the beginning and used throughout the script. I tried to write the whole path instead of a variable and I get the same results.
for ID_FILES in `ls Infiles_dir/*.txt`
Simplify this to
for ID_FILES in Infiles_dir/*.txt
ls lists the files you pass it (except when you pass it directories, then it lists their content). Rather than telling it to display a list of files and parse the output, use the list of files you already have! This is more reliable (parsing the output of ls will fail if the file names contain whitespace or wildcard characters), clearer and faster. Don't parse the output of ls.
SORT_JOB=`echo "sort -m -n perl_files_dir/*.txt >>sorted_file.txt" | qsub -j oe -W depend=afterany$JOB_ID_ARRAY`
You'd make your life simpler if you used the right form of quoting in the right place. Don't use backquotes, because it's difficult to know how to quote things inside. Use $(…) instead, it's exactly equivalent except that it is parsed in a sane way.
I recommend using a here document for the shell snippet that you're feeding to qsub. You have fewer quoting issues to worry about, and it's more readable.
While we're at it, always put double quotes around variable substitutions and command substitutions: "$some_variable", "$(some_command)". Annoyingly, $var in shell syntax doesn't mean “take the value of the variable var”, it means “take the value of the variable var, parse it as a list of wildcard patterns, and replace each pattern by the list of matching files if there are matching files”. This extra stuff is turned off if the substitution happens inside double quotes (or in a here document, by the way): "$var" means “take the value of the variable var”.
SORT_JOB=$(qsub -j oe -W depend="afterany$JOB_ID_ARRAY" <<'EOF'
sort -m -n perl_files_dir/*.txt >>sorted_file.txt
EOF
)
We now get to the snippet where the quoting was actually causing a problem.
SED=`echo "awk '{\$2="";\$3="";\$4="";\$5="";\$6=""; print \$0}' sorted_file.txt \
| sed 's/ //g' >final_file.txt" | qsub -j oe -W depend=afterany$SORT_ARRAY`
The string that becomes the argument to the echo command is:
awk '{$2=;$3=;$4=;$5=;$6=; print $0}' sorted_file.txt | sed 's/ //g' >final_file.txt
This is syntactically incorrect, and that's why you're not getting any output.
You didn't escape the double quotes inside what was meant to be the awk snippet. It's a lot clearer if you use a here document. Also, you don't need the SED= part. You added it because you had a command substitution (a command between …), which substitutes the output of a command. But since you aren't interested in the output of the qsub command, don't take its output, just execute it.
qsub -j oe -W depend="afterany$SORT_ARRAY" <<'EOF'
awk '{$2="";$3="";$4="";$5="";$6=""; print $0}' sorted_file.txt |
sed 's/ //g' >final_file.txt
EOF
I'm not familiar with qsub, but presumably there's a way to get the error output and the return status of the commands it runs. Inspect that error output, you should have seen the errors from awk.
The version of awk that I am using, does not like the character escapes
awk --version
GNU Awk 3.1.7
spuder#cent64$ awk '{\$2="";\$3="";\$4=""; print \$0}' foo.txt
awk: {\$2="";\$3="";\$4=""; print \$0}
awk: ^ backslash not last character on line
Try the following syntax
awk '{for(i=2;i<=7;i++) $i="";print}' foo.txt
As a side note, if you are using Torque 4.x you may not be able to use a comma separated list of jobs with -W depend=, instead you may need to create a new PBS declarative (-W) for each job.
eg...
#Invalid syntax in newer versions of torque
qsub -W depend=foo,bar
Resources
backslash in gawk fields
Print all but the first three columns
http://docs.adaptivecomputing.com/torque/help.htm#topics/commands/qsub.htm#-W

Escaping whitespace within nested shell/perl scripts

I'm trying to run a perl script from within a bash script (I'll change this design later on, but for now, bear with me). The bash script receives the argument that it will run. The argument to the script is as follows:
test.sh "myscript.pl -g \"Some Example\" -n 1 -p 45"
within the bash script, I simple run the argument that was passed:
#!/bin/sh
$1
However, in my perl script the -g argument only gets "Some (that's with the quotes), instead of the Some Example. Even if I quote it, it cuts off because of the whitespace.
I tried escaping the whitespace, but it doesn't work... any ideas?
To run it as posted test.sh "myscript.pl -g \"Some Example\" -n 1 -p 45" do this:
#!/bin/bash
eval "$1"
This causes the $1 argument to be parsed by the shell so the individual words will be broken up and the quotes removed.
Or if you want you could remove the quotes and run test.sh myscript.pl -g "Some Example" -n 1 -p 45 if you changed your script to:
#!/bin/bash
"$#"
The "$#" gets replaced by all the arguments $1, $2, etc., as many as were passed in on the command line.
Quoting is normally handled by the parser, which isn't seeing them when you substitute the value of $1 in your script.
You may have more luck with:
#!/bin/sh
eval "$1"
which gives:
$ sh test.sh 'perl -le "for (#ARGV) { print; }" "hello world" bye'
hello world
bye
Note that simply forcing the shell to interpret the quoting with "$1" won't work because then it tries to treat the first argument (i.e., the entire command) as the name of the command to be executed. You need the pass through eval to get proper quoting and then re-parsing of the command.
This approach is (obviously?) dangerous and fraught with security risks.
I would suggest you name the perl script in a separate word, then you can quote the parameters when referring to them, and still easily extract the script name without needing the shell to split the words, which is the fundamental problem you have.
test.sh myscript.pl "-g \"Some Example\" -n 1 -p 45"
and then
#!/bin/sh
$1 "$2"
if you really have to do this (for whatever reason) why not just do:
sh test.sh "'Some Example' -n 1 -p 45"
in:
test.sh
RUN=myscript.pl
echo `$RUN $1
(there should be backticks ` before $RUN and after $1)