SLURM Job Array output file in command - hpc

I have a command list like this
bedtools intersect -a BED1 -b BED2 >BED1_BED2_overlaps.txt
...
with over 100 files.
Here is the header for my job submission
#SBATCH -t 0-08:00
#SBATCH --job-name=JACCARD_DNase
#SBATCH -o /oasis/scratch/XXX/XXX/temp_project/logs/JACCARD_DNase_%a_out
#SBATCH -e /oasis/scratch/XXX/XXX/temp_project/logs/JACCARD_DNase_%a_err
#SBATCH --array=1-406%50
When I submit the job I get this error
Error: Unable to open file >BED1_BED2_overlaps.txt Exiting.
I tried to pipe an echo command like this
bedtools intersect -a BED1 -b BED2 | echo "BED1 BED2"
And I got
Error: Unable to open file |. Exiting.
So what gives? How can I submit array jobs with Bash syntax like > output and | pipes?

It looks like you are missing the shebang ; your submission script should start with
#! /bin/bash
or any other shell you like.

Related

Samtools/hpc/truncated file

I have tried to submit the script below to HPC
#!/bin/bash
#PBS -N bwa_mem_tumor
#PBS -q batch
#PBS -l walltime=02:00:00
#PBS -l nodes=2:ppn=2
#PBS -j oe
sample=x
ref=absolute/path/GRCh38.p13.genome.fa
fwd=absolutepath/forward_read.fq.gz
rev=absolutepath/reverse_read.fq.gz
module load bio/samtools/1.9
bwa mem $ref $fwd $rev > $sample.tumor.sam && samtools view -S $sample.tumor.sam -b > $sample.tumor.bam && samtools sort $sample.tumor.bam > $sample.tumor.sorted.bam
However as an output I can get only the $sample.tumor.sam and log file says that
Lmod has detected the following error: The following module(s) are unknown:
"bio/samtools/1.9"
Please check the spelling or version number. Also try "module spider ..."
It is also possible your cache file is out-of-date; it may help to try:
$ module --ignore-cache load "bio/samtools/1.9"
Also make sure that all modulefiles written in TCL start with the string
#%Module
However when I input modeles avail it shows that bio/samtools/1.9 is on the list.
Also when i use the option module --ignore-cache load "bio/samtools/1.9"
the result is the same
If i try to continue working with the sam file and input manually the command line
samtools view -b RS0107.tumor.sam > RS0107.tumor.bam
it shows
[W::sam_read1] Parse error at line 200943
[main_samview] truncated file.
What's possibly wrong with the samtools module ir we with the script?

How to configure slurm to email out file?

I'm a newbie to slurm and I'm trying to configure my bash script to, in the case that a job fails, email the corresponding standard output file when notifying me. I've managed to configure email notifications, but how can I make the body of the email contain standard output?
#!/bin/bash
#SBATCH -n 2 # two cores
#SBATCH --mem=3G
#SBATCH --time=48:00:00 # total run time limit (HH:MM:SS)
#SBATCH --mail-user=rylansch
#SBATCH --mail-type=FAIL
export PYTHONPATH=.
python -u model_train.py # -u flushes output buffer immediately
I don't see answers How to configure the content of slurm notification emails? or How to let SBATCH send stdout via email?
See my solution here
#!/bin/bash
#SBATCH -J MyModel
#SBATCH -n 1 # Number of cores
#SBATCH -t 1-00:00 # Runtime in D-HH:MM
#SBATCH -o JOB%j.out # File to which STDOUT will be written
#SBATCH -e JOB%j.out # File to which STDERR will be written
#SBATCH --mail-type=BEGIN
#SBATCH --mail-user=my#email.com
secs_to_human(){
echo "$(( ${1} / 3600 )):$(( (${1} / 60) % 60 )):$(( ${1} % 60 ))"
}
start=$(date +%s)
echo "$(date -d #${start} "+%Y-%m-%d %H:%M:%S"): ${SLURM_JOB_NAME} start id=${SLURM_JOB_ID}\n"
### exec task here
( << replace with your task here >> ) \
&& (cat JOB$SLURM_JOB_ID.out |mail -s "$SLURM_JOB_NAME Ended after $(secs_to_human $(($(date +%s) - ${start}))) id=$SLURM_JOB_ID" my#email.com && echo mail sended) \
|| (cat JOB$SLURM_JOB_ID.out |mail -s "$SLURM_JOB_NAME Failed after $(secs_to_human $(($(date +%s) - ${start}))) id=$SLURM_JOB_ID" my#email.com && echo mail sended && exit $?)
you can also edit this to send seperate stdout/stderr logs or attach them as files.
This snippet is also shared on github-gists
As a regular user, you do not get to choose the contents of the email sent to you. Only the administrators can do that.
But you could add a command at the end of your submission script to send you an email, like is explained here

Slurm task id as Matlab's function argument

I want to create a job array in slurm in a way such that it is called a Matlab function that depends on the array task id. I tried
#!/bin/bash
#SBATCH -J TEST
#SBATCH -p slims
#SBATCH -o o
#SBATCH -e e
matlab -r "test(${SLURM_ARRAY_TASK_ID})"
where test.m is the matlab function that I want to run. This throw the error "Not enough arguments in line 7 test.m ..."
How should I do it?
It looks like $SLURM_ARRAY_TASK_ID was not defined, and there is no --array parameter in your submission file. So unless you provided that argument on the command line
sbatch --array ... <yourscript.sh>
you did not tell Slurm to create an array.
Either add #SBATCH --array ... to your submission script or specify it on the command line.

LSF job submission - stdout & stderr redirection

I've submitted my job by the following command:
bsub -e error.log -o output.log ./myScript.sh
I have a question: why are the output and errors logs available only once the job ended?
Thanks
LSF doesn't steam the output back to the submission host. If the submission host and the execution host have a shared file system, and the JOB_SPOOL_DIR is in that shared file system (the spool directory is $HOME/.lsbatch by default) then you should see the stdout and stderr there. After the job finishes, the files there are copied back to the location specified by bsub.
Check bparams -a | grep JOB_SPOOL_DIR to see if the admin has changed the location of the spool dir. With or without the -o/-e options, while the job is running its stdout/err will be captured in the job's spool directory. When the job is finished, the stdout/stderr is copied to the filenames specified by bsub -o/-e. The location of the files in the spool dir is $JOB_SPOOL_DIR/<jobsubmittime>.<jobid>.out or $JOB_SPOOL_DIR/<jobsubmittime>.<jobid>.err
[user1#beta ~]$ cat log.sh
LINE=1
while :
do
echo "line $LINE"
LINE=$((LINE+1))
sleep 1
done
[user1#beta ~]$ bsub -o output.log -e error.log ./log.sh
Job <930> is submitted to default queue <normal>.
[user1#beta ~]$ tail -f .lsbatch/*.930.out
line 1
line 2
line 3
...
According to the LSF documentation the behaviour is configurable:
If LSB_STDOUT_DIRECT is not set and you use the bsub -o option, the standard output of a job is written to a temporary file and copied to the file you specify after the job finishes.

Torque qsub -o command not work

I made a test script test.qsub:
#!/bin/bash
#PBS -q batch
#PBS -o output.txt
#PBS -e Error.err
echo "hello world"
When running qsub test.qsub it does not generate the output.txt file nor the file error.txt. I also believe that the other options do not work either, appreciate your help ! It is said you should configure the torque.cfg but in my installation the file is not generated and not in /var/spool/torque.
Try "#PBS -k oe". This directs pbs to keep stdout and stderr.