LSF moving files into created output dir

LSF moving files into created output dir - hpc

When executing a job on LSF you can specify the working directory and create a output directory, i.e
bsub -cwd /home/workDir -outdir /home/$J program inputfile
where it will look for inputfile in the specified working directory. The -outdir will create a new directory based on the JobId.
What I'm wondering is how you pipe the results created from the run in the working directory to the newly created output dir.
You can't add a command like
mv * /home/%J
as the underlying OS has no understanding of the %J identifier. Is there an option in LSF for piping the data inside the job, where it knows the jobId?

You can use the environment variable $LSB_JOBID.
mv * /data/${LSB_JOBID}/
If you copy the data inside your job script then it will hold the compute resource during the data copy. If you're copying a small amount of data then its not a problem. But if its a large amount of data you can use bsub -f so that other jobs can start while the data copy is ongoing.
bsub -outdir "/data/%J" -f "/data/%J/final < bigfile" sh script.sh
bigfile is the file that your job creates on the compute host. It will be copied to /data/%J/final after the job finishes. It even works on a non-shared filesystem.

Related

sed -n function calling in same line repeatedly

I'm a complete novice wrt unix and writing shell scripts, so apologies if the solution to my problem is quite banal.
Essentially though, I'm working on a shell script that reads from a TextEdit file called "sursecout.txt", and runs it through another script called "sursec.x" (where sursec.x is simply a series of FORTRAN integrations). It then creates a folder named after a certain Jacobi integral ("CJ ="), and stores the ten SurSec[n] files there (where n = integer). My problem is that the different folders are created correctly with appropriate names, but are each filled with identical output files. My suspicion is that something is wrong with my sed command, in that it's reading the same two lines over and over again (where as it should be reading the first two lines of sursecout.txt, then next two, etc.)
Here are the first two folders I want to make, but I have 30 so any help would be appreciated.
./sursec.x < ./sursecout.txt
sed -n '1,2p;3q' sursecout.txt
cd ..
mv ./data ./CJ=3.029990
mkdir data
cd SurSec
./sursec.x < ./sursecout.txt
sed -n '3,4p;5q' sursecout.txt
cd ..
mv ./data ./CJ=3.030659
mkdir data
cd SurSec

How to correctly use `gsutil -q stat` in scripts?

I am creating a KSH script to check whether a subdirectory is exist on GCS bucket. I am writing the script like this:
#!/bin/ksh
set -e
set -o pipefail
gsutil -q stat ${DESTINATION_PATH}/
PATH_EXIST=$?
if [ ${PATH_EXIST} -eq 0 ] ; then
# do something
fi
Weird thing happens when the ${DESTINATION_PATH}/ does not exist, the script exit without evaluating PATH_EXIST=$?. If ${DESTINATION_PATH}/ is exist, the script will run normally as expected.
Why does that thing happen? How can I do better?

The statement set -e implies that your script will be exited if a command exits with a non-zero status.
The gsutil stat command can be used to check wheter an object exists:
gsutil -q stat gs://some-bucket/some-object
It has an exit status of 0 for an existing object and 1 for a non-existent object.
However it is advised against to use it with subdirectories:
Note: Unlike the gsutil ls command, the stat command does not support
operations on sub-directories. For example, if you run the command:
gsutil -q stat gs://some-bucket/some-subdir/
gsutil will look for
information about an object called some-subdir/ (with a trailing
slash) inside the bucket some-bucket, as opposed to operating on
objects nested under gs://some-bucket/some-subdir/. Unless you
actually have an object with that name, the operation will fail.
The reason because your command is not failing when your ${DESTINATION_PATH}/ exists is because if you create the folder using the Cloud Console i.e the UI, then a placeholder object will be created with its name. But let me be clear, folders don't exist in Google Cloud Storage, they are just a visualization of the bucket objects hierarchy.
So if you upload an object named newFolder/object to your bucket and the newFolder does not exists, it will be "created" but your gsutil -q stat ${DESTINATION_PATH}/ will return exit code 1. However if you create the folder using the UI and run the same command it will return exit 0. Thus follow the documentation, and avoid using it for checking if a directory exists.
Instead if you want to check whether a subdirectory exists just check if it contains any object inside:
gsutil -q stat ${DESTINATION_PATH}/*
Which will return 0 if any object is in the subdirectory and 1 otherwise.

Writing to /proc

I have an FPGA setup that is connected a folder within /proc. I need to write to this file, but when I do this, the file size ends up being 0 and the file is not written, though no error is issued. Oddly this behavior does not occur with scp.
I can echo to file successfully: echo -ne "\000\000\000\000" > /proc/file
I can scp a file from a remote machine to /proc/file
I cannot copy a local file to it: cp localfile /proc/file
sftp also gives a 0 file size
My question is: what is different between cp and scp and sftp, probably at a pretty low level, that one works and the others don't.

Script response if md5sum returns FAILED

Say I had a script that checked honeypot locations using md5sum.
#!/bin/bash
#cryptocheck.sh
#Designed to check md5 CRC's of honeypot files located throughout the filesystem.
#Must develop file with specific hashes and create crypto.chk using following command:
#/opt/bin/md5sum * > crypto.chk
#After creating file, copy honeypot folder out to specific folders
locations=("/share/ConfData" "/share/ConfData/Archive" "/share/ConfData/Application"
"/share/ConfData/Graphics")
for i in "${locations[#]}"
do
cd "$i/aaaCryptoAudit"
/opt/bin/md5sum -c /share/homes/admin/crypto.chk
done
And the output looked like this:
http://pastebin.com/b4AU4s6k
Where would you start to try and recognize the output and perhaps trigger some sort of response by the system if there is a 'FAILED'?
I've worked a bit with PERL trying to parse log files before but my attempts typically failed miserably for one reason or another.
This may not be the proper way to go about this, but I'd want to be putting this script into a cronjob that would run every minute. I had some guys telling me that an inotify job or script (I'm not familiar with this) would be better than doing it this way.
Any suggestions?
--- edit
I made another script to call the script above and send the output to a file. The new script then runs a grep -q on 'FAILED' and if it picks anything up, it sounds the alarm (tbd what the alarm will be).
#!/bin/bash
#cryptocheckinit.sh
#
#rm /share/homes/admin/cryptoalert.warn
/share/homes/admin/cryptocheck.sh > /share/homes/admin/cryptoalert.warn
grep -q "FAILED" /share/homes/admin/cryptoalert.warn && echo "LIGHT THE SIGNAL FIRES"

Use:
if ! /opt/bin/md5sum -c /share/homes/admin/crypto.chk
then
# Do something
fi
Or pipe the output of the loop:
for i in "${locations[#]}"
do
cd "$i/aaaCryptoAudit"
/opt/bin/md5sum -c /share/homes/admin/crypto.chk
done | grep -q FAILED && echo "LIGHT THE SIGNAL FIRES"

using wget to overwrite file but use temporary filename until full file is received, then rename

I'm using wget in a cron job to fetch a .jpg file into a web server folder once per minute (with same filename each time, overwriting). This folder is "live" in that the web server also serves that image from there. However if someone web-browses to that page during the time the image is being fetched, it is considered a jpg with errors and says so in the browser. So what I need to do is, similar to when Firefox is downloading a file, wget should write to a temporary file, either in /var or in the destination folder but with a temporary name, until it has the whole thing, then rename in an atomic (or at least negligible-duration) step.
I've read the wget man page and there doesn't seem to be a command line option for this. Have I missed it? Or do I need to do two commands in my cron job, a wget and a move?

There is no way to do this purely with GNU Wget.
wget's job is to download files and it does that. A simple one line script can achieve what you're looking for:
$ wget -O myfile.jpg.tmp example.com/myfile.jpg && mv myfile.jpg{.tmp,}
Since mv is atomic, atleast on Linux, you get the atomic update of a ready file.

Just wanted to share my solution:
alias wget='func(){ (wget --tries=0 --retry-connrefused --timeout=30 -O download_pkg.tmp "$1" && mv download_pkg.tmp "${1##*/}") || rm download_pkg.tmp; unset -f func; }; func
it creates a function that receives a parameter "url" to download the file to a temporary name. If it is successful, it is renamed to the correct filename extracted from parameter $1 with ${1##*/}. and if it fails, deletes the temp file. If the operation is aborted, the temp file will be replace on the next run. after all, unset -f removes the function definition as the alias is executed.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

LSF moving files into created output dir - hpc

Related

sed -n function calling in same line repeatedly

How to correctly use `gsutil -q stat` in scripts?

Writing to /proc

Script response if md5sum returns FAILED

using wget to overwrite file but use temporary filename until full file is received, then rename

Categories

Resources