Is there a way to see how many nodes are allocated to a user (me) in the qsub system?
Also how can one see how many nodes/cores a process is using in effect?
You can always get info about user by calling
qstat -u user_name
it shows the nodes in one of the columns.
If you want to get very detailed info regarding job, you can always call:
qstat -f job_id
And you should get info regarding all the details related to given job.
Related
I have a few cloudwatch log groups and it has lot of logs printed and since the ingestion size is vast I want to reduce a few logs since already my log level is INFO so now I need to try to comment out a few info logs.
Since I can not comment on all, I need to figure out what pattern of logs repeated most and check if I can remove any of those logs. so is there a way to get the logs pattern with most repeated count using log insights?
I need to know how many tasks are reserved by all running consumers.
For a single consumer, this is determined by the worker_prefetch_multiplier parameter.
To get the total number, I need to know how many consumers are currently running. How do I find out how many of them there are?
I know that I can do this using
rabbitmqctl list_queues name consumers, but how do I do this in the code?
I found the answer.
app.control.inspect().stats() contains all the necessary information.
I've got a large computational task, consisting of several steps, that I run on a PC cluster, managed by LSF.
Part of this task includes launching several parallel jobs with identical names. Jobs are somewhat different, therefore it is hard to transform them to a job array.
The next step of this computation, following these jobs, summarizes their results, therefore it must wait until all of them are finished.
I'm trying to use -w ended(job-name) command line switch for bsub, as usual, to specify job dependencies.
However, admins of the cluster have set JOB_DEP_LAST_SUB = 1 in lsb.params.
According to the LSF manual, this makes LSF to wait for only one most recent job with supplied name to complete, instead of all jobs.
Is it possible to override this behavior for my task only without asking admins to reconfigure the whole cluster (this cluster is used by many people, it is very unlikely that they agree)?
I cannot find any clues in the manual.
Looks like it cannot be overridden.
I've changed job names to make them unique by appending random value, then I've changed condition to -w ended(job-name-*)
What's the easiest way to run a configurable number of identical jobs on Kubernetes but give each of them different parameters (like job number)?
1) You could either just have a template job and use bash expansions to have multiple job specifications based of that initial template.
As shown in the official Parallel Processing using Expansions user guide:
mkdir ./jobs
for i in apple banana cherry
do
cat job.yaml.txt | sed "s/\$ITEM/$i/" > ./jobs/job-$i.yaml
done
kubectl create -f ./jobs
2) Or you could create a queue and have a specified number of parallel workers/jobs to empty the queue. The contents of the queue would then be the input for each worker and Kubernetes could spawn parallel jobs. That's best described in the Coarse Parallel Processing using a Work Queue user guide.
The first approach is simple and straight forward but lacks flexibility
The second requires a message queue as "overhead" but you'll gain flexibility
I want to start a process, get its PID, and write it to a PID file. I then want to check that file, get the PID, and check if the process is running with kill 0.
If the process is not running, I want to start it, get its PID, and write it to the PID file. If the process is already running, then I want to ignore it.
How can I start a process so that it keeps running and I can check its status with Perl?
It is traditional on UNIX for a process to manage its own PID file if it is understood that other processes will need its PID as a way to interact with it.
But.. If you use fork/exec to start the process, the parent receives the pid of the child process upon successful fork().
If you give us more detail, we can give more precise help.
--------------------- 2014-11-04 -----------------
Your web services 'should' be creating their own PID files (Many commercially available server solutions do this already). But you don't say how those services are started, nor what kind of processes they are: apache, iis, node, websphere, etc.
In general, this feels like an XY problem. You tell us you want to do X but the bogger picture is that you're doing Y and there's a better way to to Y than just doing X.
So please tell us about the environment and the software.