Running netlogo headless and repeat the experiment - netlogo

I plan to run my model on a cluster so I can repete the run (at least 40 times) on multiple cores. I found usefull info on stack, on the netlogo website but I don't see the codes lines to speficy the number of repetitions ? Is it something like --repetition 40 ?
--model ~/myproject/MyModel.nlogo \
--experiment MyExperiment \
--threads 8
--Dnetlogo.extensions.dir="path"

Related

Spark (v2) does not generate output if the size is more than 2 GB

My Spark application writes outputs that range from several KBs to GBs. I have been facing problem in generating output for certain cases when the file size appears to be more than 2 GB, wherein nothing seems to happen. I hardly see any CPU usage. However, in case where the output size is less than 2 GB, such as 1.3 GB, the same application works flawlessly. Also, please note that writing output is the last stage and all the computations using the data to be written gets correctly and completely processed (as can be seen from debug output) -- hence driver storing the data is not an issue. Besides, the size of the executor memory is also not an issue as I had increased it even to 90 GB while 30GB also seems to be adequate. The following is the code I am using to write the output. Please suggest any way to fix it.
var output = scala.collection.mutable.ListBuffer[String]()
...
output.toDF().coalesce(1).toDF().write.mode("overwrite")
.option("parserLib","univocity").option("ignoreLeadingWhiteSpace","false")
.option("ignoreTrailingWhiteSpace","false").format("csv").save(outputPath)
Other related parameters passed by spark-submit are as follows:
--driver-memory 150g \
--executor-cores 4 \
--executor-memory 30g \
--conf spark.cores.max=252 \
--conf spark.local.dir=/tmp \
--conf spark.rpc.message.maxSize=2047 \
--conf spark.driver.maxResultSize=50g \
The issue was observed on two different systems, one standalone and the other which is a spark cluster.
Based on Gabio's idea of reparitioning, I solved the problem as follows:
val tDF = output.toDF()
println("|#tDF partitions = " + tDF.rdd.partitions.size.toString)
tDF.write.mode("overwrite")
.option("parserLib","univocity").option("ignoreLeadingWhiteSpace","false")
.option("ignoreTrailingWhiteSpace","false").format("csv").save(outputPath)
The output ranged between 2.3 GB and 14 GB, so the source of the problem is elsewhere and perhaps not in spark.driver.maxResultSize.
A big thank you to #Gabio!

How to get Locust to evenly spread distributions among tasks?

I have created a locustfile with 28 tasks.
When I run locust it only runs a subset of the tasks.
Here is the command I am using:
locust -f $locustfile.py --headless -u 5 -r 1 --run-time 15m --logfile locust.log
I am running it for 15 minutes and each task takes just a few seconds to a minute to run.
Upon completion, it says that it ran a task 187 times yet only 8 of the 28 tasks were run.
Of the 8 it did run it ran them anywhere from 17 to 31 times.
I'm decorating all of the tasks with "#task" so they should all be weighted the same.
Can anyone tell me why the task selection is so limited and how to make it spread out more?
It turns out that I had a bug in my error event handling that was causing the failures not to be reported. I fixed this and am now getting all of the test results. It was Solowalker's comment that led me to this discovery.

Spark Streaming does not clean files in /tmp

I have a Spark Streaming job (2.3.1 - standalone cluster):
~50K RPS
10 executors - (2 core, 7.5Gb RAM, 10Gb disk GCP nodes)
Data rate ~20Mb/sec and job (while running) run ~0.5s on 1s batches.
The problem is that the temporary files that spark writes to /tmp are never cleaned out as the JVM on the executor never terminates. Short of some clever batch job to clean the /tmp directory I am looking to find the proper way to keep my job from crashing (and restarting) on no space left of device errors.
I have the following SPARK_WORKER_OPTS set as follows:
SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.interval=1200 -Dspark.worker.cleanup.appDataTtl=345600"
I have experimented with both CMS and G1GC collectors - neither seemed to have an impacted other than modulating GC time.
I have been through most of the documented settings, and searched about but have not been able to find any additional directions to try. I have to believe that ppl are running much bigger jobs with a long running, stable stream and that this is a solved problem.
Other config bits:
spark-defaults:
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.broadcast.compress true
spark.streaming.stopGracefullyOnShutdown true
spark.ui.reverseProxy true
spark.cleaner.periodicGC.interval 20
...
spark-submit: (nothing special)
/opt/spark/bin/spark-submit \
--master spark://6.7.8.9:6066 \
--deploy-mode cluster \
--supervise \
/opt/application/switch.jar
As it stands the job runs for ~90 minutes before the drives fill up and we crash. I could spin them up with larger drives, but 90 minutes should allow for the config options I've tried to have a go at cleanup at 20 minute intervals. Larger drives would likely just prolong the issue.
Drives filling up turned out to be a red herring. A cached DataSet inside the job was not being released from memory after ds.cache. Eventually things spilled to disk and our no remaining drive space condition resulted. Or, bit by premature optimization. Removing the explicit call to cache resolved the issue and performance was not impacted.

Spark reparition() function increases number of tasks per executor, how to increase number of executor

I'm working on IBM Server of 30gb ram (12 cores engine), I have provided all the cores to spark but still, it uses only 1 core, I tried while loading the file and got successful with the command
val name_db_rdd = sc.textFile("input_file.csv",12)
and able to provide all the 12 cores to the processing for the starting jobs but I want to split the operation in between the intermediate operations to the executors, so that it can use all the 12 cores.
Image - description
val new_rdd = rdd.repartition(12)
As you can see in this image only 1 executor is running and repartition function split the data to many tasks at one executor.
It depends how you're launching the job, but you probably want to add --num-executors to your command line when you're launching your spark job.
Something like
spark-submit
--num-executors 10 \
--driver-memory 2g \
--executor-memory 2g \
--executor-cores 1 \
might work well for you.
Have a look on the Running Spark on Yarn for more details, though some of the switches they mention are Yarn specific.

MPI+OpenMP job submission script on LSF

I am very new to LSF. I have 4 nodes with with 2 sockets per node. Each node is having 8 cores. I have developed hybrid MPI+OpenMP code. I am submitting the job like the following which asks each core to perform one MPI task. So I loose the power of OpenMP.
##BSUB -n 64
I wish to submit the job so that each socket runs one MPI task rather than each core so that the cores inside the socket can be used for OpenMP. How can I build up job submit scripts to optimize the power of the Hybridization in my code.
First of all, the BSUB sentinels have to be preceded by a single # sign, otherwise they are skipped over as a regular comments.
The correct way to start a hybrid job with older LSF versions is to pass the span resource request and request nodes exclusively. To start a job with 8 MPI processes and 8 OpenMP threads each, you should use the following:
#BSUB -n 8
#BSUB -x
#BSUB -R "span[ptile=2]"
The parameters are as following:
-n 8 - requests 8 slots for MPI processes
-x - requests nodes exclusively
-R "span[ptile=2]" - instructs LSF to span the job over two slots per node
You should request nodes exclusively, otherwise LSF will schedule other jobs to the same nodes since only two slots per node will be used.
Then you have to set the OMP_NUM_THREADS environment variable to 4 (the number of cores per socket), tell the MPI library to pass the variable to the MPI processes, and make the library limit each MPI process to its own CPU socket. This is unfortunately very implementation-specific, e.g.:
Open MPI 1.6.x or older:
export OMP_NUM_THREADS=4
mpiexec -x OMP_NUM_THREADS --bind-to-socket --bysocket ./program.exe
Open MPI 1.7.x or newer:
export OMP_NUM_THREADS=4
mpiexec -x OMP_NUM_THREADS --bind-to socket --map-by socket ./program.exe
Intel MPI (not sure about this one as I don't use IMPI very often):
mpiexec -genv OMP_NUM_THREADS 4 -genv I_MPI_PIN 1 \
-genv I_MPI_PIN_DOMAIN socket -genv I_MPI_PIN_ORDER scatter \
./program.exe