When generate a network of 500 nodes, the behavior space went wrong and how can I solve it? - netlogo

I use the NW extension to generate small-world networks and make an experiment in behaviour space to control the . In the experiment, I set ["nb-nodes" 50 100 500] (nb-nodes: the number of nodes).
Everything goes well until n = 500. CPU usage is too high, causing the program to become unresponsive.But when I set a single simulation with n = 500 in the interface, it keeps working. Only when I try to do it in the behaviour space, it goes wrong.
How can I solve it?
solution:
Thank for advice of Jasper and Steve Railsback, it helps a lot :)
In the FAQ (http://ccl.northwestern.edu/netlogo/docs/faq.html#how-big-can-my-model-be-how-many-turtles-patches-procedures-buttons-and-so-on-can-my-model-contain) it says:
"If you are using BehaviorSpace, note that doing runs in parallel will multiply your RAM usage accordingly"
So I just reduce the parallel from 16 to 8, I know it will slow down the program, but at the same time it will also reduce RAM usage and it works.
Of course, change RAM is another way and it can fit higher calculating demand.

I agree that you should start with increasing NetLogo's memory allocation as directed in the FAQ that Jasper referred you to. (At least in Windows, you must edit the NetLogo.cfg file using administrator privileges.) You can start by doubling or quadrupling the allocation, but the only real limitation is how much RAM you have on your machine.
We keep notes and a publication on NetLogo performance issues here:
http://www.railsback-grimm-abm-book.com/jasss-models/

Related

Define the minimal configuration for a program

So I have my .exe ready to deploy, and for distribution, I need to know the minimal requirements for my program to run on a machine... and I really don't know how to do that.
Is there a way to know that ? Some kind of benchmark ? Or must I just set things as I think it'll work ?
Maybe should I just buy all existing components until I find the minimal ? :')
Well, thanks for your answers.
Start by seeing the first Windows' version you can deploy on (Windows XP? Vista?).
If your program is cpu or gpu intensive, and has a fixed time loop (eg. game) then you'll have to do benchmarks.
You should look at several old vs new CPUs/GPUs and trying to "guess" based on online specs posted online what the minimal requirement is. For example, if your program can't run on an old cpu, but runs blazingly fast on a new one, try to find the model that -barely- runs it, which will obviously be one somewhere in the middle.
If your program requires other special things, specify them (eg. USB 3.0, controllers supported...).
Otherwise, if your program loads slower but doesn't have runtime issues, the minimum specs should be indicative of a reasonable loading time (a minute seems to be the standard now, sadly).
Additionally, if your program is memory hungry (both hard drive or RAM), you must indicate this.
For hard drive memory, simply state your program size, along with the files included with it.
For RAM, use a profiler - it will tell you how much memory your program is using.
I've completely skipped over the fact that, in some computers, the bottleneck might be the cpu, and it might be the gpu in others. You need to know which is the bottleneck to make your judgement.
To find out is a rather simple process - remove expensive gpu operations (lower texture resolutions, turn off shaders). If the program still runs slowly, then the bottleneck is the cpu.
edit: this is a simplification of the problem and hardware is a little more complicated than this (slower multi-core cpus vs faster one-core cpus vary their performance depending on how many cores a program uses and how / a program may require a gpu to have less memory but more processing power, or the opposite... even heat dissipation can affect component efficiency: your program might run fine for 20 minutes but start to slow down if the cpu isn't cooled down properly), but "minimal hardware requirements" aren't exactly precise so this method is appropriate.
tl;dr of the spoiler:
In short, there are so many factors that affect performance that you can't measure it, so just a rough estimation is good.

Growing memory usage in MATLAB

I use MATLAB for programming some meta-heuristics. Recently, I have been working on an algorithm for solving an industrial engineering problem. My problem with MATLAB is getting "out of memory" errors. Now I'm trying some suggestions from Mathworks and Stackoverflow (Hope they will work). However, there is one thing I did not understand.
During the run of the algorithm in MATLAB (it takes 4000-5000 cpu sec for a medium sized problem), even though I preallocate variables, code does not demand dynamic array resizing and does not add new variables, I observe that the memory usage of the algorithm grows continuously. The main function calls some other functions written by me. What could be the reason of increase of the memory usage?
The computer I use for the running of the algorithm has 8GBs of memory and win8 64bit installed.
The only way to figure this out is to see where the memory is going.
I think you may accidentally store results that you don't need, or that you underestimate the size of your output/intermediate variables.
Here is how I would proceed:
Turn on dbstop if error
Run the code till you get the out of memory error
See how much memory is being used (make sure to check all work spaces)
Probably you now know where the extra memory is going. If you don't find much memory being used, continue with this:
Check the memory command to see how much memory is still available
Carefully look at the line being executed, perhaps you actually need a huge amount of memory for it
If all else fails share your findings here and others can help you look for it.
The reason of memory usage growth is CPlex. I tried many alternatives but I couldn't find any other useful solution than increasing virtual memory to several hundred GBs. If you don't have special reasons to insist on CPlex (commercial usage, licensing etc.), I would suggest anyone, who encounter this problem, to use GUROBI. It is free and unlimited for academic usage, totally integrable with MATLAB. That's the solution I have found for my problem with Cplex. I hope this solution works for everybody.

How to fully use the CPU in Matlab [Improving performance of a repetitive, time-consuming program]

I'm working on an adaptive and Fully automatic segmentation algorithm under varying light condition , the core of this algorithm uses Particle swarm Optimization(PSO) to tune the fuzzy system and believe me it's very time consuming :| for only 5 particles and 100 iterations I have to wait 2 to 3 hours ! and it's just processing one image from my data set containing over 100 photos !
I'm using matlab R2013 ,with a intel coer i7-2670Qm # 2.2GHz //8.00GB RAM//64-bit operating system
the problem is : when starting the program it uses only 12%-16% of my CPU and only one core is working !!
I've searched a lot and came into matlabpool so I added this line to my code :
matlabpool open 8
now when I start the program the task manger shows 98% CPU usage, but it's just for a few seconds ! after that it came back to 12-13% CPU usage :|
Do you have any idea how can I get this code run faster ?!
12 Percent sounds like Matlab is using only one Thread/Core and this one with with full load, which is normal.
matlabpool open 8 is not enough, this simply opens workers. You have to use commands like parfor, to assign work to them.
Further to Daniel's suggestion, ideally to apply PARFOR you'd find a time-consuming FOR loop in you algorithm where the iterations are independent and convert that to PARFOR. Generally, PARFOR works best when applied at the outermost level possible. It's also definitely worth using the MATLAB profiler to help you optimise your serial code before you start adding parallelism.
With my own simulations I find that I cannot recode them using Parfor, the for loops I have are too intertwined to take advantage of multiple cores.
HOWEVER:
You can open a second (and third, and fourth etc) instance of Matlab and tell this additional instance to run another job. Each instance of matlab open will use a different core. So if you have a quadcore, you can have 4 instances open and get 100% efficiency by running code in all 4.
So, I gained efficiency by having multiple instances of matlab open at the same time and running a job. My jobs took 8 to 27 hours at a time, and as one might imagine without liquid cooling I burnt out my cpu fan and had to replace it.
Also do look into optimizing your matlab code, I recently optimized my code and now it runs 40% faster.

Is it possible to build a Large Memory System for matlab?

I am suffering from the out of memory problem in mablab.
Is is possible to build a large memory system for matlab(e.g. 64GB ram)?
If yes, what do I need?
#Itamar gives good advice about how MATLAB requires contiguous memory to store arrays, and about good practices in memory management such as chunking your data. In particular, the technical note on memory management that he links to is a great resource. However much memory your machine has, these are always sensible things to do.
Nevertheless, there are many applications of MATLAB that will never be solved by these tips, as the datasets are just too large; and it is also clearly true that having a machine with much more RAM can address these issues.
(By the way, it's also sometimes the case that it's cheaper to just buy a new machine with more RAM than it is to pay the MATLAB developer to make all the memory optimizations they could - but that's for you to decide).
It's not difficult to access large amounts of memory with MATLAB. If you have a Windows or Linux machine with 64GB (or more) - it will obviously need to be running a 64-bit OS - MATLAB will be able to access it. I've come across plenty of MATLAB users who are doing this. If you know what you're doing you can build your own machine, or nowadays you can just buy a machine that size of the shelf from Dell.
Another option (depending on your application) would be to look into getting a small cluster, and using Parallel Computing Toolbox together with MATLAB Distributed Computing Server.
When you try to allocate an array in Matlab, Matlab must have enough contiguous memory the size of the array, and if not enough contiguous memory is available, you will get out of memory error, no matter how much RAM you have on your computer.
From my experience, the solution will not come from dealing directly with memory-related properties of your hardware, but from writing your code in a way that prevents allocation of too large arrays (cutting data to chunks, etc.). If you can describe your code and the task you try to solve, it might be possible to guide you in that direction.
You can read more here:http://www.mathworks.com/support/tech-notes/1100/1106.html

Is Scala doing anything in parallel on its own?

I have little program creating a maze. It uses lots of collections (the default variant, which is immutable, or at least used as an immutable).
The program calculates 30 mazes with increasing dimensions. Using a for comprehension over (1 to 30)
Since with the latest versions the parallel collections framework became available I thought to give it a spin, hoping for some performance gain.
This failed and when I investigated a little, I found the following:
When run without any call to anything remotely parallel it still showed a processor load of about 30% on each of the 4 cores of my machine.
When I replaced the Range 1 to 30 with (1 to 30).par CPU load went up to about 80% on all cores (which I expected). The order in which the mazes completed became more or less random (which I expected). The total time for all mazes stayed the same.
Replacing some of the internally used collections with their parallel counter parts did seem to have an effect.
I now have 2 questions:
Why do I have all 4 cores spinning, although there isn't anything that runs in parallel.
What might be likely reasons for the program to still take the same time, no matter if running in parallel or not. There are no obvious other bottlenecks but CPU cycles (no IO, no Network, plenty of Memory via -Xmx setting)
Any ideas on this?
The 30% per core version is just a poor scheduler (sounds like Windows 7) migrating the process from core to core very frequently. It's probably closer to 25% per core (1/4) for your process plus misc other load making 30%. If you run the same example under Linux you would probably see one core pegged.
When you converted to (1 to 30).par, you started really using threads across all cores but the synchronization overhead of distributing such a small amount of work and then collecting the results cancelled out the parallelism gains. You need to break your work into larger independent chunks.
EDIT: If each of 1..30 represents some larger amount of work (solving a maze, say) then automatic parallelization will work much better if each unit of work is about the same. Imagine you had 29 easy mazes and one very very hard maze. The 30th maze will still run serially (or very nearly) with everything else). If your mazes increase in complexity by number try spawning them in the order 30 to 1 by -1 so that the biggest tasks will go first. Think of it as a braindead solution to the knapsack problem.