I am currently facing a new batch of hyperspectral images in Matlab. Usually, I have no problems, but this time images of around 2000x2000x324 (double) have flooded me with "Out of memory" messages for all the image processing and statistical analyses I was a happily doing before.
My pc is relatively good ( Intel Core i7 and a total of available RAM of 15.8 GB but says available 7), so I am surprised.
I was wondering if you have any suggestions, strategy for dealing with such files or I am missing something here.
I have been suggested to switch to python for larger files, but with Matlab I have been having an intimate relationship for a while now.
Thank you very much for your support.
E
Related
some users (right now 4) are running the same rather large and heavy MATLAB (R2010b) script on the same windows server.
There seems to be a rather big drop in performance in the MATLAB (observed a factor 5 in running time when doing a bit of benchmarking) when more users are running the same script on the server (many different datasets). Depending on the size of the dataset the running time is between a few hours and 1-2 weeks.
There is plenty of CPU and RAM resources available on the server, this is not the bottleneck. This server has 64 cores and 128 GB RAM, the program uses no more than 10% of the CPU, most of the time less than, and about 1 GB of RAM while running).
It does not seem to be a bottleneck related to hardware, as the server in general is running other programs without any significant slow down, only MATLAB seems to be slowing down.
Is there some kind of internal resources in MATLAB that is being used up and creating a bottleneck and if so is there a way to get around this?
Edit, extra info
When running "bench" while the scripts are running I also get extremely slow speed from this internal machine benchmarking, worse for the heavier tests ... this indicates to me it is not directly related to reading/writing files, it might be indirectly related if matlab writes some temporary files.
Also just tried to increase Java Heap Memory to 10 GB ... it does improve performance a bit, but there is still a very clear slowdown with each new instance of this script that is being run.
Update: upgrading to MATLAB 2015B didn't change much. We have improved a lot on the code so it runs much faster now, but the issue still remains even though the problem is smaller since the program is script is running for shorter amounts of time for each user.
I'm not sure what is going on. I am running my neural network simulations on my laptop, which has MATLAB R2013a on it.
The code runs fast on my desktop (R2012a though), but very very slow on the laptop. I ran it with performance and timing thing because this seems abnormal, here are the screenshots I took of the functions spending the most time doing something:
This is located in the codeHints.m file, so it isn't something I wrote. Is there any way I can disable this? I googled it but maybe I am not searching for the right things... I couldn't find anything. I can't get any work done because it is so slow :(
Would appreciate some advice!
Update: I have also attempted to run it on my desktop at work (same MATLAB version as laptop, also 8GB of RAM), and I get the same issue. I checked the resource monitor and it seems like the process is triggering a lot of memory faults (~40/sec), even though not even half of my RAM is being used.
I typed in "memory" in MATLAB and got the following information:
Maximum possible array: 11980 MB (1.256e+10 bytes) *
Memory available for all arrays: 11980 MB (1.256e+10 bytes) *
Memory used by MATLAB: 844 MB (8.849e+08 bytes)
Physical Memory (RAM): 8098 MB (8.491e+09 bytes)
So it seems like there should be sufficient room. I will try to put together a sample file.
Update #2: I ran my code on 2012a on the work computer with the following "memory" info:
Maximum possible array: 10872 MB (1.140e+10 bytes) *
Memory available for all arrays: 10872 MB (1.140e+10 bytes) *
Memory used by MATLAB: 846 MB (8.874e+08 bytes)
Physical Memory (RAM): 8098 MB (8.491e+09 bytes)
The run with more iterations than above (15000 as opposed to 10000) completed much faster and there are no extraneous calls for memory allocation:
So it seems to me that it is an issue exclusively with 2013a. For now I will use 2012a (because I need this finished), but if anyone has ideas on what to do with 2013a to stop those calls to codeHints, I would appreciate it.
Though this would scream memory problems at first sight, it seems like your test have made a lack of memory improbable. In this case the only reasonable explanation that I can think off is that the computer is actually trying to do 2 different things, thus taking more time.
Some possibilities:
Actually not using exactly the same inputs
Actually not using exactly the same functions
The first point can be detected by putting some breakpoints in the codes whilst running it on 2 computers and verifying that the inputs are exactly the same. (Consider using visdiff if you have a lot of variables)
The second one could almost only be caused by having overloaded zeros. Make sure to stop at this line and see which function is being called.
If both these points don't solve the problem, try reducing the code as much as possible till you have only one or a few lines that create the difference. If it turns out that the difference just comes from this one line, try using the zeros function with the right size input on both computers and time the result with the timeit File Exchange Submission
If you find that you are using the builtin function on both computers, with plenty of memory and there still is a huge performance difference, it is probably time to contact mathworks support and hear what they have to say about it.
I need to spec out a new computer. I'm only going to use this computer for developing Scala software. I'm going to be running Intellij, doing builds with Maven and SBT, and perhaps firing up a couple of Virtual Machines. I'm going to building a mixture of fairly large Play Framework and micro-services. What is a reasonable machine for doing this work?
The Scala compiler still has poor parrelisation. I doubt that's going to change before you'll be due an upgrade. For this reason I would suggest as a minimum using a Haswell 4670. Going up to an i7 will probably be of doubtful benefit. if you want to spend extra money over-clock a 4770K or a 4670k. If you've really got money to burn use an Ivybridge 4960x, but you won't see much benefit for that extra money. Intel beats AMD on core for core performance. Make sure you've got a 4 memory slot motherboard. 2 Eight Gig DDR3 1600 sticks are probably more than sufficient but allow for an upgrade to 32 Gig in a year or so's time when hopefully memory's come down in price.
As already stated a decent SSD. Run your operating System, your IDE and your projects off the SSD. You'll want a SATA drive for mass storage.
Anyway above $1500 or so for the Base unit diminishing returns set in rapidly. Unless you've really got money to burn.
You'll probably want a graphic cards to run multiple monitors. An AMD 7790 should do the job. I'm assuming that a budget of a 1000 to 1500 dollars for a base unit is not as issue. Personally I find 3 24" 1920 * 1200 monitors just right for civilised development.
You should turn your focus to these key components:
CPU
Since Scala (and the compiler) parallelize well: go for more cores, the more the better. Depending on your budget you can think of multi-CPU systems.
RAM
A lot of RAM helps a lot. I am pretty happy with 16GB, but depending on the size you are planning you might need more, but 16 is a decent amount. You can also think about using RAM to make a RAM-disk for faster compilation etc.
HDD
You definitely want to have a fast SSD. You should look for one with high IOPS, the transfer rate is for developing not that important. If you have a large budget, you can go for 2 SSDs in RAID-0. But be aware that some RAID controllers are not fast enough and will not give you the full possible performance of your SSD RAID.
I have been trying to load data file (csv) into matlab 64 bit running on win7(64 bit) but get memory related errors. The file size is about 3 GB, containing date ( dd/mm/yyyy hh:mm:ss) in first column and bid and ask prices in another two columns. The memory command returns the following :
Maximum possible array: 19629 MB (2.058e+010 bytes) *
Memory available for all arrays: 19629 MB (2.058e+010 bytes) *
Memory used by MATLAB: 522 MB (5.475e+008 bytes)
Physical Memory (RAM): 16367 MB (1.716e+010 bytes)
* Limited by System Memory (physical + swap file) available.
Can somebody here please explain if the max possible array size is 19.6 GB then why would matlab throw a memory error while importing a data array that is just about 3GB. Apologies if this is a simple question to the experienced as I have little experience in process/app memory management.
I would greatly appreciate if someone would also suggest solution to being able to load this dataset into matlab workspace.
Thank you.
I am no expert in memory management but from experience I can tell you that you will run into all kinds of problems if you're importing/exporting 3GB text files.
I would either use an external tool to split your data before you read it or look into storing that data in another format that is more suited to large datasets. Personally, I have used hdf5 in the past---this is designed for large sets of data and is also supported by matlab.
In the meantime, these links may help:
Working with a big CSV file in MATLAB
Handling Large Data Sets Efficiently in MATLAB
I've posted before showing how to use memmapfile() to read huge text files in matlab. This technique may help you as well.
I am suffering from the out of memory problem in mablab.
Is is possible to build a large memory system for matlab(e.g. 64GB ram)?
If yes, what do I need?
#Itamar gives good advice about how MATLAB requires contiguous memory to store arrays, and about good practices in memory management such as chunking your data. In particular, the technical note on memory management that he links to is a great resource. However much memory your machine has, these are always sensible things to do.
Nevertheless, there are many applications of MATLAB that will never be solved by these tips, as the datasets are just too large; and it is also clearly true that having a machine with much more RAM can address these issues.
(By the way, it's also sometimes the case that it's cheaper to just buy a new machine with more RAM than it is to pay the MATLAB developer to make all the memory optimizations they could - but that's for you to decide).
It's not difficult to access large amounts of memory with MATLAB. If you have a Windows or Linux machine with 64GB (or more) - it will obviously need to be running a 64-bit OS - MATLAB will be able to access it. I've come across plenty of MATLAB users who are doing this. If you know what you're doing you can build your own machine, or nowadays you can just buy a machine that size of the shelf from Dell.
Another option (depending on your application) would be to look into getting a small cluster, and using Parallel Computing Toolbox together with MATLAB Distributed Computing Server.
When you try to allocate an array in Matlab, Matlab must have enough contiguous memory the size of the array, and if not enough contiguous memory is available, you will get out of memory error, no matter how much RAM you have on your computer.
From my experience, the solution will not come from dealing directly with memory-related properties of your hardware, but from writing your code in a way that prevents allocation of too large arrays (cutting data to chunks, etc.). If you can describe your code and the task you try to solve, it might be possible to guide you in that direction.
You can read more here:http://www.mathworks.com/support/tech-notes/1100/1106.html