Matlab: Is it possible to save in the workspace a vector that contains 4 millions of values? - matlab

I am calculating something and, as result I have a vector 4 millions elements. I am not still finished to calculate it. I´ve calculate it will take 2 and a half hours more. When I will have finished I could save it?
It is not possible, what can I do?
Thank you.

On Windows 32-bit you can have at maximum a double array of 155-200 millions of elements. Check other OSs on Mathworks support page.

Yes, just use the command save. If you just need it for later Matlab computations, then it is best to save it in .mat format.
save('SavedFile.mat','largeVector')
You can then load your file whenever you need it using the load function.
load('SavedFile.mat')

On my machine it takes 0.01 sec to get a random vector with 4 million elements, with whos you can see that it takes (only) 32 MB.
It would take only few seconds to save such amount of data with MATLAB. If you are working with post-R2007b then maybe it is better to save with '-v7.3' option, newer MATLAB versions use by default general HDF5 format but there could be some performance/disc usage issues.

Related

How to write "Big Data" to a text file using Matlab

I am getting some readings off an accelerometer connected to an Arduino which is in turn connected to MATLAB through serial communication. I would like to write the readings into a text file. A 10 second reading will write around 1000 entries that make the text file size around 1 kbyte.
I will be using the following code:
%%%%%// Communication %%%%%
arduino=serial('COM6','BaudRate',9600);
fopen(arduino);
fileID = fopen('Readings.txt','w');
%%%%%// Reading from Serial %%%%%
for i=1:Samples
scan = fscanf(arduino,'%f');
if isfloat(scan),
vib = [vib;scan];
fprintf(fileID,'%0.3f\r\n',scan);
end
end
Any suggestions on improving this code ? Will this have a time or Size limit? This code is to be run for 3 days.
Do not use text files, use binary files. 42718123229.123123 is 18 bytes in ASCII, 4 bytes in a binary file. Don't waste space unnecessarily. If your data is going to be used later in MATLAB, then I just suggest you save in .mat files
Do not use a single file! Choose a reasonable file size (e.g. 100Mb) and make sure that when you get to that many amount of data you switch to another file. You could do this by e.g. saving a file per hour. This way you minimize the possible errors that may happen if the software crashes 2 minutes before finishing.
Now knowing the real dimensions of your problem, writing a text file is totally fine, nothing special is required to process such small data. But there is a problem with your code. You are writing a variable vid which increases over time. That may cause bad performance because you are not using preallocation and it may consume a lot of memory. I strongly recommend not to keep this variable, and if you need the dater read it afterwards.
Another thing you should consider is verification of your data. What do you do when you receive less samples than you expect? Include timestamps! Be aware that these timestamps are not precise because you add them afterwards, but it allows you to identify if just some random samples are missing (may be interpolated afterwards) or some consecutive series of maybe 100 samples is missing.

Matlab .mat file saving

I have identical code in Matlab, identical data that was analyzed using two different computers. Both are Win 7 64 bit. Both Matlabs are 2014-a version. After the code finishes its run, I save the variables using save command and it outputs .mat file.
Is it possible to have two very different memory sizes for these files? Like one being 170 MB, and the other being 2.4 GB? This is absurd because when I check the variables in matlab they add up to maybe 1.5 GB at most. What can be the reason for this?
Does saving to .mat file compress the variables (still with the regular .mat extension)? I think it does because when I check the individual variables they add up to around 1.5 GB.
So why would one output smaller file size, but the other just so huge?
Mat in recent versions is HDF5, which includes gzip compression. Probably on one pc the default mat format is changed to an old version which does not support compression. Try saving specifying the version, then both PCs should result in the same size.
I found the reason for this based on the following stackoverflow thread: MATLAB: Differences between .mat versions
Apparently one of the computers was using -v7 format which produces much smaller files. - v7.3 just inflates the files significantly. But this is ironical in my opinion since -v7.3 enables saving files larger than 2 GB, which means they will be much much larger when saved in .mat file.
Anyway this link is very useful.
Update:
I implemented the serialization mentioned in the above link, and it increased the file size. In my case the best option will be using -v7 format since it provides the smallest file size, and is also able to save structures and cell arrays that I use a lot.

Load netcdf subset in Matlab

G'day,
I have ocean model output in the form of netCDF files. The netCDF files are approximately 21GB, and the variables that I want to load are also pretty big (~ 120 * 31 * 300 * 400 sized matrices).
I want to load some of these variables from a netCDF file into MATLAB. Usually, I would do this via:
ncload('filename.nc',var1)
Which would load the variables var1 into similarly named MATLAB variables. However, since I only need a single column of var1, I only want to load a subset of var1 - This should speed up the loading process. For example, say,
size(var1)
>> var1 120x31x260x381
I only want the 31st column, and loading the other 30 columns, and discarding the information seems like a waste of time. In other words, this is what I want to accomplish: ncload('filename.nc',var1(:,31,:,:)).
I know there are a few different netCDF toolboxes floating around, and I have heard that one can use a stride flag to only load every xth entry... but I'm not sure if it's possible to do what I want. Does anyone know of a way to do this?
Cheers
If you have a current version of MATLAB, look for NCREAD and the example therein.

File access speed

I have got a theoretical question:
I think to use files that store an object's position.(each line contains coordinates x and y)
The data is going to be read from the file 3 times per second.
Is the delay 0.3 s of reading a coordinates from the file is too small? Will my program get necessary information in time?
Thanks.
540 objects in array is not too much, if it is just texts / numbers. Just do the read-write job in memory. You can write the array to file after the 3-minute.
Technically, I'd imagine that you could easily read this amount of data from a file at 3 times per second - but this seems like an odd design approach? Perhaps you can expand on what you're trying to achieve for some different ideas?

Using HDD memory for the MATLAB

As in my previous question I have the following problem. I have a matrix P nxn which elements are matrices P{i,j} which are also nxn. So the total amount of elements is n^4. For n=100 there is an error about the lack of memory. I calculate this matrix only one time and then operate with it. Could you advise me, how to store matrices P{i,j} on the HDD?
I mean that maybe it is possible to store each of them in a file like "data_i_j.dat" and then load it while doing computations in a loop for i and j?
The save function will write data to a file, and the load function will read it back again. save(filename,varname,varname,varname...), followed by S = load(filename) and referring to S.varname (there's also a version of load that just dumps stuff into your current workspace, but that seems like poor practice).