Load netcdf subset in Matlab - matlab

G'day,
I have ocean model output in the form of netCDF files. The netCDF files are approximately 21GB, and the variables that I want to load are also pretty big (~ 120 * 31 * 300 * 400 sized matrices).
I want to load some of these variables from a netCDF file into MATLAB. Usually, I would do this via:
ncload('filename.nc',var1)
Which would load the variables var1 into similarly named MATLAB variables. However, since I only need a single column of var1, I only want to load a subset of var1 - This should speed up the loading process. For example, say,
size(var1)
>> var1 120x31x260x381
I only want the 31st column, and loading the other 30 columns, and discarding the information seems like a waste of time. In other words, this is what I want to accomplish: ncload('filename.nc',var1(:,31,:,:)).
I know there are a few different netCDF toolboxes floating around, and I have heard that one can use a stride flag to only load every xth entry... but I'm not sure if it's possible to do what I want. Does anyone know of a way to do this?
Cheers

If you have a current version of MATLAB, look for NCREAD and the example therein.

Related

How to write "Big Data" to a text file using Matlab

I am getting some readings off an accelerometer connected to an Arduino which is in turn connected to MATLAB through serial communication. I would like to write the readings into a text file. A 10 second reading will write around 1000 entries that make the text file size around 1 kbyte.
I will be using the following code:
%%%%%// Communication %%%%%
arduino=serial('COM6','BaudRate',9600);
fopen(arduino);
fileID = fopen('Readings.txt','w');
%%%%%// Reading from Serial %%%%%
for i=1:Samples
scan = fscanf(arduino,'%f');
if isfloat(scan),
vib = [vib;scan];
fprintf(fileID,'%0.3f\r\n',scan);
end
end
Any suggestions on improving this code ? Will this have a time or Size limit? This code is to be run for 3 days.
Do not use text files, use binary files. 42718123229.123123 is 18 bytes in ASCII, 4 bytes in a binary file. Don't waste space unnecessarily. If your data is going to be used later in MATLAB, then I just suggest you save in .mat files
Do not use a single file! Choose a reasonable file size (e.g. 100Mb) and make sure that when you get to that many amount of data you switch to another file. You could do this by e.g. saving a file per hour. This way you minimize the possible errors that may happen if the software crashes 2 minutes before finishing.
Now knowing the real dimensions of your problem, writing a text file is totally fine, nothing special is required to process such small data. But there is a problem with your code. You are writing a variable vid which increases over time. That may cause bad performance because you are not using preallocation and it may consume a lot of memory. I strongly recommend not to keep this variable, and if you need the dater read it afterwards.
Another thing you should consider is verification of your data. What do you do when you receive less samples than you expect? Include timestamps! Be aware that these timestamps are not precise because you add them afterwards, but it allows you to identify if just some random samples are missing (may be interpolated afterwards) or some consecutive series of maybe 100 samples is missing.

Need a method to store a lot of data in Matlab

I've asked this before, but I feel I wasn't clear enough so I'll try again.
I am running a network simulation, and I have several hundreds output files. Each file holds the simulation's test result for different parameters.
There are 5 different parameters and 16 different tests for each simulation. I need a method to store all this information (and again, there's a lot of it) in Matlab with the purpose of plotting graphs using a script. suppose the script input is parameter_1 and test_2, so I get a graph where parameter_1 is the X axis and test_2 is the Y axis.
My problem is that I'm not quite familier to Matlab, and I need to be directed so it doesn't take me forever (I'm short on time).
How do I store this information in Matlab? I was thinking of two options:
Each output file is imported separately to a different variable (matrix)
All output files are merged to one output file and imprted together. In the resulted matrix each line is a different output file, and each column is a different test. Problem is, I don't know how to store the simulation parameters
Edit: maybe I can use a dataset?
So, I would appreciate any suggestion of how to store the information, and what functions might help me fetch the only the data I need.
If you're still looking to give matlab a try with this problem, you can iterate through all the files and import them one by one. You can create a list of the contents of a folder with the function
ls(name)
and you can import data like this:
A = importdata(filename)
if your data is in txt files, you should consider this Prev Q
A good strategy to avoid cluttering your workspace is to import them all into a single matrix. SO if you have a matrix called VAR, then VAR{1,1}.{1,1} could be where you put your test results and VAR{1,1}.{2,1} could be where you put your simulation parameters of the first file. I think that is simpler than making a data structure. Just make sure you uniformly place the information in the same indexes of the arrays. You could also organize your VAR row v col by parameter vs test.
This is more along the lines of your first suggestion
Each output file is imported separately to a different variable
(matrix)
Your second suggestion seems unnecessary since you can just iterate through your files.
You can use the command save to store your data.
It is very convenient, and can store as much data as your hard disk can bear.
The documentation is there:
http://www.mathworks.fr/help/techdoc/ref/save.html
Describe the format of text files. Because if it has a systematic format then you can use dlmread or similar commands in matlab and read the text file in a matrix. From there, you can plot easily. If you try to do it in excel, it will be much slower than reading from a text file. If speed is an issue for you, I suggest that you don't go for Excel.

File access speed

I have got a theoretical question:
I think to use files that store an object's position.(each line contains coordinates x and y)
The data is going to be read from the file 3 times per second.
Is the delay 0.3 s of reading a coordinates from the file is too small? Will my program get necessary information in time?
Thanks.
540 objects in array is not too much, if it is just texts / numbers. Just do the read-write job in memory. You can write the array to file after the 3-minute.
Technically, I'd imagine that you could easily read this amount of data from a file at 3 times per second - but this seems like an odd design approach? Perhaps you can expand on what you're trying to achieve for some different ideas?

Matlab: Is it possible to save in the workspace a vector that contains 4 millions of values?

I am calculating something and, as result I have a vector 4 millions elements. I am not still finished to calculate it. I´ve calculate it will take 2 and a half hours more. When I will have finished I could save it?
It is not possible, what can I do?
Thank you.
On Windows 32-bit you can have at maximum a double array of 155-200 millions of elements. Check other OSs on Mathworks support page.
Yes, just use the command save. If you just need it for later Matlab computations, then it is best to save it in .mat format.
save('SavedFile.mat','largeVector')
You can then load your file whenever you need it using the load function.
load('SavedFile.mat')
On my machine it takes 0.01 sec to get a random vector with 4 million elements, with whos you can see that it takes (only) 32 MB.
It would take only few seconds to save such amount of data with MATLAB. If you are working with post-R2007b then maybe it is better to save with '-v7.3' option, newer MATLAB versions use by default general HDF5 format but there could be some performance/disc usage issues.

Using HDD memory for the MATLAB

As in my previous question I have the following problem. I have a matrix P nxn which elements are matrices P{i,j} which are also nxn. So the total amount of elements is n^4. For n=100 there is an error about the lack of memory. I calculate this matrix only one time and then operate with it. Could you advise me, how to store matrices P{i,j} on the HDD?
I mean that maybe it is possible to store each of them in a file like "data_i_j.dat" and then load it while doing computations in a loop for i and j?
The save function will write data to a file, and the load function will read it back again. save(filename,varname,varname,varname...), followed by S = load(filename) and referring to S.varname (there's also a version of load that just dumps stuff into your current workspace, but that seems like poor practice).