Fetching blocks of data from a huge file in Matlab - matlab

There is a huge file mask.txt containing floating point numbers arranged in column format(single column of approx 2 million numbers) I want to extract the data in blocks of 512*512. How do I fetch the next block of data. I have done the following but it is erroneous.
rawData=dlmread('mask.txt');
a1=reshape(rawData(1:262144),512,512);
a2=reshape(rawData(262145:524289),512,512);
What to do? Please resolve the problem. Thanking you

You method is correct, it's simply your numbers that's wrong. You did the classical mistake of not counting the first number. The vector should be from [n:n+512^2-1], not [n:n+512^2] as you did. So to fix it, just do
a2=reshape(rawData(262145:524288),512,512);

Related

kdb - issue with reading floating numbers

I am reading covariance data from flat files. For that reason, not being able to fully read the floating number results in covarince not satisfying positive semi definite requirements.
For instance, this is one of the input from raw text:
“-0.581050672”— no, actually raw text is this: -5.801050672E-01
When I read this into kdb and cast with F, it results in -0.50810507. When I do this for all and check the covariance, unfortunately it does not satisfy PSD constraints. Other hack I have been doing is to add small noise in Identity matrix…
Apart from this hack, is there way to read above data into proper floating number up to 9th digit? I tried \P and .Q.f but these only seem to work in Display.
Thank you
Sorry, does not seem like a kdb issue. Was exporting these data into different software and floating points were lost during this process. Thanks for pointer.

Big numbers and long loops in matlab?

How can I store a matrix with 2^100 rows in MatLab! it is my search space and I really need to do it .
In your opinion, is it possible ? if yes, please help me that how can i do it?
2100 is about 1030, which is much too large for you to fit in memory - so you won't be able to store this matrix.
A couple of alternatives that you might want to think about -
Are many of the entries in the matrix zero? If so, you could consider using a sparse matrix which is much more memory efficient.
Do you need to be able to access the rows in an arbitrary order, or sequentially? If sequentially, you can generate the rows on an as-needed basis (perhaps in blocks of ten thousand at a time)
Do you need to look at all the rows at all? If not, perhaps you can define a function which generates the entries on the fly, as they are requested.

Need a method to store a lot of data in Matlab

I've asked this before, but I feel I wasn't clear enough so I'll try again.
I am running a network simulation, and I have several hundreds output files. Each file holds the simulation's test result for different parameters.
There are 5 different parameters and 16 different tests for each simulation. I need a method to store all this information (and again, there's a lot of it) in Matlab with the purpose of plotting graphs using a script. suppose the script input is parameter_1 and test_2, so I get a graph where parameter_1 is the X axis and test_2 is the Y axis.
My problem is that I'm not quite familier to Matlab, and I need to be directed so it doesn't take me forever (I'm short on time).
How do I store this information in Matlab? I was thinking of two options:
Each output file is imported separately to a different variable (matrix)
All output files are merged to one output file and imprted together. In the resulted matrix each line is a different output file, and each column is a different test. Problem is, I don't know how to store the simulation parameters
Edit: maybe I can use a dataset?
So, I would appreciate any suggestion of how to store the information, and what functions might help me fetch the only the data I need.
If you're still looking to give matlab a try with this problem, you can iterate through all the files and import them one by one. You can create a list of the contents of a folder with the function
ls(name)
and you can import data like this:
A = importdata(filename)
if your data is in txt files, you should consider this Prev Q
A good strategy to avoid cluttering your workspace is to import them all into a single matrix. SO if you have a matrix called VAR, then VAR{1,1}.{1,1} could be where you put your test results and VAR{1,1}.{2,1} could be where you put your simulation parameters of the first file. I think that is simpler than making a data structure. Just make sure you uniformly place the information in the same indexes of the arrays. You could also organize your VAR row v col by parameter vs test.
This is more along the lines of your first suggestion
Each output file is imported separately to a different variable
(matrix)
Your second suggestion seems unnecessary since you can just iterate through your files.
You can use the command save to store your data.
It is very convenient, and can store as much data as your hard disk can bear.
The documentation is there:
http://www.mathworks.fr/help/techdoc/ref/save.html
Describe the format of text files. Because if it has a systematic format then you can use dlmread or similar commands in matlab and read the text file in a matrix. From there, you can plot easily. If you try to do it in excel, it will be much slower than reading from a text file. If speed is an issue for you, I suggest that you don't go for Excel.

File access speed

I have got a theoretical question:
I think to use files that store an object's position.(each line contains coordinates x and y)
The data is going to be read from the file 3 times per second.
Is the delay 0.3 s of reading a coordinates from the file is too small? Will my program get necessary information in time?
Thanks.
540 objects in array is not too much, if it is just texts / numbers. Just do the read-write job in memory. You can write the array to file after the 3-minute.
Technically, I'd imagine that you could easily read this amount of data from a file at 3 times per second - but this seems like an odd design approach? Perhaps you can expand on what you're trying to achieve for some different ideas?

Matlab: Is it possible to save in the workspace a vector that contains 4 millions of values?

I am calculating something and, as result I have a vector 4 millions elements. I am not still finished to calculate it. I´ve calculate it will take 2 and a half hours more. When I will have finished I could save it?
It is not possible, what can I do?
Thank you.
On Windows 32-bit you can have at maximum a double array of 155-200 millions of elements. Check other OSs on Mathworks support page.
Yes, just use the command save. If you just need it for later Matlab computations, then it is best to save it in .mat format.
save('SavedFile.mat','largeVector')
You can then load your file whenever you need it using the load function.
load('SavedFile.mat')
On my machine it takes 0.01 sec to get a random vector with 4 million elements, with whos you can see that it takes (only) 32 MB.
It would take only few seconds to save such amount of data with MATLAB. If you are working with post-R2007b then maybe it is better to save with '-v7.3' option, newer MATLAB versions use by default general HDF5 format but there could be some performance/disc usage issues.