I'm playing with large pointcloud data in Octave (different files ranging from [10^5 to 10^7, 4] elements) and I'm looking for ways to optimize the code.
Right now I am trying to save the data into a .mat file as I've read somewhere (confirmation needed) that loading from a .mat file is much faster than loading the actual data.txt file every time.
save -ascii myfile data works fine needs since it's only numerical values I want to store but
load('myfile.mat') brings up a 1x1 matrix containing all the values instead of having a nx4 matrix, which is strange because when I use load('data.txt') I get a full nx4 matrix.
The problem seems to be with the save syntax. Any way I can save the file so I can load it with its original dimensions? Or do I have to manipulate the resulting 1x1 variable somehow?
Bonus question:
Browsing through some answers I kinda got the feeling that working with the transpose matrix instead of the nx4 would improve runtime considerably. Is that true?
Use a binary format if speed matters. Below a little speed comparison
a = rand (1e6, 4);
fn = tmpnam;
tic; save ("-ascii", fn, "a"); toc;
tic; load ("-ascii", fn); toc;
stat (fn).size
tic; save ("-v7", fn, "a"); toc;
tic; load ("-v7", fn); toc;
stat (fn).size
tic; save ("-v6", fn, "a"); toc;
tic; load ("-v6", fn); toc;
stat (fn).size
tic; save ("-binary", fn, "a"); toc;
tic; load ("-binary", fn); toc;
stat (fn).size
which gives
Elapsed time is 2.82237 seconds.
Elapsed time is 6.28686 seconds.
ans = 61000000
Elapsed time is 1.54074 seconds.
Elapsed time is 0.252718 seconds.
ans = 30192558
Elapsed time is 0.030833 seconds.
Elapsed time is 0.047183 seconds.
ans = 32000184
Elapsed time is 0.116342 seconds.
Elapsed time is 0.0523431 seconds.
ans = 32000045
As you can see -v6 is much faster than -ascii
EDIT: also keep in mind that "-ascii" only uses single precision floats
Related
I am experiencing some strange behaviour in an code that reads a matrix from the hard disk.
I first write a binary file to the hard disk containing a matrix a:
N=17000;
a=rand(N);
fid=fopen('a','Wb');
fwrite(fid, a, 'double');
fclose(fid);
Now, I try reading the opening, reading, and closing the file to see how long the whole procedure takes. Each iteration should take the same amount of time.
for ii=1:10
tic
fid = fopen('a', 'r');
matrix=fread(fid, [N, N], 'double');
fclose(fid);
toc
end
Interestingly, the time it takes to read the file alternates between fast and slow! Is there any explanation for this? Here are the timings for the above loop:
Elapsed time is 1.259988 seconds.
Elapsed time is 2.454427 seconds.
Elapsed time is 1.534250 seconds.
Elapsed time is 2.453246 seconds.
Elapsed time is 1.535322 seconds.
Elapsed time is 2.454762 seconds.
Elapsed time is 1.534847 seconds.
Elapsed time is 2.449777 seconds.
Elapsed time is 1.534265 seconds.
Elapsed time is 2.449074 seconds.
That's very interesting, I wonder if it has to do with the preceding of opening it closing a file. A way to test that possibility may be to add delay(2) in the loop and then see if the times match.
I am collecting data from a potentiometer connected to an Arduino. In the script, I tell matlab to keep collecting data for 2 minutes. But I need to tell it that if the user does not move the potentiometer for 10 consecutive seconds, then it should stop the loop and move to the next session (write the data to an excel file). Does anybody have ideas on how to achieve this?
Probably tic and toc can help you.
tic starts a stopwatch timer. The function records the internal time at execution of the tic command.
toc reads the elapsed time from the stopwatch timer started by the tic function.
tic;
while toc < 10
% Do your loopy things
if variable_changed
tic; % Restart stopwatch
end
end
Furthermore to be sure tic won't interact with other processes you should store it's value like this:
% First start stopwatch
time_since_last_movement = tic;
while toc(time_since_last_movement) < 10
% Do your loopy things
if variable_changed
time_since_last_movement = tic; % Restart stopwatch
end
end
I need to filter 6 signals with 60000000 samples in each. So data are saved in matrix data(60000000,6). There are several aproaches how to do that:
data=randn(60000000,6);
b=ones(1,1000)/1000;
tic
R=filter(b,1,data);
toc
tic
for i=1:6
R2(:,i)=filter(b,1,data(:,i));
end
toc
tic
parfor i=1:6
R2(:,i)=filter(b,1,data(:,i));
end
toc
By documentation it is recommanded to use 1st form as the fastest one, but in my case it is the slowest.
Elapsed time is 172.235919 seconds.
Elapsed time is 45.354810 seconds.
Elapsed time is 59.250638 seconds.
In process explorer 1st form use only 1 thread of CPU. By documentation it should run on multiple threads in default. Have you experienced same problem?
This question already has answers here:
Faster way to initialize arrays via empty matrix multiplication? (Matlab)
(4 answers)
Closed 9 years ago.
/edit: See here for an interesting discussion of the topic. Thanks #Dan
Using a(m,n) = 0 appears to be faster, depending of the size of matrix a, than a = zeros(m,n). Are both variants the same when it comes to pre-allocation before a loop?
They are definately not the same.
Though there are ways to beat the performance of a=zeros(m,n), simply doing a(m,n) = 0 is not a safe way to do it. If any entries in a already exist they will keep existing.
See this for some nice options, also consider doing the loop backwards if you don't mind the risk.
I think it depends on your m and n. You can check the time for yourself
tic; b(2000,2000) = 0; toc;
Elapsed time is 0.004719 seconds.
tic; a = zeros(2000,2000); toc;
Elapsed time is 0.004399 seconds.
tic; a = zeros(2,2); toc;
Elapsed time is 0.000030 seconds.
tic; b(2,2) = 0; toc;
Elapsed time is 0.000023 seconds.
I have a program running a loop I want to have two time counters, one for the loop, that will tell me how log did one iteration of the loop took, and one for the entire program. To the best of my knowledge tic and toc will work only once.
You're only familiar with this tic toc syntax:
tic; someCode; elapsed = toc;
But there is another syntax:
start = tic; someCode; elapsed = toc(start);
The second syntax makes the same time measurement, but allows you the option of running more than one stopwatch timer concurrently. You assign the output of tic to a variable tStart and then use that same variable when calling toc. MATLAB measures the time elapsed between the tic and its related toc command and displays the time elapsed in seconds. This syntax enables you to time multiple concurrent operations, including the timing of nested operations (matlab documentation of tic toc).
Here's how to use it in your case. Let's say that this is your code:
for i = 1:M
someCode;
end
Insert the tic and toc like this:
startLoop = tic;
for i = 1:N
startIteration = tic;
someCode;
endIteration = toc(startIteration);
end
endLoop = toc(startLoop);
You can also use the above syntax to create a vector for which the ith element is the time measurement for the ith iteration. Like this:
startLoop = tic;
for i = 1:N
startIteration(i) = tic;
someCode;
endIteration(i) = toc(startIteration(i));
end
endLoop = toc(startLoop);
You can use tic and toc to time nested operations, from the Matlab help for tic:
tStart=tic; any_statements; toc(tStart); makes the same time measurement, but allows you the option of running more than one stopwatch timer concurrently. You assign the output of tic to a variable tStart and then use that same variable when calling toc. MATLAB measures the time elapsed between the tic and its related toc command and displays the time elapsed in seconds. This syntax enables you to time multiple concurrent operations, including the timing of nested operations
I'm not able to try this right now, but you should be able to use multiple tic and toc statements if you store the tic values into variables.
Read Matlab's documentation on this, there is even a section on nesting them. Here is a rough example:
tStartOverall = tic;
...
tStartLoop = tic;
<your loop code here>
tEndLoop = toc(tStartLoop);
...
tEndOverall = toc(tStartOverall);