I am trying to simulate trajectories in the Lorenz System in MATLAB, with currently using the following code -
clear all
clf;
clc;
% Solution
[t1,x1] = ode45('g',[0 30],[0;2;0]);
[t2,x2] = ode45('g2',[0 30],[0;2.001;0]);
[C,h] = size(x2);
ang = 0;
for j = 1:C
p1(j,:)= x1(j,:);
p2(j,:)= x2(j,:); % Plot
plot3(p1(:,1),p1(:,2),p1(:,3),'k', p2(:,1),p2(:,2),p2(:,3),'r'); hold on;
plot3(p1(j,1),p1(j,2),p1(j,3),'ko','markerfacecolor','k');
plot3(p2(j,1),p2(j,2),p2(j,3),'rd','markerfacecolor','r'); hold off
axis([-20 20 -40 40 0 50])
axis off
set(gca,'color','none') % Rotation
camorbit(ang,0,[p1(1,1),p1(1,2),p1(1,3)])
ang = ang + (360/C); % Record
set(gcf, 'units','normalized','outerposition',[0 0 1 1])
F(j)= getframe(gcf);
end
movie(F)
clf;
close;
With the functions g, g2 defined in the same way:
function xdot = g(t,x)
xdot = zeros(3,1);
sig = 10;
rho = 28;
bet = 8/3;
xdot(1) = sig*(x(2)-x(1));
xdot(2) = rho*x(1)-x(2)-x(1)*x(3);
xdot(3) = x(1)*x(2)-bet*x(3);
Which is the Lorenz System. The purpose of this whole code is to make a movie of the trajectory of two initial states that vary very slightly, in order to demonstrate the chaotic behaviour of this system. The code itself does in fact work, but takes all of my computer's memory, and in an attempt to make a .avi file of the trajectory, it complained about exceeding 7.5 GB - which is of course way too much for this simulation.
My question consists of two parts:
(1) How do I manage this code in order to make it run more smoothly?
(2) How can I make a .avi file of the trajectory? I tried to find a way on the internet for a long time, but either MATLAB or my computer gave up every time.
Thanks in advance!
As already mentioned in my comment above: your code runs quite smoothly on my Laptop machine (an "old" i5 processor, 8 GB memory). Approximately 102 % CPU load is generated and about 55 % of my memory is used during the frame generation process.
To write your frames to a video file is used the following commands:
v = VideoWriter('LorenzAnimation.avi');
open(v);
writeVideo(v,F);
close(v);
This outputs a file of 47 seconds (C=1421 frames, 30 frames per second) duration and frames of size 1364 × 661 pixels each. The file is about 38 MB. Both generating the frames and writing the video took about 3 minutes on my machine (using tic/toc).
I cannot tell you much about CPU load during the video writing process (varying between 5 and 400 %). It took about up to 82 % of my memory. Better do not touch your machine within this process.
Note: make sure that you do not change the size of the figure window as all frames must be the same size, else MATLAB will return with an error message.
Things that might influence the "smoothness":
you are using a bigger frame size than me
you are not using compressed video, what was your approach to write the video file?
the scheduler of your operating system does a bad/good job
your machine is even slower than mine (unlikely)
Edit: initializing variables you are operating on (e.g. vectors and matrices) often speeds up as you are pre-allocating memory. I have tried this for the frame generation process (where 540, 436, 3 should be replaced by your frame dimensions - manually or automatically
G = struct('cdata', uint8( zeros(540, 436, 3) ), 'colormap', []);
G = repmat( G, 1, C );
This gave me a little speed-up, though I am not sure if that's the perfect way to initialize a struct array.
Related
I am doing mean shift color based image segmentation on video frames.
Here is my code:
while hasFrame(v)
if k == 1
s(k).cdata = readFrame(v);
a = s(k).cdata;
I = imresize(a,[50,50]);
[means, Ims, Nms] = Ms(I,bw); %Mean Shift on first frame
Ims = im2uint8(Ims);
s(k).cdata = Ims;
else
s(k).cdata = readFrame(v);
a = s(k).cdata;
I = imresize(a,[50,50]);
[Ims,data2cluster]= MeanShiftCluster2(I,means); % simple segmentation based on norm using means of first frame
Ims = im2uint8(Ims);
Ims = imresize(Ims,[500,720]);
s(k).cdata = Ims;
end
k=k+1;
end
I am sending first frame for mean shift implementation and then using same resulting means for all other frames to calculate their respective clusters on basis of euclidean distance (My frames have minor changes).
Problems:
Profiler tells that iamresize and VideoReader functions are taking too long to execute. Are there any substitute that I can use?
imresize may be the slowest step of your processing.
But here are several idea to fasten the process.
imresize do what is called interpolation. This is can be a slow process, but the speed depends on the quality you want for the output. The default in matlab is bicubic. You can try bilinear or nearest. e.g :
[...] = imresize(...,'nearest');
In my personnal experimentation, I have also found that imresize as equivalent functions have some overhead. You could probably go "much" faster by calling the function only once for all your video frames. You will need to have enough memory to do this. Suppose that you you have all your frames in a 3d matrix dataMovie. When constructing this matrix, preallocation (by getting the number of frame) will help gaining some speed !
k = 0.5; % scaling parameter
tform = affine2d([k 0 0;0 k 0;0 0 1]);
dataTform = imwarp(dataMovie,tform,'nearest');
And then, apply your processing to the resize movie frame by frame. You can also provide the type of interpolation nearset, linear or bicubic.
If you are working on a color movie, you need to stack the 3 color layers of all the frame together, and get them back using the proper indexing.
I have a MATLAB code where I am using parfor to reduce the amount of time taken by for to do some image processing tasks. Basically,it is taking two images and after doing some mathematical calculations, it produces a scalar qunatity called EucDist. For this, one image is kept fixed and another image is generated by a FORTRAN code which is taking around 20 seconds to do that. Below is the outline of my code:
matlabpool open
gray1 = some_image(8192,200);
dep = 0.04:0.01:0.40; % Parameter 1
vel = 1.47:0.01:1.72; % Parameter 2
dist = zeros(length(dep),length(vel));
tic
parfor i = 1:length(dep)
ans = zeros(1,length(vel));
for j = 1:length(vel)
% Updating the Input.txt file
fname = sprintf('Input_%.2d%s',i,'.txt');
fid=fopen(fname,'w');
fprintf(fid,'%-5.2f\n%-5.2f\n%.2d',dep(i),vel(j),i);
fclose(fid);
% Running my fortran code to generate another .dat file (Note that I have already compiled this code outside these loops)
system(['./editcrfl ' fname]);
% Calling IMAGE_GEN script incorporating the above .dat file
system('IMAGE_GEN');
system(sprintf('IMAGE_GEN %d',i));
gray2 = some_image(8192,200);
% Doing some mathematical calculations and getting a value say 'EucDist'
- - - - - - -
- - - - - - -
ans(j) = EucDist;
end
dist(i,:) = ans;
fclose('all');
end
fprintf('Total time taken: %f\n',toc);
matlabpool close
There are two major problems that I am facing with the above code.
First, the dist matrix is not able to store all the EucDist generated. Ideally dist matrix should be of size 37 X 26 but it is only 37 X 1 and all the values are zeros in that. Though I have checked that all 37 X 26 values are getting calculated but don't know why it is not getting stored in dist.
Second, the total time taken when I am using parfor is somewhere around 9.5 hours whereas for normal for it is taking only 5.5 hours.
Can someone please help me to get rid of the above two problems?
Thanks in advance.
So I'm trying to perform STFT on a piano recording using matlab, but I get the following error.
Warning: Input arguments must be scalar.
In test3 at 35
??? Error using ==> zeros
Out of memory. Type HELP MEMORY for your options.
Error in ==> test3 at 35
song = cat(1,song,zeros(n_of_padding,1));
The coding I've used is taken from a sample code found on the net.
clc;
clear all;
[song,FS] = wavread('c scale fast.wav');
song = sum(song,2);
song = song/max(abs(song));
wTime = 0.05;
ZP_exp = 1;
P_OL = 50;
% Number of STFT samples per STFT slice
N_window = floor(wTime*FS);
% Number of overlapping points
window_overlap = floor(N_window*(P_OL/100));
wTime = N_window/FS;
%size checking
%make sure there are integer number of windows if not zero pad until they are
L = size(song);
%determine the number of times-1 the overlapping window will fit the song length
N_of_windows = floor(L - N_window/(N_window - window_overlap));
%determine the remainder
N_of_points_left = L - (N_window + N_of_windows*(N_window - window_overlap));
%Calculate the number of points to zero pad
n_of_padding = (N_window - window_overlap) - N_of_points_left;
%append the zeros to the end of the song
song = cat(1,song,zeros(n_of_padding,1));
clear n_of_windows n_of_points_left n_of_padding
n_of_windows = floor((L - N_window)/(N_window - window_overlap))+1;
windowing = hamming(N_window);
N_padding = 2^(nextpow2(N_window)+ZP_exp);
parfor k = 1:N_of_windows
starting = (k-1)*(N_window -window_overlap) +1;
ending = starting+N_window-1;
%Define the Time of the window, i.e., the center of window
times(k) = (starting + ceil(N_window/2))/Fs;
%apply windowing function
frame_sample = music(starting:ending).*windowing;
%take FFT of sample and apply zero padding
F_trans = fft(frame_sample,N_padding);
%store FFT data for later
STFT_out(:,k) = F_trans;
end
Based on some assumptions I would reason that:
- n_of_padding should be smaller than N_window
- N_window is much smaller FS
- Fs is not too high (frequency of your sound, so should not exceed a few thousand?!)
- Your zeros matrix will not be huge
This should mean that the problem is not that you are creating a too large matrix, but that you already filled up the memory before this call.
How to deal with this?
First type dbstop if error
Run your code
When it stops check all variable sizes to see where the space has gone.
If you don't see anything strange (and the big storage is really needed) then you may be able to process your song in parts.
In line 35 you are trying to make an array that exceeds your available memory. Note that a 1 by n array of zeros alone, is n*8 bytes in size. This means if you make such an array, call it x, and check it with whos('x'), like:
x = zeros(10000,1);
whos('x');
You will likely find that x is 80000 bytes. Maybe by adding such an array to your song variable is adding the last bytes that breaks the memory-camel's back. Using and whos('variableName') take whatever the size of song is before line 35, separately add the size of zeros(n_of_padding,1), convert that to MB, and see if it exceeds your maximum possible memory given by help memory.
The most common implication of Out of memory errors on Matlab is that it is unable to allocate memory due to the lack of a contiguous block. This article explains the various reasons that can cause an Out of memory error on MATLAB.
The Out of memory error often points to a faulty implementation of code that expands matrices on the fly (concatenating, out-of-range indexing). In such scenarios, MATLAB creates a copy in memory i.e memory twice the size of the matrix is consumed with each such occurrence.
On Windows this problem can be alleviated to some extent by passing the /3GB /USERVA=3030 switch during boot as explained here. This enables additional virtual memory to be addressed by the application(MATLAB in this case).
I have written the following code in MATLAB to process large images of the order of 3000x2500 pixels. Currently the operation takes more than half hour to complete. Is there any scope to improve the code to consume less time? I heard parallel processing can make things faster, but I have no idea on how to implement it. How do I do it, given the following code?
function dirvar(subfn)
[fn,pn] = uigetfile({'*.TIF; *.tiff; *.tif; *.TIFF; *.jpg; *.bmp; *.JPG; *.png'}, ...
'Select an image', '~/');
I = double(imread(fullfile(pn,fn)));
ld = input('Enter the lag distance = '); % prompt for lag distance
fh = eval(['#' subfn]); % Function handles
I2 = uint8(nlfilter(I, [7 7], fh));
imshow(I2); % Texture Layer Image
imwrite(I2,'result_mat.tif');
% Zero Degree Variogram
function [gamma] = ewvar(I)
c = (size(I)+1)/2; % Finds the central pixel of moving window
EW = I(c(1),c(2):end); % Determines the values from central pixel to margin of window
h = length(EW) - ld; % Number of lags
gamma = 1/(2 * h) * sum((EW(1:ld:end-1) - EW(2:ld:end)).^2);
end
The input lag distance is usually 1.
You really need to use the profiler to get some improvements out of it. My first guess (as I haven't run the profiler, which you should as suggested already), would be to use as little length operations as possible. Since you are processing every image with a [7 7] window, you can precalculate some parts,
such that you won't repeat these actions
function dirvar(subfn)
[fn,pn] = uigetfile({'*.TIF; *.tiff; *.tif; *.TIFF; *.jpg; *.bmp; *.JPG; *.png'}, ...
'Select an image', '~/');
I = double(imread(fullfile(pn,fn)));
ld = input('Enter the lag distance = '); % prompt for lag distance
fh = eval(['#' subfn]); % Function handles
%% precalculations
wind = [7 7];
center = (wind+1)/2; % Finds the central pixel of moving window
EWlength = (wind(2)+1)/2;
h = EWlength - ld; % Number of lags
%% calculations
I2 = nlfilter(I, wind, fh);
imshow(I2); % Texture Layer Image
imwrite(I2,'result_mat.tif');
% Zero Degree Variogram
function [gamma] = ewvar(I)
EW = I(center(1),center(2):end); % Determines the values from central pixel to margin of window
gamma = 1/(2 * h) * sum((EW(1:ld:end-1) - EW(2:ld:end)).^2);
end
end
Note that by doing so, you trade performance for clearness of your code and coupling (between the function dirvar and the nested function ewvar). However, since I haven't profiled your code (you should do that yourself using your own inputs), you can find what line of your code consumes the most time.
For batch processing, I would also recommend to leave out any input, imshow, imwrite and uigetfile. Those are commands that you typically call from a more high-level function/script and that will force you to enter these inputs even when you want them to stay the same. So instead of that code, make each of the variables they produce (/process) a parameter (/return value) for your function. That way, you could leave MATLAB running during the weekend to process everything (without having manually enter to all those values), even if you are unable to speed up the code.
A few general purpose tricks:
1 - use the MATLAB profiler to determine all the computational bottlenecks
2 - parallel processing can make things faster and there are a lot of tools that you can use, but it depends on how your entire code is set up and whether the code is optimized for it. By far the easiest trick to learn is parfor, where you can replace the top level for loop by parfor. This does mean you must open the MATLAB pool with matlabpool open.
3 - If you have a rather recent Nvidia GPU as well as MATLAB 2011, you can also write some CUDA code.
All in all 30 mins to me is peanuts, so don't fret it too much.
First of all, I strongly suggest you follow the advice by #Egon: Write a separate function that collects a list of files (the excellent UIPICKFILES from the FEX is your friend here), and then runs your filtering code in a loop for each image. Note that you should definitely keep the call to imwrite in your filtering code: In case the analysis crashes at image 48 (e.g. due to power failure), you don't want to lose all the previous work.
Running thusly in batch mode has two big advantages: (1) you can start running your code and go home for the week-end, and (2) you can easily parallelize this outside loop using PARFOR. However, with only a dual-core machine, it is unlikely that you get any significant improvements from parallelization - your OS also wants to run stuff at times, and the overhead of parallelization might be more than the gain from running two workers. Also, 2.5GB of RAM is seriously limiting.
As to your specific code: in my experience using IM2COL is often faster than NLFILTER. im2col creates a nElementsInMask-by-nMasks array out of your image, so that you can apply the filtering in one single operation. With a 7x7 window, the output of im2col will be 3000*2500*49 bytes, which is close to 400MB. Thus, it should just work. All that you need to do is rewrite ewvar so that it works on a 49x1 array of pixels that make up the pixels your mask, which will require some index juggling, if I understand your code correctly.
FFT and changing frequency and vectorizing for loop
Greetings All
I can increase and decrease the frequency of a
signal using the combination of fft and a Fourier series expansion FOR loop in
the code below
but if the signal/array is to large it becomes extremely
slow (an array that's 1x44100 takes about 2 mins to complete) I'm sure
it has to do with the for loop but
I'm not exactly sure how to vectorize it to improve performance. Please note that this will be used with audio signals that are 3 to 6 mins long. The 1x44100 array is only a second and it takes about 2 mins to complete
Any recommendations
%create signal
clear all, clc,clf,tic
x= linspace(0,2*pi,44100)';
%Used in exporting to ycalc audio file make sure in sync with above
freq_orig=1;
freq_new=4
vertoff=0;
vertoffConj=0;
vertoffInv=0;
vertoffInvConj=0;
phaseshift=(0)*pi/180 ; %can use mod to limit to 180 degrees
y=sin(freq_orig*(x));
[size_r,size_c]=size(y);
N=size_r; %to test make 50
T=2*pi;
dt=T/N;
t=linspace(0,T-dt,N)';
phase = 0;
f0 = 1/T; % Exactly, one period
y=(y/max(abs(y))*.8)/2; %make the max amplitude here
C = fft(y)/N; % No semicolon to display output
A = real(C);
B = imag(C)*-1; %I needed to multiply by -1 to get the correct sign
% Single-Sided (f >= 0)
An = [A(1); 2*A(2:round(N/2)); A(round(N/2)+1)];
Bn = [B(1); 2*B(2:round(N/2)); B(round(N/2)+1)];
pmax=N/2;
ycalc=zeros(N,1); %preallocating space for ycalc
w=0;
for p=2:pmax
%
%%1 step) re-create signal using equation
ycalc=ycalc+An(p)*cos(freq_new*(p-1).*t-phaseshift)
+Bn(p)*sin(freq_new*(p-1).*t-phaseshift)+(vertoff/pmax);
w=w+(360/(pmax-1)); %used to create phaseshift
phaseshift=w;
end;
fprintf('\n- Completed in %4.4fsec or %4.4fmins\n',toc,toc/60);
subplot(2,1,1), plot(y),title('Orginal Signal');
subplot(2,1,2),plot(ycalc),title('FFT new signal');
Here's a pic of the plot if some one wants to see the output, which is correct the FOR loop is just really really slow
It appears as though you are basically shifting the signal upwards in the frequency domain, and then your "series expansion" is simply implementing the inverse DFT on the shifted version. As you have seen, the naive iDFT is going to be exceedingly slow. Try changing that entire loop into a call to ifft, and you should be able to get a tremendous speedup.