Efficient computation and redoing graphics actions - matlab

Are there any general guidelines on how Matlab handles graphics-based commands that ultimately result in no action being taken? A simple illustrative example--note the actual computational cost here is quite negligible:
fig=figure;
ax=axes;
for i=1:10
data=myFunction(i) %e.g. rand(i)
plot(data)
hold(ax,'on') %perform this repeatedly even though it's only needed once
end
versus:
fig=figure;
ax=axes;
for i=1:10
data=myFunction(i) %e.g. rand(i)
plot(data)
if ~ishold(ax)
hold(ax,'on') %perform this only if it is needed
end
end
If Matlab internally determines whether the hold(ax,'on') command is needed before actually doing it, then presumably the computational cost is similar or lower for the first form. The coding is also simpler to implement and read. But, if the action is carried out in full, then there are cases where it would be better, from a computational cost standpoint, to use the second form.
It's worth noting that the definition of "no action" here is deliberately vague, there's a lot of nuance here. For instance, it's easy to create an example where Matlab must perform some level of computation before it can evaluate whether the graphics command would have no effect. For instance, in colormap(myColormapFunction), Matlab would have to call myColormapFunction in order to evaluate whether what it returns is the same as the existing plot's CData property. Thanks.

As far as I know, there are no official guidelines on how "no action" commands are handled by the built-in MATLAB functions. Meanwhile, MathWorks does provide guidelines for optimizing graphics performance; which I feel, is a much more important thing to consider.
Graphics Performance Documentation Page
Now I apologize in advance if the following section doesn't answer your question. But if you are truly curious about the behind the scenes and real-world performance, you should use the provided Code Profiling Tool, and built-in timing functions.
With that said, the next section is about optimizing graphics performance in general.
For example, in the code you've provided, I would strongly recommend against putting hold and plot in the for-loop to begin with. From my experience, these are never necessary and can be optimized away.
Here, I'm guessing that you are trying to animate MATLAB plots; in which case, try updating the plot markers instead of using the plot function. For example:
figure
% Plot an empty plot with placeholder values (NaNs) the size of your data
% Save handle to plot object as `h`
h = plot( nan(size(data)) );
for i = 1:10
[Xdata, Ydata] = MyFunction(...);
% Update your plot markers with the handle update method
h.XData = Xdata;
h.YData = Ydata;
drawnow % drawnow to see animation
end
In this case, I don't even need to use the hold function, since I'm just updating the same plot, which is much faster.
Even if your function is outputting new data series, and you want to plot them on top of the old data series, you can use the same trick; it will only require you to pre-allocate the plot handles and data arrays. (Which, honestly is a good programming practice in general.)
figure
hold on % Hold the plot now
N = 100; % Number of data series you expect to have
H = cell(N,1); % Preallocate N Cells for all the plot handles
for i = 1:N
% Save plot handles to cell array in a loop, if you have so many series
H{i} = plot ( nan(size(data)) );
end
% Your iterative function calls
for t = 1:100
...
% Iteratively update all plot handles with a syntax like this
% (Not entirely sure, this is off the top of my head)
for i = 1:N
H{i}.XData = newX;
H{i}.YData = newY;
end
end

Related

Signal smoothing algorithm (Matlab's moving average)

I have written a simple code that performs a 3-point moving average smoothing algorithm. It is meant to follow the same basic algorithm as Matlab's smooth(...) function as described here.
However, the result of my code is very different from that of Matlab's. Matlab's 3-point filter appears to perform a much more aggressive smoothing.
Here is a comparison of a noisy data smoothed using my code (red) and Matlab's function (blue):
Here is my code written in the form of a function:
function [NewSignal] = smoothing(signal)
NewSignal = signal;
for i = 2 : length(signal)-1
NewSignal(i,:) = (NewSignal(i,:)+NewSignal(i-1,:)+NewSignal(i+1,:))./3;
end
end
Matlab's function is used as follows:
signal = smooth(time, signal, 3, 'moving');
As far as I understand Matlab's function works the same way; it averages 3 adjacent bins to a single bin. So I expected both algorithms to produce the same results.
So, what is the reason for the discrepancy? And how could I tweak my code to produce the same results?
Edit:
My sample data can be found here. It can be accessed using:
M = csvread('DS0009.csv');
time = M(:,1);
signal = M(:,2);
Here is the new result (red plot) using rinkert's correction:
One reason for the difference could be that you are partially using your smoothed signal during smoothing. In your loop, you store the smoothed value in NewSignal(i,:), and for the next sample to smooth this value will be called by NewSignal(i-1,:).
Let NewSignal be determined by the original signal only:
function [NewSignal] = smoothing(signal)
NewSignal = signal;
for i = 2 : length(signal)-1
NewSignal(i,:) = (signal(i,:)+signal(i-1,:)+signal(i+1,:))./3;
end
end
Update: To show that the function above in fact does the same as Matlab's smooth function, let's consider this MVCE:
t = (0:0.01:10).'; % time vector
y = sin(t) + 0.5*randn(size(t));
y_smooth1 = smooth(t,y,3,'moving');
y_smooth2 = smoothing(y);
difference_methods = abs(y_smooth1-y_smooth2);
So creating a sine wave, add some noise, and determine the absolute difference between the two methods. If you take the sum of all the differences you will see that this adds up to something like 7.5137e-14, which cannot explain the differences you see.
Plotting the smooth signal (blue original, red smoothed):
figure(1); clf; hold on
plot(t,y)
plot(t,y_smooth2)
And then plotting the difference between the two methods:
figure(2); clf; hold on;
plot(t,y_smooth1-y_smooth2)
As you can see, the difference is of the order 1e-16, so influenced by the Floating-point relative accuracy (see eps).
To answer your question in the comments: the Function filter and smooth perform arithmetically the same (in the case that they are applied for moving average). however, there are the special cases at the beginning and endpoints which are handled differently.
This is also stated in the documentation of smooth "Because of the way smooth handles endpoints, the result differs from the result returned by the filter function."
Here you see it in an example:
%generate randonm data
signal=rand(1,50);
%plot data
plot(signal,'LineWidth',2)
hold on
%plot filtered data
plot(filter(ones(3,1)/3,1,signal),'r-','LineWidth',2)
%plot smoothed data
plot( smooth(signal,3,'moving'),'m--','LineWidth',2)
%plot smoothed and delayed
plot([zeros(1,1); smooth(signal,3,'moving')],'k--','LineWidth',2)
hold off
legend({'Data','Filter','Smooth','Smooth-Delay'})
As you can see the filtered data (in red) is just a delayed version of the smoothed data (in magenta). Additionally, they differ in the beginning. Delaying the smoothed data results in an identical waveform as the filtered data (besides the beginning). As rinkert pointed out, your approach overwrites the data points which you are accessing in the next step. This is a different issue.
In the next example you will see that rinkerts implementation (smooth-rinkert) is identical to matlabs smooth, and that your approach differs from both due to overwriting the values:
So it is your function which low passes the input stronger. (as pointed out by Cris)

Select and plot value above a threshold

I have a plot in which there are a few noise components. I am planning to select data from that plot preferably above a threshold in my case I am planning to keep it at 2.009 on the Y axis. And plot the lines going only above it. And if anything is below that i would want to plot it as 0.
as we can see in the figure
t1=t(1:length(t)/5);
t2=t(length(t)/5+1:2*length(t)/5);
t3=t(2*length(t)/5+1:3*length(t)/5);
t4=t(3*length(t)/5+1:4*length(t)/5);
t5=t(4*length(t)/5+1:end);
X=(length(prcdata(:,4))/5);
a = U(1 : X);
b = U(X+1: 2*X);
c = U(2*X+1 : 3*X);
d = U(3*X+1 : 4*X);
e = U(4*X+1 : 5*X);
figure;
subplot (3,2,2)
plot(t1,a);
subplot (3,2,3)
plot(t2,b);
subplot(3,2,4)
plot(t3,c);
subplot(3,2,5)
plot(t4,d);
subplot(3,2,6)
plot(t5,e);
subplot(3,2,1)
plot(t,prcdata(:,5));
figure;
A=a(a>2.009,:);
plot (t1,A);
This code splits the data (in the image into 5 every 2.8 seconds, I am planning to use the thresholding in first 2.8 seconds. Also I had another code but I am just not sure if it works as it took a long time to be analysed
figure;
A=a(a>2.009,:);
plot (t1,A);
for k=1:length(a)
if a(k)>2.009
plot(t1,a(k)), hold on
else
plot(t1,0), hold on
end
end
hold off
The problem is that you are trying to plot potentially several thousand times and adding thousands of points onto a plot which causes severe memory and graphical issues on your computer. One thing you can do is pre process all of the information and then plot it all at once which will take significantly less time.
figure
threshold = 2.009;
A=a>threshold; %Finds all locations where the vector is above your threshold
plot_vals = a.*A; %multiplies by logical vector, this sets invalid values to 0 and leaves valid values untouched
plot(t1,plot_vals)
Because MATLAB is a highly vectorized language, this format will not only be faster to compute due to a lack of for loops, it is also much less intensive on your computer as the graphics engine does not need to process thousands of points individually.
The way MATLAB handles plots is with handles to each line. When you plot a vector, MATLAB is able to simply store the vector in one address and call it once when plotting. However, when each point is called individually, MATLAB has to store each point in a separate location in memory and call all of them individually and graphically handle each point completely separately.
Per request here is the edit
plot(t1(A),plot_vals(A))

MATLAB Fitting Function

I am trying to fit a line to some data without using polyfit and polyval. I got some good help already on how to implement this and I have gotten it to work with a simple sin function. However, when applied to the function I am trying to fit, it does not work. Here is my code:
clear all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=lb:step:ub;
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
i=1;
for x=lb:step:ub
G(i,1)= abs(sin((a/la)*pi*x*(sqrt(1+(1/x)^2)-1)));
N(i,1)=2*rand()-1;
Ghat(i,1)=(1+ep1*N(i,1))*G(i,1)+ep2*N(i,1);
r(i,1)=x;
i=i+1;
end
x=r;
y=G;
V=[x.^0];
Vfit=[x.^0];
for i=1:1:1000
V = [x.^i V];
c = V \ y;
Vfit = [x.^i Vfit];
yFit=Vfit*c;
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
The first two sections are just defining variables and the function. The second for loop is where I am making the fit. As you can see, I have it pause after every nth order in order to see the fit.
I changed your fit formula a bit, I got the same answers but quickly got
a warning that the matrix was singular. No sense in continuing past
the point that the inversion is singular.
Depending on what you are doing you can usually change out variables or change domains.
This doesn't do a lot better, but it seemed to help a little bit.
I increased the number of samples by a factor of 10 since the initial part of the curve
didn't look sampled highly enough.
I added a weighting variable but it is set to equal weight in the code below. Attempts
to deweight the tail didn't help as much as I hoped.
Probably not really a solution, but perhaps will help with a few more knobs/variables.
...
step=.01; %step-size through data
...
x=r;
y=G;
t=x.*sqrt(1+x.^(-2));
t=log(t);
V=[ t.^0];
w=ones(size(t));
for i=1:1:1000
% Trying to solve for value of c
% c that
% yhat = V*c approximates y
% or y = V*c
% V'*y = V'*V * c
% c = (V'*V) \ V'*y
V = [t.^i V];
c = (V'*diag(w.^2)*V ) \ (V'*diag(w.^2)*y) ;
yFit=V*c;
subplot(211)
plot(t,y,'o',t,yFit,'--')
subplot(212)
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
It looks like more of a frequency estimation problem, and trying to fit a unknown frequency
with polynomial tends to be touch and go. Replacing the polynomial basis with a quick
sin/cos basis didn't seem to do to bad.
V = [sin(t*i) cos(t*i) V];
Unless you specifically need a polynomial basis, you can apply your knowledge of the problem domain to find other potential basis functions for your fit, or to attempt to make the domain in which you are performing the fit more linear.
As dennis mentioned, a different set of basis functions might do better. However you can improve the polynomial fit with QR factorisation, rather than just \ to solve the matrix equation. It is a badly conditioned problem no matter what you do however, and using smooth basis functions wont allow you to accurately reproduce the sharp corners in the actual function.
clear all
close all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=logspace(log10(lb),log10(ub),100)';
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
y=abs(sin(a/la*pi*x.*(sqrt(1+(1./x).^2)-1)));
N=2*rand(size(x))-1;
Ghat=(1+ep1*N).*y+ep2*N;
V=[x.^0];
xs=(lb:.01:ub)';
Vfit=[xs.^0];
for i=1:1:20%length(x)-1
V = [x.^i V];
Vfit = [xs.^i Vfit];
[Q,R]=qr(V,0);
c = R\(Q'*y);
yFit=Vfit*c;
plot(x,y,'o',xs,yFit)
axis([0 10 0 1])
drawnow
pause
end

Implementing multiple syntaxes for a MATLAB plot function

Many of the plotting functions in MATLAB and toolboxes (thought not all) allow both the following syntaxes:
plotfcn(data1, data2, ...)
plotfcn(axes_handle, data1, data2, ...)
The first plots into the current axes (gca) or creates and plots into a new axes if none exists. The second plots into the axes with handle axes_handle.
Having looked into the internals of several MATLAB and toolbox plotting functions, it looks like there isn't really a standardised way that MathWorks do this. Some plotting routines use the internal, but open, function axescheck to parse the input arguments; some do a simple check on the first input argument; and some use a more complex input-parsing subfunction that can handle a larger variety of input syntaxes.
Note that axescheck appears to use an undocumented syntax of ishghandle - the doc says that ishghandle takes only one input, returning true if it is any Handle Graphics object; but axescheck calls it as ishghandle(h, 'axes'), which returns true only if it's specifically an axes object.
Is anyone aware of a best practice or standard for implementing this syntax? If not, which way have you found to be most robust?
In case anyone is still interested, four years after I posted the question, this is the pattern that I have mostly settled on.
function varargout = myplotfcn(varargin)
% MYPLOTFCN Example plotting function.
%
% MYPLOTFCN(...) creates an example plot.
%
% MYPLOTFCN(AXES_HANDLE, ...) plots into the axes object with handle
% AXES_HANDLE instead of the current axes object (gca).
%
% H = MYPLOTFCN(...) returns the handle of the axes of the plot.
% Check the number of output arguments.
nargoutchk(0,1);
% Parse possible axes input.
[cax, args, ~] = axescheck(varargin{:});
% Get handle to either the requested or a new axis.
if isempty(cax)
hax = gca;
else
hax = cax;
end
% At this point, |hax| refers either to a supplied axes handle,
% or to |gca| if none was supplied; and |args| is a cell array of the
% remaining inputs, just like a normal |varargin| input.
% Set hold to on, retaining the previous hold state to reinstate later.
prevHoldState = ishold(hax);
hold(hax, 'on')
% Do the actual plotting here, plotting into |hax| using |args|.
% Set the hold state of the axis to its previous state.
switch prevHoldState
case 0
hold(hax,'off')
case 1
hold(hax,'on')
end
% Output a handle to the axes if requested.
if nargout == 1
varargout{1} = hax;
end
not sure I understand the question.
What I do is to separate the plotting of data from the generation / setup of plots. So if I want to plot a histogram in a standardized way I have a function called setup_histogram(some, params) which will return the appropriate handles. Then I have a function update_histogram(with, some, data, and, params) which will write the data into the appropriate handles.
This works very well, if you have to plot lots of data the same way.
Two recommendations from the sideline:
Don't go undocumented if you don't need to.
If a simple check is sufficient, this would have my personal preference.

MATLAB scatter3, plot3 speed discrepencies

This is about how MATLAB can take very different times to plot the same thing — and why.
I generate 10000 points in 3D space:
X = rand(10000, 1);
Y = rand(10000, 1);
Z = rand(10000, 1);
I then used one of four different methods to plot this, to create a plot like so:
I closed all figures and cleared the workspace between each run to try to ensure fairness.
Bulk plotting using scatter3:
>> tic; scatter3(X, Y, Z); drawnow; toc
Elapsed time is 0.815450 seconds.
Individual plotting using scatter3:
>> tic; hold on;
for i = 1:10000
scatter3(X(i), Y(i), Z(i), 'b');
end
hold off; drawnow; toc
Elapsed time is 51.469547 seconds.
Bulk plotting using plot3:
>> tic; plot3(X, Y, Z, 'o'); drawnow; toc
Elapsed time is 0.153480 seconds.
Individual plotting using plot3:
>> tic; hold on
for i = 1:10000
plot3(X(i), Y(i), Z(i), 'o');
end
drawnow; toc
Elapsed time is 5.854662 seconds.
What is it that MATLAB does behind the scenes in the 'longer' routines to take so long? What are the advantages and disadvantages of using each method?
Edit:
Thanks to advice from Ben Voigt (see answers), I have included drawnow commands in the timing — but this has made little difference to the times.
The main difference between the time required to run scatter3 and plot3 comes from the fact that plot3 is compiled, while scatter3 is interpreted (as you'll see when you edit the functions). If scatter3 was compiled as well, the speed difference would be small.
The main difference between the time required to plot in a loop versus plotting in one go is that you add the handle to the plot as a child to the axes (have a look at the output of get(gca,'Children')), and you're thus growing a complicated array inside a loop, which we all know to be slow. Furthermore, you're calling several functions often instead of just once and incur thus calls from the function overhead.
Recalculation of axes limits aren't an issue here. If you run
for i = 1:10000
plot3(X(i), Y(i), Z(i), 'o');
drawnow;
end
which forces Matlab to update the figure at every iteration (and which is A LOT slower), you'll see that the axes limits don't change at all (since the default axes limits are 0 and 1). However, even if the axes limits started out differently, it wouldn't take many iterations for them to converge with these data. Compare with omitting the hold on, which makes plotting take longer, because axes are recalculated at every step.
Why have these different functions? scatter3 allows you to plot points with different marker sizes, and colors under a single handle, while you'd need a loop and get a handle for each point using plot3, which is not only costly in terms of speed, but also in terms of memory. However, if you need to interact with different points (or groups of points) individually - maybe you want to add a separate legend entry for each, maybe you want to be able to turn them on and off separately etc - using plot3 in a loop may be the best (though slow) solution.
For a faster approach, consider this third option (directly uses the low-level function LINE):
line([X,X], [Y,Y], [Z,Z], 'LineStyle','none', 'Marker','o', 'Color','b')
view(3)
Here are some articles discussing plotting performance issues:
Performance: scatter vs. line
Plot performance
Well, if you wanted control over the color of each point, bulk scatter would be faster, because you'd need to call plot separately.
Also, I'm not sure your timing information is accurate because you haven't called drawnow, so the actual drawing could take place after toc.
In summary:
plot3 is fastest because it draws the same marker at many different locations
scatter3 draws many different markers, since size and color of the marker (are allowed to) vary with each point
calling in a loop is really slow, because argument parsing and so forth have to take place repeatedly, in addition as points are added to the plot the axes have to be recalculated