Is there is elegant way to connect the line between the nearest points in scatter plot?
The reason I ask is because plot will connect the line based on the 'index of row' of Y(data). Basically, it connects points in the same row in Y. However, this sometimes will result in a jump, which is due to the missing data in one row( and thus all the following data will all mislabeled and shifted by 1, and this cannot be avoided for some practical reason).
Here is a minimal example of the problem.
xrange=linspace(-2,2,100);
Y=repmat(-xrange.^2,4,1)+repmat((-4:-1)',1,100);
Y(Y<-5)=0;
for i=1:100
[~,~,v]=find(Y(:,i));
Y(1:length(v),i)=v;
end
Y(Y==0)=nan;
%jump due to missing data
figure;
plot(xrange,Y);
figure;
%from bare eye, we see there are four lines
for i=1:4
scatter(xrange,Y(i,:),'b');
hold on
end
The undesirable result by using plot is:
The jump is due to the missing data and this is unavoidable in practice.
However, we can see that there are actually four lines, by bare eyes, if we use scatter.
So what I want to achieve is to connect the nearest points but without introducing discontinuity given a imperfect data set, which is missing some data. What I can come up with is to preprocess the data before plotting but this is not always possible, because of the complicated experimental situation, which is hard to predict which data point will be missing in advance.
Any comments and answers are highly appreciated!
idx = ~isnan(Y)
plot(xrange(idx), Y(idx))
or better yet, since you know you want to get rid of everything less than -5
idx = Y < -5
plot(xrange(idx), Y(idx))
Related
So I'm trying to plot a graph and this is my code so far
T=10; %set the values given to us
t(1)=0;
delta=0.01;
mu=0.05;
sigma=linspace(0,1,10001); %set the spacing for sigma
v=zeros(length(sigma),T); %set the xeros
k=1;
for sigma=linspace(0.01,1,1000)
t(k)=k*delta;
eta=randn(1); %define eta
S(2)=1+mu*delta+sigma*sqrt(delta)*eta; %set S
S(T+1)=S(10)+mu*delta*S(10)+sigma*sqrt(delta)*eta*S(10);
end
t(10000)=1; %set the rounding error
plot (S,sigma) %plot the graph
xlabel 'S' %label the axis
ylabel 'sigma'
I've tried using . to satisfy inner matrix dimensions (for S) but this hasn't worked. I've been going round in circles for a while now and can't figure it out.
I've tried your code and analyzed the problem, and I have a few doubts about it. It would be so helpful to know what problem are you trying to solve. But, focusing on what we have, one of the things I've seen is about the iteration. It doesn´t make sense to me, you've used a for iteration structure with the sigma condition. That's ok, let's see step by step what's happening:
for sigma=linspace(0.01,1,1000)
t(k)=k*delta;
eta=randn(1); %define eta
S(2)=1+mu*delta+sigma*sqrt(delta)*eta; %set S
S(T+1)=S(10)+mu*delta*S(10)+sigma*sqrt(delta)*eta*S(10);
end
First of all, you can define t(k) of the iteration, as the value doesn´t change with it. Next, you've written S(2) and S(T+1), here, you're doing an iteration, that's correct, but you're overwriting the values over and over again each loop, so you're overwriting some values of a matrix S. About S, it could be helpful to preassign the expected size in memory, you can use:
S=zeros(i,j); % Being i and j, the matrix dimensions.
or simply, as some people like, declaring the variable (not necessary in Matlab):
S=[];
This depends on the problem you're solving. I really want to help you, but I need to know more about the problem. I don't know your working method or how you face your problems, but what I always do is solve it first in a piece of paper, or if it's a numerical problem of some kind and you can't solve it, then just sketch the ideas you have and check things like the coherence in matrix operations (related to matrix sizes), and those kind of things...
Hope I've been helpful, and have a nice day :)
I am trying to find power spectrum of the signal. The length of the signal is 100000, sample frequency is 1000Hz,and the number of points is 100000. I found the power spectrum using two approaches. The first one is by taking all the length as one part and found power spectrum for it while the second approach is by dividing the signal into 100*1000and find spectrum for each row then get the mean for all rows. My problem is that I must get the same answer in both approaches but I got different answers. I do not know what is the error in my code.
N=100000;
SF=1000;
a=0.1;
b=0.3;
amplitude1=1;
amplitude2=0.5;
t=0:1/SF:100;
f1=SF*a;
f2=SF*b;
A=amplitude1*sin(2*pi*f1*t)+amplitude2*sin(2*pi*f2*t);
Y=2*randn(1,length(A))+A;
bin=[0 :N/2];
fax_Hz=(bin*SF)/N;
FFT=fft(Y);
spectra=2/(SF*length(Y))*(FFT.*conj(FFT));
plot(fax_Hz,spectra(1,1:50001));
D=reshape(Y(1,1:100000),[100,1000]);
M=length(D(1,:));
for i=1:100
FFT_1(i,:)=fft(D(i,:));
S(i,:)=(2/(SF*M))*(FFT_1(i,:).*conj(FFT_1(i,:)));
end
S_f=mean(S);
figure
plot (S_f);
I just update the code. I do not know but when I added noise to signal the two plots looks shifted.
The main problem is with reshape you are working with each row being a separate sequence. Reshape however fills the first column before moving to the second one.
You can use the following instead.
D=reshape(A(1,1:100000),[1000,100]).';
Normalization is another problem. You can either use ifft instead of fft as it is normalized by default (not sure why). Or alternatively keep your normalization and instead of using mean you should can use sum, maybe that is due to a mistake you might have made. There still seems to be a small discrepancy in the amplitudes, not sure where that is coming from.
At the end to plot use the following:
bin=[0 :N];
fax_Hz=(bin*SF)/N;
FFT=ifft(A);
spectra=FFT.*conj(FFT);
plot(fax_Hz,spectra); hold on
D=reshape(A(1,1:100000),[1000,100]).';
M=length(D(1,:));
for i=1:100
FFT_1(i,:)=ifft(D(i,:));
S(i,:)=FFT_1(i,:).*conj(FFT_1(i,:));
end
S_f=mean(S);
plot(fax_Hz(1:100:end-1), S_f);
Note: the fax_Hz(1:100:end-1) is a hacky way of getting the length of the vectors to be the same.
I want to make a plot that discontinues at one point using Matlab.
This is what the plot looks like using scatter:
However, I would like to the plot to be a smooth curve but not scattered dots. If I use plot, it would give me:
I don't want the vertical line.
I think I can break the function manually into two pieces, and draw them separately on one figure, but the problem is that I don't know where the breaking point is before hand.
Is there a good solution to this? Thanks.
To find the jump in the data, you can search for the place where the derivative of the function is the largest:
[~,ind] = max(diff(y));
One way to plot the function would be to set that point to NaN and plotting the function as usual:
y(ind) = NaN;
plot(x,y);
This comes with the disadvantage of losing a data point. To avoid this, you could add a data point with value NaN in the middle:
xn = [x(1:ind), mean([x(ind),x(ind+1)]), x(ind+1:end)];
yn = [y(1:ind), NaN, y(ind+1:end)];
plot(xn,yn);
Another solution would be to split the vectors for the plot:
plot(x(1:ind),y(1:ind),'-b', x(ind+1:end),y(ind+1:end),'-b')
All ways so far just handle one jump. To handle an arbitrary number of jumps in the function, one would need some knowledge how large those jumps will be or how many jumps there are. The solution would be similar though.
you should iterate through your data and find the index where there is largest distance between two consecutive points. Break your array from that index in two separate arrays and plot them separately.
In my project i have hige surfaces of 20.000 points computed by a algorithm. This algorithm, sometimes, has an error, computing 1 or more points in an small area incorrectly.
This error can not be solved in the algorithm, but needs to be detected afterwards.
The error can be seen in the next figure:
As you can see, there is a point wrongly computed that not only breaks the full homogeneous surface, but also destroys the aestetics of the plot (wich is also important in the project.)
Sometimes it can be more than a point, in general no more than 5 or 6. The error is allways the Z axis, so no need to check X and Y
I have been squeezing my mind to find a bit "generic" algorithm to detect this poitns.
I thougth that maybe taking patches of surface and meaning the Z, then detecting the points out of the variance... but I dont think it will work allways.
Any ideas?
NOTE: I dont want someone to write code for me, just an idea.
PD: relevant code for the avobe image:
[x,y] = meshgrid([-2:.07:2]);
Z = x.*exp(-x.^2-y.^2);
subplot(1,2,1)
surf(x,y,Z,gradient(Z))
subplot(1,2,2)
Z(35,35)=Z(35,35)+0.3;
surf(x,y,Z,gradient(Z))
The standard trick is to use a Laplacian, looking for the largest outliers. (This is not unlike what Mohsen posed for an answer, but is actually a bit easier.) You could even probably do it with conv2, so it would be pretty efficient.
I could offer a few ways to implement the idea. A simple one is to use my gridfit tool, found on the File Exchange. (Gridfit essentially uses a Laplacian for its smoothing operation.) Fit the surface with all points included, then look for the single point that was perturbed the most by the fit. Exclude it, then rerun the fit, again looking for the largest outlier. (With gridfit, you can use weights to give points a zero weight, a simple way to exclude a point or list of points.) When the largest perturbation that was needed is small enough, you can decide to stop the process. A nice thing is gridfit will also impute new values for the outliers, filling in all of the holes.
A second approach is to use the Laplacian directly, in more of a filtering approach. Here, you simply compute a value at each point that is the average of each neighbor to the left, right, above, and below. The single value that is most largely in disagreement with its computed average is replaced with a new value. Or, you can use a weighted average of the new value with the old one there. Again, iterate until the process does not generate anything larger than some tolerance. (This is the basis of an old outlier detection and correction scheme that I recall from the Fortran IMSL libraries, but probably dates back to roughly 30 years ago.)
Since your functions seems to vary smoothly these abrupt changes can be detected by looking into the derivatives. You can
Take the derivative in one direction
Calculate mean and standard deviation of derivative
Find the points by looking for points that are further from mean by certain multiple of standard deviation.
Here is the code
U=diff(Z);
V=(U-mean(U(:)))/std(U(:));
surf(x(2:end,:),y(2:end,:),V)
V=[zeros(1,size(V,2)); V];
V(abs(V)<10)=0;
V=sign(V);
W=cumsum(V);
[I,J]=find(W);
outliers = [I, J];
For your example you get this plot for V with a peak at around 21.7 while second peak is at around 1.9528, so maybe a threshold of 10 is ok.
and running the code returns
outliers =
35 35
The need for cumsum is for the cases that you have a patch of points next to each other that are incorrect.
So I have this matrix here, and it is of size 13 x 8198. (I have called it 'blah').
This is a sparse matrix, in that, most of its entries are 0. When I do an imagesc(blah), I get the following image:
Clearly this is worthless because I cannot clearly see the non-zero elements. I have tried playing around with the color scaling, but to no avail.
Anyway, I was wondering if there might be a nicer way to be able to visualize this matrix in MATLAB somehow? I am designing an algorithm and would like to be able to see certain things int teh matrix.
Thanks!
Try spy; it's intended for exactly that.
The problem is that spy makes the axes equal, and your data is 13 x 8198, so the first axis is almost invisible compared to the second one. daspect can fix that.
>> spy(blah)
>> daspect([400 1 1])
spy doesn't have an option to plot differently by signs. One option would be to edit the source to add that capability (it's implemented in matlab, and you can get the source by running edit spy). An easier hack, though, is to just spy the positive and negative parts separately:
>> daspect([400 1 1]);
>> hold on;
>> spy(max(blah, 0), 'b');
>> spy(min(blah, 0), 'r');
This has the unfortunate side effect of making places where positives and negatives are close together appear dominated by the second one plotted, here the negatives (e.g. in the top rows of your matrix). I'm not sure what to do about that other than maybe fiddling with marker sizes. You could of course do it in both orders and compare.