if (pbcg(k+M) > pbcg(k-1+M) && pbcg(k+M) > pbcg(k+1+M) && pbcg(k+M) > threshold)
peaks_y(Counter) = pbcg(k+M);
peaks_x(Counter) = k + M;
py = peaks_y(Counter);
px = peaks_x(Counter);
plot(px,py,'ro');
Counter = (Counter + 1)-1;
fid = fopen('y1.txt','a');
fprintf(fid, '%d\t%f\n', px, py);
fclose(fid);
end
end
this code previously doesn't have any issue on finding the peak..
the main factor for it to find the only peak is this
if (pbcg(k+M) > pbcg(k-1+M) && pbcg(k+M) > pbcg(k+1+M) && pbcg(k+M) > threshold)
but right now it keep show me all the peak that is above the threshold instead of the particular highest peak..
UPDATE: what if the highest peaks have 4nodes that got the same value?
EDIT:
If multiple peaks with the same value surface, I will take the value at the middle and plot.
What I mean by that is for example [1,1,1,4,4,4,2,2,2]
I will take the '4' at the 5th position, so the plot will be at the center of the graph u see
It will be much faster and much more readable to use the built-in max function, and then test if the max value is larger than the threshold.
[C,I] = max(pbcg);
if C > threshold
...
%// I is the index of the maximal value, and C is the maximal value.
end
As alternative solution, you may evaluate the idea of using the built-in function findpeaks, which encompasses several methods to ascertain the existance of peaks within a given signal. Within thos methods you may call
findPeaks = findpeaks(data,'threshold',threshold_resolution);
The only limit I see is that findpeaks is only available with the Signal Processing Toolbox.
EDIT
In case of multiple peaks over the defined threshold, I would just call max to figure the highest peak, as follows
max(peaks);
Assuming you have a vector with peaks pbcg
Here is how you can get the middle one:
highestPeakValue = max(pbcg)
f = find(pbcg == highestPeakValue);
middleHighestPeakLocation = f(ceil(length(f)/2))
Note that you can still make it more robust for cases where you have no peaks, and can adjust it to give different behavior when there are two middle peaks (now it will take the second one)
Related
I've implement a moving maximum filter in matlab and now I have to determine if it's LTI or not. I've already prove it's linear but I can't prove if it's time ivnariant. To prove this I have to give as an input a signal x[n] and get the output X[n]. After that I'm giving the same signal but this time it have to delay it that is x[n-a] and the output must be X[n-a] the same as the X[n] but delayed. My problem is that I don't know what input to give. Also, the moving maximum is a LTI system or not?
Thanks in advance.
Max-Filter Using For-Loop and Testing System for Linear and Time-Invariance
Filter Construction:
To filter an array/matrix/signal a for-loop can be applied. Within this for-loop, portions of the signal that overlaps the window must be stored in this case called Window_Sample. To apply the maximum filter the max() of the Window_Sample must be taken upon each iteration of the loop. Upon each iteration of the loop the window moves. Here I push all the code that calculates the maximum filtered signal into a function called Max_Filter().
Time-Invariant or Not-Time Invariant:
With proper padding, I think the maximum filter is time-invariant/shift-invariant. Concatenating zeros to the start and ends of the signal is a good way to combat this. The amount of padding I used on each end is the size of the window, the requirement would be half the window but there's no harm in having a little bit of buffer. Unfortunately, without proper padding, this may appear to break down in MATLAB since the max filter will be introduced to points early. In this case without padding the ramp the max filter will take the 5th point as the max right away. Here you can apply a scaling factor to test if the system is linear. You'll see that the system is not linear since the output is not a scaled version of the input.
clf;
%**********************************************************%
%ADJUSTABLE PARAMETERS%
%**********************************************************%
Window_Size = 3; %Window size for filter%
Delay_Size = 5; %Delay for the second test signal%
Scaling_Factor = 1; %Scaling factor for the second test signal%
n = (0:10);
%Ramp function with padding%
Signal = n; %Function for test signal%
%**********************************************************%
Signal = [zeros(1,Window_Size) Signal zeros(1,Window_Size)];
Filtered_Result_1 = Max_Filter(Signal,Window_Size);
%Ramp function with additional delay%
Delayed_Signal = Scaling_Factor.*[zeros(1,Delay_Size) Signal];
Filtered_Result_2 = Max_Filter(Delayed_Signal,Window_Size);
plot(Filtered_Result_1);
hold on
plot(Filtered_Result_2);
xlabel("Samples [n]"); ylabel("Amplitude");
title("Moving Maximum Filtering Signal");
legend("Original Signal","Delayed Signal");
xticks([0: 1: length(Filtered_Result_2)]);
xlim([0 length(Filtered_Result_2)]);
grid;
function [Filtered_Signal] = Max_Filter(Signal,Filter_Size)
%Initilizing an array to store filtered signal%
Filtered_Signal = zeros(1,length(Signal));
%Filtering array using for-loop and acccesing chunks%
for Sample_Index = 1: length(Signal)
Start_Of_Window = Sample_Index - floor(Filter_Size/2);
End_Of_Window = Sample_Index + floor(Filter_Size/2);
if Start_Of_Window < 1
Start_Of_Window = 1;
end
if End_Of_Window > length(Signal)
End_Of_Window = length(Signal);
end
Window_Sample = Signal(Start_Of_Window:End_Of_Window);
Filtered_Signal(1,Sample_Index) = max(Window_Sample);
end
end
Alternatively Using Built-In Functions:
To test the results of the created filter it is a good idea to compare this to the output results of MATLAB's built-in function movmax(). Here the results appear to be the same as the for-loop implementation.
clf;
Window_Size = 3;
n = (0:10);
%Ramp function with padding%
Signal = n;
Signal = [zeros(1,Window_Size) Signal zeros(1,Window_Size)];
Filtered_Result_1 = movmax(Signal,Window_Size);
%Ramp function with additional delay%
Delay_Size = 5;
Delayed_Signal = [zeros(1,Delay_Size) Signal];
Filtered_Result_2 = movmax(Delayed_Signal,Window_Size);
plot(Filtered_Result_1);
hold on
plot(Filtered_Result_2);
xlabel("Samples [n]"); ylabel("Amplitude");
title("Moving Maximum Filtering Signal");
legend("Original Signal","Delayed Signal");
xticks([0: 1: length(Filtered_Result_2)]);
xlim([0 length(Filtered_Result_2)]);
grid;
Ran using MATLAB R2019b
I have a scatter plot of approximately 30,000 pts, all of which lie above a horizontal line which I've visually defined in my plot. My goal now is to sum the vertical distance of all of these points to this horizontal line.
The data was read in from a .csv file and is already saved to the workspace, but I also need to check whether a value is NaN, and ignore these.
This is where I'm at right now:
vert_deviation = 0;
idx = 1;
while idx <= numel(my_data(:,5)) && isnan(idx) == 0
vert_deviation = vert_deviation + ((my_data(idx,5) - horiz_line_y_val));
idx = idx + 1;
end
I know that a prerequisite of using the && operator is having two logical statements I believe, but I'm not sure how to rewrite this loop in this way at the moment. I also don't understant why vert_deviation returns NaN at the moment, but I assume this might have to do with the first mistake I described...
I would really appreciate some guidance here - thank you in advance!
EDIT: The 'horizontal line' is a slight oversimplification - in reality the lower limit I need to find the distance to consists of 6 different line segments
I should have specified that the lower limit to which I need to calculate the distance for all scatterplot points varies for different x values (the horizontal line snippet was meant to be a simplification but may have been misleading... apologies for that)
I first modified the data I had already read into the workspace by replacing all NaNvalues with 0. Next, I wrote a while loop which defines the number if indexes to loop through, and defined an && condition to filter out any zeroes. I then wrote a nested if loop which checks what range of x values the given index falls into, and subsequently takes the delta between the y values of a linear line lower limit for that section of the plot and the given point. I repeated this for all points.
while idx <= numel(my_data(:,3)) && not(my_data(idx,3) == 0)
...
if my_data(idx,3) < upper_x_lim && my_data(idx,5) > lower_x_lim
vert_deviation = vert_deviation + (my_data(idx,4) - (m6 * (my_data(idx,5)) + b6))
end
...
m6 and b6 in this case are the slope and y intercept calculated for one section of the plot. The if loop is repeated six times for each section of the lower limit.
I'm sure there are more elegant ways to do this, so I'm open to any feedback if there's room for improvement!
Your loop doesn't exclude NaN values becuase isnan(idx) == 0 checks to see if the index is NaN, rather than checking if the data point is NaN. Instead, check for isnan(my_data(idx,5)).
Also, you can simplify your code using for instead of while:
vert_deviation = 0;
for idx=1:size(my_data,1)
if !isnan(my_data(idx,5))
vert_deviation = vert_deviation + ((my_data(idx,5) - horiz_line_y_val));
end
end
As #Adriaan suggested, you can remove the loop altogether, but it seems that the code in the OP is an oversimplification of the problem. Looking at the additional code posted, I guess it is still possible to remove the loops, but I'm not certain it will be a significant speed improvement. Just use a loop.
Apologies for the long post but this takes a bit to explain. I'm trying to make a script that finds the longest linear portion of a plot. Sample data is in a csv file here, it is stress and strain data for calculating the shear modulus of 3D printed samples. The code I have so far is the following:
x_data = [];
y_data = [];
x_data = Data(:,1);
y_data = Data(:,2);
plot(x_data,y_data);
grid on;
answer1 = questdlg('Would you like to load last attempt''s numbers?');
switch answer1
case 'Yes'
[sim_slopes,reg_data] = regr_and_longest_part(new_x_data,new_y_data,str2num(answer2{3}),str2num(answer2{2}),K);
case 'No'
disp('Take a look at the plot, find a range estimate, and press any button to continue');
pause;
prompt = {'Eliminate values ABOVE this x-value:','Eliminate values BELOW this x-value:','Size of divisions on x-axis:','Factor for similarity of slopes:'};
dlg_title = 'Point elimination';
num_lines = 1;
defaultans = {'0','0','0','0.1'};
if isempty(answer2) < 1
defaultans = {answer2{1},answer2{2},answer2{3},answer2{4}};
end
answer2 = inputdlg(prompt,dlg_title,num_lines,defaultans);
uv_of_x_range = str2num(answer2{1});
lv_of_x_range = str2num(answer2{2});
x_div_size = str2num(answer2{3});
K = str2num(answer2{4});
close all;
iB = find(x_data > str2num(answer2{1}),1,'first');
iS = find(x_data > str2num(answer2{2}),1,'first');
new_x_data = x_data(iS:iB);
new_y_data = y_data(iS:iB);
[sim_slopes, reg_data] = regr_and_longest_part(new_x_data,new_y_data,str2num(answer2{3}),str2num(answer2{2}),K);
end
[longest_section0, Midx]= max(sim_slopes(:,4)-sim_slopes(:,3));
longest_section=1+longest_section0;
long_sec_x_data_start = x_div_size*(sim_slopes(Midx,3)-1)+lv_of_x_range;
long_sec_x_data_end = x_div_size*(sim_slopes(Midx,4)-1)+lv_of_x_range;
long_sec_x_data_start_idx=find(new_x_data >= long_sec_x_data_start,1,'first');
long_sec_x_data_end_idx=find(new_x_data >= long_sec_x_data_end,1,'first');
long_sec_x_data = new_x_data(long_sec_x_data_start_idx:long_sec_x_data_end_idx);
long_sec_y_data = new_y_data(long_sec_x_data_start_idx:long_sec_x_data_end_idx);
[b_long_sec, longes_section_reg_data] = robustfit(long_sec_x_data,long_sec_y_data);
plot(long_sec_x_data,b_long_sec(1)+b_long_sec(2)*long_sec_x_data,'LineWidth',3,'LineStyle',':','Color','k');
function [sim_slopes,reg_data] = regr_and_longest_part(x_points,y_points,x_div,lv,K)
reg_data = cell(1,3);
scatter(x_points,y_points,'.');
grid on;
hold on;
uv = lv+x_div;
ii=0;
while lv <= x_points(end)
if uv > x_points(end)
uv = x_points(end);
end
ii=ii+1;
indices = find(x_points>lv & x_points<uv);
temp_x_points = x_points((indices));
temp_y_points = y_points((indices));
if length(temp_x_points) <= 2
break;
end
[b,stats] = robustfit(temp_x_points,temp_y_points);
reg_data{ii,1} = b(1);
reg_data{ii,2} = b(2);
reg_data{ii,3} = length(indices);
plot(temp_x_points,b(1)+b(2)*temp_x_points,'LineWidth',2);
lv = lv+x_div;
uv = lv+x_div;
end
sim_slopes = NaN(length(reg_data),4);
sim_slopes(1,:) = [reg_data{1,1},0,1,1];
idx=1;
for ii=2:length(reg_data)
coff =sim_slopes(idx,1);
if abs(reg_data{ii,1}-coff) <= K*coff
C=zeros(ii-sim_slopes(idx,3)+1,1);
for kk=sim_slopes(idx,3):ii
C(kk)=reg_data{kk,1};
end
sim_slopes(idx,1)=mean(C);
sim_slopes(idx,2)=std(C);
sim_slopes(idx,4)=ii;
else
idx = idx + 1;
sim_slopes(idx,1)=reg_data{ii,1};
sim_slopes(idx,2)=0;
sim_slopes(idx,3)=ii;
sim_slopes(idx,4)=ii;
end
end
end
Apologies for the code not being well optimized, I'm still relatively new to MATLAB. I did not use derivatives because my data is relatively noisy and derivation might have made it worse.
I've managed to get the get the code to find the longest straight part of the plot by splitting the data up into sections called x_div_size then performing a robustfit on each section, the results of which are written into reg_data. The code then runs through reg_data and finds which lines have the most similar slopes, determined by the K factor, by calculating the average of the slopes in a section of the plot and makes a note of it in sim_slopes. It then finds the longest interval with max(sim_slopes(:,4)-sim_slopes(:,3)) and performs a regression on it to give the final answer.
The problem is that it will only consider the first straight portion that it comes across. When the data is plotted, it has a few parts where it seems straightest:
As an example, when I run the script with answer2 = {'0.2','0','0.0038','0.3'} I get the following, where the black line is the straightest part found by the code:
I have the following questions:
It's clear that from about x = 0.04 to x = 0.2 there is a long straight part and I'm not sure why the script is not finding it. Playing around with different values the script always seems to pick the first longest straight part, ignoring subsequent ones.
MATLAB complains that Warning: Iteration limit reached. because there are more than 50 regressions to perform. Is there a way to bypass this limit on robustfit?
When generating sim_slopes there might be section of the plot whose slope is too different from the average of the previous slopes so it gets marked as the end of a long section. But that section sometimes is sandwiched between several other sections on either side which instead have similar slopes. How would it be possible to tell the script to ignore one wayward section and to continue as if it falls within the tolerance allowed by the K value?
Take a look at the Douglas-Peucker algorithm. If you think of your (x,y) values as the vertices of an (open) polygon, this algorithm will simplify it for you, such that the largest distance from the simplified polygon to the original is smaller than some threshold you can choose. The simplified polygon will be the set of straight lines. Find the two vertices that are furthest apart, and you're done.
MATLAB has an implementation in the Mapping Toolbox called reducem. You might also find an implementation on the File Exchange (but be careful, there is also really bad code on there). Or, you can roll your own, it's quite a simple algorithm.
You can also try using the ischange function to detect changes in the intercept and slope of the data, and then extract the longest portion from that.
Using the sample data you provided, here is what I see from a basic attempt:
>> T = readtable('Data.csv');
>> T = rmmissing(T); % Remove rows with NaN
>> T = groupsummary(T,'Var1','mean'); % Average duplicate timestamps
>> [tf,slopes,intercepts] = ischange(T.mean_Var2, 'linear', 'SamplePoints', T.Var1); % find changes
>> plot(T.Var1, T.mean_Var2, T.Var1, slopes.*T.Var1 + intercepts)
which generates the plot
You should be able to extract the longest segment based on the indices given by find(tf).
You can also tune the parameters of ischange to get fewer or more segments. Adding the name-value pair 'MaxNumChanges' with a value of 4 or 5 produces more linear segments with a tighter fit to the curve, for example, which effectively removes the kink in the plot that you see.
I have written a function in MATLAB to count the number of zero crossings given a vector of signal data. If I find a zero crossing, I also check whether the absolute difference between the two vector indices involved is greater than a threshold value - this is to try to reduce the influence of signal noise.
zc = [];
thresh = 2;
for i = 1:length(v)-1
if ( (v(i)>0 && v(i+1)<0) || (v(i)<0 && v(i+1)>0) ) && abs(v(i)-v(i+1)) >= thresh
zc = [zc; i+1];
end
end
zcCount = length(zc);
I used the vector from the zero crossings function here to test it: http://hips.seas.harvard.edu/content/count-zero-crossings-matlab
A = [-0.49840598306643,
1.04975509964655,
-1.67055867973620,
-2.01437026154355,
0.98661592496732,
-0.06048256273708,
1.19294080740269,
2.68558025885591,
0.85373360483580,
1.00554850567375];
It seems to work fine but is there a more efficient way of achieving the same result? E.g. on the above webpage, they simply use the following line to calculate zero crossings:
z=find(diff(v>0)~=0)+1;
Is there a way to incorporate the threshold check into something similarly efficient?
How about
zeroCrossIndex=diff(v>0)~=0
threshholdIndex = diff(v) >= thresh;
zcCount = sum(zeroCrossIndex & threshholdIndex)
In Matlab I've got two matrices,L1 and L2, containing in each row the coordinates (row, column) of several points in 2D space:
L1=[1,1;2,2;3,3];
L2=[4,4;5,5;6,5;7,6;8,7];
After plotting I've got this:
I'm trying to implement an algorithm that could fusion lines that are similarly orientated. I've tried for quite a while. I guess that the simplest way of solving this is to follow these steps:
-First: suppose that L1 and L2 are two segments of the same line(L3).
-Next: Starting from (1,1) (or (8,7)) evaluate the orientation of the next point. In other words, the orientation that point (2,2) has from (1,1), point (3,3,) from (2,2), etc. And save those values.
-Next: Calculate from all orientation values the average value.
-Next: evaluate if the fusion points between fibers, that in this case are (3,3) and (4,4), follow a similar orientation.
-Results: If previous stage is TRUE, then fusion the fibers. If FALSE, do nothing.
One key point here is to establish a reference from which orientation angles can be measured. Maybe this approach is too complicated. I guess there is a simpler way and less memory consuming way. Thank you.
Code -
% We need to set a tolerance value for the similarity of slopes between the
% main data and the "fusion" data. This tolerance is in degrees, so basically
% means that the fiber must be within TOL degrees left or right of the overall data average.
TOL = 10;
% Slightly different and a more general data probably
L1=[2,3;3,5;4,10];
L2=[7,15;8,19;9,21];
L_fiber = [L1(end,:);L2(1,:)];
% Slopes calculation
a1 = diff(L1);
m1 = a1(:,2)./a1(:,1);
a2 = diff(L2);
m2 = a2(:,2)./a2(:,1);
% Overall slope for the main data
m = mean([m1;m2]);
a_fiber = diff(L_fiber);
m_fiber = a_fiber(:,2)./a_fiber(:,1);
m_fiber_mean = mean(m_fiber);
% Checking if the fiber mean is within the limits set by TOL
deg_max = atan(m)*(180/pi) + TOL;
deg_min = atan(m)*(180/pi) - TOL;
slope_max = tan(deg_max*pi/180);
slope_min = tan(deg_min*pi/180);
if m_fiber_mean >= slope_min && m_fiber_mean <= slope_max
out = true;
disp('Yes the fusion matches the overall data');
else
out = false;
disp('No the fusion does not match the overall data');
end
Hope this settles it!