How to quickly/easily merge and average data in matrix in MATLAB? - matlab

I have got a matrix of AirFuelRatio values at certain engine speeds and throttlepositions. (eg. the AFR is 14 at 2500rpm and 60% throttle)
The matrix is now 25x10, and the engine speed ranges from 1200-6000rpm with interval 200rpm, the throttle range from 0.1-1 with interval 0.1.
Say i have measured new values, eg. an AFR of 13.5 at 2138rpm and 74,3% throttle, how do i merge that in the matrix? The matrix closest values are 2000 or 2200rpm and 70 or 80% throttle. Also i don't want new data to replace the older data. How can i make the matrix take this value in and adjust its values to take the new value in account?
Simplified i have the following x-axis values(top row) and 1x4 matrix(below):
2 4 6 8
14 16 18 20
I just measured an AFR value of 15.5 at 3 rpm. If you interpolate the AFR matrix you would've gotten a 15, so this value is out of the ordinary.
I want the matrix to take this data and adjust the other variables to it, ie. average everything so that the more data i put in the more reliable and accurate the matrix becomes. So in the simplified case the matrix would become something like:
2 4 6 8
14.3 16.3 18.2 20.1
So it averages between old and new data. I've read the documentation about concatenation but i believe my problem can't be solved with that function.
EDIT: To clarify my question, the following visual clarification.
The 'matrix' keeps the same size of 5 points whil a new data point is added. It takes the new data in account and adjusts the matrix accordingly. This is what i'm trying to achieve. The more scatterd data i get, the more accurate the matrix becomes. (and yes the green dot in this case would be an outlier, but it explains my case)
Cheers

This is not a matter of simple merge/average. I don't think there's a quick method to do this unless you have simplifying assumptions. What you want is a statistical inference of the underlying trend. I suggest using Gaussian process regression to solve this problem. There's a great MATLAB toolbox by Rasmussen and Williams called GPML. http://www.gaussianprocess.org/gpml/

This sounds more like a data fitting task to me. What you are suggesting is that you have a set of measurements for which you wish to get the best linear fit. Instead of producing a table of data, what you need is a table of values, and then find the best fit to those values. So, for example, I could create a matrix, A, which has all of the recorded values. Let's start with:
A=[2,14;3,15.5;4,16;6,18;8,20];
I now need a matrix of points for the inputs to my fitting curve (which, in this instance, lets assume it is linear, so is the set of values 1 and x)
B=[ones(size(A,1),1), A(:,1)];
We can find the linear fit parameters (where it cuts the y-axis and the gradient) using:
B\A(:,2)
Or, if you want the points that the line goes through for the values of x:
B*(B\A(:,2))
This results in the points:
2,14.1897 3,15.1552 4,16.1207 6,18.0517 8,19.9828
which represents the best fit line through these points.
You can manually extend this to polynomial fitting if you want, or you can use the Matlab function polyfit. To manually extend the process you should use a revised B matrix. You can also produce only a specified set of points in the last line. The complete code would then be:
% Original measurements - could be read in from a file,
% but for this example we will set it to a matrix
% Note that not all tabulated values need to be present
A=[2,14; 3,15.5; 4,16; 5,17; 8,20];
% Now create the polynomial values of x corresponding to
% the data points. Choosing a second order polynomial...
B=[ones(size(A,1),1), A(:,1), A(:,1).^2];
% Find the polynomial coefficients for the best fit curve
coeffs=B\A(:,2);
% Now generate a table of values at specific points
% First define the x-values
tabinds = 2:2:8;
% Then generate the polynomial values of x
tabpolys=[ones(length(tabinds),1), tabinds', (tabinds').^2];
% Finally, multiply by the coefficients found
curve_table = [tabinds', tabpolys*coeffs];
% and display the results
disp(curve_table);

Related

How to use 3D surface data from cftool in Simulink lookup table?

I am designing a battery model with an internal resistance which is dependant on two variables: SoC and temperature.
I have interpolated the data I have (x,y and z basically - a total of 131 points each) with MATLAB's curve fitting toolbox and was able to generate the desired 3D map of that dependence (see the picture below):
My question is how can I use that map now for my Simulink model? As input parameters I will have SoC and temperature and the resistance in ohm should be the output. However, I have not been able to find a convenient way to export the data in a suitable lookup table (or similarly useful, my first guess was that I should use a 2-D lookup table in this case) in Simulink. However, I am quite new to this and I do not know how to generate the the table data for the Simulink LUT.
Simulink LUT:
Table data is your interpolated z-data from curve fitting. I guess it will have a value for every combination of breakpoints (i.e. it covers every grid intersection in your first diagram). So if Breakpoint 1 is 100 elements and Breakpoint 2 is 40 elements, Table data is 100x40.
If you can't get the data out from the GUI-based interactive curve fit, I guess you can extract the data from the command line. The following is an excerpt of Mathworks' curve fitting documentation. It would be good to verify this because I don't have the toolbox to test it though.
•Interpolation: fittedmodel = fit([Time,Temperature], Energy, 'cubicinterp');
•Evaluation: fittedmodel(80, 40)
Based on your LUT inputs u1 and u2, the table will interpolate or extrapolate the grid to get your output value.
Hope that helps.
I did find a solution after all, thanks Tom for your help, the fittedmodel() function was indeed the key of it. I then used two FOR loops to populate my matrix which was 49x51 (as seen by the grid in the image) after the cftool interpolation. After that it was all a matter of two for loops in one another to populate my matrix with the z values of my T and SoC parameters.
for x = 1:49
for y = 1:51
TableData(x,y)=fittedmodel(B_SoC(x),B_Temp(y));
end
end
Where TableData is the 49x51 matrix required for my LUT, B_SoC and B_Temp being [0:2.083:100] and [-10:1.1:45] respectively (determined as the desired start and end of my x and y axis with the spacing taken from the image with the data cursor).

How to apply a moving median filter on a time series of 2D scans in Matlab?

I have a huge set of data of a timelapse of 2D laser scans of waves running up and down stairs (see fig.1fig.2fig.3).
There is a lot of noise in the scans, since the water splashes a lot.
Now I want to smoothen the scans.
I have 2 questions:
How do I apply a moving median filter (as recommended by another study dealing with a similar problem)? I can only find instructions for single e.g. (x,y) or (t,y) plots but not for x and y values that vary over time. Maybe an average filter would do it as well, but I do not have a clue on that either.
The scanner is at a fixed point (222m) so all the data spikes point towards that point at the ceiling. Is it possible or necessary to include this into the smoothing process?
This is the part of the code (I hope it's enough to get it):
% Plot data as real time profile
x1=data.x;y1=data.y;
t=data.t;
% add moving median filter here?
h1=plot(x1(1,:),y1(1,:));
axis([210 235 3 9])
ht=title('Scanner data');
for i=1:1:length(t);
set(h1,'XData',x1(i,:),'YData',y1(i,:));set(ht,'String',sprintf('t = %5.2f
s',data.t(i)));pause(.01);end
The data.x values are stored in a (mxn) matrix in which the change in time is arranged vertically and the x values i.e. "laser points" of the scanner are horizontally arranged. The data.y is stored in the same way. The data.t values are stored in a (mx1) matrix.
I hope I explained everything clearly and that somebody can help me. I am already pretty desperate about it... If there is anything missing or confusing, please let me know.
If you're trying to apply a median filter in the x-y plane, then consider using medfilt2 from the Image Processing Toolbox. Note that this function only accepts 2-D inputs, so you'll have to loop over the third dimension.
Also note that medfilt2 assumes that the x and y data are uniformly spaced, so if your x and y data don't fall onto a uniformly spaced grid you may have to manually loop over indices, extract the corresponding patches, and compute the median.
If you can/want to apply an averaging filter instead of a median filter, and if you have uniformly spaced data, then you can use convn to compute a k x k moving average by doing:
y = convn(x, ones(k,k)/(k*k), 'same');
Note that you'll get some bias on the boundaries because you're technically trying to compute an average of k^2 pixels when you have less than that number of values available.
Alternatively, you can use nested calls to movmean since the averaging operation is separable:
y = movmean(movmean(x, k, 2), k, 1);
If your grid is separable, but not uniform, you can still use movmean, just use the SamplePoints name-value pair:
y = movmean(movmean(x, k, 2, 'SamplePoints', yv), k, 1, 'SamplePoints', xv);
You can also control the endpoint handling in movmean with the Endpoints name-value pair.

How can I classify my data for K-Means Clustering

A proof of concept prototype I have to do for my final year project is to implement K-Means Clustering on a big data set and display the results on a graph. I only know object-oriented languages like Java and C# and decided to give MATLAB a try. I notice that with a functional language the approach to solving problems is very different, so I would like some insight on a few things if possible.
Suppose I have the following data set:
raw_data
400.39 513.29 499.99 466.62 396.67
234.78 231.92 215.82 203.93 290.43
15.07 14.08 12.27 13.21 13.15
334.02 328.79 272.2 306.99 347.79
49.88 52.2 66.35 47.69 47.86
732.88 744.62 687.53 699.63 694.98
And I picked row 2 and 4 to be the 2 centroids:
centroids
234.78 231.92 215.82 203.93 290.43 % Centroid 1
334.02 328.79 272.2 306.99 347.79 % Centroid 2
I want to now compute the euclidean distances of each point to each centroid, then assign each point to it's closest centroid and display this on a graph. Let's say I want I want to classify the centroids as blue and green. How can I do this in MATLAB? If this was Java I would initialise each row as an object and add to separate ArrayLists (representing the clusters).
If rows 1, 2 and 3 all belong to the first centroid / cluster, and rows 4, 5 and 6 belong to the second centroid / cluster - how can I classify these to display them as blue or green points on a graph? I am new to MATLAB and really curious about this. Thanks for any help.
(To begin with, Matlab has a flexible distance measuring function, pdist2 and also kmeans implementation, but I'm assuming that you want to build your code from scratch).
In Matlab, you try to implement everything as matrix algebra, without loops over elements.
In your case, if R is the raw_data matrix and C is the centroids matrix,
you can shift the dimension that represents centroid number to the 3rd place by
permC=permute(C,[3 2 1]); Then the bsxfun function allows you to subtract C from R while expanding R's third dimension as necessary: D=bsxfun(#minus,R,permC). Element-wise square followed by summation across columns SqD=sum(D.^2,2) will give you the squared distances of each observation from each centroid. Performing all these operations within a single statement and shifting the third (centroid) dimension back to the 2nd place will look like this:
SqD=permute(sum(bsxfun(#minus,R,permute(C,[3 2 1])).^2,2),[1 3 2])
Picking the centroid of minimal distance is now straightforward: [minDist,minCentroid]=min(SqD,[],2)
If this looks complex, I recommend inspecting the product of each sub-step and reading the help of each command.

3-D Plotting with MATLAB for Galton's Skewness and Moor's Kurtosis

I know there are many plotting documents for Matlab online and I am pretty sure that it has been asked many times. I aplogize in advance for any inconvenience.
I am dealing with a new distribution and I need to draw 3D plot for different values of parameters (I can do it with Excel or any other programs, however, since my other graphs is drawn with MATLAB, and I need to put this 3D in Matlab, too, to publish it as an article). I calculated the result using MATLAB loops, however, plotting gives me the hardest time. I had no other choice but to ask for your assistance. I have these equations for different alphas and betas with a constant sigma and calculate Galton's Skewness and Moor's Kurtosis given with the last two equations.
median=sqrt(2*(sigma^2)*beta*gammaincinv(0.5,alpha));
q1=sqrt(2*(sigma^2)*beta*gammaincinv((6/8),alpha));
q3=sqrt(2*(sigma^2)*beta*gammaincinv((2/8),alpha));
q4=sqrt(2*(sigma^2)*beta*gammaincinv((7/8),alpha));
q5=sqrt(2*(sigma^2)*beta*gammaincinv((5/8),alpha));
q6=sqrt(2*(sigma^2)*beta*gammaincinv((3/8),alpha));
q7=sqrt(2*(sigma^2)*beta*gammaincinv((1/8),alpha));
galtonskewness=(q1-2*median+q3)/(q1-q3);
moorskurtosis=(q4-q5+q6-q7)/(q1-q3);
Let's assume that,
sigma=1
beta=[0.1 0.2 0.5 1 2 5];
alpha=[0.1 0.2 0.5 1 2 5];
I have used mesh(X,Y,Z) for the same range of alphas and betas with the same increment but I take the error "these values cannot be complex". I just want to draw something like the one below.
It must be something easy that I am missing out, but I do not understand where the mistake is. I appreciate any help. Thank you!
I ran the above code for a 2D mesh of points for alpha and beta between 0.1 and 5 for both dimensions and I got results for both.
I suspect it's due to your alpha and beta declaration. You are only providing a few points, and if you try to use mesh, it won't get good results. Therefore, define a meshgrid of points for both alpha and beta, then vectorize your MATLAB code to produce the kurotsis and skewness curves. Only under certain situations should you use for loops. In general, you should avoid using them whenever possible.
How meshgrid works is that given a range of X and Y values, it will produce two (or three if you want 3D co-ordinates) arrays where each location in each array gives you the spatial co-ordinate at that particular location. Therefore, if we did something like:
[X,Y] = meshgrid(1:3, 1:3);
This is what we get:
X =
1 2 3
1 2 3
1 2 3
Y =
1 1 1
2 2 2
3 3 3
Notice that in a 2D grid, for the top-left corner, (x,y) = (1,1), and so for the corresponding location in X, we get 1 and Y we get 1. If you do the same logic for any other position in the 2D grid, you simply look at the X and Y values in each array and it will tell you what the component is for each dimension.
As such, instead of looping through all possible points in your grid, generate them all using meshgrid, then vectorize the computation by calculating your values all at once rather than individually. Once you do this, you have the right structure to be able to put this into mesh.
Therefore, try doing this instead:
%// Define meshgrid of points
[alpha,beta] = meshgrid(0.1:0.1:5, 0.1:0.1:5);
%// From your code
sigma = 1;
%// Calculate quantities - Notice that this is all vectorized
med=sqrt(2*(sigma^2)*beta.*gammaincinv(0.5,alpha));
q1=sqrt(2*(sigma^2)*beta.*gammaincinv((6/8),alpha));
q3=sqrt(2*(sigma^2)*beta.*gammaincinv((2/8),alpha));
q4=sqrt(2*(sigma^2)*beta.*gammaincinv((7/8),alpha));
q5=sqrt(2*(sigma^2)*beta.*gammaincinv((5/8),alpha));
q6=sqrt(2*(sigma^2)*beta.*gammaincinv((3/8),alpha));
q7=sqrt(2*(sigma^2)*beta.*gammaincinv((1/8),alpha));
galtonskewness=(q1-2*med+q3)./(q1-q3);
moorskurtosis=(q4-q5+q6-q7)./(q1-q3);
%// Show our meshes
figure;
mesh(alpha, beta, galtonskewness);
figure;
mesh(alpha, beta, moorskurtosis);
Also take note that I renamed your median variable to med. MATLAB has a function called median and so you don't want to unintentionally shadow over this function with a variable of the same name.
This is what I get:
Take note that I'm not getting the plots that you have placed in your post. It may be because I'm choosing the wrong variables to define the mesh, or perhaps your equations may be incorrect. Double check what you know in theory to what you have here in code and try again.
This should hopefully give you enough to start with though!

How to resample with interp1 in Matlab when input vectors are of different length

I have two variables in a .mat file here:
https://www.yousendit.com/download/UW13UGhVQXA4NVVQWWNUQw
testz is a vector of cumulative distance (in meters, monotonically and regularly increasing)
testSDT is a vector of integrated (cumulative) sound wave travel time (in milliseconds) generated using the distance vector and a vector of velocities
(there is an intermediate step of creating interval travel times)
Since velocity is a continuously variable function the resulting interval travelt times and also the integrated travel times are non integers and variable in magnitude
What I want is to resample the distance vector at regular time intervals (e.g. 1 ms, 2 ms, ..., n ms)
What makes it difficult is that the maximum travel time, 994.6659, is less than the number of samples in the 2 vectors, therefore it is not straightforward to use interp1.
i.e.:
X=testSDT -> 1680 samples
Y=testz -> 1680 samples
XI=[1:1:994] -> 994 samples
This is the code I've come up with. It is a working code and it is not too bad I think.
%% Initial chores
M=fix(max(testSDT));
L=(1:1:M);
%% Create indices
% this loops finds the samples in the integrated travel time vector
% that are closest to integer milliseconds and their sample number
for i=1:M
[cl(i) ind(i)] = min(abs(testSDT-L(i)));
nearest(i) = testSDT(ind(i));
end
%% Remove duplicates
% this is necessary to remove duplicates in the index vector (happens in this test).
% For example: 2.5 ms would be the closest to both 2 ms and 2 ms
[clsst,ia,ic] = unique(nearest);
idx=(ind(ia));
%% Interpolation
% this uses the index vectors to resample the depth vectors at
% integer times
newz=interp1(clsst,testz(idx),[1:1:length(idx)],'cubic')';
As far as I can see there is one issue with this code:
I rely on the vector idx as my XI for interpolation. Vector idx is 1 sample shorter than vector ind (one duplicate was removed).
Therefore my new times will stop one millisecond short. This is a very small issue, and duplicate are unlikely but I am wondering if anybody can think of a workaround, or of a different way to approach the problem altogether.
Thank you
If I understand you correctly, you want to extrapolate to that extra point.
you can do this is many ways, one is to add that extra point to the interp1 line.
If you have some function you expect to follow your data you can use it by fitting it to the data and then obtaining that extra point or with a tool like fnxtr.
But I have a problem understanding what you want because of the way you used the line. The third argument you use, [1:1:length(idx)], is just the series [1 2 3 ...], usually when interpolating, one uses some vector x_i of points of interest, though I doubt your points of interest happen to be the series of integers 1:length(idx), what you want is just [1:length(idx) xi], where xi is that extra point x-axis value.
EDIT:
Instead of the loop just produce matrix forms out of L and testSDT, then matrix operation is somewhat faster in doing the min(abs(...:
MM=ones(numel(testSDT),1)*L;
TT=testSDT*ones(1,numel(L));
[cl ind]=(min(abs(TT-MM)));
nearest=testSDT(ind);