Is it possible to implement a custom distance measure in Matlab "knnclassify" function?
In particular, I am interested in classifying an example according to the distance between two vectors to be equal to histogram intersection (vectors are considered to be histograms). For two N-dimensional vectors, w1 and w2, the distance is:
dist(w1, w2)=sum_i_to_N min(w1(i), w2(i))
By examining the source of knnclassify, this relies on using knnsearch. The parameter of the distance to use is supplied to this function when you look at the knnclassify source. By looking at knnsearch, you certainly can implement this function yourself. knnsearch allows you to specify a custom function as long as it can take in only two vectors of the same size. These vectors are from the data sets that you are applying knnclassify to. As such, create a new function or you can do it anonymously using either:
function [d] = histogramIntersection(w1, w2)
d = sum(min([w1,w2],[],2));
... or you can do this anonymously:
f = #(w1,w2) sum(min([w1,w2],[],2));
However, what you're going to have to do to incorporate this into knnclassify is that you will have to modify the source and include an additional condition in the switch statement so that you can include histogram intersection as a choice. Once you do that, you can either provide #f as input into the knnsearch call, or make some room in the code and define a histogramIntersection method like above, then use #histogramIntersection as input into knnclassify. This input should replace the string that is input into knnclassify that specifies the kind of distance measure you want.
tl;dr: You can do it but you'll have to modify the knnclassify source if you want to do this. Alternatively, you can see what knnclassify is doing, then just pull out the relevant calls that pertain to just your case and place your custom histogram intersection method accordingly, create a new file and just run this file. That way you don't need to mess with MATLAB's original source.
Related
Let us suppose I do have two functions on a grid [X,Y]=meshgrid(x,y),
f=f(x,y) and g=g(x,y)
and I want to calculate
sqrt(f^2+g^2)
in two different ways:
Expanding f^2 and g^2 symbolically first and then applying
V1=(f.^2+g.^2).^(1/2).
Using f=f(X,Y), g=g(X,Y) (this is, computing f and g on the grid first) and then applying the puntual operation over the matrices
V2=(f.^2+g.^2).^(1/2)
Unfortunately, when I calculate abs(V1-V2), I don't get zero in all positions.
In my case, the functions have the form
f=f(x,y,cos(x),cos(y),cos(x)cos(y))
and the same for g.
What could be the best method to calculate this operations in order to get the closest result with respect to the analytical function without changing the grid?
I have a data set that contains both categorical and numerical features for each row. I would like to select a different similarity metric for each feature (column) and preform hierarchical clustering on the data. Is there a way to do that in Matlab?
Yes, this is actually fairly straightforward: linkage, which creates the tree, takes as input a dissimilarity matrix. Consequently, in the example workflow below
Y = pdist(X,'cityblock');
Z = linkage(Y,'average');
T = cluster(Z,'cutoff')
you simply replace the call to pdist with a call to your own function that calculates the pairwise dissimilarity between rows, everything else stays the same.
I want to plot some equations and inequalities like x>=50, y>=0,4x-5y>=8,x=40,x=60,y=25, y=45 in matlab and want to get the area produced by intersecting these equations and inequalities. Is it possible using matlab? If yes can someone provide me some manual? If not, is there some other software that can do this?
Integrals would work for your purposes, provided you know the points at which the curves intersect (something Matlab is also able to compute). Take a look at the documentation on the integral function.
q = integral(fun,xmin,xmax) approximates the integral of function fun
from xmin to xmax using global adaptive quadrature and default error
tolerances.
EDIT: As an additional resource, take a look at the code provided by user Grzegorz Konz on the Mathworks blog.
EDIT #2: I'm not familiar with any Matlab functions that'll take a vector of functions and return the points of intersection (if any) between all the curves. Users have produced functions that return the set of intersection points between two curves. You could run this function for each pair of equations in your list and use a function like polyarea to compute the area of the enclosed region if the curves are all straight lines.
I would like to use knnimpute to fill some missing values in my dataset. Thing is, I would like to use my own distance function, instead of the typical ones (Euclidean, Manhattan...).
For what I've read, knnimpute allows me to use a function handle, that calculates the distance according to Heterogeneous Euclidean-Overlap Metric (HEOM)
I've implemented this function as a regular function, but not as a handle function. So, I cannot use the distance matrix from my "normal" function, because this has to be done inside knnimpute, somehow, as a handler...
I'm confuse, can someone help me understand what I need to do?
As long as your implementation of a distance function has the same signature as the standard distance functions, then you should be able to easily pass your function in.
From the knnimpute documentation (matlab knnimpute) it states that you can pass "A handle to a distance function, specified using #, for example, #distfun." It then refers the reader to the pdist function which provides more details (matlab pdist) about the custom distance function:
A distance function specified using #:
D = pdist(X,#distfun)
A distance function must be of form
d2 = distfun(XI,XJ)
taking as arguments a 1-by-n vector XI, corresponding to a single row of X, and an m2-by-n matrix XJ, corresponding to multiple rows of X. distfun must accept a matrix XJ with an arbitrary number of rows. distfun must return an m2-by-1 vector of distances d2, whose kth element is the distance between XI and XJ(k,:).
So as long as your distance function, as defined in your *.m file matches this signature and so can support these inputs, then there shouldn't be any problems.
Suppose that your distance function is in the mydistFunc.m file, and it's signature matches the above requirements, then all you should need to do is:
% call knnimpute with the data and your function
knnimpute(inputData,'Distance',#mydistFunc);
If I have an RC circuit with transfer function 1/(1+sRC) how do I draw the transfer function using MATLAB?
Num2=[1];
Den2=[R*C 1];
RCcirc=tf(Num2,Den2);
How do I declare the R and the C so that there are no errors?
tf is the wrong tool for plotting the transfer function. Try these instead:
Use linspace to generate a range of values for s. Give R and C reasonable values of your choice.
Read up on arithmetic operations in MATLAB, especially ./
Look at how to use plot and familiarize yourself with the command using some simple examples from the docs.
With these you should be able to plot the transfer function in MATLAB :)
First of all you need to understand what transfer function you want. Without defined values of R and C you won't get any transfer function. Compare it to this, you want to plot a sine wave: x = sin(w*t), I hope you can agree with me that you cannot plot such a function (including axes) unless I specifically say e.g. t is the time, ranging from 0 seconds to 10 seconds and w is a pulsation of 1 rad/s. It's exactly the same with your RC network: without any values, it is impossible for numerical software such as MATLAB to come up with a plot.
If you fill in those values, you can use th tf function to display the transfer function in whatever way you like (e.g. a bode plot).
On the other hand, if you just want the expression 1/(1+s*R*C), take a look at the symbolic toolbox, you can do such things there. But to make a plot, you will still have to fill in the R and C value (and even a value for your Laplace variable in this case).