I'm using a kd-tree to perform quick nearest neighbor search queries. I'm using the following piece of code to generate the kd-tree and perform queries on it:
% 3 dimensional vertex data
x = [1 2 2 1 2 5 6 3 4;
3 2 3 2 2 7 6 5 2;
1 2 9 9 7 5 8 9 3]';
% create the kd-tree
kdtree = createns(x, 'NSMethod', 'kdtree');
% perform a nearest neighbor search
nearestNeighborIndex = knnsearch(kdtree, [1 1 1]);
This works well enough for when the data is static. However, every once in a while, I need to translate every vertex on the kd-tree. I know that changing the whole data means I need to re-generate the whole tree to perform a nearest neighbor search again. Having a couple of thousand vertices for each kd-tree, re-generating the whole tree from scratch seems to me like an overkill as it takes a significant amount of time. Is there a way to translate the kd-tree without re-generating it from scratch? I tried accessing and changing the X property (which holds the actual vertex data) of the kd-tree but it seems to be read-only, and it probably wouldn't have worked even if it wasn't since there is a lot more going on behind the curtains.
Related
I'm sure this is a trivial question for a signals person. I need to find the function in Matlab that outputs averaging of contiguous segments of windowsize= l of a vector, e.g.
origSignal: [1 2 3 4 5 6 7 8 9];
windowSize = 3;
output = [2 5 8]; % i.e. [(1+2+3)/3 (4+5+6)/3 (7+8+9)/3]
EDIT: Neither one of the options presented in How can I (efficiently) compute a moving average of a vector? seems to work because I need that the window of size 3 slides, and doesnt include any of the previous elements... Maybe I'm missing it. Take a look at my example...
Thanks!
If the size of the original data is always a multiple of widowsize:
mean(reshape(origSignal,windowSize,[]));
Else, in one line:
mean(reshape(origSignal(1:end-mod(length(origSignal),windowSize)),windowSize,[]))
This is the same as before, but the signal is only taken to the end minus the extra values less than windowsize.
So I am writing a function that plots matrix data from n different cells. If n is 10, it should display 10 equally spaced plots on a single figure. If n is 7, it should try to space them out as equally as possible (so 3x2 or 2x3 plots with a plot by itself).
I am able to get these graphs drawn using subplot() and plot() but I'm having a hard time finding out how to initialise the dimensions for the subplot.
The number of subplots will be changing after each run so I can't initialise it to specific dimensions.
Can anyone point me in the right direction?
I am afraid problems like this tend to be messy. This normally problems like this need to be solved for different cases.
if (mod(n,2) && n<8)
% Do something
elseif (!mod(n,2) && n < 11)
% Do something else
elseif ...
....
end
The conditions are choosen a bit arbitarily since the specifications in the OP seemed a bit arbitary too. You probably understand the point and can set your own conditions.
There are two reasons why I recommend this approach.
1) This makes the code simpler to write. You do not have to come up with some complicated solution which may break in after some time.
2) By adding cases you can protect yourself against a rampant number of plots. In case the number of plots gets too large you do typically not want to have all plots in the same figure. It is also possible to wrap this into a function and apply this to X plots at a time in a loop. Typically you would want each iteration to be a separate figure.
It is not very easy to elaborate more on this since you have not yet specified how many cases you expect or what will happen to the last plot in case of odd numbers. Still this may give a good hint.
Good luck!
Another simple solution would be using round and ceil on the square root:
for n=1:20
[n, round(sqrt(n))*ceil(sqrt(n)), round(sqrt(n)), ceil(sqrt(n))]
end
output:
%(n, total_plots, x, y)
1 1 1 1
2 2 1 2
3 4 2 2
4 4 2 2
5 6 2 3
6 6 2 3
7 9 3 3
8 9 3 3
9 9 3 3
10 12 3 4
Usage example:
n = 7
subplot(round(sqrt(n)), ceil(sqrt(n)), plot_nr_x) % switch first 2 params to have either a slightly longer or slightly wider subplot
I ran into a very similar problem today and I was having a lot of trouble to define the size of the subplot that would fit everything. My reasoning is mostly a hack but it can help. If you have to represent at most n figures, you can thing as a square grid of sqrt(n) * sqrt(n). To make things better we add a safety row, so the final matrix would be (sqrt(n) + 1) * sqrt(n). I hope this helps solving your problem.
In my code have 2 nested loops:
within a loop that opens a figure for each kk element and is meant to plot a particular graph from the x position within the array.
for kk=1:length(some_file_list)
% Load data
% do some math
% get data as a cell array with things we care about in data(3,)
array_size = size(data(3,:),2);
for x=1:size(data(3,:),2);
% do more math and get things ready to plot matrix_A scaled by range_A
figure(kk); % open figure
grid_rows = round((sqrt(array_size)+1));
grid_cols = round(sqrt(array_size));
% plot
subplot(grid_rows, grid_cols, x);
imagesc(matrix_A,range_A); %plot in position
colormap(gray);
end
end
I found the detail and implementation of Local Ternary Pattern (LTP) on Calculating the Local Ternary Pattern of an image?. I want to ask more details that what the best way to choose the threshold t and also I have confusion in understand the role of reorder_vector = [8 7 4 1 2 3 6 9];
Unfortunately there isn't a good way to figure out what the threshold is using LTPs. It's mostly trial and error or by experimentation. However, I could suggest to make the threshold adaptive. You can use Otsu's algorithm to dynamically determine the best threshold of your image. This is assuming that the distribution of your intensities in the image is bimodal. In other words, there is a clear separation between objects and background. MATLAB has an implementation of this by the graythresh function. However, this generates a threshold between 0 and 1, so you will need to multiply the result by 255, assuming that the type of your image is uint8.
Therefore, do:
t = 255*graythresh(im);
im is the image that you desire to compute the LTPs. Now, I can certainly provide insight on what the reorder_vector is doing. Look at the following figure on how to calculate LTPs:
(source: hindawi.com)
When we generate the ternary code matrix (matrix in the middle), we need to generate an 8 element sequence that doesn't include the middle of the neighbourhood. We start from the east most element (row 2, column 3), then traverse the elements in counter-clockwise order. The reorder_vector variable allows you to select those specific elements that respect that order. If you recall, MATLAB can access matrices using column-major linear indices. Specifically, given a 3 x 3 matrix, we can access an element using a number from 1 to 9 and the memory is laid out like so:
1 4 7
2 5 8
3 6 9
Therefore, the first element of reorder_vector is index 8, which is the east most element. Next is index 7, which is the top right element, then index 4 which is the north facing element, then 1, 2, 3, 6 and finally 9.
If you follow these numbers, you will determine how I got the reorder_vector:
reorder_vector = [8 7 4 1 2 3 6 9];
By using this variable for accessing each 3 x 3 local neighbourhood, we would thus generate the correct 8 element sequence that respects the ordering of the ternary code so that we can proceed with the next stage of the algorithm.
I'm pretty new to Simulink and I was wondering if the following thing was possible somehow:
I have a signal of let's say 10000 data points.
On this signal I want to run a certain algorithm, however said algorithm needs exactly 1000 samples to work properly.
Now with normal matlab functions this is no problem. You cut the signal in 10 pieces, perform the algorithm for each one, stitch the processed parts back together and you get your result.
In Simulink however this creates sort of a problem, since (to my understanding right now) Simulinks blocks work sample per sample (one sample in, one sample out). So I don't have the necessary data to perform the algortihm within a block.
Is there any way to increase the number of processed samples per block?
Reshape 10,000 data points with 1000 samples and create Column wise data
lets say
data = [1 2 3 4 5 6], convert
data = [1 4
2 5
3 6]
Now, define sample time for lets say 1 sec, use fromworkspace blocks. In this example 2 (No of column), with each workspace block format a array as [t data(:,1)],[t data(:,2)] where t = [1 2 3], multiples of sampletime.
On Simulink Model, set running time equal to 3 sec as there are 3 data points and store output via to workspace block
I have a dataset set that looks like this:
Feature 1 Feature 2 Feature 3 Feature 4 Feature 5 Class
Obj 2 2 2 8 5 1
Obj 2 8 3 3 4 2
Obj 1 7 4 4 8 1
Obj 4 3 5 9 7 2
The rows contain objects, which have a number of features. I have put 5 features for demonstration purposes but there is approximately 50 features per object, with the final column being the class label for each object.
I want to create and run the nearest neighbour classifier algorithm on this data set and retrieve the error rate.I have managed to get the NN algorithm working for each feature, a short Pseudo code example is below. For each feature I loop through each object, assigning object j according to its nearest neighbours.
for i = 1:Number of features
for j = 1:Number of objects
distance between data(j,i) and values of feature i
order by shortest distance
sum or the class labels k shortest distances
assign class with largest number of labels
end
error = mean(labels~=assigned)
end
The issue I have is how would I work out the 1-NN algorithm for multiple features. I will have a selection of the features from my dataset say features 1,2 and 3. I want to calculate the error rate if I add feature 5 into my set of selected features. I want to work out the error using 1NN. Would I find the nearest value out of all my features 1-3 in my selected feature?
For example, for my data set above:
Adding feature 5 - For object 1 of feature 5 the closest number to that is object 4 of feature 3. As this has a class label of 2 I will assign object 1 of feature 5 the class 2. This is obviously a misclassification but I would continue to classify all other objects in Feature 5 and compare the assigned and actual values.
Is this the correct way to perform the 1NN against multiple features?