Median filter's out is wrong. What is the right median filter algorithm? - matlab

I want to write a 1-D median filter for eliminating glitches from the signal.
I wrote my median filter code on matlab and ı compare it with medfilt1 function out. My median filter is not working.
-- My median filter order is 8.
In my implementation,
when data comes, I fill the array( size of the array is 8).
When incoming data count is 8, I take the middle value and write this
middle value to the median filter output array. And then I wait for the next 8 data. when ı take the 8 data, ı take the middle value and write this middle value to the median filter output array. and so on. (I implement sorting algoritm and ı test it. it is working good).
here is my screenshots,
my incoming data's color is red.
matlab's medfilt1's out is green.
my median filter's out is blue.
Overall picture
blown-up image
I think my algorithm is wrong, but I don't know what is the right algorithm?

Your implementation is wrong, probably in two ways (hard to tell as you didn't show us your code).
You should be scrolling 1 element at a time, not 8 elements at a time. That is, you should drop just the oldest element and add just the newest element, before taking the median. (Note that your output has a frequency 8 times too high because you are replacing all 8 elements.)
You say that you take the middle value. The middle value is not the median. But perhaps you forgot to tell us that you do a sort first?

Related

How come dice coefficient comes in bigger than 1

I want to evaluate my automatic image segmentation results. I use Dice coefficients by a function written in Matlab. Following is the link for the code.
mathworklink
I am comparing the segmented patch and manually cropped patch; interestingly DICE comes in bigger than one. I dispatched the code many times such as taking the absolute value of patches(to get rid of negative pixels) but could not find the reason. How come, while sum each individual set is (say 3 and 5), and their union comes (say 45)? The maximum of union must be 8.
Can somebody guide me more precise sources to implement dice coefficients?
function[Dice]=evaldem(man,auto)
for i=1:size(auto,1)
for j=1:size(auto,2)
auto(i,j).autosegmentedpatch=imresize(auto(i,j).autosegmentedpatch,[224 224]);)
man(i,j).mansegmentedpatch=imresize(man(i,j).mansegmentedpatch,[224 224]);
Dice(i,j)=sevaluate(man(i,j).mansegmentedpatch,auto(i,j).autosegmentedpatch)
end
Since I have many automatically segmented patches, and manually segmented patches, I stored them in structures[man and auto]. Structures` size is [i,j]. Definitely I have to imresize to have them be in equal size! Then, I call the FEX submission file. When it comes to negative pixels of these patches, they have some. Note that, I take the absolute value of these patches when I am computing 'common' and 'union' for Dice. All in all, I get still Dice values bigger than one.

MATLAB: Using CONVN for moving average on Matrix

I'm looking for a bit of guidance on using CONVN to calculate moving averages in one dimension on a 3d matrix. I'm getting a little caught up on the flipping of the kernel under the hood and am hoping someone might be able to clarify the behaviour for me.
A similar post that still has me a bit confused is here:
CONVN example about flipping
The Problem:
I have daily river and weather flow data for a watershed at different source locations.
So the matrix is as so,
dim 1 (the rows) represent each site
dim 2 (the columns) represent the date
dim 3 (the pages) represent the different type of measurement (river height, flow, rainfall, etc.)
The goal is to try and use CONVN to take a 21 day moving average at each site, for each observation point for each variable.
As I understand it, I should just be able to use a a kernel such as:
ker = ones(1,21) ./ 21.;
mat = randn(150,365*10,4);
avgmat = convn(mat,ker,'valid');
I tried playing around and created another kernel which should also work (I think) and set ker2 as:
ker2 = [zeros(1,21); ker; zeros(1,21)];
avgmat2 = convn(mat,ker2,'valid');
The question:
The results don't quite match and I'm wondering if I have the dimensions incorrect here for the kernel. Any guidance is greatly appreciated.
Judging from the context of your question, you have a 3D matrix and you want to find the moving average of each row independently over all 3D slices. The code above should work (the first case). However, the valid flag returns a matrix whose size is valid in terms of the boundaries of the convolution. Take a look at the first point of the post that you linked to for more details.
Specifically, the first 21 entries for each row will be missing due to the valid flag. It's only when you get to the 22nd entry of each row does the convolution kernel become completely contained inside a row of the matrix and it's from that point where you get valid results (no pun intended). If you'd like to see these entries at the boundaries, then you'll need to use the 'same' flag if you want to maintain the same size matrix as the input or the 'full' flag (which is default) which gives you the size of the output starting from the most extreme outer edges, but bear in mind that the moving average will be done with a bunch of zeroes and so the first 21 entries wouldn't be what you expect anyway.
However, if I'm interpreting what you are asking, then the valid flag is what you want, but bear in mind that you will have 21 entries missing to accommodate for the edge cases. All in all, your code should work, but be careful on how you interpret the results.
BTW, you have a symmetric kernel, and so flipping should have no effect on the convolution output. What you have specified is a standard moving averaging kernel, and so convolution should work in finding the moving average as you expect.
Good luck!

What element of the array would be the median if the the size of the array was even and not odd?

I read that it's possible to make quicksort run at O(nlogn)
the algorithm says on each step choose the median as a pivot
but, suppose we have this array:
10 8 39 2 9 20
which value will be the median?
In math if I remember correct the median is (39+2)/2 = 41/2 = 20.5
I don't have a 20.5 in my array though
thanks in advance
You can choose either of them; if you consider the input as a limit, it does not matter as it scales up.
We're talking about the exact wording of the description of an algorithm here, and I don't have the text you're referring to. But I think in context by "median" they probably meant, not the mathematical median of the values in the list, but rather the middle point in the list, i.e. the median INDEX, which in this cade would be 3 or 4. As coffNjava says, you can take either one.
The median is actually found by sorting the array first, so in your example, the median is found by arranging the numbers as 2 8 9 10 20 39 and the median would be the mean of the two middle elements, (9+10)/2 = 9.5, which doesn't help you at all. Using the median is sort of an ideal situation, but would work if the array were at least already partially sorted, I think.
With an even numbered array, you can't find an exact pivot point, so I believe you can use either of the middle numbers. It'll throw off the efficiency a bit, but not substantially unless you always ended up sorting even arrays.
Finding the median of an unsorted set of numbers can be done in O(N) time, but it's not really necessary to find the true median for the purposes of quicksort's pivot. You just need to find a pivot that's reasonable.
As the Wikipedia entry for quicksort says:
In very early versions of quicksort, the leftmost element of the partition would often be chosen as the pivot element. Unfortunately, this causes worst-case behavior on already sorted arrays, which is a rather common use-case. The problem was easily solved by choosing either a random index for the pivot, choosing the middle index of the partition or (especially for longer partitions) choosing the median of the first, middle and last element of the partition for the pivot (as recommended by R. Sedgewick).
Finding the median of three values is much easier than finding it for the whole collection of values, and for collections that have an even number of elements, it doesn't really matter which of the two 'middle' elements you choose as the potential pivot.

Quicksort pivote choice

I've read that the pivote can be the median of 3 numbers, bottom, middle and top. But, could that generate overflow? What happens if the median returns a value larger than the array size?
I assume that the this choice is by assuming that they array values can't be longer than the array size.
I think I'm confused at what a pivote really is.
The pivot is just the value that you compare other values against - lower values go the left, higher to the right. The pivot can be chosen by taking any of the existing values in the array. If the array is completely unsorted, it won't matter which value you choose. If it is somewhat sorted, you should choose a value from the middle of the array.
UPDATE: Some reading informs me that a better pivot choice may be to choose the median value of 3 values in the array (such as middle, bottom and top or 3 random positions). Some people advocate taking the median of 5 values. The worst-case performance of quicksort occurs when pivot is close to the smallest or largest value in the array, and this tactic is intended to defend against that occurring. This is just an optimisation for certain kinds of data - it is not a necessity.

Why my filter output is not accurate?

I am simulating a digital filter, which is 4-stage.
Stages are:
CIC
half-band
OSR
128
Input is 4 bits and output is 24 bits. I am confused about the 24 bits output.
I use MATLAB to generate a 4 bits signed sinosoid input (using SD tool), and simulated with modelsim. So the output should be also a sinosoid. The issue is the output only contains 4 different data.
For 24 bits output, shouldn't we get a 2^24-1 different data?
What's the reason for this? Is it due to internal bit width?
I'm not familiar with Modelsim, and I don't understand the filter terminology you used, but...Are your filters linear systems? If so, an input at a given frequency will cause an output at the same frequency, though possibly different amplitude and phase. If your input signal is a single tone, sampled such that there are four values per cycle, the output will still have four values per cycle. Unless one of the stages performs sample rate conversion the system is behaving as expected. As as Donnie DeBoer pointed out, the word width of the calculation doesn't matter as long as it can represent the four values of the input.
Again, I am not familiar with the particulars of your system so if one of the stages does indeed perform sample rate conversion, this doesn't apply.
Forgive my lack of filter knowledge, but does one of the filter stages interpolate between the input values? If not, then you're only going to get a maximum of 2^4 output values (based on the input resolution), regardless of your output resolution. Just because you output to 24-bit doesn't mean you're going to have 2^24 values... imagine running a digital square wave into a D->A converter. You have all the output resolution in the world, but you still only have 2 values.
Its actually pretty simple:
Even though you have 4 bits of input, your filter coefficients may be more than 4 bits.
Every math stage you do adds bits. If you add two 4-bit values, the answer is a 5 bit number, so that adding 0xf and 0xf doesn't overflow. When you multiply two 4-bit values, you actually need 8 bits of output to hold the answer without the possibility of overflow. By the time all the math is done, your 4-bit input apparently needs 24-bits to hold the maximum possible output.