I want to perform a Census transform in MATLAB at the center pixels of each filter's window as shown below:
If the the image does not appear, an alternative link: https://i.ibb.co/9Y6LfSL/Shared-Screenshot.jpg
My Initial attempt for the code is:
function output =census(img,census_size)
img_gray = rgb2gray(img);
[y,x]=size(img_gray);
borders = floor(census_size/2); % limit to exclude image borders when filtering
for(iy = 1+borders : y-borders)
for(ix = 1+borders : x-borders)
f=img_gray(iy-borders:iy+borders,ix-borders:ix+borders);
iix=ix-borders;
iiy=iy-borders;
% shift=bitsll(img_out(iiy,iix),1);
img_out(iiy,iix)= % Must be Implemented with census
end
end
%normalised_image = img_out ./ max(max(img_out)) ;
output=img_out;
imshow(normalised_image);
end
iix and iiy at the second for loop represents my current location for the center pixels. f is my current filter window.
In addition to comparsion operation with the window's other pixels, I need to set each comparsion result to logical 1/0, and extend the total result ( by shifting I guess) to 8-bit data, then convert this binary number to a deciaml number. How I can do this in a practical way in MATLAB?
I have checked this in Python: https://stackoverflow.com/a/38269363/12173333
But could not make a similarity in MATLAB.
If you have the image processing toolbox you can use blockproc:
%Load your image
I = imread('https://i.stack.imgur.com/9oxaQ.png');
%Creation of the census transform function
fun = #(B) [128 64 32 16 0 8 4 2 1]*(B.data(:)>B.data(2,2));
%Process the image, block-by-block with overlap, force the result to be of type uint8
I2 = uint8(blockproc(I.',[1 1],fun,'BorderSize',[1,1],'TrimBorder',0)).'
Here blockproc is configured for a 3x3 windows (with overlap) and work for grayscale image. The function fun check which part of the block is strictly bigger than the center of the block. We obtain a 1x9 logical vector. Then I multiply this vector with [128 64 32 16 0 8 4 2 1] (binary to decimal transformation).
Update:
Optimisation with linear algeabra
For random windows size you can use:
I = imread('https://i.stack.imgur.com/9oxaQ.png');
w = 5; % windows size, any odd number between 3 and 31.
b2d = 2.^[w^2-1:-1:ceil(w^2/2),0,floor(w^2/2):-1:1] % binary to decimal vector
cen = ceil(w/2); % center position
%Creation of the census transform function
fun = #(B) b2d*(B.data(:)>B.data(cen,cen));
%Process the image, block-by-block with overlap
I2 = blockproc(I.',[1 1],fun,'BorderSize',[cen-1,cen-1],'TrimBorder',0).'/sum(b2d)
Input:
Output:
Related
I've been working on bilinear interpolation based on wiki example in matlab. I followed the example to the T, but when comparing the outputs from my function and the in-built matlab function, the results are vastly different and I can't figure out why or how that happens.
Using inbuilt matlab function:
Result of my function below:
function T = bilinear(X,h,w)
%pre-allocating the output size
T = uint8(zeros(h,w));
%padding the original image with 0 so i don't go out of bounds
X = padarray(X,[2,2],'both');
%calculating dimension ratios
hr = h/size(X,1);
wr = w/size(X,2);
for row = 3:h-3
for col = 3:w-3
%for calculating equivalent position on the original image
o_row = ceil(row/hr);
o_col = ceil(col/wr);
%getting the intensity values from horizontal neighbors
Q12=X(o_row+1,o_col-1);
Q22=X(o_row+1,o_col+1);
Q11=X(o_row-1,o_col-1);
Q21=X(o_row-1,o_col+1);
%calculating the relative positions to the enlarged image
y2=round((o_row-1)*hr);
y=round(o_row*hr);
y1=round((o_row+1)*hr);
x1=round((o_col-1)*wr);
x=round(o_col*wr);
x2=round((o_col+1)*wr);
%interpolating on 2 first axis and the result between them
R1=((x2-x)/(x2-x1))*Q11+((x-x1)/(x2-x1))*Q21;
R2=((x2-x)/(x2-x1))*Q12+((x-x1)/(x2-x1))*Q22;
P=round(((y2-y)/(y2-y1))*R1+((y-y1)/(y2-y1))*R2);
T(row,col) = P;
T = uint8(T);
end
end
end
The arguments passed to the function are step4 = bilinear(Igray,1668,1836); (scale factor of 3).
You are finding the pixel nearest to the point you want to interpolate, then find 4 of this pixel’s neighbors and interpolate between them:
o_row = ceil(row/hr);
o_col = ceil(col/wr);
Q12=X(o_row+1,o_col-1);
Q22=X(o_row+1,o_col+1);
Q11=X(o_row-1,o_col-1);
Q21=X(o_row-1,o_col+1);
Instead, find the 4 pixels nearest the point you want to interpolate:
o_row = ceil(row/hr);
o_col = ceil(col/wr);
Q12=X(o_row,o_col-1);
Q22=X(o_row,o_col);
Q11=X(o_row-1,o_col-1);
Q21=X(o_row-1,o_col);
The same pixel’s coordinates then need to be used when computing distances. The easiest way to do that is to separate out the floating-point coordinates of the output pixel ((row,col)) in the input image (o_row,o_col), and the location of the nearest pixel in the input image (fo_row,fo_col). Then, the distances are simply d_row = o_row - fo_row and 1-d_row, etc.
This is how I would write this function:
function T = bilinear(X,h,w)
% Pre-allocating the output size
T = zeros(h,w,'uint8'); % Create the matrix in the right type, rather than cast !!
% Calculating dimension ratios
hr = h/size(X,1); % Not with the padded sizes!!
wr = w/size(X,2);
% Padding the original image with 0 so I don't go out of bounds
pad = 2;
X = padarray(X,[pad,pad],'both');
% Loop
for col = 1:w % Looping over the row in the inner loop is faster!!
for row = 1:h
% For calculating equivalent position on the original image
o_row = row/hr;
o_col = col/wr;
fo_row = floor(o_row); % Code is simpler when using floor here !!
fo_col = floor(o_col);
% Getting the intensity values from horizontal neighbors
Q11 = double(X(fo_row +pad, fo_col +pad)); % Indexing taking padding into account !!
Q21 = double(X(fo_row+1+pad, fo_col +pad)); % Casting to double might not be necessary, but MATLAB does weird things with integer computation !!
Q12 = double(X(fo_row +pad, fo_col+1+pad));
Q22 = double(X(fo_row+1+pad, fo_col+1+pad));
% Calculating the relative positions to the enlarged image
d_row = o_row - fo_row;
d_col = o_col - fo_col;
% Interpolating on 2 first axis and the result between them
R1 = (1-d_row)*Q11 + d_row*Q21;
R2 = (1-d_row)*Q12 + d_row*Q22;
T(row,col) = round((1-d_col)*R1 + d_col*R2);
end
end
end
I want to implement two dimensional matched filter for blood vessel extraction according to the paper "Detection of Blood Vessels in Retinal Images Using Two-Dimensional Matched Filters" by Chaudhuri et al., IEEE Trans. on Medical Imaging, 1989 (there's a PDF on the author's web site).
A brief discription is that blood vessel's cross-section has a gaussian distribution and therefore I want to use gaussian matched filter to increase SNR. Such a kernel may be mathematically expressed as:
K(x,y) = -exp(-x^2/2*sigma^2) for |x|<3*sigma, |y|<L/2
L here is the length of vessel with fixed orientation. Experimentally sigma=1.5 and L = 7.
My MATLAB code for this part is:
s = 1.5; %sigma
t = -3*s:3*s;
theta=0:15:165; %different rotations
%one dimensional kernel
x = 1/sqrt(6*s)*exp(-t.^2/(2*s.^2));
L=7;
%two dimensional gaussian kernel
x2 = repmat(x,L,1);
Consider the response of this filter for a pixel belonging to the background retina. Assuming the background to have constant intensity with zero mean additive Gaussian white noise, the expected value of the filter output should ideally be zero. The convolution kernel is, therefore, modified by subtracting the mean value of s(t) from the function itself. The mean value of the kernel is determined as: m = Sum(K(x,y))/(number of points).
Thus, the convolutional mask used in this algorithm is given by: K(x, y) = K(x,y) - m.
My MATLAB code:
m = sum(x2(:))/(size(x2,1)*size(x2,2));
x2 = x2-m;
A vessel may be oriented at any angle 0<theta<180 and the matched filter response is maximum when when it is aligned at theta+- 90 (cross-section distribution is gaussian not the vessel itself).
Thus we need to rotate the matched filter 12 times with 15 degree increment.
My MATLAB code is attached here but I don't get a desirable result. Any help is appreciated.
%apply rotated matched filter on image
r = {};
for k = 1:12
x3=imrotate(x2,theta(k),'crop');%figure;imagesc(x3);colormap gray;
r{k}=conv2(img,x3);
end
w=[];h = zeros(584,565);
for i = 1:565
for j = 1:584
for k = 1:32
w= [w ,r{k}(j,i)];
end
h(j,i)=max(abs(w));
w = [];
end
end
%show result
figure('Name','after matched filter');imagesc(h);colormap gray
For rotation I used imrotate which seems more sensible to me but in the paper it is different: suppose p=[x,y] be a discrete point in the kernel. To compute coefficients in the rotated kernel we have [u,v] = p*Rotation_Matrix.
Rotation_Matrix=[cos(theta),sin(theta);-sin(theta),cos(theta)]
And the kernel is:
K(x,y) = -exp(-u^2/2*s^2)
But the new kernel doesn't have a gaussian shape anymore. Using imrotate preserves gaussian shape. So what is the benefit of using Rotation matrix?
Input image is:
Output:
Matched filtering helps increase SNR but background noise is amplified too.
Am I right to use imrotate to rotate the kernel? My main problem is with rotation matrix that why and what is the right code to implement it.
The reason to build the filter from its analytic expression for each rotation, rather than using imrotate, is that the filter extent is not circular, and therefore rotating brings in "new" pixel values and pushes some other pixels out of the kernel. Furthermore, rotating a kernel constructed as here (smooth transition along one direction, step edge along the other dimension) requires different interpolation methods along each dimension, which imrotate cannot do. The resulting rotated kernel will always be wrong.
Both these issues can be easily seen when displaying the kernel you make together with two rotated versions:
This display brings an additional issues to the front: the kernel is not centered on a pixel, causing it to shift the output by half a pixel.
Note also that, when subtracting the mean, it is important that this mean be computed only over the original domain of the filter, and that any zeros used to pad this domain to a rectangular shape remain zero (these should not become negative).
The rotated kernels can be constructed as follows:
m = max(ceil(3*s),(L-1)/2);
[x,y] = meshgrid(-m:m,-m:m); % non-rotated coordinate system, contains (0,0)
t = pi/6; % angle in radian
u = cos(t)*x - sin(t)*y; % rotated coordinate system
v = sin(t)*x + cos(t)*y; % rotated coordinate system
N = (abs(u) <= 3*s) & (abs(v) <= L/2); % domain
k = exp(-u.^2/(2*s.^2)); % kernel
k = k - mean(k(N));
k(~N) = 0; % set kernel outside of domain to 0
This is the result for the three rotations used in the example above (the grey around the edges of the kernel corresponds to the value 0, the black pixels have a negative value):
Another issue is that you use conv2 with the default 'full' output shape, you should be using 'same' here, so that the output of the filter matches the input.
Note that, instead of computing all filter responses, and computing the max afterwards, it is much easier to compute the max as you compute each filter response. All of the above leads to the following code:
img = im2double(rgb2gray(img));
s = 1.5; %sigma
L = 7;
theta = 0:15:165; %different rotations
out = zeros(size(img));
m = max(ceil(3*s),(L-1)/2);
[x,y] = meshgrid(-m:m,-m:m); % non-rotated coordinate system, contains (0,0)
for t = theta
t = t / 180 * pi; % angle in radian
u = cos(t)*x - sin(t)*y; % rotated coordinate system
v = sin(t)*x + cos(t)*y; % rotated coordinate system
N = (abs(u) <= 3*s) & (abs(v) <= L/2); % domain
k = exp(-u.^2/(2*s.^2)); % kernel
k = k - mean(k(N));
k(~N) = 0; % set kernel outside of domain to 0
res = conv2(img,k,'same');
out = max(out,res);
end
out = out/max(out(:)); % force output to be in [0,1] interval that MATLAB likes
imwrite(out,'so_result.png')
I get the following output:
I am implementing a simple algorithm to do in-painting on a "damaged" image. I have a predefined mask that specifies the area which needs to be fixed. My strategy is to start at the border of the masked area and in-paint each pixel with the central mean of its neighboring non-zero pixels, repeating until there's no unknown pixels left.
function R = inPainting(I, mask)
H = [1 2 1; 2 0 2; 1 2 1];
R = I;
n = 1;
[row,col,~] = find(~mask); %Find zeros in mask (area to be inpainted)
unknown = horzcat(row, col)';
while size(unknown,2) > 0
new_unknown = [];
new_R = R;
for u = unknown
r = u(1);
c = u(2);
nb = R(max((r-n), 1):min((r+n), end), max((c-n),1):min((c+n),end));
nz = nb~=0;
nzs = sum(nz(:));
if nzs ~= 0 %We have non-zero neighbouring pixels. In-paint with average.
new_R(r,c) = sum(nb(:)) / nzs;
else
new_unknown = horzcat(new_unknown, u);
end
end
unknown = new_unknown;
R = new_R;
end
This works well, but it's not very efficient. Is it possible to vectorize such an approach, using mostly matrix operations? Does someone know of a more efficient way to implement this algorithm?
If I understand your problem statement, you are given a mask and you wish to fill in these pixels in this mask with the mean of the neighbourhood pixels that surround each pixel in the mask. Another constraint is that the image is defined such that any pixels that belong to the mask in the same spatial locations are zero in this mask. You are starting from the border of the mask and are propagating information towards the innards of the mask. Given this algorithm, there is unfortunately no way you can do this with standard filtering techniques as the current time step is dependent on the previous time step.
Image filtering mechanisms, like imfilter or conv2 can't work here because of this dependency.
As such, what I can do is help you speed up what is going on inside your loop and hopefully this will give you some speed up overall. I'm going to introduce you to a function called im2col. This is from the image processing toolbox, and given that you can use imfilter, we can use this function.
im2col creates a 2D matrix such that each column is a pixel neighbourhood unrolled into a single vector. How it works is that each pixel neighbourhood in column major order is grabbed, so we get a pixel neighbourhood at the top left corner of the image, then move down one row, and another row and we keep going until we reach the last row. We then move one column over and repeat the same process. For each pixel neighbourhood that we have, it gets unrolled into a single vector, and the output would be a MN x K matrix where you have a neighbourhood size of M x N for each pixel neighbourhood and there are K neighbourhoods.
Therefore, at each iteration of your loop, we can unroll the current inpainted image's pixel neighbourhoods into single vectors, determine which pixel neighborhoods are non-zero and from there, determine how many zero values there are for each of these selected pixel neighbourhood. After, we compute the mean for these non-zero columns disregarding the zero elements. Once we're done, we update the image and move to the next iteration.
What we're going to need to do first is pad the image with a 1 pixel border so that we're able to grab neighbourhoods that extend beyond the borders of the image. You can use padarray, also from the image processing toolbox.
Therefore, we can simply do this:
function R = inPainting(I, mask)
R = double(I); %// For precision
n = 1;
%// Change - column major indices
unknown = find(~mask); %Find zeros in mask (area to be inpainted)
%// Until we have searched all unknown pixels
while numel(unknown) ~= 0
new_R = R;
%// Change - take image at current iteration and
%// create columns of pixel neighbourhoods
padR = padarray(new_R, [n n], 'replicate');
cols = im2col(padR, [2*n+1 2*n+1], 'sliding');
%// Change - Access the right pixel neighbourhoods
%// denoted by unknown
nb = cols(:,unknown);
%// Get total sum of each neighbourhood
nbSum = sum(nb, 1);
%// Get total number of non-zero elements per pixel neighbourhood
nzs = sum(nb ~= 0, 1);
%// Replace the right pixels in the image with the mean
new_R(unknown(nzs ~= 0)) = nbSum(nzs ~= 0) ./ nzs(nzs ~= 0);
%// Find new unknown pixels to look at
unknown = unknown(nzs == 0);
%// Update image for next iteration
R = new_R;
end
%// Cast back to the right type
R = cast(R, class(I));
I wonder whether it would be possible to extract only hands from a video with matlab. In the video hands perform some gesture. Because first frames are only background I tried in this way:
readerObj = VideoReader('VideoWithHands.mp4');
nFrames = readerObj.NumberOfFrames;
fr = get(readerObj, 'FrameRate');
writerObj = VideoWriter('Hands.mp4', 'MPEG-4');
set(writerObj, 'FrameRate', fr);
open(writerObj);
bg = read(readerObj, 1); %background
for k = 1 : nFrames
frame = read(readerObj, k);
hands = imabsdiff(frame,bg);
writeVideo(writerObj,hands);
end
close(writerObj);
But I realized that colors of the hands are not "real" and they are transparent. Is there a better way to extract them from video keeping colors and opacity level exploiting the first frames (background)?
EDIT: Well, I have found a good setting for vision.ForegroundDetector object, now hands are white logical regions but when I try to visualize them with:
videoSource = vision.VideoFileReader('VideoWithHands.mp4', 'VideoOutputDataType', 'uint8');
detector = vision.ForegroundDetector('NumTrainingFrames', 46, 'InitialVariance', 4000, 'MinimumBackgroundRatio', 0.2);
videoplayer = vision.VideoPlayer();
hands = uint8(zeros(720,1280,3));
while ~isDone(videoSource)
frame = step(videoSource);
fgMask = step(detector, frame);
[m,n] = find(fgMask);
a = [m n];
if isempty(a)==true
hands(:,:,:) = uint8(zeros(720,1280,3));
else
hands(m,n,1) = frame(m,n,1);
hands(m,n,2) = frame(m,n,2);
hands(m,n,3) = frame(m,n,3);
end
step(videoplayer, hands)
end
release(videoplayer)
release(videoSource)
or put them into a videofile with:
eaderObj = VideoReader('Video 9.mp4');
nFrames = readerObj.NumberOfFrames;
fr = get(readerObj, 'FrameRate');
writerObj = VideoWriter('hands.mp4', 'MPEG-4');
set(writerObj, 'FrameRate', fr);
detector = vision.ForegroundDetector('NumTrainingFrames', 46, 'InitialVariance', 4000, 'MinimumBackgroundRatio', 0.2);
open(writerObj);
bg = read(readerObj, 1);
frame = uint8(zeros(size(bg)));
for k = 1 : nFrames
frame = read(readerObj, k);
fgMask = step(detector, frame);
[m,n] = find(fgMask);
hands = uint8(zeros(720,1280));
if isempty([m n]) == true
hands(:,:) = uint8(zeros(720,1280));
else
hands(m,n) = frame(m,n);
end
writeVideo(writerObj,mani);
end
close(writerObj);
...my PC crashes. Some suggestion?
So you're trying to cancel out the background, making it black, right?
The easiest way to do this should be to filter it, you can do that by comparing your difference data to a threshold value and then using the result as indices to set a custom background.
filtered = imabsdiff(frame,bg);
bgindex = find( filtered < 10 );
frame(bgindex) = custombackground(bgindex);
where custombackground is whatever image file you want to put into the background. If you want it to be just black or white, use 0 or 255 instead of custombackground(bgindex). Note that the numbers depend on your video data's format and could be inaccurate (except 0, this one should always be right). If too much gets filtered out, lower the 10 above, if too much remains unfiltered, increase the 10.
At the end, you write your altered frame back into the video, so it just replaces the hands variable in your code.
Also, depending on your format, you might have to do the comparison across RGB values. This is slightly more complicated as it involves checking 3 values at the same time and doing some magic with the indices. This is the RGB version (works with anything containing 3 color bands):
filtered = imabsdiff(frame,bg); % differences at each pixel in each color band
totalfiltered = sum(filtered,3); % sums up the differences
% in each color band (RGB)
bgindex = find( totalfiltered < 10 ); % extracts indices of pixels
% with color close to bg
allind = sub2ind( [numel(totalfiltered),3] , repmat(bgindex,1,3) , ...
repmat(1:3,numel(bgindex),1) ); % index magic
frame(allind) = custombackground(allind); % copy custom background into frame
EDIT :
Here's a detailed explanation of the index magic.
Let's assume a 50x50 image. Say the pixel at row 2, column 5 is found to be background, then bgindex will contain the number 202 (linear index corresponding to [2,5] = (5-1)*50+2 ). What we need is a set of 3 indices corresponding to the matrix coordinates [2,5,1], [2,5,2] and [2,5,3]. That way, we can change all 3 color bands corresponding to that pixel. To make calculations easier, this approach actually assumes linear indexing for the image and thus converts it to a 2500x1 image. Then it expands the 3 color bands, creating a 2500x3 matrix. We now construct the indices [202,1], [202,2] and [202,3] instead.
To do that, we first construct a matrix of indices by repeating our values. repmat does this for us, it creates the matrices [202 202 202] and [1 2 3]. If there were more pixels in bgindex, the first matrix would contain more rows, each repeating the linear pixel coordinates 3 times. The second matrix would contain additional [1 2 3] rows. The first argument to sub2ind is the size of the matrix, in this case, 2500x3, so we calculate the number of pixels with numel applied to the sum vector (which collapses the image's 3 bands into 1 value and thus has 1 value per pixel) and add a static 3 in the second dimension.
sub2ind now takes each element from the first matrix as a row index, each corresponding element from the second matrix as a column index and converts them to linear indices into a matrix of the size we determined earlier. In our example, this results in the indices [202 2702 5202]. sub2ind preserves the shape of the inputs, so if we had 10 background pixels, this result would have the size 10x3. But since linear indexing doesn't care about the shape of the index matrix, it just takes all of those values.
To confirm this is correct, let's revert the values in the example. The original image data would have the size 50x50x3. For an NxMxP matrix, a linear index to the subscript [n m p] can be calculated as ind = (p-1)*M*N + (m-1)*N + n. Using our values, we get the following:
[2 5 1] => 202
[2 5 2] => 2702
[2 5 3] => 5202
ind2sub confirms this.
Yes, there is a better way. The computer vision system toolbox includes a vision.ForegroundDetector object that does what you need. It implements the Gaussian Mixture Model algorithm for background subtraction.
I have a binary barcode image, I want to apply a sobel filter that I created so that I can detect the positive/negative vertical edges of the barcode lines which I can then use to determine the start/end of each barcode line.
The idea of implementing my own sobel filter is so that I can preserve the polarity(for start/end of each barcode line) and then plot different color lines over the positive/negative values marking the beginning and end of each bar in the barcode.
My problem is that I am not getting every line from my original barcode when I apply my filter in my output image. Can anyone see the problem in my code?
Image =imread('barcode.jpg')
I = im2double(Image);
G = rgb2gray(I);
avgI=mean(mean(G))
Threshold = avgI;
T=(G<Threshold);
figure()
New = T.*G
imshow(New);
G = [1 0 -1
2 0 -2
1 0 -1];
Output = I;
for n = 1:1000
Output = conv2(New,G); % 2D Convolution Function
end
subplot(1,2,1), imshow (New);
subplot(1,2,2), imshow (Output);
Use the convolution operation directly on the thresholded image. Also make sure to convert it to double since it will be of class logical. Lastly I dont know why you were performing the convolution 1000 times but overall I think you put in a good effort and were relatively close to a solution.
Code:
Image =imread('barcode.jpg');
I = im2double(Image);
G = rgb2gray(I);
avgI=mean(mean(G));
Threshold = avgI;
T=(G<Threshold);
figure
G = [1 0 -1
2 0 -2
1 0 -1];
Output = conv2(double(T),G); % 2D Convolution Function
subplot(1,2,1), imshow (T,[]);
subplot(1,2,2), imshow (Output,[]);
Output: