I have several video sequences exhibiting light intensity flickering (under fluorescent light sources).
This is due to the shutter speed and/or sampling rate not being a whole multiple of the electrical frequency. For example - shooting video at 1/50 second shutter speeds with 60Hz electrical frequency.
In general - I need to solve this without knowing the sampling rate, electrical frequency, video frame rate. I just see the flickering and need to fix them.
The video scenes include moving objects as well (some move slow, some as fast as the rapid change in intensity due to the flickering).
Is there a well known method of dealing with such flickering?
Thanks!
The common method for removing flickering is along the following lines. Looking at the difference image between consecutive frames, the flickering should appear as a strong periodic signal along the vertical axis of the image. Therefore, it should have a strong coefficient in the frequency domain. Thus, the flickering can be detected and removed by finding the coefficients in frequency domain that represent the flickering in the difference image, nullifying them and transforming back to space domain.
In pseudocode this algorithm looks like this:
imDiff = I_{t+1} - I_t (Compute the difference between subsequent video frames)
imDiff = FilterImDiff(imDiff)
imDiffRowSum = RowSum(imDiff) (summing the rows of the diff image)
dctCoef = DiscreteCosineTransform(imDiffRowSum)
flickeringDctCoef = SomeHeuristicToFindFlickeringCoef(dctCoef)
flickeringIm = CloneColumn(InverseDiscreteCosineTransform(fixedDctCoef), numCols)
fixedimDiff = imDiff - flickeringIm
fixedI_{t+1} = I_{t+1} + fixedimDiff
where:
RowSum(x) takes an m x n image as input and returns a column vector of size m x 1 where element i contains the sum of the i'th row in the image x.
CloneColumn(x, n) takes a column vector x of size m and clones it n times in order to create an m x n matrix.
numCols is the number of columns in the input image.
a simple algorithm for SomeHeuristicToNullifyFlickeringCoef can be choosing the first couple of largest coefficients, if they are greater than a certain threshold.
FilterImDiff should discard stuff from the difference image that doesn't contain flickering, such as movement of foreground objects. For example, pixels that have a temporal difference that is greater than the maximal magnitude of flickering. Also, pixels that are too bright or too dark usually don't have flickering in them.
Related
I am trying to write a program that uses computer vision techniques to detect (and track) tiny blobs in a stream of very noisy images. The image stream comes from an dual X ray imaging setup, which outputs left and right views (different sizes because of collimating differently). My data is of two types: one set of images are not so noisy, which I am just using to try different techniques with, and the other set are noisier, and this is where the detection needs to work at the end. The image stream is at 60 Hz. This is an example of a raw image from the X ray imager:
Here are some cropped out samples of the regions of interest. The blobs that need to be detected are the small black spots near the center of the image.
Initially I started off with a simple contour/blob detection techniques in OpenCV, which were not very helpful. Eventually I moved on to techniques such as "opening" the image using morphological operators, and subsequently performing a Laplacian of Gaussian blob detection to detect areas of interest. This gave me better results for the low-noise versions of the images, but fails when it comes to the high-noise ones: gives me too many false positives. Here is a result from a low-noise image (please note input image was inverted).
The code for my current LoG based approach in MATLAB goes as below:
while ~isDone(videoReader)
frame = step(videoReader);
roi_frame = imcrop(frame, [660 410 120 110]);
I_roi = rgb2gray(roi_frame);
I_roi = imcomplement(I_roi);
I_roi = wiener2(I_roi, [5 5]);
background = imopen(I_roi,strel('disk',3));
I2 = imadjust(I_roi - background);
K = imgaussfilt(I2, 5);
level = graythresh(K);
bw = im2bw(I2);
sigma = 3;
% Filter image with LoG
I = double(bw);
h = fspecial('log',sigma*30,sigma);
Ifilt = -imfilter(I,h);
% Threshold for points of interest
Ifilt(Ifilt < 0.001) = 0;
% Dilate to obtain local maxima
Idil = imdilate(Ifilt,strel('disk',50));
% This is the final image
P = (Ifilt == Idil) .* Ifilt;
Is there any way I can improve my current detection technique to make it work for images with a lot of background noise? Or are there techniques better suited for images like this?
The approach I would take:
-Average background subtraction
-Aggressive Gaussian smoothing (this filter should be shaped based on your target object, off the top of my head I think you want the sigma about half the smallest cross section of your object, but you may want to fiddle with this) Basically the goal is blurring the noise as much as possible without completely losing your target objects (based on shape and size)
-Edge detection. Try to be specific to the object if possible (basically, look at what the object's edge looks like after Gaussian smoothing and set your edge detection to look for that width and contrast shift)
-May consider running a closing operation here.
-Search the whole image for islands (fully enclosed regions) filter based on size and then on shape.
I am taking a hunch that despite the incredibly low signal to noise ratio, your granularity of noise is hopefully significantly smaller than your object size. (if your noise is both equivalent contrast and same ballpark size as your object... you are sunk and need to re-evaluate your acquisition imo)
Another note based on your speed needs. Extreme amounts of processing savings can be made through knowing last known positions and searching locally and also knowing where new targets can enter the image from.
I am a beginner in digital image processing field, recently I am working on a project where I have to decompose an image into two frequency components namely (low and high) using DCT. I searched a lot on web and I found that MATLAB has a built-in function for Discrete Cosine Transform which is used like this:
dct_img = dct2(img);
where img is input image and dct_img is resultant DCT of img.
Question
My question is, "How can I decompose the dct_img into two frequency components namely low and high frequency components".
As you've mentioned, dct2 and idct2 will do most of the job for you. The question that remains is then: What is high frequency and what is low frequency content? The coefficients after the 2 dimensional transform will actually represent two frequencies each (one in x- and one in y-direction). The following figure shows the bases for each coefficient in an 8x8 discrete cosine transform:
Therefore, that question of low vs. high can be answered in different ways. A common way, which is also used in the JPEG encoding, proceeds diagonally from zero-frequency downto the max as shown above. As we can see in the following example that is mostly motivated because natural images are largely located in the "top left" corner of "low" frequencies. It is certainly worth looking at the result of dct2 and play around with the actual choice of your regions for high and low.
In the following I'm dividing the spectrum diagonally and also plotting the DCT coefficients - in logarithmic scale because otherwise we would just see one big peak around (1,1). In the example I'm cutting far above half of the coefficients (adjustable with cutoff) we can see that the high-frequency part ("HF") still contains some relevant image information. If you set cutoff to 0 or below only noise of small amplitude will be left.
%// Load an image
Orig = double(imread('rice.png'));
%// Transform
Orig_T = dct2(Orig);
%// Split between high- and low-frequency in the spectrum (*)
cutoff = round(0.5 * 256);
High_T = fliplr(tril(fliplr(Orig_T), cutoff));
Low_T = Orig_T - High_T;
%// Transform back
High = idct2(High_T);
Low = idct2(Low_T);
%// Plot results
figure, colormap gray
subplot(3,2,1), imagesc(Orig), title('Original'), axis square, colorbar
subplot(3,2,2), imagesc(log(abs(Orig_T))), title('log(DCT(Original))'), axis square, colorbar
subplot(3,2,3), imagesc(log(abs(Low_T))), title('log(DCT(LF))'), axis square, colorbar
subplot(3,2,4), imagesc(log(abs(High_T))), title('log(DCT(HF))'), axis square, colorbar
subplot(3,2,5), imagesc(Low), title('LF'), axis square, colorbar
subplot(3,2,6), imagesc(High), title('HF'), axis square, colorbar
(*) Note on tril: The lower triangle-function operates with respect to the mathematical diagonal from top-left to bottom-right, since I want the other diagonal I'm flipping left-right before and afterwards.
Also note that this kind of operations are not usually applied to entire images, but rather to blocks of e.g. 8x8. Have a look at blockproc and this article.
An easy example:
I2 = dct_img;
I2(8:end,8:end) = 0;
I3 = idct2(I2);
imagesc(I3)
I3 can be seen as the image after low pass filter (the low frequency components), then idct2(dct_img - I2) can be viewed as high frequency.
I've got a frame in JPEG format.
I want to define if it's a fade in/out frame(which most of them are black frames) or not.
I read an article and tried to do exactly like that, but it wont work properly.
this is the idea:
At first the frame feature vector should be defined:
color histogram is computed only from the Hue component, which represents the dominant spectral component color in its pure form (Manjunath et al., 2001). Moreover, the quantization of the color histogram is set to 16 color bins, aiming at reducing significantly the amount of data without loosing important information
then it computes the standard deviation of the frame feature vector. The standard deviation of monochromatic frames is equal to zero or a sufficiently small value close to zero.4 This information is used by VSUMM to removes these frames. This step is also employed by Furini et al. (2010).
The code is in MATLAB:
str = num2str(50);
filename1=strcat('pics\' , str , '.jpeg');
Im1 = imread(filename1);
hsv = rgb2hsv(Im1);
hn1 = hsv(:,:,1);
hn1 = hn1/norm(hn1);
f=std2(hn1)
According to the idea f should be equal to zero or a sufficiently small value close to zero. it's correct for all fade in/out frame but the result is sometimes small value close to zero for usual frame which is wrong ,what is wrong with it?
As an example i upload 4 pictures:
the result for the fisrt two pics which are fade in/out frames are 9.3340e-04,9.9959e-04 and for the 3rd image which is a normal frame is 0.23 which all of these results are correct but then the result for some frames like the 4th one which is a normal frames is 8.2447e-04 which is wrong.
honestly this code is n't that important, i just want a code that distinguish normal frames from fade in/out frames.
I have two images – mannequin with and without garment.
Please refer sample images below. Ignore the jewels, footwear on the mannequin, imagine the second mannequin has only dress.
I want to extract only the garment from the two images for further processing.
The complexity is that there is slight displacement in the position of camera when taking the two pictures. Due to this simple subtraction to generate the garment mask will not work.
Can anyone tell me how to handle it?
I think I need to do registration between the two images so that I can extract only the garment from the image?
Any references to blogs, articles and codes is highly appreciated.
--
Thanks
Idea
This is an idea of how you could do it, I haven't tested it but my gut tells me it might work. I'm assuming that there will be slight differences in the pose of the manequin as well as the camera attitude.
Let the original image be A, and the clothed image be B.
Take the difference D = |A - B|, apply a median filter that is proportional to the largest deviation you expect from pose and camera attitude error: Dmedian = Median(D, kernelsize).
Quantize Dmedian into a binary mask Dmask = Q(Dmedian, threshold) using appropriate threshold values to obtain an approximate mask for the garment (this will be smaller than the garment itself due to the median filter). Reject any shapes in Dmedian that have too small area by setting their pixels to 0.
Expand the shape(s) in Dmask proportionally to the size of the median kernel into Emask=expand(Dmask, k*kernelsize). Then construct the difference in the masks Fmask=|Dmask - Emask| which now contains areas of pixels where the garment edge is expected to be. For every pixel in Fmask which is in this area, find the correlation Cxy between A and B using a small neighbourhood, store the correlations into an image C=1.0 - Corr(A,B, Fmask, n).
Your final garment mask will be M=C+Dmask.
Explanation
Since your image has nice and continuous swatches of colour, the difference between the two similar images will be thin lines and small gradients where the pose and camera attitude is different. When taking a median filter of the difference image over a sufficiently large kernel, these lines will be removed because they are in a minority of the pixels.
The garment on the other hand will (hopefully) have a significant difference from the colors in the unclothed version. And will generate a bigger difference. Thresholding the difference after the median filter should give you a rough mask of the garment that is undersized dues to some of the pixels on the edge being rejected due to their median values being too low. You could stop here if the approximation is good enough for you.
By expanding the mask we obtained above we get a probable region for the "true" edge. The above process has served to narrow our search region for the true edge considerably and we can apply a more costly correlation search between the images along this edge to find where the garment is. High correlation means no carment and low correlation means garment.
We use the inverted correlation as an alpha value together with the initially smaller mask to obtain a alpha valued mask of the garment that can be used for extracting it.
Clarification
Expand: What I mean by "expanding the mask" is to find the contour of the mask region and outsetting/growing/enlarging it to make it larger.
Corr(A,B,Fmask,n): Is just an arbitrarily chosen correlation function that gives correlation between pixels in A and B that are selected by the mask Fmask using a region of size n. The function returns 1.0 for perfect match and 0.0 for anti-match for each pixel tested. A good function is this pseudocode:
foreach px_pos in Fmask where Fmask[px_pos] == 1
Ap = subregion(A, px_pos, size) - mean(mean(A));
Bp = subregion(B, px_pos, size) - mean(mean(B))
Cxy = sum(sum(Ap .* Bp))*sum(sum(Ap .* Bp)) / (sum(sum(Ap.*Ap))*sum(sum(Bp.*Bp)))
C[px_pos] = 1.0 - Cxy;
end
where subregion selects a region of size size around the pixel with position px_pos.
You can see that if Ap == Bp then Cxy=1
Here with i have attached two consecutive frames captured by a cmos camera with IR Filter.The object checker board was stationary at the time of capturing images.But the difference between two images are nearly 31000 pixels.This could be affect my result.can u tell me What kind of noise is this?How can i remove it.please suggest me any algorithms or any function possible to remove those noises.
Thank you.Sorry for my poor English.
Image1 : [1]: http://i45.tinypic.com/2wptqxl.jpg
Image2: [2]: http://i45.tinypic.com/v8knjn.jpg
That noise appears to result from camera sensor (Bayer to RGB conversion). There's the checkerboard pattern still left.
Also lossy jpg contributes a lot to the process. You should first have an access to raw images.
From those particular images I'd first try to use edge detection filters (Sobel Horizontal and Vertical) to make a mask that selects between some median/local histogram equalization for the flat areas and to apply some checker board reducing filter to the edges. The point is that probably no single filter is able to do good for both jpeg ringing artifacts and to the jagged edges. Then the real question is: what other kind of images should be processed?
From the comments: if corner points are to be made exact, then the solution more likely is to search for features (corner points with subpixel resolution) and make a mapping from one set of points to the other images set of corners, and search for the best affine transformation matrix that converts these sets to each other. With this matrix one can then perform resampling of the other image.
One can fortunately estimate motion vectors with subpixel resolution without brute force searching all possible subpixel locations: when calculating a matched filter, one gets local maximums for potential candidates of exact matches. But this is not all there is. One can try to calculate a more precise approximation of the peak location by studying the matched filter outputs in the nearby pixels. For exact match the output should be symmetric. Otherwise the 'energies' of the matched filter are biased towards the second best location. (A 2nd degree polynomial fit + finding maximum can work.)
Looking closely at these images, I must agree with #Aki Suihkonen.
In my view, the main noise comes from the jpeg compression, that causes sharp edges to "ring". I'd try a "de-speckle" type of filter on the images, and see if this makes a difference. Some info that can help you implement this can be found in this link.
In a more quick and dirty fashion, you apply one of the many standard tools, for example, given the images are a and b:
(i) just smooth the image with a Gaussian filter, this can reduce noise differences between the images by an order of magnitude. For example:
h=fspecial('gaussian',15,2);
a=conv2(a,h,'same');
b=conv2(b,h,'same');
(ii) Reduce Noise By Adaptive Filtering
a = wiener2(a,[5 5]);
b = wiener2(b,[5 5]);
(iii) Adjust ntensity Values Using Histogram Equalization
a = histeq(a);
b = histeq(b);
(iv) Adjust Intensity Values to a Specified Range
a = imadjust(a,[0 0.2],[0.5 1]);
b = imadjust(b,[0 0.2],[0.5 1]);
If your images are supposed to be black and white but you have captured them in gray scale there could be difference due to noise.
You can convert the images to black and white by defining a threshold, any pixel with a value less than that threshold should be assigned 0 and anything larger than that threshold should be assigned 1, or whatever your gray scale range is (maybe 255).
Assume your image is I, to make it black and white assuming your gray scale image level is from 0 to 255, assume you choose a threshold of 100:
ind = find(I < 100);
I(ind) = 0;
ind = find(I >= 100);
I(ind) = 255;
Now you have a black and white image, do the same thing for the other image and you should get very small difference if the camera and the subject have note moved.