How to enhance Tesseract automatic text rotation capabilities for OCR? - python-imaging-library

I have a set of PIL images, where some pages are correctly rotated, while others have a rotation close to 180°. This means that automatic orientation detection may fail as instead of 178° degrees recognizes a 2° degrees orientation.
Unfortunately, Tesseract sometimes cannot understand the difference between 2° orientation and 178°, so in the latter case, the output is completely wrong.
A simple im.rotate(180) automatically fixes this, but the step is manual, and I would like tesseract to automatically understand whether the text is upside-down or not.
Looking at some approaches they require the Hough transform for understanding of the prevalent orientation in the document. In this case, however, they may fail, because of the peculiar orientation of these scanned documents.
What options for automatic rotation are available, without reyling on third party scripts, but staying within Python libraries?

I am new to StackOverflow, so please forgive me for any kind of misleadings or incorrect answers. In case if anyone is still looking for an answer, pytesseract's image_to_osd function gives the information about the orientation. It only determines the orientation as 0°,90°,180° or 270° i.e. it accurately determines the orientation if the text is aligned along with the axes. But it also outputs any of those four angles even for a different orientation.
So if you are working with minute angle differences like 2° or so, this should solve the issue. So first we align the text and then use the function.
here is the code in python:
while True:
osd_rotated_image = pytesseract.image_to_osd(image)
# using regex we search for the angle(in string format) of the text
angle_rotated_image = re.search('(?<=Rotate: )\d+', osd_rotated_image).group(0)
if (angle_rotated_image == '0'):
image = image
# break the loop once we get the correctly deskewed image
break
elif (angle_rotated_image == '90'):
image = rotate(image,90,(255,255,255)) # rotate(image,angle,background_color)
continue
elif (angle_rotated_image == '180'):
image = rotate(image,180,(255,255,255))
continue
elif (angle_rotated_image == '270'):
image = rotate(image,90,(255,255,255))
continue
And to align the text deskew python library is the best in my opinion.
Thank you.

Related

Making the "Region of Interest" (ROI) transparent in MATLAB

I've already made a function to cut out the image, and the part I cut out has a black background. I'm trying to make the black part transparent, then I can generate an image sequence that I can create a video with. I've tried converting the image to a double and then replacing the 0 values with NaN:
J = imread('imgExample.jpg');
J2 = im2double(J);
J2(J2 == 0) = NaN;
imwrite(J2, 'newImg.jpg');
but when I convert it into a video, it doesn't seem to stay. Is there any way to get the black part of the image to be transparent?
From clarifications in comments, you are trying to create a video format that supports alpha transparency using matlab.
In general this seems impossible using matlab alone (at least in matlab 2013 which is the version I use). If you'd like to check if the newest matlab supports videos with alpha transparency, type doc videowriter and have a look at the available formats. If you see anything with transparency options there, take it from there. But the most I see on mine is 24bit RGB videos (i.e. three channels, no transparency).
So matlab does not have the ability to produce native .avi video with alpha transparency.
However, note that this is a very rare video format anyway, and even if you did manage to produce such a video, you would still have to find a suitable viewer which supports playing videos with transparency!
It's therefore important for you to tell us your particular use-case because it may be you're actually trying to do something much simpler (which may or may not be solvable via matlab) (i.e. a case of the XY Problem
E.g. you may be trying to create a video with transparency for the web instead, like here https://developers.google.com/web/updates/2013/07/Alpha-transparency-in-Chrome-video
If this is the case, then I would recommend you attempting the method outlined there; you can create individual .png "frames" with transparency in matlab using the imwrite function. have a look at its documentation, particularly the section about png images and the 'Alpha' property. But beyond that, you'd need an external tool to combine them into a .webm file, since matlab doesn't seem to have a tool like that (at least none that I can see at a glance; there might be a 3rd-party toolkit if you look on the web).
Hope this helps.

How to stop MATLAB clipping the title of a figure when I print

When I create a figure with MATLAB with a title, then use the File|Print option to print the figure, the title is clipped. Please try this code for an example
t = linspace(0,2*pi,1000);
s = sin(t);
figure
plot(t,s)
titleString = sprintf('Multi\nLine\nTitle');
title(titleString)
disp('Now press File|Print Preview on the figure and observe that the title is clipped.')
disp('This happens with all titles, the multi line title makes it more obvious.')
disp('I know I can fix it with Fill Page or Center, but I should not have too.')
You can also see the problem in print preview. As I say in the example code, I know I can get round the problem using Print Preview then Fill Page or Center, but I don't want people using my code to have to use a work around.
I have observed this problem with r2014a and r2015b. I assume other releasse are also affected.
Is there setting I can make before creating the figure that centres the plot or fills the page and makes the problem go away? Is there some other setting I should make to avoid the problem?
Here is a little more debug information. If I press File|Print Preview, MATLAB reports Left 0.64, Top -0.59, Width 20.32, Height 15.24. I guess the problem is related to the negative Top value. These are defaults from MATLAB; I have not made any attempt to change these values.
One extra thing. I am in the UK, so my default paper/printer setting will be for A4 paper, if that makes a difference.
Edit:
It looks like my problems are caused by two lines further up in my program:
set(0,'DefaultFigurePaperOrientation','landscape')
set(0,'DefaultFigurePaperType','A4')
I think that becuase plots expect to be on paper with a portrait orientation, I am seeing these problems.
Perhaps I should revise my question to: what to I need to change in MATLAB figures so they print correctly on landscape A4 paper (ideally in the center, scaled to fill the page, but with correct orientation). All this without using Print Preview.
But I am going to do this instead to code around my problem.
set(0,'DefaultFigurePaperOrientation','portrait')
set(0,'DefaultFigurePaperType','A4')
I can't seem to reproduce your problem on the computer I'm currently on (see the values I'm getting by default - Top is 8.11):
However, if your problem is what I think it is (I'm getting something that fits this description on another computer I'm working on), try adding _{ } at the end of your string. This is a TeX string meaning "subscripted space" which pushes the rest of the text slightly upward. You can also use ^{ } on the first line if the clipping is happening from the top. I found this workaround to work on axis titles and labels as well.
Exaggerated, the workaround looks like this:
titleString = sprintf('^{^{^{^{^{^{ }}}}}}Multi\nLine\nTitle');
Which shows the word "Multi" even for Top = -0.59.
If the above is not what you're looking for, you might want to look at the robust export_fig.
I can confirm that my clipping problems are caused by this line:
set(0,'DefaultFigurePaperOrientation','landscape')
I have revised my program to start with this instead.
set(0,'DefaultFigurePaperOrientation','portrait')
set(0,'DefaultFigurePaperType','A4')
And the problem has gone away.
Users can still print in landscape if they use the print preview feature.

Detecting text in image using MSER

I'm trying to follow this tutorial http://www.mathworks.com/help/vision/examples/automatically-detect-and-recognize-text-in-natural-images.html to detect text in image using Matlab.
As a first step, the tutorial uses detectMSERFeatures to detect textual regions in the image. However, when I use this step on my image, the textual regions aren't detected.
Here is the snippet I'm using:
colorImage = imread('demo.png');
I = rgb2gray(colorImage);
% Detect MSER regions.
[mserRegions] = detectMSERFeatures(I, ...
'RegionAreaRange',[200 8000],'ThresholdDelta',4);
figure
imshow(I)
hold on
plot(mserRegions, 'showPixelList', true,'showEllipses',false)
title('MSER regions')
hold off
And here is the original image
and here is the image after the first step
[![enter image description here][2]][2]
Update
I've played around with parameters but none seem to detect textual region perfectly. Is there a better way to accomplish this than tweaking numbers? Tweaking the parameters won't work for wide array of images I might have.
Some parameters I've tried and their results:
[mserRegions] = detectMSERFeatures(I, ...
'RegionAreaRange',[30 100],'ThresholdDelta',12);
[mserRegions] = detectMSERFeatures(I, ...
'RegionAreaRange',[30 600],'ThresholdDelta',12);
Disclaimer: completely untested.
Try reducing MaxAreaVariation, since your text & background have very little variation (reduce false positives). You should be able to crank this down pretty far since it looks like the text was digitally generated (wouldn't work as well if it was a picture of text).
Try reducing the minimum value for RegionAreaRange, since small characters may be smaller than 200 pixels (increase true positives). At 200, you're probably filtering out most of your text.
Try increasing ThresholdDelta, since you know there is stark contrast between the text and background (reduce false positives). This won't be quite as effective as MaxAreaVariation for filtering, but should help a bit.

artifacts in processed images

This question is related to my previous post Image Processing Algorithm in Matlab in stackoverflow, which I already got the results that I wanted to.
But now I am facing another problem, and getting some artefacts in the process images. In my original images (stack of 600 images) I can't see any artefacts, please see the original image from finger nail:
But in my 10 processed results I can see these lines:
I really don't know where they come from?
Also if they belong to the camera's sensor why can't I see them in my original images? Any idea?
Edit:
I have added the following code suggested by #Jonas. It reduces the artefact, but does not completely remove them.
%averaging of images
im = D{1}(:,:);
for i = 2:100
im = imadd(im,D{i}(:,:));
end
im = im/100;
imshow(im,[]);
for i=1:100
SD{i}(:,:)=imsubtract(D{i}(:,:),im(:,:))
end
#belisarius has asked for more images, so I am going to upload 4 images from my finger with speckle pattern and 4 images from black background size( 1280x1024 ):
And here is the black background:
Your artifacts are in fact present in your original image, although not visible.
Code in Mathematica:
i = Import#"http://i.stack.imgur.com/5hM3u.png"
EntropyFilter[i, 1]
The lines are faint, but you can see them by binarization with a very low level threshold:
Binarize[i, .001]
As for what is causing them, I can only speculate. I would start tracing from the camera output itself. Also, you may post two or three images "as they come straight from the camera" to allow us some experimenting.
The camera you're using is most likely has a CMOS chip. Since they have independent column (and possibly row) amplifiers, which may have slightly different electronic properties, you can get the signal from one column more amplified than from another.
Depending on the camera, these variability in column intensity can be stable. In that case, you're in luck: Take ~100 dark images (tape something over the lens), average them, and then subtract them from each image before running the analysis. This should make the lines disappear. If the lines do not disappear (or if there are additional lines), use the post-processing scheme proposed by Amro to remove the lines after binarization.
EDIT
Here's how you'd do the background subtraction, assuming that you have taken 100 dark images and stored them in a cell array D with 100 elements:
% take the mean; convert to double for safety reasons
meanImg = mean( double( cat(3,D{:}) ), 3);
% then you cans subtract the mean from the original (non-dark-frame) image
correctedImage = rawImage - meanImg; %(maybe you need to re-cast the meanImg first)
Here is an answer that in opinion will remove the lines more gently than the above mentioned methods:
im = imread('image.png'); % Original image
imFiltered = im; % The filtered image will end up here
imChanged = false(size(im));% To document the filter performance
% 1)
% Compute the histgrams for each column in the lower part of the image
% (where the columns are most clear) and compute the mean and std each
% bin in the histogram.
histograms = hist(double(im(501:520,:)),0:255);
colMean = mean(histograms,2);
colStd = std(histograms,0,2);
% 2)
% Now loop though each gray level above zero and...
for grayLevel = 1:255
% Find the columns where the number of 'graylevel' pixels is larger than
% mean_n_graylevel + 3*std_n_graylevel). - That is columns that contains
% statistically 'many' pixel with the current 'graylevel'.
lineColumns = find(histograms(grayLevel+1,:)>colMean(grayLevel+1)+3*colStd(grayLevel+1));
% Now remove all graylevel pixels in lineColumns in the original image
if(~isempty(lineColumns))
for col = lineColumns
imFiltered(:,col) = im(:,col).*uint8(~(im(:,col)==grayLevel));
imChanged(:,col) = im(:,col)==grayLevel;
end
end
end
imshow(imChanged)
figure,imshow(imFiltered)
Here is the image after filtering
And this shows the pixels affected by the filter
You could use some sort of morphological opening to remove the thin vertical lines:
img = imread('image.png');
SE = strel('line',2,0);
img2 = imdilate(imerode(img,SE),SE);
subplot(121), imshow(img)
subplot(122), imshow(img2)
The structuring element used was:
>> SE.getnhood
ans =
1 1 1
Without really digging into your image processing, I can think of two reasons for this to happen:
The processing introduced these artifacts. This is unlikely, but it's an option. Check your algorithm and your code.
This is a side-effect because your processing reduced the dynamic range of the picture, just like quantization. So in fact, these artifacts may have already been in the picture itself prior to the processing, but they couldn't be noticed because their level was very close to the background level.
As for the source of these artifacts, it might even be the camera itself.
This is a VERY interesting question. I used to deal with this type of problem with live IR imagers (video systems). We actually had algorithms built into the cameras to deal with this problem prior to the user ever seeing or getting their hands on the image. Couple questions:
1) are you dealing with RAW images or are you dealing with already pre-processed grayscale (or RGB) images?
2) what is your ultimate goal with these images. Is the goal to simply get rid of the lines regardless of the quality in the rest of the image that results, or is the point to preserve the absolute best image quality. Are you to perform other processing afterwards?
I agree that those lines are most likely in ALL of your images. There are 2 reasons for those lines ever showing up in an image, one would be in a bright scene where OP AMPs for columns get saturated, thus causing whole columns of your image to get the brightest value camera can output. Another reason could be bad OP AMPs or ADCs (Analog to Digital Converters) themselves (Most likely not an ADC as normally there is essentially 1 ADC for th whole sensor, which would make the whole image bad, not your case). The saturation case is actually much more difficult to deal with (and I don't think this is your problem). Note: Too much saturation on a sensor can cause bad pixels and columns to arise in your sensor (which is why they say never to point your camera at the sun). The bad column problem can be dealt with. Above in another answer, someone had you averaging images. While this may be good to find out where the bad columns (or bad single pixels, or the noise matrix of your sensor) are (and you would have to average pointing the camera at black, white, essentially solid colors), it isn't the correct answer to get rid of them. By the way, what I am explaining with the black and white and averaging, and finding bad pixels, etc... is called calibrating your sensor.
OK, so saying you are able to get this calibration data, then you WILL be able to find out which columns are bad, even single pixels.
If you have this data, one way that you could erase the columns out is to:
for each bad column
for each pixel (x, y) on the bad column
pixel(x, y) = Average(pixel(x+1,y),pixel(x+1,y-1),pixel(x+1,y+1),
pixel(x-1,y),pixel(x-1,y-1),pixel(x-1,y+1))
What this essentially does is replace the bad pixel with a new pixel which is the average of the 6 remaining good pixels around it. The above is an over-simplified version of an algorithm. There are certainly cases where a singly bad pixel could be right next the bad column and shouldn't be used for averaging, or two or three bad columns right next to each other. One could imagine that you would calculate the values for a bad column, then consider that column good in order to move on to the next bad column, etc....
Now, the reason I asked about the RAW versus B/W or RGB. If you were processing a RAW, depending on the build of the sensor itself, it could be that only one sub-pixel (if you will) of the bayer filtered image sensor has the bad OP AMP. If you could detect this, then you wouldn't necessarily have to throw out the other good sub-pixel's data. Secondarily, if you are using an RGB sensor, to take a grayscale photo, and you shot it in RAW, then you may be able to calculate your own grayscale pixels. Many sensors when giving back a grayscale image when using an RGB sensor, will simply pass back the Green pixel as the overall pixel. This is due to the fact that it really serves as the luminescence of an image. This is why most image sensors implement 2 green sub-pixels for every r or g sub-pixel. If this is what they are doing (not ALL sensors do this) then you may have better luck getting rid of just the bad channel column, and performing your own grayscale conversion using.
gray = (0.299*r + 0.587*g + 0.114*b)
Apologies for the long winded answer, but I hope this is still informational to someone :-)
Since you can not see the lines in the original image, they are either there with low intensity difference in comparison with original range of image, or added by your processing algorithm.
The shape of the disturbance hints to the first option... (Unless you have an algorithm that processes each row separately.)
It seems like your sensor's columns are not uniform, try taking a picture without the finger (background only) using the same exposure (and other) settings, then subtracting it from the photo of the finger (prior to other processing). (Make sure the background is uniform before taking both images.)

Detecting line gaps in binary image with MATLAB

I have some image like this:
There is a gap in some lines, How can I detect gap's position in image?
It's the result. It seems closing creates new pixels.
May i assume that the final goal is close the gap?
Than you might want to use morphological operations. To close the gap you just need the so called "closing". This is done by applying "dilation" and than an "erosion".
So how do you find a position where the gap was closed? You can just compare the before and after image and look at the changes.
EDIT: after your post i decided to update the answear. So i tried a little piece of code in matlab.
originalBW = imread('Je3ud.jpg');
imshow(originalBW);
se = strel('line',8, 0); % a straight line of 8 pixels
closeBW = imclose(originalBW,se_disk);
figure, imshow(closeBW)
subtractedBW = closeBW - originalBW;
figure, imshow(subtractedBW)
it produces an resulting image:
So basicly we have found the right position, but have unfortunatly got alot of false positives. I think you can get the result you want by treating each one of them as a candidate match and getting rid of the false positives. One important part of the false positives seems to be that if you check their vertical neighborhood (in the original image) you will find that there are white pixels, because the white line was not really disconnected there (and they therefore can not be the right solution). So you can try to use that to discard the false positives.