I am using tesseract for 2 months now and using opencv for reducing the dots/noise in the images. But I am trying to solve this issue at tesseract level.
Is there any tesseract parameter to remove the background dots?
or can i tell the tesseract not to recognize the dots(depending on the size)?
I am very thankful if anyone guides me on this issue.
For the image below:
https://i.stack.imgur.com/9TjN6.png
I am getting the output like.
lb ane a a a ee ee
Ee ah Tani ANOTES tsi Ca Ee RR
RAT TE CORRE NE Re ele TTR a ee Tol a te es
see © Students should schedile 21 points of work sech years...
fen Es ee EE i ea
| fdvenced Coreral Sciemes ©. |. eroral Home Feonomits (limited to.
Co mlgebras i ULE LE cl BE unions andi sentors) Dh
7od 1 Art’ SpeelaliAvt [for those tC meman Ta GET
Lhd recommended by Art Supervisor. ii Industrial Arts hal
I am using below command to run tesseract:
tesseract --psm 6 --oem 1 image.png output_text_file
There won't be any noise removal option at the tesseract level as the preprocessing methods can't be generalized for all images.You can use denoising methods in opencv like fastNlMeansDenoising, Dilation ,Erosion etc.
tesseract is OCR engine and not image manipulation tool.
Related
I am doing a pansharpening fusion with one multi=spectral image and 1 Pan Image from LANDSAT 8.
The problem happens when I try to apply the AWLP algorithm from de Pansharpening Toolbox (author: Gemine Vinone et al) because the algorithm inside itself uses an 2D non-decimated wavelet transform function called ndwt2.
http://www.codeforge.com/read/255069/ndwt2.m__html <==== this file doesn't work for me
That function apparently was in the Wavelet Toolbox from MATLAB 2 or more years ago. I don't know how the algorithm works, so I need to know how to use swt2, dwt2, etc... to do the same operation that ndwt2 did.
I've been searching on the whole web, even on the third Google page xD, but have not found a solution.
I am using pytesseract to parse digits extracted from browser screenshots. This is my first experience with OCR, please correct me if I'm going at this the wrong way. I get very low accuracy on images that look easily interpretable to me. Sometimes I get empty string; rarely I also get wrong digits.
Elsewhere, people suggested filtering and enlarging the image. I did so and it made it better, going from accuracy of almost 0 to around 50%, but that's still poor. I am working on a selenium-extracted screenshot; some code is reported below. Sorry if it's messy, I included the image loading and processing part to show what I was doing but didn't want to give away the page I'm loading.
Here is an image in which I show what the images look like after processing, and the result of parsing and converting to float.
from selenium import webdriver
from PIL import Image
import pytesseract, cv2, time, numpy as np
# load the page, enlarge, save as png, load as usable image
driver.get("https://a-page-I-wont-tell-you-sorry")
time.sleep(5) # wait for loading
driver.execute_script('document.body.style.MozTransform = "scale(1.50)";') # enlarge
enlarged_screenshot = driver.get_screenshot_as_png()
file = open("enlarged_screenshot.png", "wb")
file.write(enlarged_screenshot)
file.close()
enlarged_screenshot = Image.open("enlarged_screenshot.png")
# method for cropping and filtering
def crop_and_filter(image, coordinates, filter_level):
width, height = image.size
x0, y0, x1, y1 = coordinates
cropped_image = image.crop((width*x0, height*y0, width*x1, height*y1))
image_l = cropped_image.convert("L")
image_array = np.array(image_l)
_, filtered_image_array = cv2.threshold(image_array, filter_level, 255, cv2.THRESH_BINARY)
print("*"*100); print("Filtered image:")
display(Image.fromarray(filtered_image_array))
return filtered_image_array
# example of how I call and parse it
x0 = 0.51; y0 = 0.43; delta_x = 0.05; delta_y = 0.025
filtered_image_array = crop_and_filter(enlarged_screenshot, (x0, y0, x0+delta_x, y0+delta_y), 125, True)
number = pytesseract.image_to_string(filtered_image_array, config="-c tessedit_char_whitelist=0123456789.\t%")
This started as, but was too long for, a comment:
Your question was slightly unclear but in the end I figured you wanted to run Tesseract over the actual image you posted at https://i.stack.imgur.com/m5WJQ.png
The command I used was
tesseract --oem 1 -l eng --psm 11 m5WJQ.png stdout
This produced the following output:
ek ok ek ok ok ok ok ok ok ok ok ok
Filtered image:
65
HAA
Filtered image:
3
HAA
Filtered image:
3.5
HAA
Filtered image:
2.64
HAA
Filtered image:
75
HAA
Filtered image:
3.1
HAA
Filtered image:
3.6
HAA
Filtered image:
2.68
EARSED NUMBERS:
[nan, nan,
3.5, 2.64, nan,
3.1, 3.6, 2.68]
As per your comments on your original question, this looks good to you.
I am running Tesseract on macOS 10.13.6 High Sierra built from source (but you don't have to do this).
tesseract --version
tesseract 5.0.0-alpha-371-ga9227
leptonica-1.78.0
libgif 5.1.4 : libjpeg 9c : libpng 1.6.37 : libtiff 4.0.10 : zlib 1.2.11 : libwebp 1.0.3 : libopenjp2 2.3.1
Found AVX2
Found AVX
Found FMA
Found SSE
Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6
See if you can also reproduce this and comment if you can't. I will see if I can get the corresponding output from pytesseract.
Also, since you (sometimes) know what the numbers should be, you can use tools like ocreval (https://github.com/eddieantonio/ocreval - I'm not affiliated with it) to see how well your run is doing compared to the known/input/"ground" truth.
HTH
everybody.I'd like to use caffe to train a 5 classes detection task with "SSD: Single Shot MultiBox Detector", so I changed the num_classes from 21 to 6.However,I get an following error:
"Check failed: num_priors_ * num_classes_ == bottom[1]->channels() (52392 vs. 183372) Number of priors must match number of confidence predictions."
I can understand this error,and I found 52392/6=183372/21,namely why I changed num_classes to 6,but the number of confidence predictions is still 183372. So how to solve this problem. Thank you very much!
Since SSD depends on the number of labels not only for the classification output, but also for the BB prediction, you would need to change num_output in several other places in the model.
I would strongly suggest you wouldn't do that manually, but rather use the python scripts provided in the 'examples/ssd' folder. For instance, you can change line 277 in 'examples/ssd/ssd_pascal_speed.py' to:
num_classes = 5 # instead of 21
And then use the model files this script provides.
I have the following code which works on matlab R2015 until the concatenation of I2D but stops at imadjust because I'm out of memory. I'm treating 2249 images of 1984*1984 pixels on a 16 Go RAM and 64 bit.
When I run the same code a more powerful computer (24 Go RAM) but with the R2009 version of Matlab, the code stops at the cat step saying the arguments dimensions are not consistent.
I don't get why it is working on R2015 and not on R2009. Notice that I will have to run the code on R2009 (but that's another story). Therefore, the problem of out of memory will maybe not be important but I don't know yat because it is not working on the R2009 version (24 Go RAM).
%reading data and inverting the GV
for k=1:n
I = imread(fullfile(rep,list1(k).name),ext(3:end));
for i=1:Row
for j=1:Col
I(i,j)=255-I(i,j);
end
end
nomFichier=sprintf('Xa1_3D_%04d.bmp',k);
imwrite(I,fullfile(rep2,nomFichier),ext(3:end));
end
%Now, we want to read all the images at once to have all the available
%informations for the histogram adjustement
chemin = fullfile(rep2,ext);
list2 = dir(chemin);
o=numel(list2);
I2{o}=ones;
for k=1:o
I2{k} = imread(fullfile(rep2,list2(k).name),ext(3:end));
end
%Combination of images in 2D matrix and hist adjustement
I2D=cat(2,I2{:});
I2D=imadjust(I2D);
There are several example data sets in Matlab, for example wind and mri. If you execute the command load wind you will load the data in the data set wind. Some are included in toolboxes and some appear to be included in standard Matlab. These example data sets are valuable as test data when developing algorithms.
Where can one find a list of all such data sets included in Matlab?
You can enter demo in matlab to get a list. The wind table is part of Example — Stream Line Plots of Vector Data, etc.
For the tables on your computer, have a look at:
C:\Program Files\MATLAB\R2007b\toolbox\matlab\demos
The example data is located in .mat files in ../toolbox/matlab/demos.
The following data is available in MATLAB 2014a:
% in matlab run:
> H=what('demos')
> display(H.mat)
You can also use your favorite Linux console:
/usr/local/MATLAB/R2014a/toolbox/matlab/demos$ ls *.mat -1 | sed -e "s/.mat//g"
This is my list for readers who can not try it on their machine while reading this answer:
accidents
airfoil
cape
census
clown
detail
dmbanner
durer
earth
flujet
gatlin
gatlin2
integersignal
logo
mandrill
membrane
mri
patients
penny
quake
seamount
spine
stocks
tetmesh
topo
topography
trimesh2d
trimesh3d
truss
usapolygon
usborder
vibesdat
west0479
wind
xpmndrll
While the command demo in MATLAB 2018b will start a help browser with some demos:
You can find a list of all available dataset and their description in the following link :
https://www.mathworks.com/help/stats/sample-data-sets.html