Tesseract training character size and frequency - tesseract

The documentation of Tesseract says:
Make sure there are a minimum number of samples of each character. 10
is good, but 5 is OK for rare characters.
There should be more samples of the more frequent characters - at
least 20.
I assume the last sentence means: at least 20 samples of more frequent characters would be OK. But what will be a good frequency?
Also:
Tesseract works best on images which have a DPI of at least 300 dpi,
so it may be beneficial to resize images. For more information see the
FAQ.
Why does Tesseract work best on 300 DPI? Isn't DPI just an setting telling on what scale an image is being printed? Why DPI and not just a minimum height in pixels?
Also, what would be a good height of an character in pixels?

Related

Why pixels have not the same weight?

I don't understand :
if we considerate the value 00001111 (15) is a byte and a RGB pixel (220,180,155) it's 3 bytes whatever the values of the pixel.
so why when i reduce the values of my pixels (with bitshift operation or whatever) the size of
my image is not = pixel numbers x 3. when i say "pixel numbers" i mean "pixel numbers bigger than fully black".
how the mechanism works ? is it counted in bits and then divided by eight as an average ?
if i have a 3MB picture and i do a bitshift (factor 2 on each 3 RGB channel) i found a 300 KB picture.
Don't tell me 90% of my pixels turned fully black.
Thanks.
If you shift all the pixel values right by 2 places, you will have around 1/4 as many shades of red as before, and around 1/4 as many greens and likewise for blues. That means overall you will have vastly fewer colours. That means your image may well have fewer than 256 colours which means it can be palettised. It also means it is likely to compress better because there will be more repetition of fewer unique sequences.
You can check if your image is palettised in several ways:
open it with PIL and check if image.mode contains a P
run exiftool on it and check if Colour Type is Palette
run ImageMagick on it with magick identify -verbose YOURIMAGE
You can count the number of unique colours in your image with ImageMagick using:
magick identify -format %k YOURIMAGE
Or you can do it in Python with the last part (entitled "Update") of this answer.

Matlab figure size formatting for Word

I'm trying to create MATLAB figures to put into a paper. The paper has very specific sizing instructions for figures that I'm having trouble matching in MATLAB. The figures need to be no greater than 3.5" width, >300 DPI, with 8pt font.
In my code, I use the following to try to set the parameters:
set(gcf,'PaperUnits','inches');
set(gcf,'PaperPosition',[0 0 3.5 3.5]);
xlabel('x-axis label','FontSize',8);ylabel('y-axis label','FontSize',8);
set(gca,'FontSize',8);
print('-djpeg','-r300','filename.jpg')
This should be giving me a 300 DPI, 3.5"x3.5" JPEG image with an 8pt font size. However, when I import the image into Word, it becomes 6.5" x 6.5" and the font size is larger than Word's 8pt font. Even if I resize the image, the font size is still too large, though it should maintain the same DPI. Are the FontSize and PaperPosition parameters not working as I expect they should or is Word doing something strange for importing?
The font size issue was caused due to differing fonts used in MATLAB and Word. Once I learned about set(gca,'FontName'), the font size seemed to be correct when the image was manually resized to 3.5" x 3.5".
The image size issue seemed to be related to saving it as a JPEG. Once I swapped to PNG, the image was the correct size by default. Looking into the JPEG properties, it had the correct number of pixels for a DPI of 300 at 3.5", the sole issue was that it would have to be manually resized. Thanks for the comments that led me to finding a solution.

BMP image header - biXPelsPerMeter

I have read a lot about BMP file format structure but I still cannot get what is the real meaning of the fields "biXPelsPermeter" and "biYPelsPermeter". I mean in practical way, how is it used or how it can be utilized. Any example or experience? Thanks a lot
biXPelsPermeter
Specifies the horizontal print resolution, in pixels per meter, of the target device for the bitmap.
biYPelsPermeter
Specifies the vertical print resolution.
Its not very important. You can leave them on 2835 its not going to ruin the image.
(72 DPI × 39.3701 inches per meter yields 2834.6472)
Think of it this way: The image bits within the BMP structure define the shape of the image using that much data (that much information describes the image), but that information must then be translated to a target device using a measuring system to indicate its applied resolution in practical use.
For example, if the BMP is 10,000 pixels wide, and 4,000 pixels high, that explains how much raw detail exists within the image bits. However, that image information must then be applied to some target. It uses the relationship to the dpi and its target to derive the applied resolution.
If it were printed at 1000 dpi then it's only going to give you an image with 10" x 4" but one with extremely high detail to the naked eye (more pixels per square inch). By contrast, if it's printed at only 100 dpi, then you'll get an image that's 100" x 40" with low detail (fewer pixels per square inch), but both of them have the same overall number of bits within. You can actually scale an image without scaling any of its internal image data by merely changing the dpi to non-standard values.
Also, using 72 dpi is a throwback to ancient printing techniques (https://en.wikipedia.org/wiki/Twip) which are not really relevant in moving forward (except to maintain compatibility with standards) as modern hardware devices often use other values for their fundamental relationships to image data. For video screens, for example, Macs use 72 dpi as the default. Windows uses 96 dpi. Others are similar. In theory you can set it to whatever you want, but be warned that not all software honors the internal settings and will instead assume a particular size. This can affect the way images are scaled within the app, even though the actual image data within hasn't changed.

Image sizing issues (not fitting proportionally)

I created a 8.5x11.0 inches image # a 300dpi setting in photoshop.
When i go to use this as a background image in report designer the image looks hugeee.
It's not fitting within the 8.5x11.0 page.
Is there a way to resize this image correctly so that it will fit correctly within a 8.5x11.0 letter size page?
Thanks in advance,
with the information you gave, i believe your problem is problably in the group Size/dpi
You saved an image of size 8,5 x 11 inches # 300 Dpi (dots per inch) that calculates to aproximately an image of 2550 x 3300 pixels.
Now if your "report designer" software looks only at the size in pixels and assumes a dpi value diferent then the one you used, say for example 72 dpi, your 2550 x 3300 pixels image would actually be something like 45,8 x 35.4 inches.
So, my advice is, find out what are the characteristics your solftware is especting, aparently it is not 300dpi.
If you can´t find the information, try commonly used dpis like 72dpi or 150dpi.

FreeType2: Get global font bounding box in pixels?

I'm using FreeType2 for font rendering, and I need to get a global bounding box for all fonts, so I can align them in a nice grid. I call FT_Set_Char_Size followed by extracting the global bounds using
int pixels_x = ::FT_MulFix((face->bbox.xMax - face->bbox.xMin), face->size->metrics.x_scale );
int pixels_y = ::FT_MulFix((face->bbox.yMax - face->bbOx.yMin), face->size->metrics.y_scale );
return Size (pixels_x / 64, pixels_y / 64);
which works, but it's quite a bit too large. I also tried to compute using doubles (as described in the FreeType2 tutorial), but the results are practically the same. Even using just face->bbox.xMax results in bounding boxes which are too wide. Am I doing the right thing, or is there simply some huge glyph in my font (Arial.ttf in this case?) Any way to check which glyph is supposedly that big?
Why not calculate the min/max from the characters that you are using in the string that you want to align? Just loop through the characters and store the maximum and minimum from the characters that you are using. You can store these values after you rendered them so you don't need to look it up every time you render the glyphs.
I have a similar problem using freetype to render a bunch of text elements that will appear in a grid. Not all of the text elements are the same size, and I need to prerender them before I know where they would be laid out. The different sizes were the biggest problem when the heights changed, such as for letters with descending portions (like "j" or "Q").
I ended up using the height that is on the face (kind of like you did with the bbox). But like you mentioned, that value was much to big. It's supposed to be the baseline to baseline distance, but it appeared to be about twice that distance. So, I took the easy way out and divided the reported height by 2 and used that as a general height value. Most likely, the height is too big because there are some characters in the font that go way high or way low.
I suppose a better way might be to loop through all the characters expected to be used, get their glyph metrics and store the largest height found. But that doesn't seem all that robust either.
Your code is right.
It's not too large.
Because there are so many special symbols that is vary large than ascii charater. . view special big symbol
it's easy to traverse all unicode charcode, to find those large symbol.
if you only need ascii, my hack method is
FT_MulFix(face_->units_per_EM, face_->size->metrics.x_scale ) >> 6
FT_MulFix(face_->units_per_EM, face_->size->metrics.y_scale ) >> 6