How to train/fine tune with images containing multiple languages?

How to train/fine tune with images containing multiple languages? - tesseract

When extracting wordstrboxes I supply the languages the image contains, for example "eng+deu", however lstmf is created per language.
tesseract.exe image.png image --psm 4 -l eng+deu wordstrbox
So what is the correct path forward here? Use the images both in training for English and for German? Or something else?
Thanks

Related

Convert a folder containing asciidocs and pictures to pdf

I would like to convert this book Mastering the Lightning Network, which is freely available through GitHub to a pdf for personal use.
Unfortunately, I have only figured out how to "translate" single files using asciidoc or asciidoctor-pdf. The options for folders don't seem to work with the configuration of the repository.
There has to be an easy way to translate everything, including all files and pictures. Would be very thankful if somebody could help me out.

As far as I know it is not possible to convert a folder containing AsciiDoc files to a pdf, a simple script could do it but the problem would be in what order do you want your files to be converted?
The simplest solution for you is to create your own content.adoc file and use the include macro to select what files you want to convert and in what order, it could look something like this:
= Mastering the Lightning Network
include::01_introduction.asciidoc[]
include::02_getting_started.asciidoc[]
include::03_how_ln_works.asciidoc[]
include::04_node_client.asciidoc[]
include::05_node_operations.asciidoc[]
include::06_lightning_architecture.asciidoc[]
include::07_payment_channels.asciidoc[]
include::08_routing_htlcs.asciidoc[]
include::09_channel_operation.asciidoc[]
include::10_onion_routing.asciidoc[]
include::11_gossip_channel_graph.asciidoc[]
include::12_path_finding.asciidoc[]
include::13_wire_protocol.asciidoc[]
include::14_encrypted_transport.asciidoc[]
include::15_payment_requests.asciidoc[]
include::16_security_privacy_ln.asciidoc[]
include::17_conclusion.asciidoc[]
and you convert using asciidoctor-pdf content.adoc

You could try using imagemagick:
magick *.jpg out.pdf

IBM Watson Visual recognition{"code":400,"error":"Cannot execute learning task. : no classifier name given"}

When I try to train a classifier with two positive classes and with the API key (each class contains around 1200 images) in Watson Visual Recognition, it returns that "no classifier name is given" - but that I have already provided. This is the code:
$ curl -X POST -F "blank_positive_examples=#C:\Users\rahansen\Desktop\Altmuligt\training\no_ocd\no_ocd.zip" -F "OCD_positive_examples=#C:\Users\rahansen\Desktop\Altmuligt\training\ocd\ocd.zip" -F "name=disease" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key={X}&version=2016-05-20"
{"code":400,"error":"Cannot execute learning task. : no classifier name given"}
What I have done so far:
Removed all special characters in the file names as I thought that might be the problem:
Tried to give other names for the classifeir, e.g. "name=ocd"
I also tried to train it on a smaller dataset, like 40 images in each positive class and then it actually works fine. So maybe the size of the dataset is the problem. However, according to Watson training guidelines, I comply with the size regulations: https://www.ibm.com/watson/developercloud/doc/visual-recognition/customizing.html I have a free subscription.
Do anyone has any recommendations for how to solve this classifier training problem?

This can occur when there's a problem processing the zip files. I would try simplifying your training files. For instance, use just 100 examples for class, then you can add more via retraining later. It's always good to train then measure performance and then add more training samples.

#Rasmus, you should verify the name their picture neatly, meaning no special symbols, spaces or etc. in the file name of images. It appears to be related to special characters in the input. This API expects only characters and numbers in the alphabet as classifier names. It also requires that the images in your zip files end with a file extension for images like .jpg, .jpeg, .gif or .png
So, after you rename the images, check if all have the correct formats, like .jpg, .png, and supported formats for Visual Recognition.
Replace {api-key} with the service credentials you copied in the first step.
Modify the location of the {class}_positive_examples to point to where you saved the .zip files.
And, use your cURL like:
curl -X POST
-F "blank_positive_examples=#C:\Users\rahansen\Desktop\Altmuligt\training\no_ocd\no_ocd.zip"
-F "OCD_positive_examples=#C:\Users\rahansen\Desktop\Altmuligt\training\ocd\ocd.zip"
-F "name=disease"
"https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key={api-key}&version=2016-05-20"
Obs. Can be other problem, see Other ask about error with classifier name.
My example working in my PC computer:
curl -X POST -F "dog_positive_examples=c:\Dogs.zip" -F "negative_examples=c:\Cats.zip" -F "name=dogs" "https://gateway-a.watsonplatform.net/visual-recognition/api/v3/classifiers?api_key={API KEY}&version=2016-05-20"
See the official reference here.

Generate PNG file with pre-known CRC

Is it possible to create a PNG file with a predefined CRC? (kind of a programming challenge..)
I have a python script to generate hex codes with the target CRC, but I'm not sure how to make a valid PNG out of it.
BTW - it may be that I'm talking nonsense, but it sounds possible on theory (right?)

You can use spoof.c to do that, either at the level of a PNG chunk or at the level of the entire file. (Note that a PNG file does not contain a CRC of the whole thing, only CRCs of the chunks.)

How to find parameters supported in Tesseract OCR config file

I want to know what parameters the config file used by Tesseract OCR accepts, how to write a config file, etc.
I can't find any documentation about this on their site. How can I determine what parameters are supported, and what they mean?

I found these instructions in the link below. They are about writing the config file and where to place it:
config file is simple text file without BOM and with Unix end-of-line mark (on Windows you can use some advanced text editor e.g. Notepad++ to achieve this).
If you use tesseract executable this is only way how to change tesseract parameters.
config file should be located in your tessdata/configs directory. Have a look there for some examples.
There is a list of all the variables plus descriptions of each one in http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version. Note it's for Tesseract 3.02, things may be different in other versions.
Edit: Also adding a pastebin link in case the above link becomes dead.

Tesseract v3.04 now offers the command line option --print-parameters, so you can call tesseract --print-parameters to get a list of the 678 (!) configurable parameters, their default values, and a short description:
Tesseract parameters:
editor_image_xpos 590 Editor image X Pos
editor_image_ypos 10 Editor image Y Pos
editor_image_menuheight 50 Add to image height for menu bar
editor_image_word_bb_color 7 Word bounding box colour
editor_image_blob_bb_color 4 Blob bounding box colour
editor_image_text_color 2 Correct text colour
...and many, many more

It's just a plain text file containing space-delimited key/value pairs for Tesseract config variables, each on separate line; for instance:
interactive_display_mode T
tessedit_display_outwords T
There are several standard config files -- such as digits, hocr -- under Tesseract tessdata/configs folder.

gdcm library - transfer syntax

Does GDCM library support the following DICOM transfer syntaxes:
1.2.840.10008.1.2.4.53 JPEG Spectral Selection, Nonhierarchical (Processes 6 & 8)
1.2.840.10008.1.2.4.55 JPEG Full Progression, Nonhierarchical (Processes 10 & 12)
If yes, could anyone link me sample pictures encoded with these transfer syntaxes? I've been searching everywhere, but I found nothing...
Thanks for your reply.
I have already seen that link you provided, but there is no Transfer Syntax UIDs list, so I wasn't sure GDCM supports exactly transfer syntaxes I mentioned in my question.
Still, I need to test it in practice, so still looking for files encoded with those transfer syntaxes... Searched over 20k pictures and didn't find them, but I know there must be somewhere example files..

Yes GDCM does support both of those, see Short Presentation

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to train/fine tune with images containing multiple languages? - tesseract

Related

Convert a folder containing asciidocs and pictures to pdf

IBM Watson Visual recognition{"code":400,"error":"Cannot execute learning task. : no classifier name given"}

Generate PNG file with pre-known CRC

How to find parameters supported in Tesseract OCR config file

gdcm library - transfer syntax

Categories

Resources