Is user_words functional in Tesseract 4.x - tesseract

Can I supply a user_words list to augment the existing dictionary in Tesseract v4 as was the case in v3.x or is re-training the LTSM model the only approach to introducing new vocabulary?

Related

External DSL for CRUD code generation for multiple frameworks

I am developing a general purpose CRUD code generator application. The idea is that codes/files (model, controller, view) for common insert, update, list, delete etc. operations will be automatically generated from model definition (like the definition used in Grails). But the generated code can be for any framework, e.g. Play (Scala or Java version) or Django or Grails or whatever framework user wants to use it for, even AngularJs. That is, same model definition can be used for generating code for any framework.
My question is, what can I use for this task - Scala or Groovy or some DSL specialized tools like Xtext?
This seems like a good case for a DSL. A DSL can be summed up as the following 3 elements:
Abstract Syntax: the concepts of your DSL. Here you want to specify CRUD applications.
Concrete Syntax(es): a way to materialize your Abstract Syntax. As a programmer, the first thought is often text-based syntaxes, but you could also use a graphical or tree-like syntax, or even simply a GUI with text fields and checkboxes.
Semantics: the meanings of your DSL. Here you want to generate code.
I'll now suggest some solutions which are based on Java and come from the Eclipse Modeling ecosystem.
Eclipse EMF implements standards for the definition of so-called "metamodels" (basically abstract syntaxes). In the Eclipse world, EMF is the base for a lot of tooling.
Assuming you have an EMF metamodel, textual syntaxes can be specified using Eclipse Xtext, and graphical syntaxes using Eclipse Sirius. Note that you can also develop your own GUI in Java and create your model using the EMF Java APIs. Also note that Xtext can create your metamodel for you based on the grammar you want for your text-based syntax. This is nice if you don't want to dive too deep into EMF it self (thus steps 1 and 2 are one and the same).
Eclipse Acceleo provides a template language specifically designed to generate text, including code. Once again, you can also write your code generator using plain Java, or any JVM-based language thanks to the EMF Java APIs. If you use Xtext, there is also a facility for including an Xtend-based code generator alonside your syntax.

Grammar and/or lexer for Swift AST output

I've started a project to transpile a subset of Swift into Kotlin (for sharing business logic between iOS and Android apps) and am using the Abstract Syntax Tree output of the Swift compiler (i.e. as produced by the -dump-ast option) as a starting point.
I was wondering if anyone knew of an "official" grammar and/or lexer for the AST output itself. It's not that complicated and I've created one myself for now, but I'd feel more comfortable if I was relying on something supported.

F# for the VM in a MVVM WPF environment?

Obviously F# would rule the model part, but would the choice be as clear for the VM part ?
Would the tooling support lost (any?) be compensated by the gain in the flexibility in the langage in a large application ?
Given that you don't rely on any major features of some MVVM framework, I'd say mostly it's as much as the choice of F# over C#, if that's the case of comparison.
The current F# compiler misses one feature that is useful in VM code though -- the support for CallerMemberName attribute, which is nice to use for INPC object properties. But this is probably a minor shortcoming compared to benefits you get (e.g. all the cool stuff in F#).
Since VM is often about transforming and processing the M data to use it in the V, I don't yet see why F# would be a bad fit.

C++ Libraries on iPhone/Android: Generation of Wrapper classes

I have the following question:
Let's assume I have many different C++ libraries(algorithms), which are written in the same style . (They need some inputs and give back some outputs).
I've done some research and wanted to ask if its possible to auto-generate Wrapper classes (by using an algorithm which are given the input and the outputs of the c++ algorithm), which can be easily used in Objective-C/Java (iOS/Android) then .
The app-programming part isn't really time-consuming.
You'll want to look at SWIG. This generates bindings for other languages from a C based API. Objective-C support is in there as is Java.
I'm not sure what happened to objective-C support in the later versions, but its in v1.1 and you can see the branch where it was added.

What is practical use of IDEA MPS and Eclipse Xtext

Both of those frameworks deal with meta-model:
XText (Eclipse)
MPS (JetBrain)
Do you have example of practical applications based on meta-model transformation with those tools?
We created whole bug tracker using MPS. Code generation is not the goal but mean to get some executable code. The goal is to give a tool to developer that allows creating DSLs with minimum effort.
Cool thing about MPS is that it also provides you with an IDE for your language. And different DSLs you create are compatible, i.e. you can create DSL that extends Java with closures and another DSL that enables external methods, and these extensions will work together.
They are different in term of document storing the metamodel.
Regarding XText, this article illustrates one usage, when it comes to y create your own programming languages and domain-specific languages (DSLs).
Once you have a language, you want to process it and this means usually to transform your model into another representation.
The facility responsible for this transformation is called generator and consists of a bunch of transformation templates (e.G. XPand) and some code executing them. On some event, the model is read in and the transformations are applied to produce code.
Example of such a model transformation:
dot3zest, which comes with a DOT to Zest interpreter (which now uses the Xtext switch API generated for the DOT grammar) is support for ad-hoc DOT edge definitions.
Regarding MPS, you have here a serie of practical examples,
like this code generation to GPL such as Java, C#, C++ or XML:
(source: googlecode.com)
I think the main usage of XText is firstly to create a DSL from the grammer you defined and an eclipse workbench auto-generated for you. Secondly, it can transform the scrpit written in your DSL into java. The built-in expressions from XText2 is a plus.
The framework gives you a free IDE to support your writing DSL you created. And the DSL is the ulimate product to provide. It can be used to abstract the rules and logics from the real world. For example, in our project, the product config rule. Only specialist knows them, so they write some in the DSL you create.