Saving Converted OCRed File using ABBYY in commandLine

Saving Converted OCRed File using ABBYY in commandLine - command-line

Hi want to integrate ABBYY Fine reader to my custom application
i use the commandline FineCMD.exe MyDocument /lang french /send MsWord
It does the OCR process by converting my document and pass it to Msword. I want the FineCMD to Save the converted OCRed word document to a folder How To handle that using commandline argument?
Iam USING ABBYY Fine Reader 12 on Windows 10

ABBYY FineReader does not provide command line interface for saving the result from scripting or for batch processing. FineReader is a desktop application intended for UI-driven use and not intended for black-box integration. The fitting ABBYY package for that task is Recognition Server or Engine SDK for true development, which btw has CLEI precompiled sample.
(Source: I am a former ABBYY dev/support tech and currently an independent ABBYY technology integrator)

Related

Create a desktop app that generate ms word file

I'm now working with tons of MS Word files and trying to find a way in improving my workflow.
I'm wondering if there's a way to create a desktop app which can preview certain parts from a Word file, select them and generate a new one with controls in Word's text style, paragraph, etc.
I supposed that this would take MS Word API and some frame structure particularly. I've been using Electron/node.js to create some cross platform applications, wondering if it can do as well? Or is there any reference that I can dig in?
Sorry if this sounds like a rookie one. I've tried to search but still can't find out where to start.

There are three possible ways to get the job done:
Automate MS Word to get job done. See Automate MS Office Applications using Python win32com module for more information. For example:
import win32com.client
word = win32com.client.Dispatch("Word.Application")
Use the Open XML SDK for generating Word documents at runtime, see Welcome to the Open XML SDK 2.5 for Office for more information.
Use third-party components.

If you are on Windows, there seem to be some way to access Word files in Python: https://www.blog.pythonlibrary.org/2010/07/16/python-and-microsoft-office-using-pywin32/. Maybe in node too.

Word 2013 to OpenXML Converter

Is there any converter available for Word 2013?
I had used one tool 2 years ago, which was converting Word 2010 Document to OpenXML Tags and C# code.
I am not able to recall its name. We just need to create a Word document with the format we need and then use that tool to convert it in OpenXML tags.
Anyone has any idea about this kind of tool ?
I have downloaded below tools, but they are not working for Word 2013.
Odf-AddInForWordSetup-en-1.0.exe
OdfAddInForOfficeSetup-en_4.0.5309.exe
Thank you,

I think you are looking for Open XML SDK 2.5 for Microsoft Office. In the link that I provided above its features are described:
Features include the ability to generate Open XML SDK 2.5 source code based on document content, compare source and target Open XML documents to reveal differences and to generate source code to create the target from the source, validate documents, and display documentation for the Open XML SDK 2.5 Classes, the ECMA376v1 standard, and the Microsoft Office implementation notes.

Can I convert .docx Word documents using the DocX .NET Library?

I am currently attempting to convert a couple of .NET desktop applications that I have developed into a web application harnessing AngularJS and RESTful services.
One of the key components of these applications is in their ability to generate Word documents on the fly using a .dotx Word template. I am currently exploring the possibility of using a third party library called DocX to generate these Word documents without resorting to using a template.
I guess my question is: Can I use this library to read an existing Word document in .docx format and generate a source code representation of the document? If this is possible could someone point me in the direction of any code samples that I could use? I have looked around and have been unable to find anything that could help me get started.

Generating code representation of the document and using it with DocX seems like a time consuming effort to me. Why not using a template instead and fill it with data at runtime?
I have some experience with Docentric, which is 3rd party OpenXML toolkit. It features an Word Add-in for template design and libraries for document generation and manipulation. It took me less then a week to generate pretty complex documents. If I was in your shoes I would definitely try some 3rd party toolkits. They cost money, but save time so do some math and see it they can be useful for you.

It is possible to read an existing Word document in .docx format with following code
DocX document = DocX.Load(filename)
While it is impossible to generate a source code representation of a document.

Programmatically convert Doc(x) files to PDF using Microsoft Word

We are developing a Java application that needs to programmatically convert .rtf, .doc and .docx files to PDF files.
Formatting is important to us, so we need the page numbers to be the same between a source file and a target PDF file, and the contents of each page being the same as the original file.
We have tried out open source solutions, such as JODConverter to invoke a LibreOffice of OpenOffice installation, Docx4j and XDocReport. The best formatting was achieved with LibreOffice. However, even in that case, the pages were different (for example, a 87-page .rtf file results in an 80-page PDF file).
So, we think that the ideal way to make the conversion would be to somehow invoke Microsoft Word though our Java application, and make the conversion with it. That would produce PDF files that have the same formatting as the original files.
Is this possible in any of the following ways:
An API that is directly invokeable through Java?
An API that is invokeable through a .Net language and we would use that with something like JACOB?
A 3rd party library that uses a Microsoft Word installation under the hood (something like JODConverter for Word)?
A CLI interface supported by Word (relevant question)?
Something else?

Programmatically convert DITA to FrameMaker

Is there a toolkit available (paid or otherwise) to help with programmatically converting a DITA document to a FrameMaker one?
I'm attempting to make an application that converts to multiple formats from DITA. I know I can use the DITA Open Toolkit for most of my needs, but I need to be able to create a native FrameMaker document as well.
Programing language doesn't matter, altho I prefer Java as my application will be web based.

Arbortext import-export is industrial strength and very flexible. You could also try MIFtoGo. Conversion is tricky because source documents are rarely consistent. Conversion without cleanup before and after is next to impossible.

DITA-FMx is what you need:
http://leximation.com/dita-fmx/
Using DITA-FMx, you generate a FrameMaker book from your ditamap (and then save the FM book as PDF).
There is a movie on YouTube that shows you how the process goes. Just search for "PDF Publishing with DITA-FMx 1.1" (Stack Overflow does not allow me to post a second URL here yet)
If you like to see an example, just send me a small sample of a ditamap and I'll generate a FrameMaker book for you.

The disclaimer is that if you're converting a lot of documents you'll be better off with a supported well-integrated solution that fully uses FrameMaker's DITA support. If you're looking to do it on the cheap though (and who isn't) you can do this conversion by using straight XSLT and framemaker templates.
First create the framemaker template to handle the appearance of the document, then use XSLT to map your DITA content to the content tags you've used in Framemaker.
You can use the free SAXON xslt interpretor to do the actual conversion.
Here is some of adobe's reference material on authoring new DITA documents:
http://help.adobe.com/en_US/FrameMaker/8.0/05h/dita.html
Info on Framemakers's native XML support is here:
http://www.adobe.com/products/framemaker/pdfs/xml_fm7.pdf
The framemaker manuals also cover the subject quite extensively. Hope that helps

Indeed FM supports loading DITA files (I tried FM10 and newer) but to automate conversion to .FM format you either use the internal scripting mechanism which is still some manual work.
I have found an existing free utility that can do most basic operations like opening a file, 'saving as' another format and closing it.
tool name is DZBatcher
Example DZbatcher batch file:
Open "c:\My Dita Files\Doc1.dita"
SaveAs -d "c:\My FM Files\Doc1.fm"
Close "c:\My FM Files\Doc1.fm"
Open "c:\My Dita Files\Doc2.dita"
SaveAs -d "c:\My FM Files\Doc2.fm"
Close "c:\My FM Files\Doc2.fm"
Exit
Adobe has a framework called RoboHelp which is probably the infrastructure for this, but I didn't dig deeper as this utility did the job perfectly.
In my SW flow, I created this batch file using a python script that scan all the docs in an input directory and added 3 lines per file as seen above.
I used FM2015 for this task.

Bryan, after a decade's experience converting Frame, Word, Interleaf, etc to XML, I'll tell you that Adobe doesn't have it fully covered. The DITA support in FrameMaker works best if you also have the Leximation plug-in or know how to program the Adobe proprietary EDD. You can't do DITA specialization without programming the EDD in FrameMaker.

FrameMaker has excellent support for DITA. You can open DITA topics, and save them (if you wish) as .fm files. You could also open DITAMAPS, and save them as FrameMaker book files, or as composite (monolithic) .fm files. There is no need to write a parser.
PS: I am talking about FM 12.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Saving Converted OCRed File using ABBYY in commandLine - command-line

Related

Create a desktop app that generate ms word file

Word 2013 to OpenXML Converter

Can I convert .docx Word documents using the DocX .NET Library?

Programmatically convert Doc(x) files to PDF using Microsoft Word

Programmatically convert DITA to FrameMaker

Categories

Resources