Store Kofax 10.2 Batch configuration in version control - version-control

When you export Batch configuration in Kofax 10.2 through UI it generates a cab file.
There a bunch of binary files like dlls in that cab file. That kind of kills an ability to store it in version-control system.
Having those configuration file in version control would allow better/easier code sharing/testing/deployment/automation.
So I have 3 questions:
Is there way to export version-control friendly batch configuration?
Is there way to integrate Kofax with version control directly?
Are there any plans to add this functionality in future versions?
Thanks.

Unfortunately the short answers to all of your questions are No.
Despite the fact that it has no granularity, you should store the whole cab file in source control, since that is what you would use if you needed to restore your configuration to a previous state.
Within the cab file, the primary item that holds the batch configuration is the admin.xml file. If you really felt the need you could extract the contents of the cab file and also store these in source control. If you were to diff versions of the admin xml you may be able to determine context about what changed in the batch class. However you would still only be able to restore the full cab file.
Additionally, you mentioned dlls in the cab file, so I assume that you have Validation Scripts or something similar. Not only the built dlls, but also the source code would be within the cab in folders like Scripts\00000001[DocumentClassName]. So again, keeping the extracted contents in source control might be a good way to be able to diff changes, etc. But you still do need to keep the full cab since that is the only way you can import the batch class configuration.

Everything that Stephen said in his answer, and...
For some of the types of configuration management, version control, and troubleshooting tasks in the Kofax environment, I have found Beyond Compare by Scooter Software to be supremely helpful in comparing the contents of two .cab files and reconciling differences between them.
I'm speaking specifically of comparing cab files containing Kofax batch classes, which also contain the document class information for the document types in the batch class, as well as other things like assigned users, etc.
This will work best if your cab files have only one batch class in each, the same one, e.g., before and after cab snapshots, for the same batch class.
In Beyond Compare (BC) (I'm using the 4.x version), from Windows Explorer you select one .cab file for the left side, and the .cab file you are comparing it to for the right side. BC will show you the files inside each cab file, and as Stephen said, the admin.xml is the one with the details.
You can actually copy XML lines from one side to the other in BC, and save the result, but the real value is in seeing what settings changed between versions of the batch class.
If Kofax had some sort of scriptable automation API for the admin module, that would be amazing and potentially enable many of the capabilities you describe, but if Kofax does have such an API, I am unaware of it. I'm currently running Kofax Capture 10.1.
In Kofax version 11, they did add some features for keeping versions of batch classes automatically for you, so you can audit changes that were made in the admin module. Didn't notice anything about an automation API for the admin module in Kofax 11.

Related

Importing source files and folders into IAR Workbench

I have a cup of source files in a certain folder structure in my file system. I want to use this structure for a project in the IAR Workbench. Thinking of Eclipse, that could be so easy! But in the IAR Workbench, the folders will become to "Groups", which are only kind of virtual folders. The Workbench doesn't care about folders.
Is there some easy and fast way to import them?
Up to now I have to add the groups manually each and then add the files to the groups, and that's really annoying!
Is there maybe a tool to generate a proper project file (*.ewp) out of a file/folder structure path?
This would help me a lot!
You should have a look at IAR Project/Add Project Connection command.
Although IAR doesn't seem to have any public documentation on the xml syntax, or at least I couldn't find any, you can find Infineon DAVE (Config.xml) and Freescale PE (ProjectInfo.xml) files if you search around. These can be used as examples to figure out the syntax on how to write your own xml files in one of these interfaces, to allow you to specify where all your c, h, assembly and library files are from where ever they may be in your file system. They also allow you to define preprocessor includes for compiler/assembler, and DAVE allows you to define a path variable, which is also very useful.
See: https://mcuoneclipse.com/2013/11/01/iar-arm-v6-7-comes-with-improved-processor-expert-support/
I have modified a DAVE Config.xml file and found it EXTREMELY useful for managing and migrating even just a handful of project files. For example to upgrade to a new release with all files having a new directory root, you just change a single line in the xml file (defining the new root), and all source files, compiler includes etc are all updated to the new level. No more manually editing the preprocessor includes or replacing all the files in the project. And no more fiddling around with ../../ file system hierarchy navigation stuff, you just specify directly (or indirectly via a path to) where the files are, no more relative from where your project happens to be. VERY NICE.
IAR should consider opening this up (documenting) for general users, as it is very useful for project management and migration. While at it they should also consider generalizing the xml syntax a little bit and allow for definition of IAR group heading names, specifying linker file name, and definitely allowing multiple xml files to be included (connected) (so that subprojects can be easily added or removed without effecting the other subproject definition files) and a few basic things like that.
If they where to do a bang up job on this, they might consider allowing most/all aspects of IAR project configuration that might be required by the subproject, to be defined in these xml files, and then entire (sub)projects could just be plopped down anywhere and be up an running extremely quickly (OK, just let me dream a bit :)
For anyone who happens upon this you may want to check out https://github.com/IARSystems/project-migration-tools. They have a tool for pulling in file trees here.

Custom Eclipse (CDT) project layout, different from folder structure

A good hello to you fellow Stackoverflow people.
I am stuck with a small dilemma here.
At my work we used to work with UltraEdit projects but we want to migrate to using Eclipse CDT. (Not using its compiler/build options, we need an external SDK for this).
On the harddisk we have a specific folder structure to keep things seperate between two teams. Namely the 'productcode' + 'applicationcode'-group and the 'drivercode'-group.
Both groups have their own folder where they place sourcecode in.
application
drivercode
productcode
The filenames are given a specific prefix, denoting to which 'layer' they belong.
os (operating system)
application
system
unit
component
IO
hardware
All of these files (except for application which is only allowed in the application folder) can be in the product or drivercode folder.
In UltraEdit all of these files are grouped under their respective layer. So our project has the following folders:
0 Operating System
1 Application Layer
2 System Safety Layer
3 Unit Layer
4 Component Layer
5 IO Layer
6 Hardware Layer
Generic
XML
The virtual folder '0 Operating System' holds all os_xxx files from the real folders 'drivercode/productcode' And the same goes for 2, 3, 4, 5 and 6.
TL;DR:
Is it possible to get the same (virtual) folder structure within Eclipse CDT?
To make things more complex, this whole folder structure is devided in 3 projects. E.G. proj-1, proj-2, proj-3 and there is also a shared folder that holds code that is shared among projects.
I had a similar situation. Rather than a bunch of hunt/peck for linked resources, which tend break the ability to reuse the .*project files elsewhere, I made a 'workspace setup" script that just symlinked the sources into the directories where their projects were. That way the default eclipse mechanisms (build all source within a tree) just work out of the box.
I have found one way, but it is quite cumbersome.
I can create the structure I want using Linked Resource Folder and files.
However this means I need to go through all dialog's per folder/file in order to add them to the list. I hope there is an other way though. So I'll not accept my own answer as of yet.
Eclipse CDT plays well with existing projects.
I guess you probably also have manually generated Makefile? Then you only need to use File -> Import -> C/C++ -> Existing code as Makefile Project.
This will leave all your source where it was and team members that prefer to no use Eclipse can still use whatever they want, and build from command line.

Should I put my output files in source control?

I've been asked to put every single file in my project under source control, including the database file (not the schema, the complete file).
This seems wrong to me, but I can't explain it. Every resource I find about source control tells me not to put generated output files in a source control system. And I understand, it's not "source" files.
However, I've been presented with the following reasoning:
Who cares? We have plenty of bandwidth.
I don't mind having to resolve a conflict each time I get the latest revision, it's just one click
It's so much more convenient than having to think about good ignore files
Also, if I have to add an external DLL file in the bin folder now, I can't forget to put it in source control, as the bin folder is not being ignored now.
The simple solution for the last bullet-point is to add the file in a libraries folder and reference it from the project.
Please explain if and why putting generated output files under source control is wrong.
You haven't explained what "the database file" is.
I would certainly include 3rd party libraries in source control, as they're necessarily for the build and it's good to have a way of reproducing a build at a later time with the library versions you used at that particular moment. But yes, those libraries should be included from a "libraries" folder rather than the output directory.
I wouldn't generally include my own libraries built from the sources elsewhere in the same repository - although I have been in situations where that's been worth doing, where some projects didn't use the "latest and greatest" version of a common library, but just occasionally updated.
The most important practical argument I'd give against including everything, in a world where disk, processor and network are considered free and instantaneous, is that it makes it harder to tell what really changed for any given commit. It's easier to look down a list of 3 source files than 3 source files and 150 binaries from the obj/bin directories.
Generated output files (in general) are "dangerous" in a VCS because:
what you need to version is how to regenerate them: the day you will need to actually update them, chances are you won't remember how to do it
they can contain some private generated file which make them work on the committer desktop, but not on a client one ("works on my machine" TM syndrome)
some generated file are not easily stored in delta (binary especially), making them consuming lots of space (and the topic of cleaning that space will come-up someday...)
External libraries are not generated directly by your project, and can be put in a VCS, although external repositories like a public Maven repo are better at this kind of management.
Do we also put compiled object files such as class files, executables, DLLs build from our source? What about when we're doing serious volume testing and that database becomes many gigabytes or terabytes in size?
The clue is in the name: it's Source Code Management System.
I can understand the simplicity of put eveything in, it's more likely that developer doesn't forget some important file. But if you're doing regular automated builds then surely that gets picked up anyway?
I think the key phrase is here:
It's so much more convenient than
having to think about good ignore
files
Are you explicitly forbiden from having good ignore files? My guess is that already you are excluding .exe and .class (or whatever) files. Suppose you did take the trouble to exclude your database would that be a problem? Why? It's a concious action that you are chosing to take for the commone good. In Eclipse it's a couple of seconds work to add a new file type to the workspace's CVS ignore rules for all projects.
A rule of "No Ignore Files" is almost self-evidently absurd. Once you have the freedom the have some ignore files then why not just use them intelligently to exclude the DB? Who is inconveninced? Only yourself, if anyone, and you're prepared to do the extra work.

Do you version "derived" files?

Using online interfaces to a version control system is a nice way to have a published location for the most recent versions of code. For example, I have a LaTeX package here (which is released to CTAN whenever changes are verified to actually work):
http://github.com/wspr/pstool/tree/master
The package itself is derived from a single file (in this case, pstool.tex) which, when processed, produces the documentation, the readme, the installer file, and the actual files that make up the package as it is used by LaTeX.
In order to make it easy for users who want to download this stuff, I include all of the derived files mentioned above in the repository itself as well as the master file pstool.tex. This means that I'll have double the number of changes every time I commit because the package file pstool.sty is a generated subset of the master file.
Is this a perversion of version control?
#Jon Limjap raised a good point:
Is there another way for you to publish your generated files elsewhere for download, instead of relying on your version control to be your download server?
That's really the crux of the matter in this case. Yes, released versions of the package can be obtained from elsewhere. So it does really make more sense to only version the non-generated files.
On the other hand, #Madir's comment that:
the convenience, which is real and repeated, outweighs cost, which is borne behind the scenes
is also rather pertinent in that if a user finds a bug and I fix it immediately, they can then head over to the repository and grab the file that's necessary for them to continue working without having to run any "installation" steps.
And this, I think, is the more important use case for my particular set of projects.
We don't version files that can be automatically generated using scripts included in the repository itself. The reason for this is that after a checkout, these files can be rebuild with a single click or command. In our projects we always try to make this as easy as possible, and thus preventing the need for versioning these files.
One scenario I can imagine where this could be useful if 'tagging' specific releases of a product, for use in a production environment (or any non-development environment) where tools required for generating the output might not be available.
We also use targets in our build scripts that can create and upload archives with a released version of our products. This can be uploaded to a production server, or a HTTP server for downloading by users of your products.
I am using Tortoise SVN for small system ASP.NET development. Most code is interpreted ASPX, but there are around a dozen binary DLLs generated by a manual compile step. Whilst it doesn't make a lot of sense to have these source-code versioned in theory, it certainly makes it convenient to ensure they are correctly mirrored from the development environment onto the production system (one click). Also - in case of disaster - the rollback to the previous step is again one click in SVN.
So I bit the bullet and included them in the SVN archive - the convenience, which is real and repeated, outweighs cost, which is borne behind the scenes.
Not necessarily, although best practices for source control advise that you do not include generated files, for obvious reasons.
Is there another way for you to publish your generated files elsewhere for download, instead of relying on your version control to be your download server?
Normally, derived files should not be stored in version control. In your case, you could build a release procedure that created a tarball that includes the derived files.
As you say, keeping the derived files in version control only increases the amount of noise you have to deal with.
In some cases we do, but it's more of a sysadmin type of use case, where the generated files (say, DNS zone files built from a script) have intrinsic interest in their own right, and the revision control is more linear audit trail than branching-and-tagging source control.

Solution deployment, CM, InstallShield

People,
We have 4 or 5 utilities that work in conjunction with our application. These utilities are either .bat files, or VB apps, PowerBuilder, etc. I am trying to manage these utils in source control, and am trying to figure out a better way to assign versions to them. Right now, the developers use the version control's meta-data -- specifically label -- to store the version number of the tool.
My goal is to have individual InstallShield packages for each utility, and an easy means to manage and assign version numbers to these packages.
Would you recommend a separate .ini file with the info, or store the info in InstallShield .ism file itself, or just use the meta-data info from version control tool?
UPDATE:
I like the idea Orion. I have one concern though. The script that increments the version number... it can not be intelligent enough to increment Major number etc. right. e.g. if one of the utils has version 1.2.3 and we are at a point where the new version is 2.0.0. The script may not be able to handle this.
I think this has to do a lot with our branching techniques -- we don't have any. The folks thought since the utils are so small, the source may not need branches.
PowerBuilder in particular has a nice trick you can do to incorporate the build number from an ini file into the compiled application.
Details here: http://www.pbdr.com/pbtips/ex/autorev.htm
We have ini file inside source control that stores the build number and its value is used in our build scripts to determine what label to apply to the source tree after a successful build. Works very nicely for our needs. When we branch, we do have to manually kick the file to increment the proper number though.
I managed our build system at my last job, which seemed to have some parallels to what you're asking.
There were ~30 C++ projects which needed compiling, and various .NET/Java things, and the odd perl script.
This was all built on our build machine using NAnt - If I were doing it today I'd use rake, but the idea is the same.
We basically had an auto-incrementing build number which was stored in a version.txt file in the root of the repository.
Each time we did a build (automatically done each night, or also on-demand if neccessary) the script would increment this number and check the file back into source control.
All the other apps referenced this file for their version number, or for things which didn't support working like this, the script would set environment variables or perform other workarounds
I'm pretty sure that our installshield programs referenced an environment variable for their version number, but we deprecated them in favour of wix as installshield really did suck
in the case of visual studio, grep/replace the number within the .csproj files, and check them back in
Hope this gives you some ideas
Using the meta data from your version control system should keep things simpler. It's how your developers already use the system. There is no additional file to maintain. My personal experience has taught me to version the satellite applications with the same as version as the main app. K.I.S.S