Cannot run UIMA Ruta programs (imports not working?) - eclipse

I am completely new to Ruta (and to Java).
I installed Eclipse, Maven plugin and Uima Ruta on my computer. I followed the instructions from the UIMA Ruta Guide and Reference. Eclipse shows I have got UIMA Runtime 2.10.2, UIMA Tools 2.10.2, UIMA Ruta Workbench 2.6.1 and UIMA-AS Deployment Descriptor Editor 2.10.3.
But now it turns out that I cannot write (well, compile/run) a single most simple program using Ruta because something is wrong with the imports.
When I write "PACKAGE uima.ruta.example", a red circle appears saying "The package declaration does not match the project structure" -- even if there is no other line in my program.
When I try to compile and run a simple program on an input file (right click on file > UIMA Ruta > Quick Ruta), nothing happens.
I suppose some important files simply haven't been downloaded onto my computer. When I am exploring the directory where I (think I) installed everything, I see there are loads of different "uima" amd "uimaj" packages in there, but I cannot find any packages called 'ruta' or 'ruta.example' or so.
What should I do? Where can I get the 'ruta' library? Does the 'ruta.example' library really exist or is it used in the book just as an example?
(Actually I would also be happy to receive an answer to the question "Why in God's name should I download an environment, install a plugin for it, install a subplugin for it, create a project for it and adjust its settings before writing some programs, instead of just installing and/or compiling some single stuff and just running my program with it in the command line?", but since such a way has not appeared yet officially (has it?), I suppose there should be some serious reasons for that.)

If you use a declaration like PACKAGE uima.ruta.example; it's also required that you put that file in the matching package (directory). So, if your Ruta file names MyExample.ruta it should be located in script/uima/ruta/example/MyExample.ruta. If the Ruta workbench is installed correctly, and you create a default UIMA Ruta project, then it's possible to execute the Ruta file via the file context menu. No additional resources have to be downloaded.
Don't forget to add example input files into the "input" directory. These will be processed and the result will be placed in the "output" directory.
If you are using the Ruta library in Maven project (In a custom project other then running it through the Ruta workbench of Eclipse) you have to add Ruta as dependency in the pom.xml. All transitive dependencies will be downloaded by the Maven system.
<dependency>
<groupId>org.apache.uima</groupId>
<artifactId>ruta-core</artifactId>
<version>2.6.1</version>
</dependency>

Related

Define a Java 9 multi-moduled project in Eclipse

I'm trying out Java 9 Jigsaw module system (no module experience yet) and would like to use it for capsuling the classes within my project, but it's confusing.
According to this article it should be possible to have multiple modules within ONE project. I made a new project in Eclipse Oxygen (Java 9 is supported) with the same structure as shown in the article. But Eclipse keeps telling me that I must not have more than one module-info.java in a project.
I really don't know how to tell Eclipse that it should use the "multi-module-mode". And I really would appreciate not having to create a new project for every single module.
This works:
This not:
But according to this article something like that should work:
And how about deployment of a modularized project with Eclipse? There is nothing to see about the new jmod extension. Do I still export it as a runnable JAR file like before?
Notice that my questions refer to working with the IDE (no command line, I mean with an IDE that should be possible, right?) Thank you for enlightening me.
Currently, Eclipse requires you to create a separate project for each module (e. g. because each module has its own Java Build Path).
To understand this design decision, consider that Java modules correspond to OSGi bundles / Eclipse plug-ins and it has always been to have a separate project for each bundle/plug-in. If you come from the Maven world, you would probably expect a deeper folder structure instead. But modules are self-contained and combining several modules into one project would only add an additional folder level without meaning. However, Eclipse supports nested projects and so-called working sets if you need an additional folder level.
Exporting modules as images is planned for Eclipse 2019-03 (4.11), on March 20, 2019 (see Eclipse bug 518445). Exporting modules as JARs that can be used on the modulepath (-m) already works (see my video).
I don't know if this question is still open for an answer, but you can solve this problem by simply removing all source folders on the build path. At least this works for Eclipse 2021-12 version.
As you can see this is a demo project from the Official Gradle Guide Book and it has multiple modules. Each module has its own module-info.java.
project structure in IntelliJ IDEA
If I open this project in Eclipse it will give me the 'duplicated entries on module-info.java' error.
Eclipse shows the error
But if I delete all the source folders on the build path, the error is gone and the project can be built and run without problem.
project properties: Java build path
The only problem is that you have to build the project with Gradle so that it will produce the .jar of each module and you have to include them in the libraries later.
include all the .jar in libraries
I think this is probably the same solution mentioned by howlger above.

Apache UIMA Ruta Workbench with custom ruta-core

In our corpus we often find and need to parse data that is alpha-numeric as a single token (for example file hashes, email addresses, etc.) We have created our own ruta-core version by re-working the JFlex definition. Is there a way we can still work with this new version of ruta-core in Workbench?
If you use simple Ruta projects, you would need to replace the ruta.engine plugin with a different jar containing your ruta-core version. The clean way would be to build a complete update site with your version.
You could maybe also set your ruta-core jar in the classpath of your ruta launch configurations.
If you use maven-based projects, you can set the dependency to your version of ruta-core, which should then be used in the launch delegate.
For your use case, I would not use your own version of ruta-core at all. You could simply write your own version of the TokenLexer, as you probably did. Then, you can configure the utilized TokenLexer in the RutaEngine as there is a configuration parameter for setting it. Thus, there is already some functionality to customize the JFlex definition without building your own ruta-core.
DISCLAIMER: I am a developer of UIMA Ruta

Problems loading Ruta TYPESYSTEM

I'm having trouble importing a typesystem into Ruta. I have two projects in my workspace:
UIMA project located ./workspace/UIMA_NLP/
Ruta project located at ./workspace/RUTA_CLARIFY/
I'm trying to load the Type System Definition file: ./workspace/UIMA_NLP/descriptors/type_system/nlpTypes.xml created in the UIMA project into the Ruta script.
I've been able to do this successfully if I copy the Type System Definition into the Ruta project into ./workspace/RUTA_CLARIFY/descriptor/nlpTypes.xml and loading it in the Ruta script with the following:
TYPESYSTEM nlpTypes;
However, when trying to import directly from the UIMA_NLP project I get 'error nlpTypes not found' in the editor. I've tried adding the fully qualified directory of the Type System Descriptor to the descriptorPaths field in the generated ruta engine without any success.
I've tried the following types system imports in script after adding the path to the descriptor paths:
TYPESYSTEM type_system.nlpTypes;
TYPESYSTEM descriptors.type_system.nlpTypes;
TYPESYSTEM UIMA_NLP.descriptors.type_system.nlpTypes;
What is strange is that I can add the nlpTypes.xml Type System Descriptor in the Type System generated by the Ruta script using the Imported Types and Import By Location and the types defined by the imported nlpTypes.xml appear in the Types. I can also type them in the editor when using auto-complete and the types appear. However, I will still get an error in the editor that 'Type "typename" is not defined in this script/block' even after using the auto-complete to complete the type name. Because of this I suspect I am not using the TYPESYSTEM import correctly for this case.
Am I using the TYPESYSTEM import incorrectly? Or is the only way to use my predefined Type System Descriptor to copy it to the Ruta project?
Adding the absolute path to the folder of the type system to the descriptiorPaths configuration parameter of your analysis engine descriptor should work. However, in which xml descriptor did you add it? If it is the generated descriptor of your script, then the modification will be overwritten by the workbench. You need to add the additional path to the template descriptor BasicEngine.xml of the project.
If the descriptorPath contains the paths to the descriptors folder of the other project, then the correct import would read: TYPESYSTEM type_system.nlpTypes;
Normally, you would reference the UIMA project from the Ruta project: Right-click on the Ruta project->Properties->Project References->Check the UIMA project
The default folders of referenced projects are automatically included in the descriptorPaths when the analysis engine descriptor is built by the workbench. In case of UIMA Pear projects, this would be the desc folder. For Java project this would be the output folder, e.g., bin or target/classes.
The strange error you report is really strange. Sounds like a problem of the project setup or descriptors that are not up-to-date. Try to clean the project: Menu->Project->Clean...
The error is maybe false positive due to the project setup. Can you launch the script and get results in the output folder?
I personally recommend using simple Ruta projects only for prototyping. For serious rule projects, especially if there are dependencies to other projects, I'd rather recommend a maven-built project. There is also an archetype for ruta projects in order to ease the setup.
DISCLAIMER: I am a developer of UIMA Ruta

UIMA Ruta Workbench picks up an incorrect descriptor path

I am new to UIMA Ruta. I try to run the Main.ruta in the ExampleProject came with Ruta 2.1.0 source release using Eclipse 3.7 according to the instruction. It seems the Ruta workbench picks up an incorrect descritor path. I tried to set new Arguments in the "Run Configurations". It doesn't work.
Due to some restriction at work, The UIMA Ruta plugins were installed manaully.
Please help. Thank you!
An error because of incorrect descriptor paths indicates that the plugin are correctly installed, but the descriptors themselves are not updated.
The descriptors use some absolute paths in their parameter configuration. Try to build the project twice in order to regenerate all descriptors, e.g, with Menu->Project->Clean. Then investigate the descriptor ExampleProject/descriptor/uima/ruta/example/MainEngine.xml and check if the absolute paths are correct (pointing to your workspace project), especially the parameter scriptPaths.
Thanks Peter. The problem truns out that Ruta workbench can't handle spaces in descriptor paths. I fixed the space problem and the rebuilt the project. Now it works.

Eclipse: script compiler as part of a project

This question is not limited to lex and yacc, but how can I add a custom script compiler as part of a project? For example, I have the following files in the project:
grammar.y
grammar.l
test.script
The binary 'script_compiler' will be generated using grammar.y and grammar.l compiled by lex, yacc and g++. And then I want to use that generated script_compiler to compile test.script to generate CompiledScript.java. This file should be compiled along with the rest of the java files in the project. This setting is possible with XCode or make, but is it also possible with Eclipse alone? If not, how about together with Maven plugin?
(I might setup the script compiler as a separate project, but it would be nice if they can be put in the same project so that changes to the grammar files can be applied immediately)
Thanks in advance for your help!
You can add a custom "Builder" from the project properties dialog. This can be an ant script (with an optional target) or any other script or executable.
There are also maven plugins for ant and other scripting languages
If you just want to run an external program in Maven this is what you want: http://mojo.codehaus.org/exec-maven-plugin/ -- you can then run Maven targets from your IDE or command line and it should do the right thing either way.
To integrate with the normal compilation bind the plugin to the "generate-sources" phase and add the location where the Java files are generated to the "sourceRoot" option of the exec plugin. That way the compiler will pick them up.
Ideally you generate the code into a folder "target/generated-sources/MY_SCRIPT_NAME". That is the standard location for generated sources in the Maven world and e.g. IntelliJ IDEA will pick up source files inside of that location. Note that this doesn't work if the files are directly in "target/generated-sources".
The other option is to write your own Maven plugin, which is actually quite easy as well. See e.g. https://github.com/peterbecker/maven-code-generator