UIMA Ruta: Check if feature is empty/undefined - uima

Within a UIMA Ruta script, I would like to check whether a feature of an annotation has been set/defined/is not null (whatever you call it). The feature is itself not of primitive type, but an "Annotation".
Is there a way to do that so that this check can be used to restrict generation of new annotations if the feature is not set.

You should be able to simply write it down in the feature dot notation with a comparison against null:
MyAnnotation.complexFeature != null;
a:MyAnnotation{a.complexFeature != null};
(requires a newer Ruta version, it should work just fine in Ruta 2.6.1)
DISCLAIMER: I am a developer of UIMA Ruta

Related

Reset a local ANNOTATION variable?

I'm using an ANNOTATION variable within a BLOCK statement. How do I initialize and/or reset an ANNOTATION variable to an "empty" value?
All variables in UIMA Ruta are global and thus are not automatically reset within a block.
As the comments of the question correctly mention, resetting the values of the variables is currently done by a separate rule at the beginning of the block using an implicit action or the ASSIGN action.
DISCLAIMER: I am a developer of UIMA Ruta

Are some extra settings in RUTA script needed to detect annotations with the same begin and end attributes?

I have a xmi output from Tika UIMA Annotator which is passed to a UIMA Ruta script for further processing. I was able to successfully import the corresponding type system and detect any MarkupAnnotations covering some fragment of text.
However the input has some MarkupAnnotations which has the same value for begin and end (so, do not cover any text). Those annotations are not recognized by the RUTA engine.
For example, the following rule is not fired:
MarkupAnnotation.name=="img" {->MARK(IMAGE}};
however in CAS Viewer I see a lot of MarkupAnnotations with the feature name equal to "img", and all of them have equal begin and end attributes.
Should I make some extra specifications in the script to catch such annotations?
Matching on annotation with the length 0 (begin == end) is not supported by UIMA Ruta (2.6.1).
There are various reasons, for example, the sequential matching is problematic since an annotation can preceed and follow itself.
DISCLAIMER: I am a developer of UIMA Ruta

How to access Array in UIMA-RUTA

I have an Annotation class with String Array as one of the field.
I want to add and remove string elements to that String Array from ruta script.
I searched for FSArray but didn't got anything.
Please help me with solving above problem.
As of UIMA Ruta 2.5.0, operations on FSArrays and StringArrays are not yet supported and still an open issue: UIMA-4399
You either need to wrap the logic in an additional analysis engine or you need to write an language extension for ruta.
DISCLAIMER: I am a developer of UIMA Ruta

Using Apache UIMA Ruta from my own annotator

I have a series of UIMA Ruta rules that I wish to run from within my own UIMA annotator. This is described here, but I can't get it to work: http://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.integration
When I try to run the annotator (from within a JUnit test, which I have used with other UIMA annotators successfully in the past), I get an error telling me that one of the Ruta basic annotation types (org.apache.uima.ruta.type.TokenSeed) is used in the Java code but isn't defined in the XML.
I've added the absolute path to the Ruta type system (BasicTypeSystem.xml and InternalTypeSystem.xml) to the descriptorPaths parameter (as detailed here: http://uima.apache.org/d/ruta-current/tools.ruta.book.html#ugr.tools.ruta.ae.basic.parameter.descriptorPaths), but that doesn't seem to make a difference.
I've had a look through the Ruta source code and couldn't figure out where I was going wrong.
Has anyone successfully got a Ruta script to run from within a UIMA annotator? How did you manage to get it working?
The problem is caused by the fact that the type system used by your analysis engine does not contain the types UIMA Ruta needs. The error mentions the seeding types because the initial annotations are added at the beginning. Even without seeding, more errors will occur because of the missing types like RutaBasic.
Adding the BasicTypeSystem to the type system used in your analysis engine should solve the problem.

How can I add a custom condition to an existing RUTA project? Started, but am stuck

I want to add a custom UIMA RUTA rule condition. I have an existing UIMA Ruta project in Eclipse. So far I created a source file in the same project with a basic annotator stub:
package mynamespace.extensions;
[imports]
public class MyNewCondition extends AbstractRutaCondition {
private final String para1;
public MyNewCondition(String para1) {
super();
this.para1 = para1;
}
#Override
public EvaluatedCondition eval(AnnotationFS annotation,
RuleElement element, RutaStream stream, InferenceCrowd crowd) {
// TODO Auto-generated method stub
if (para1 == "hfoo")
return new EvaluatedCondition(this, true);
else
return new EvaluatedCondition(this, false);
}
public String getPara() {
return para1;
}
}
The file compiles to the target/classes/... folder, but when I create a RUTA script:
DECLARE Test;
SW{MyNewCondition("foo") -> MARK(Test)};
... Eclipse tells me that "MyNewCondition" is not defined and when I run it I get: "Error in line 40, "(": found no viable alternative" on the console. I presume I need to do some further import, but do not know how. I tried to work from the Extension example project in the Github repository, but I do not know where to start there as the script file does not contain any further imports, but the associated xml descriptor files do. But as these get automatically generated I do not know whether this is what I should change or it is something else.
I also tried importing the same new condition type from a second project via Eclipse's build path options, but no luck there either.
Can someone help? Thanks.
You need at least three classes for adding a new condition that also is resolved in the UIMA Ruta Workbench:
An implementation of the condition as you did in your question
An implementation of IRutaConditionExtension, which provides the condition implementation to the engine
An implementation of IIDEConditionExtension, which provides the condition for the UIMA Ruta Workench
The condition itself contains only the functionality that should be added to the language. The analysis engine knows of course nothing about any external implementations resulting in a strange parse exception like "(" not found. That should be improved sometimes. The analysis engine provides a configuration parameter additionalExtensions that lists all known extensions to the language. If you are not using the UIMA Ruta Workbench, you need to add your implementation of IRutaConditionExtension to this parameter. The implemenation of IIDEConditionExtension provides the necessary functionality for the UIMA Ruta Workbench that is the syntax check, syntax highlighting and so on. Additionally, it enables the Workbench to generate correct descriptors. It adds your implementation of IRutaConditionExtension to the respective parameter. This extension of the Workbench needs of course to be implemented in an Eclipse plugin that is installed in your UIMA Ruta Workbench Eclipse instance, in order to be available in the Workbench. There is an extension point, which you need to extend that knows both your implementation of IRutaConditionExtension and IIDEConditionExtension.
There is an exemplary project that provides implementation of all possible language elements. This project contains the implementations for the analysis engine and also the implementation for the UIMA Ruta Workbench, and is therefore an Eclipse plugin (mind the pom file).
Concerning the ExampleCondition condition extension, there are three important spots/classes:
ExampleCondition.java provides the implementation of the new condition, which evaluates dates
ExampleConditionExtension.java provides the extension for the analysis engine. It knows the name of the condition, its implementation, can create new instances of that condition, and is able to verbalize the condition for the explanation components.
ExampleConditionIDEExtension provides the syntax check for the editor and the keyword for syntax coloring.
plugin.xml defines the extension for the Workbench:
<extension point="org.apache.uima.ruta.ide.conditionExtension">
<condition
class="org.apache.uima.ruta.example.extensions.ExampleConditionIDEExtension"
engine="org.apache.uima.ruta.example.extensions.ExampleConditionExtension">
</condition>
</extension>
If you do not use the UIMA Ruta Workbench or only want to apply the rules in UIMA pipelines, you only need ExampleCondition and ExampleConditionExtension, and you need to add org.apache.uima.ruta.example.extensions.ExampleConditionExtension to the additionalExtensions parameter of your UIMA Ruta analysis engine (descriptor).
Adding new conditions using Java projects in the same workspace has not been tested yet, but at least the Workbench support will be missing due to the inclusion of extensions using the extension point mechanism of Eclipse.