Want to Remove Markup's from the Annotation-UIMA RUTA - uima

If I use P tag(from Html Annotator) as PASSAGE.I want to ignore the markup's from the Annotation.
SCRIPT:
//-------------------------------------------------------------------
// SPECIAL SQUARE HYPHEN PARENTHESIS
//-------------------------------------------------------------------
DECLARE LParen, RParen;
SPECIAL{REGEXP("[(]") -> MARK(LParen)};
SPECIAL{REGEXP("[)]") -> MARK(RParen)};
DECLARE LSQParen, RSQParen;
SPECIAL{REGEXP("[\\[]") -> MARK(LSQParen)};
SPECIAL{REGEXP("[\\]]") -> MARK(RSQParen)};
DECLARE LANGLEBRACKET,RANGLEBRACKET;
SPECIAL{REGEXP("<")->MARK(LANGLEBRACKET)};
AMP{REGEXP("<")->MARK(LANGLEBRACKET)};
SPECIAL{REGEXP(">")->MARK(RANGLEBRACKET)};
AMP{REGEXP(">")->MARK(RANGLEBRACKET)};
DECLARE LBracket,RBracket;
(LParen|LSQParen|LANGLEBRACKET){->MARK(LBracket)};
(RParen|RSQParen|RANGLEBRACKET){->MARK(RBracket)};
DECLARE PASSAGE,TESTPASSAGE;
"<a name=\"para(.+?)\">(.*?)</a>"->2=PASSAGE;
RETAINTYPE(WS); // or RETAINTYPE(SPACE, BREAK,...);
PASSAGE{-> TRIM(WS)};
RETAINTYPE;
PASSAGE{->MARK(TESTPASSAGE)};
DECLARE TagContent,PassageFirstToken,InitialTag;
LBracket ANY+? RBracket{-PARTOF(TagContent)->MARK(TagContent,1,3)};
BLOCK(foreach)PASSAGE{}
{
Document{->MARKFIRST(PassageFirstToken)};
}
TagContent{CONTAINS(PassageFirstToken),-PARTOF(InitialTag)->MARK(InitialTag)};
BLOCK(foreach)PASSAGE{}
{
InitialTag ANY+{->SHIFT(PASSAGE,2,2)};
}
Sample Input:
<p class="Normal"><a name="para1"><h1><b>On the Insert tab, the galleries include items that are designed to coordinate with the overall look of your document. </b></a></p>
<p class="Normal"><a name="para2"><aus>On the Insert tab, the galleries include items that are designed to coordinate with the overall look of your document.</a></p>
<p class="Normal"><a name="para3">On the Insert tab, the galleries include items that are designed to coordinate with the overall look of your document.</a></p>
<p class="Normal"><a name="para4">On the Insert tab, the galleries include items that are designed to coordinate with the overall look of your document. </a></p>
<p class="Normal"><a name="para5">On the Insert tab, the <span>galleries</span> include items that are designed to coordinate with the overall look of your document.</a></p>
PASSAGE(5) AND TESTPASSAGE(2).Why the TESTPASSAGE reduced? And InitialTag is not tagged.
I have attached the output annotation image

When reproducing the given example, I get 5 PASSAGE annotations and 3 TESTPASSAGE annotations (the last three PASSAGE annotations). The other two PASSAGE annotations are not annotated with TESTPASSAGE, because they start with a MARKUP annotation, which is not visible by default, and make the complete annotation invisible. In order to avoid this problem, you can make MARKUP visible or trim markups from PASSAGE annotations (is this actually the main question?). Just extend you rules for the TRIM action:
RETAINTYPE(WS, MARKUP);
PASSAGE{-> TRIM(WS, MARKUP)};
RETAINTYPE;
There are no InitialTag annotations because there are no TagContent annotations because there are no LBracket annotations in the example.
Btw, you could rewrite some rules:
PASSAGE{->MARKFIRST(PassageFirstToken)};
(LBracket # RBracket){-PARTOF(TagContent)-> TagContent};
DISCLAIMER: I am a developer of UIMA Ruta

//-------------------------------------------------------------------
// SPECIAL SQUARE HYPHEN PARENTHESIS
//-------------------------------------------------------------------
DECLARE LParen, RParen;
SPECIAL{REGEXP("[(]") -> MARK(LParen)};
SPECIAL{REGEXP("[)]") -> MARK(RParen)};
DECLARE LSQParen, RSQParen;
SPECIAL{REGEXP("[\\[]") -> MARK(LSQParen)};
SPECIAL{REGEXP("[\\]]") -> MARK(RSQParen)};
DECLARE LANGLEBRACKET,RANGLEBRACKET;
SPECIAL{REGEXP("<")->MARK(LANGLEBRACKET)};
AMP{REGEXP("<")->MARK(LANGLEBRACKET)};
SPECIAL{REGEXP(">")->MARK(RANGLEBRACKET)};
AMP{REGEXP(">")->MARK(RANGLEBRACKET)};
DECLARE LBracket,RBracket;
(LParen|LSQParen|LANGLEBRACKET){->MARK(LBracket)};
(RParen|RSQParen|RANGLEBRACKET){->MARK(RBracket)};
DECLARE PASSAGE,TESTPASSAGE;
"<a name=\"para(.+?)\">(.*?)</a>"->2=PASSAGE;
RETAINTYPE(WS); // or RETAINTYPE(SPACE, BREAK,...);
PASSAGE{-> TRIM(WS)};
RETAINTYPE;
PASSAGE{->MARK(TESTPASSAGE)};
DECLARE TagContent,PassageFirstToken,InitialTag;
LBracket ANY+? RBracket{-PARTOF(TagContent)->MARK(TagContent,1,3)};
BLOCK(foreach)PASSAGE{}
{
Document{->MARKFIRST(PassageFirstToken)};
}
TagContent{CONTAINS(PassageFirstToken),-PARTOF(InitialTag)->MARK(InitialTag)};
BLOCK(foreach)PASSAGE{}
{
InitialTag ANY+{->SHIFT(PASSAGE,2,2)};
}

Related

using ruta, annotate a line containing annotation and extract required data

Annotate a line containing specific annotations in order to extract text. Annotate line for Borrower and Co-Borrower and get their respective SSNs
Borrower Name: Alice Johnson SSN: 123-456-7890
Co-Borrower Name: Bob Symonds SSN: 987-654-3210
code
PACKAGE uima.ruta.test;
TYPESYSTEM utils.PlainTextTypeSystem;
ENGINE utils.PlainTextAnnotator;
EXEC(PlainTextAnnotator, {Line});
DECLARE Borrower, Name;
DECLARE BorrowerName(String value, String label);
CW{REGEXP("\\bBorrower") -> Borrower} CW{REGEXP("Name") -> Name};
Borrower Name COLON n:CW[1,3]{-> CREATE(BorrowerName, "label"="Borrower Name", "value"=n.ct)};
DECLARE SSN;
DECLARE BorrowerSSN(String label, String value);
W{REGEXP("SSN") -> SSN};
SSN COLON n:NUM[3,3]{-> CREATE(BorrowerSSN, "label"="Borrower SSN", "value"=n.ct)};
DECLARE Co;
CW{REGEXP("Co") -> Co};
DECLARE CoBorrowerName(String label, String value);
Co Borrower Name COLON n:CW[1,3]{-> CREATE(CoBorrowerName, "label"="Co-Borrower Name", "value"=n.ct)};
DECLARE BorrowerLine;
Line{CONTAINS(Borrower),CONTAINS(Name)->MARK(BorrowerLine)};
Please suggest how to annotate a line containing annotation and get specific label value for required annotation.
To spare yourself from handling the separate strings, you could gather all the indicators to a wordlist (i.e., a text file containing one indicator per line) and place it in your project resources folder (see this for more details). Then you could just mark all the indicators with the desired indicator type:
WORDLIST IndicatorList ='IndicatorList.txt';
DECLARE Indicator;
Document{->MARKFAST(Indicator, IndicatorList )};
This would output Indicator helper annotations like "Borrower Name".
Once you have that, you could now iterate over the lines and find the target annotations.
DECLARE Invisible;
SPECIAL{-PARTOF(Invisible), REGEXP("[-]")-> Invisible};
BLOCK(line) Line{CONTAINS(Indicator)}{
//Ex. pattern: Borrower Name: Alice Johnson SSN: 123-456-7890
Indicator COLON c:CW[1,3]{-> CREATE(BorrowerName, "label"="Borrower Name", "value"=c.ct)} Indicator;
FILTERTYPE(Invisible);
Indicator COLON n:NUM[3,3]{-> CREATE(BorrowerSSN, "label"="BorrowerSSN", "value"=n.ct)};
REMOVEFILTERTYPE(Invisible);
}
Hope this helps.
Addition to Viorel's answer:
The PlainTextAnnotator creates annotations of the type Line and these annotation cover the complete line, which means that leading or trailing whitespaces are also included. As a consequence, the resulting annotations are not visible for the following rules. In order to avoid this problem, you could for example trim the whitespaces in these annotations:
EXEC(PlainTextAnnotator, {Line});
ADDRETAINTYPE(WS);
Line{-> TRIM(WS)};
REMOVERETAINTYPE(WS);

Wrong xpath for nav html tag

I m trying to find element using xpath for tag
<nav id="nav">... </nav>
this works:
WebElement navigationPane = firefox.findElement(By.className("nav"));
but this does not:
WebElement navigationPane =
firefox.findElement(By.xpath("//nav[#id='nav')]"));
How do I make it valid expression in xpath?
In the xpath expression, you've meant to use #class instead of #id. And, there is an extra parenthesis inside. Here is the fixed version:
//nav[#class = 'nav']
After finding an HTML block in your question (made it visible with an edit), I've realized that there is an id attribute set on the element, not class. In this case, you should use the following expression:
//nav[#id = 'nav']
Note that by.id would be an easier and faster way to find the element:
WebElement navigationPane = firefox.findElement(By.id("nav"));

Multiple method overrides in Mason2

Building a website using Mason2. Each page should call 3 filters (methods, called by content):
<% $.filter1(args...) {{ %>
<% $.filter2(args...) {{ %>
<% $.filter3(args...) {{ %>
I have 3 different implementation of those filters (methods). The default set of those 3 filters, are defined in the top level /Base.mc component.
Now, for a different parts of the site - (different routes), say for the
/a/all/paths>/...>
/b/.....
need use the above default set of filters/methods, but for another routes,
/c/...
/d/...
/e/...
need use another set of filter1, filter2, filter3 and for the
/x/....
/y/...
want use the third set.
The methods can be easily redefined, in lower level components, but doing it as in (this question), isn't honors the DRY principe, e.g. in each
/c/Base.mc
/d/Base.mc
/e/Base.mc
Need repeat the same 3
<%override filter1>
...
<%override filter2>
...
<%override filter3>
The question is:
how to write only once the 3 different implementations of the methods, and how to use them at once?
I tried make an compomponent, like: /set2.mi, /set3.mi, where I tried override those filter-methods, and called it in the needed /{c,d,e}/Base.mc as
<& /set2.mi &>
But this doesn't works.
How to write the 3 different methods implementatinons and call them at once in the needed deeprr Base.mc? Is it possible?
In standard perl i would probably use roles, so in the needed packages I would use something like:
with 'Set1';
with 'Set2';
Where the packages Set1 and Set2 would contain the implementations of the needed methods, or for dynamic loading would use the require ... . Is something like possible in the Mason2, or i must repeat the %ovverride?
Hope this make sense... if not, please add an comment and i will try update the question.
EDIT
Example, for shorting the code, using only one filter not 3.
The /Base.mc
<%augment wrap><% inner() %></%augment>
% #this is the "default" MyHead filter
<%filter MyHead($arg)>
<!-- in the real code it is more complicated, but for an illustration it is enough -->
<h1 class="<% $arg %>"><% $yield->() %></h1>
</%filter>
When in the /a/index.mc using it as
% $.MyHead('big') {{
some head text
% }}
will output like:
<h1 class="big">some head text</h1>
now, have an another MyHead.
<%filter MyHead($arg)>
<!-- in the real code it is more complicated - basically want output different thing -->
<h2 id="<% $arg %>"><% $yield->() %></h2>
</%filter>
If I add the above code to my /b/Base.mc it will work, and calling the MyHead filter in the /b/index.mc
% $.MyHead('modal') {{
some other text
% }}
will call the redefined filter, and will output what i want
<h2 id="modal">some other text</h2>
The problem is,
I don't want repeat the above filter code, in N other Base.mc components, like in /c/Base.mc and /d/Base.mc and so on.
How to achieve, to write the filter once and "use" it in many other components for "redefine" the default one.
One solution could be (not a nice one and produces an "spaghetty" like code) playing with the inheritance chain, as next:
Remove your "default" filter from the /Base.mc, so it would contains only the <%augment wrap
Create one component called for example: BaseSetDefault.mc and enter into it, your "default" filter and explicitly set the inheritance chain to the top level /Base.mc
<%augment wrap><% inner() %></%augment>
<%filter MyHead($arg)><h1 class="<% $arg %>"><% $yield->() %></h1></%filter>
<%flags>
extends => '/Base.mc'
</%flags>
Similarly, create an another component say called as /BaseSet2.mc and put here your "Set2" filters, like:
<%augment wrap><% inner() %></%augment>
<%filter MyHead($arg)><h2 id="<% $arg %>"><% $yield->() %></h2></%filter>
<%flags>
extends => '/Base.mc'
</%flags>
Now, in the /a/Base.mc and /b/Base.mc - everywhere, you want the "default" set, change the chain
<%flags>
extends => '/BaseSetDefault.mc'
</%flags>
and in the /c/Base.mc and /d/Base.mc - everywhere you want the "Set2" filters, use the
<%flags>
extends => '/BaseSet2.mc'
</%flags>
From now, the inheritance chain for /c/index.mc will be:
/c/index.mc -> /c/Base.mc -> /BaseSet2.mc -> /Base.mc
and the execution would be done as
/Base.mc -augment-> /BaseSet2.mc -augment-> /c/Base.mc -main-> /c/index.mc
^^^^^^^^^^^^ - defines the Set2 filters
and for the /a/index.mc inheritance chain
/a/index.mc -> /a/Base.mc -> /BaseSetDefault.mc -> /Base.mc
the execution
/Base.mc -augment-> /BaseSetDefault.mc -augment-> /a/Base.mc -main-> /a/index.mc
^^^^^^^^^^^^^^^^^^ - defines the "Default" filters
It is not very nice solution, but works...

a simple Ruta annotator

I just started with Ruta and I would like to write a rule that will work like this:
it will try to match a word e.g. XYZ and when it hits it, it will then assign the text that comes before to the Annotator CompanyDetails.
For example :
This is a paragraph that contains the phrase we are interested in, which follows the sentence. LL, Inc. a Delaware limited liability company (XYZ).
After running the script the annotator CompanyDetails will contain the string:
LL, Inc. a Delaware limited liability company
I assume that you mean annotation of the type 'CompanyDetails' when you talk about annotator 'CompanyDetails'.
There are many (really many) different ways to solve this task. Here's one example that applies some helper rules:
DECLARE Annotation CompanyDetails (STRING context);
DECLARE Sentence, XYZ;
// just to get a running example with simple sentences
PERIOD #{-> Sentence} PERIOD;
#{-> Sentence} PERIOD;
"XYZ" -> XYZ; // should be done in a dictionary
// the actual rule
STRING s;
Sentence{-> MATCHEDTEXT(s)}->{XYZ{-> CREATE(CompanyDetails, "context" = s)};};
This example stores the string of the complete sentence in the feature. The rule matches on all sentences and stores the covered text in the variable ´s´. Then, the content of the sentence is investigated: An inlined rule tries to match on XYZ, creates an annotation of the type CompanyDetails, and assigns the value of the variable to the feature named context. I would rather store an annotation instead of a string since you could still get the string with getCoveredText(). If you just need the tokens before XYZ in the sentence, the you could do something like that (with an annotation instead of a string this time):
DECLARE Annotation CompanyDetails (Annotation context);
DECLARE Sentence, XYZ, Context;
// just to get a running example with simple sentences
PERIOD #{-> Sentence} PERIOD;
#{-> Sentence} PERIOD;
"XYZ" -> XYZ;
// the actual rule
Sentence->{ #{-> Context} SPECIAL? #XYZ{-> GATHER(CompanyDetails, "context" = 1)};};

How to use ng-class in select with ng-options

I have an array of Person objects
var persons = [
{Name:'John',Eligible:true},
{Name:'Mark',Eligible:true},
{Name:'Sam',Eligible:false},
{Name:'Edward',Eligible:false},
{Name:'Michael',Eligible:true}
];
and i am using select with ng-options like this:
<select ng-model="Blah" ng-options="person.Name for person in persons"></select>
I want to show the record with Eligible:false in red color.
So the problem is how do i use the ng-class in select inorder to achieve this? Since we are not using any option tag it wont work if i simply add ng-class in the select element itself.
You could create a directive that processed the options after the ngOptions directive is processed that updated them with the appropriate classes.
Update: The old code had a few bugs, and I've learned a bit since I answered this question. Here is a Plunk that was redone in 1.2.2 (but should work in 1.0.X as well)
Here is updated (Nov 30 '13 at 3:17) the Code:
app.directive('optionsClass', function ($parse) {
return {
require: 'select',
link: function(scope, elem, attrs, ngSelect) {
// get the source for the items array that populates the select.
var optionsSourceStr = attrs.ngOptions.split(' ').pop(),
// use $parse to get a function from the options-class attribute
// that you can use to evaluate later.
getOptionsClass = $parse(attrs.optionsClass);
scope.$watch(optionsSourceStr, function(items) {
// when the options source changes loop through its items.
angular.forEach(items, function(item, index) {
// evaluate against the item to get a mapping object for
// for your classes.
var classes = getOptionsClass(item),
// also get the option you're going to need. This can be found
// by looking for the option with the appropriate index in the
// value attribute.
option = elem.find('option[value=' + index + ']');
// now loop through the key/value pairs in the mapping object
// and apply the classes that evaluated to be truthy.
angular.forEach(classes, function(add, className) {
if(add) {
angular.element(option).addClass(className);
}
});
});
});
}
};
});
Here's how you'd use it in your markup:
<select ng-model="foo" ng-options="x.name for x in items"
options-class="{ 'is-eligible' : eligible, 'not-eligible': !eligible }">
</select>
It works like ng-class does, with the exception that it's on a per-item-in-the-collection basis.
In this scenario you can only apply ng-class only if you use ng-repeat with option tags:
<select ng-model="Blah">
<option ng-repeat="person in persons" ng-class="{red: person.Eligible}">
{{person.Name}}
</option>
</select>
This will give custom class to your 'Eligible' persons, but CSS won't work consistently across bowsers.
Plunker.
I wanted to comment on the accepted answer, but because I don't have enough reputation points, I must add an answer.
I know that this is an old question, but comments where recently added to the accepted answer.
For angularjs 1.4.x the proposed directive must be adapted to get it working again.
Because of the breaking change in ngOptions, the value of the option isn't anymore the index, so the line
option = elem.find('option[value=' + index + ']');
won't work anymore.
If you change the code in the plunker to
<select ng-model="foo" ng-options="x.id as x.name for x in items"
options-class="{ 'is-eligible' : eligible, 'not-eligible': !eligible }">
</select>
As result the value of the option tag will now be
value="number:x" (x is the id of the item object)
Change the directive to
option = elem.find('option[value=\'number:' + item.id + '\']');
to get it working again.
Of course this isn't a generic solution, because what if you have not an id in your object?
Then you will find value="object:y" in your option tag where y is a number generated by angularjs, but with this y you can't map to your items.
Hopes this helps some people to get their code again working after the update of angularjs to 1.4.x
I tried also to use the track by in ng-options, but didn't get it to work.
Maybe people with more experience in angularjs then me (= my first project in angularjs)?
The directive is one way, but I used a custom filter.
If you know how to select your element, you should be fine here. The challenge was to find the current option element inside the select. I could have used the "contains" selector but the text in the options may not be unique for items. To find the option by value, I injected the scope and the item itself.
<select ng-model="foo" ng-options="item.name|addClass:{eligible:item.eligible,className:'eligible',scope:this,item:item} for item in items"></select>
and in the js:
var app = angular.module('test', []);
app.filter('addClass', function() {
return function(text, opt) {
var i;
$.each(opt.scope.items,function(index,item) {
if (item.id === opt.item.id) {
i = index;
return false;
}
});
var elem = angular.element("select > option[value='" + i + "']");
var classTail = opt.className;
if (opt.eligible) {
elem.addClass('is-' + classTail);
elem.removeClass('not-' + classTail);
} else {
elem.addClass('not-' + classTail);
elem.removeClass('is-' + classTail);
}
return text;
}
})
app.controller('MainCtrl', function($scope) {
$scope.items = [
{ name: 'foo',id: 'x1',eligible: true},
{ name: 'bar',id: 'x2',eligible: false},
{ name: 'test',id: 'x3',eligible: true}
];
});
Here you can see it work.
The accepted answer did not work for me, so I found an alternative without a custom directive using track by :
<select ng-model="foo" ng-options="x.name for x in items track by x.eligible"></select>
Each option now gets the value x.eligible. In CSS you can style options with value = true (I think true has to be a string). CSS:
option[value="true"]{
color: red;
}
In case you not only want to show them in red color but prevent the user from selecting the options, you can use disable when:
<select
ng-model="Blah"
ng-options="person.Name disable when !person.Eligible for person in persons">
</select>
You can then use CSS to set the color of disabled options.
I can't write this as a comment, due to reputation, but I have updated the plunker for the accepted answer to work with Angular 1.4.8. Thanks to Ben Lesh for the original answer, it helped me a lot. The difference seems to be that newer Angular generates options like this:
<option class="is-eligible" label="foo" value="object:1">foo</option>
so the code
option = elem.find('option[value=' + index + ']');
wouldn't be able to find the option. My change parses ngOptions and determines what field of item was used for the label, and finds the option based on that instead of value. See:
http://plnkr.co/edit/MMZfuNZyouaNGulfJn41
I know I am a bit late to the party, but for people who want to solve this with pure CSS, without using a directive you can make a css class like this:
select.blueSelect option[value="false"]{
color:#01aac7;
}
This css rule says : Find all elements with value = false with tag name 'option' inside every 'select' that has a class "blueSelect" and make the text color #01aac7; (a shade of blue)
In your case your HTML will look like this:
<select class="form-control blueSelect" name="persons" id="persons1"
ng-options="person as person.name for person in $ctrl.persons track by person.Eligible"
ng-model="$ctrl.selectedPerson" required>
<option disabled selected value="">Default value</option>
</select>
The track by inside the ng-options is what will hold what to track the options by, or the "value" field of each option. Notice that depending on your project needs , you might have to do some tweaking to make this work as per your requirements.
But that's not going to work right when there's multiple options with the same value for the Eligible field. So to make this work, we create a compound expression to track by, that way we can have unique values to track by in each option. In this case we combine both fields Name and Eligible
So now our html will look like this
<select class="form-control blueSelect" name="persons" id="persons2"
ng-options="person as person.name for person in $ctrl.persons track by (person.name + person.Eligible)"
ng-model="$ctrl.selectedPerson" required>
<option disabled selected value="">Default value</option>
</select>
and our css :
select.blueSelect option[value*="False"]{
color:#01aac7;
}
Notice the * next to value, this is a regular expression which means to find the word "False" somewhere in the value field of the option element.
Quick Edit
You can also choose to disable the options with Eligible = False using the "disable when" in the ng-options expression , for example:
label disable when disable for value in array track by trackexpr
I'll leave how to use that in your case for you to find out ;-)
This works for simple css modifications, for more complex stuff you might need a directive or other methods. Tested in chrome.
I hope this helps someone out there. :-)
I've found another workaround that was easier than adding a directive or filter, which is to add a handler for the onfocus event that applies the style.
angular.element('select.styled').focus( function() {
angular.element(this).find('option').addClass('myStyle');
});