matching between breaks in UIMA RUTA based on a condition - uima

I have the following sample text:
zip 20193
New York
USA
What I would like to do, is match only "New York" i.e., the line after the zipcode.
I tried using this code but it is not working -
DECLARE heading;
pin BREAK #{-> MARK(heading)} BREAK;
(I have declared pin before this).
Please let me know how to go about this.
Thanks!

The problem is probably the filtering setting. BREAK is by default not visible. It will never be a successful match because ruta will automatically skip the line breaks.
Try to add another rule changing the filtering setting in front of your rule:
RETAINTYPE(BREAK);
pin BREAK #{-> MARK(heading)} BREAK;
There could be another problem because BREAK represents \n and \r. Thus, the rule would not work for windows line endings. You would need something like:
pin BREAK[1,2] #{-> MARK(heading)} BREAK;
There is a utils analysis engine in ruta for annotating lines: PlainTextAnnotator
If you include it, you can write something like:
pin Line{-> heading};
(You maybe need to trim the Lines, e.g., with the TRIM action if the lines start or end with whitespaces)
DISCLAIMER: I am a developer of UIMA Ruta

Related

How to convert embedded CRLF codes to their REAL newlines in Vscode?

I searched everywhere for this, the problem is that the search criteria is very similar to other questions.
The issue I have is that file (script actually) is embedded in another file. So when I open the parent file I can see the script as massive string with several \n and \r\n codes. I need a way to convert these codes to what they should be so that it formats the code correctly then I can read said code and work on it.
Quick snippet:
\n\n\n\n\nlocal scriptingFunctions\n\n\n\n\nlocal measuringCircles = {}\r\nlocal isCurrentlyCheckingCoherency
Should covert to:
local scriptingFunctions
local measuringCircles = {}
local isCurrentlyCheckingCoherency
perform a Regex Find-Replace
Find: (\\r)?\\n
Replace: \n
If you don't need to reconvert from newlines to \n after you're done working on the code, you can accomplish the trick by simply pressing ctrl-f and substituting every occurrence of \n with a new line (you can type enter in the replace box by pressing ctrl-enter or shift-enter).
See an example ctrl-f to do this:
If after you're done working on the code you need to reconvert to \n, you can add an invisible char to the replace string (typing it like ctrl-enter invisibleChar), and after you're done you can re-replace it with \n.
There's plenty of invisible chars, but I'd personally suggest [U+200b] (you can copy it from here); another good one is [U+2800] (⠀), as it renders as a normal whitespace, and thus is noticeable.
A thing to notice is that recent versions of vscode will show a highlight around invisible chars, but you can easily disable it by clicking on Adjust settings and then selecting Exclude from being highlighted.
If you need to reenable highlighting in the future, you'll have to look for "editor.unicodeHighlight.allowedCharacters" in the settings.

How do you delete lines with certain keywords in VScode

I have this regular expression to find certain keywords on a line:
.*(word1|word2|word3).*
In the find and replace feature of the latest VSCode it works ok and finds the words but it just blanks the lines leaving big gaps in-between.
I would like to delete the entire line including linefeed.
The find and replace feature doesnt seem to support reg exp in the replace field.
If you want to delete the entire line make your regex find the entire line and include the linefeed as well. Something like:
^.*(word1|word2|word3).*\n?
Then ALT-Enter will select all lines that match and Delete will eliminate them including the lines they occupied.

How can I use curly braces in Netbeans code templates for example for an slf4j template?

Creating a Netbeans code template for creating an slf logger is described here:
http://wiki.netbeans.org/SLF4JCodeTemplate
However creating code templates for log statements, e.g.
logger.debug("Something: {}", var);
is harder than expected because the template language doesn't balance curly braces. This means it will end the capture at the first ending curly brace.
There exist some examples, like for example How to get current class name in Netbeans code template? but they do not touch into the curly brace issue.
I have tried to escape them in every way I could think of so farm including:
${LOGGER default="logger" editable=false}.debug("${logMessage}${: '{}'}", ${EXP instanceof="<any>" default="exp"});
and
${LOGGER default="logger" editable=false}.debug("${logMessage}${: \{\}}", ${EXP instanceof="<any>" default="exp"});
but no luck. Also my google skills have been failing me so far.
Turns out there is a simple solution. I didn't find it anywhere near anything about netbeans code templates, but under a question about freemarker:
How to output ${expression} in Freemarker without it being interpreted?
Basically the answer is to use r"..." around the code, like this:
${LOGGER default="logger" editable=false}.debug("${logMessage}${:r"{}"}", ${EXP instanceof="<any>" default="exp"});
Now this can be assigned to sld, so I can type slt, expand it to:
logger.debug("logMessage: {}", <last variable>);
Where "logMessage" is selected (so I can overwrite it with something useful, one tab selects ": {}" so I can delete it if I want to log without parameters and a last tab selects which is the last assigned value (in case I want to replace or remove it).

Sublime Text Macro Command Move to Specific Column Number

I'm trying to move to a specific column number in a Sublime Text macro command, so that I can delete everything after that point, but I can't get the move command to work.
I've compared the old and new versions of the unofficial documentation, and it looks like the move command used to have an "amount" parameter, but it hasn't been working for me. So the reason I'm writing is because I feel like there's something the docs are leaving out, but I don't know what it is, or how to even debug it in Sublime Text.
Here's an example of what I'm trying to do:
/*//////////////////////////////////////////////////////////
// Comment Block ///////////////////////////////////////////////
//////////////////////////////////////////////////////////*/
When the caret is at the end of the phrase "Comment Block", I need to run a command that advances the caret to a specific column number. After that, I can expand the selection to the end of the line and delete, trimming the line to be equal with its counterparts.

Removing 1000s of comments in eclipse?

I installed JD-GUI to retrieve my code from a jar file. Everything works fine, except JD-GUI automatically adds annoying comments like this:
Any way I can remove them? I don't understand regex.
Using Eclipse:
Go to Edit > Find/Replace...
Use this regular expression in the Find box: ^/\* [0-9 ]{3} \*/
^ match start of line.
/\* match start of comment
[0-9 ]{3} match exactly three digits/spaces
\*/ match end of comment
Make sure the Replace box is empty.
Make sure the Regular expressions checkbox is selected.
Click Replace All
Use CTRL+H. Within "File Search" > "Search string", check "Regular expression" and use one of the regex given by the other answers.
Then use "Replace..." to replace them all with nothing.
Use the utility sed to search for a regex and replace with an empty string. Here is a gist that should get you started with using it.
Since you don't understand regex, I'll help you out with it: /^\/\* \d+ \*\//gm will find every comment block that starts at the beginning of a line and contains a line number.
Here's how it works:
/ is the start of the regex
^ matches the begnning of the line
\/\* finds the opening /* of the comment
(space) finds the space before the line number
\d+ finds any number of digits
(space) finds the space after the line number
\*\/ finds the ending */ of the comment
/gm ends the regex and flags this as a global, multiline search