Incorrect partition detected by PartitionScanner in custom TextEditor for eclipse - eclipse

I have a PartitionScanner that extends RuleBasedPartitionScanner in my custom text editor plugin for Eclipse. I am having issues with the partition scanner detecting character sequences within larger strings, resulting in document being partitioned incorrectly. For example, within the constructor of m partition scanner I have following rule set-up:
public MyPartitionScanner() {
...
rules.add(new MultiLineRule("SET", "ENDSET", mytoken));
...
}
However, if I happen to use a token that contains the character sequence "SET," it seems like partition scanner would continue searching for endSequence("ENDSET") and will make the rest of the document as single partition set to "mytoken."
var myRESULTSET34 = ...
Is there a way to make the partition scanner ignore the word "SET" from the token above? And only recognize the whole word "SET"?
Thank you.

Using the MultilineRule as is, you won't be able to differentiate. But you can create your own subclass that overrides sequenceDetected and does a lookback/lookahead when the super impl returns true, to make sure that it's preceded/followed by EOF/whitespace. If it doesn't, then push back the characters onto the scanner and return false.

Related

How to create element from API with name from appropriate sequence?

I'd like to use EA to generate Requirement elements programatically. I need to use the same sequence numbering (REQ00000xy), as with the GUI when pressing "Auto" button in "Add Element ..." dialog in order to keep´consistent numbering for Requirement elements created either from GUI or from API.
Selecting the last used sequence number from already existing Requierement elements won't help, as it don't move the sequence number up and next Requirement created from GUI .
Is there a way to get (and properly use) the sequence number via EA API or EA SQL?
The table you're looking for is t_trxtypes. This contains something like (EA's output)
Description;NumericWeight;Notes;TRX;TRX_ID;Style;
Autocount;1,00;prefix=bla;suffix=x;active=1;active_a=0;counter=126;;Class;1; ;
You're interested in the column Notes which holds as CSV list like
prefix=bla;suffix=x;active=1;active_a=0;counter=126;
This is a test setting for a class which currently has the number 126. So the next created class would be named bla126x and the entry would change to
prefix=bla;suffix=x;active=1;active_a=0;counter=127;
Just keep the columne t_trxtypes.notes in synch with your creations.
Note EA does not (seem to) allow direct DB access. However, it has a proven back door:
Repository.Execute("UPDATE t_trxtypes SET Notes='prefix=bla;suffix=x;active=1;active_a=0;counter=127;' WHERE TRX_ID=<your id>")
will do the update (replace <your id> with the appropriate key). Though Execute is undocumented it works ever since and for sure Sparx will not limit it as nowadays everyone relies on it.
As a side note: This counter is not safe. There are lots of ways (the easiest is a simple rename) to break it. You'd need some script/add-in to have regular checks your numbering is still consistent. If you rely on requirement numbering you better use an external system like, I dare to say, DOORS.
Finally, RTFM....
For elements, where sequence is defined, if you use empty name in set =AddNew() function, EA generates the sequence upon .Update(). Not earlier. So if you plan to use the generated sequence and add some description, you need to create the element with empty name first, then Update() it and after that append your description to the content of the Name field.
As easy as this.

mirth connect Database Reader automatic column mapping

Please could somebody confirm the following..
I am using Mirth Connect 3.5.08232.
My Source Connector is a Database Reader.
Say, I am using a query that returns multiple rows, and return the result (via JavaScript), as documentation suggests, so that Mirth would treat each row as a separate message. I also use a couple of mappers as source transformers, and save the mapped fields in my channel map (which ends up to contain only those fields that I define in transformers)
In the destination, and specifically, in destination response transformer (or destination body, if it is a JavaScript writer), how do I access the source fields?
the only way I found by trial and error is
var rawMsg = connectorMessage.getRawData();
var xmlMsg = new XML(rawMsg);
logger.info(xmlMsg.some_field); // ignore the root element of rawMsg
Is this the right way to do this? I thought that maybe the fields that were nicely automatically detected would be put in some kind of a map, like sourceMap - but that doesn't seem to be the case, right?
Thank you
If you are using Mapper steps in your transformer to extract the data and put it into a variable map (like the channel map), then you can use any of the following methods to retrieve it from a subsequent JavaScript context (including a JavaScript Writer, and your response transformer):
var value = channelMap.get('key');
var value = $c('key');
var value = $('key');
Look at the Variable Maps section of the User Guide for more information.
So to recap, say you're selecting a column "mycolumn" with a Database Reader. The XML sent to the channel will be something like this:
<result>
<mycolumn>value</mycolumn>
</result>
Then you can choose to extract pieces of that message into specific variables for later use. The transformer allows you to easily drag-and-drop pieces of the sample inbound message.
Finally in your JavaScript Writer (or in any subsequent filter, transformer, or response transformer), just drag the value into the field you want:
And the corresponding JavaScript code will automatically be inserted:
One last note, if you are selecting a lot of variables and don't want to make Mapper steps for each one individually, you can use a JavaScript Step to iterate through the message and extract each column into a separate map variable:
for each (child in msg.children()) {
channelMap.put(child.localName(), child.toString());
}
Or, you can just reference the columns directly from within the JavaScript Writer:
var msg = new XML(connectorMessage.getEncodedData());
var column1 = msg.column1.toString();
var column2 = msg.column2.toString();
...

Spark: How to structure a series of side effect actions inside mapping transformation to avoid repetition?

I have a spark streaming application that needs to take these steps:
Take a string, apply some map transformations to it
Map again: If this string (now an array) has a specific value in it, immediately send an email (or do something OUTSIDE the spark environment)
collect() and save in a specific directory
apply some other transformation/enrichment
collect() and save in another directory.
As you can see this implies to lazily activated calculations, which do the OUTSIDE action twice. I am trying to avoid caching, as at some hundreds lines per second this would kill my server.
Also trying to mantaining the order of operation, though this is not as much as important: Is there a solution I do not know of?
EDIT: my program as of now:
kafkaStream;
lines = take the value, discard the topic;
lines.foreachRDD{
splittedRDD = arg.map { split the string };
assRDD = splittedRDD.map { associate to a table };
flaggedRDD = assRDD.map { add a boolean parameter under a if condition + send mail};
externalClass.saveStaticMethod( flaggedRDD.collect() and save in file);
enrichRDD = flaggedRDD.map { enrich with external data };
externalClass.saveStaticMethod( enrichRDD.collect() and save in file);
}
I put the saving part after the email so that if something goes wrong with it at least the mail has been sent.
The final 2 methods I found were these:
In the DStream transformation before the side-effected one, make a copy of the Dstream: one will go on with the transformation, the other will have the .foreachRDD{ outside action }. There are no major downside in this, as it is just one RDD more on a worker node.
Extracting the {outside action} from the transformation and mapping the already sent mails: filter if mail has already been sent. This is a almost a superfluous operation as it will filter out all of the RDD elements.
Caching before going on (although I was trying to avoid it, there was not much to do)
If trying to not caching, solution 1 is the way to go

Requesting member of node_element results in "undefined"

I'm using Opa for a school project in which there has to be some synchronization of a textfield between several users. The easy way to solve this, is to transmit the complete field whenever there is a change performed by one of the users. The better way is of course to only transmit the changes.
My idea was to use the caret position in the textfield. As a user types, one can get the last typed character based on the caret position (simply the character before the caret). A DOM element has an easy-to-use field for this called selectionStart. I have this small Javascript for this:
document.getElementById('content').selectionStart
which correctly returns 5 if the caret stands at the fifth character in the field. In Opa, I cannot use selectionStart on either a DOM or a dom_element so I thought I'd write a small plugin. The result is this:
##extern-type dom_element
##register jsGetCaretPosition: dom_element -> int
##args(node)
{
return node.selectionStart;
}
This compiles with the opp-builder without any problem and when I put this small line of code in my Opa script:
#pos = %%caret.jsGetCaretPosition%%(Dom.of_selection(Dom.select_id("content")));
that also compiles without problems. However, when I run the script, it always returns "undefined" and I have no idea what I'm doing wrong. I've looked in the API and Dom.of_selection(Dom.select_id("content")) looked like the correct way to get the corresponding dom_element typed data to give to the plugin. The fact that the plugin returns "undefined" seems to suggest that the selected element does not know the member "selectionStart" eventhough my testcode in Javascript suggest otherwise. Anyone can help?
In Opa dom_element are the results of jQuery selection (i.e. an array of dom nodes). So if I well understood your program you should write something like node[0].selectionStart instead of node.selectionStart.
Moreover you should take care of empty selection and selection which doesn't contains textarea node (without selectionStart property). Perhaps the right code is tmp == undefined ? -1 : tmp = node[0].selectionStart == undefined ? -1 : tmp

Make Lucene index a value and store another

I want Lucene.NET to store a value while indexing a modified, stripped-down version of the stored value. e.g. Consider the value:
this_example-has some/weird (chars) 100%
I want it stored right like that (so that I can retrieve exactly that for showing in the results list), but I want lucene to index it as:
this example has some weird chars 100
(you see, like a "sanitized" version of the original value) for a simplified search.
I figure this would be the job of an analyzer, but I don't want to mess with rolling my own. Ideally, the solution should remove everything that is not a letter, a number or quotes, replacing the removed chars by a white-space before indexing.
Any suggestions on how to implement that?
This is because I am indexing products for an e-commerce search, and some have realy creepy names. I think this would improve search assertiveness.
Thanks in advance.
If you don't want a custom analyzer, try storing the value as a separate non-indexed field, and use a simple regex to generate the sanitized version.
var input = "this_example-has some/weird (chars) 100%";
var output = Regex.Replace(input, #"[\W_]+", " ");
You mention that you need another Analyzer for some searching functionality. Dont forget the PerFieldAnalyzerWrapper which will allow you to use different analyzers within the same document.
public static void Main() {
var wrapper = new PerFieldAnalyzerWrapper(defaultAnalyzer: new StandardAnalyzer(Version.LUCENE_29));
wrapper.AddAnalyzer(fieldName: "id", analyzer: new KeywordAnalyzer());
IndexWriter writer = null; // TODO: Retrieve these.
Document document = null;
writer.AddDocument(document, analyzer: wrapper);
}
You are correct that this is the work of the analyzer. And I'd start by using a tool like luke to see what the standard analyzer does with your term before getting into what to use -- it tends to do a good job stripping noise characters and words.