Will Ruta support all type of spaces?-Uima Ruta - uima

I need to know whether ruta support all type of spaces.I've tried to annotate Fig 1.1 and Fig 1.2.
Sample Code:
PACKAGE uima.ruta.example;
DECLARE FigNo;
(W{REGEXP("Fig",true)} NUM PERIOD NUM){->MARK(FigNo)};
Input:
Fig 1.1
Fig 1.2
Expected Output:
Fig 1.1
Fig 1.2
But for me only Fig 1.1 is coming under FigNo. The difference between Fig 1.1 and Fig 1.2 is Space (Fig thinspace 1.1 and Fig emspace 1.2).

How to answer this question?
Yes, UIMA Ruta will support all types of spaces, if someone opens a ticket for it here and someone implements the changes, e.g., by uploading a patch.
If you are the first someone, I'll do the part of the second someone.
DISCLAIMER: I am a developer of UIMA Ruta

Related

Coldfusion/Lucee Encoding Issue When Using EncodeForHTML

Running into an issue when using EncodeForHTML for certain characters (Emojis in this case)
The text in this case is:
⌛️a😊b👍c😟 💥🍉🍔 💩 🤦🏼‍♀️🤦🏼‍♀️🤦🏼‍♀️ 😘
Now if I just a straight output
<cfoutput>#txt#</cfoutput>
It displays correctly, no issues, but if I use EncodeForHTML first
<cfoutput>#EncodeForHTML(txt)#</cfoutput>
I get this
⌛️a��b��c�� ������ �� ����‍♀️����‍♀️����‍♀️ ��
I tested it with EncodeForXML & esapiEncode as well to be sure; all are giving me the same result.
I've verified the encoding settings in Lucee are UTF-8, and the meta charset tag is also set to UTF-8. I can't find any documenation re: EncodeForHTML saying if it make any changes to the character encoding, if it requires the character encoding to be something specific, or if it has any known issues with emojis or certain code points.
I appreciate any help or clarification anyone can provide.
Edit: Thank you everyone. Wish I could accept multiple answers.
I was required to sanitize emojis in order ensure that third-party content was cross-compatible with external services. Some of the content contained emojis and was causing export/import problems. I wrote a ColdFusion wrapper for the emoji-java library to identify, sanitize and convert emojis.
https://github.com/JamoCA/cf-emoji-java
For example, the parseToAliases() function "replaces all the emoji's unicodes found in a string by their aliases".
emojijava = new emojijava();
emojijava.parseToAliases('I like 🍕'); // I like :pizza:
To "encode" you could use either the parseToHtmlDecimal() or parseToHtmlHexadecimal() functions prior to using EncodeForHTML().
emojijava = new emojijava();
test = emojijava.parseToHtmlDecimal('I like 🍕'); // I ❤️ 🍕
EncodeForHTML(test);
At the time of this writing, ColdFusion's latest version is 2018 update 9
In turn, it uses ESAPI 2.1.1
Recent release notes don't mention Emoji,
https://github.com/ESAPI/esapi-java-legacy/tree/develop/documentation
But they do mention in Pull request 413
"Fixing ESAPI's inability to handle non-BMP codepoints."
This dates from 2017
https://github.com/ESAPI/esapi-java-legacy/pull/413
So based on all this information, I would recommend doing both of the following
Try using ESAPI directly. This is how it was done before ESAPI was added to CF. This issue may or may not still exist in ESAPI
Put in a ticket with Adobe to update this library.
Yes, ESAPI 2.2.0.0 addressed the issue of not correctly encoding non-BMP characters (see https://github.com/ESAPI/esapi-java-legacy/issues/300) as part of PR #413 that James mentioned above.
But I just uploaded release ESAPI 2.2.1.0-RC1 (release candidate 1) to Maven Central early this morning and hope to have an official 2.2.1.0 release out by next weekend, so if you are going to put in a ticket with Adobe for fix this with an updated version of ESAPI, I'd wait another week and then tell them to update to 2.2.1.0.

Where is the high-level charts in bokeh's last version (1.0.3)?

I hava a quick question:
I see that you have the high-level charts in the version 0.11.0:
http://docs.bokeh.org/en/0.11.0/docs/user_guide/charts.html
But I can't find the same topic in the last version (1.0.3)? Did bokeh team remove it?
I have been working on adapting histograms to my project but can't find the histograms section in the last version?
Any ideas?
The bokeh.charts API was deprecated and scheduled to be removed in version 1.0 several years ago. However, even that schedule was also eventually accelerated, due to lack of interest and better alternatives. It has been completely gone from the project since late 2017.
For very high level APIs on top of Bokeh, consider Chartify or Holoviews. Or just create histograms directly with the stable bokeh.plotting API. e.g.
from bokeh.plotting import figure, output_file, show
from bokeh.sampledata.autompg import autompg as df
from numpy import histogram
p = figure(plot_height=300)
hist, edges = histogram(df.hp, density=True, bins=20)
p.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:], line_color="white")
show(p)

How to add DocumentBlockExtension in uima ruta

When I try to use DocumentBlock it shows not define in script block. How I need to add it in additional Engine? Can anyone explain in detail about the usage of DocumentBlock.
It depends on how you call/execute the RutaEngine. You need to add org.apache.uima.ruta.block.DocumentBlockExtension to the value of the configuration parameter additionalExtensions.
Using uimaFIT, this would look like:
AnalysisEngineFactory.createEngineDescription(RutaEngine.class, ...,
RutaEngine.PARAM_ADDITIONAL_EXTENSIONS, new String[] {
DocumentBlockExtension.class.getName() });
In a simple Ruta project, the value needs to be set in descriptor/BasicEngine.xml which is used as a template for the generated descriptors which are used in the launch configuration, if it is not automatically set in the generated descriptor.
DISCLAIMER: I am a developer of UIMA Ruta

trouble while running for code in Product Market introduction

I 'm new here, and I am encountering a problem. I have taken an already existing code for 2 new products entering the market and I am trying to modify it to support 3 products. I have added all the extra variables and conditions needed but it shows an error on plotting. Could it be the excel or...? I could use some help on this...If any extra info is needed, I 'd be happy to provide you! Thanks for your time in advance!
the error is:
http://prntscr.com/9gy84l
It appears that you added variables and conditions, but did not add an additional plot pen to the plot. You can add the pen by editing the plot itself.

How to setting using unicode font in Enterprise Architect

I try to draw Data Flow Diagram (DFD) by Enteprise Architect (EA). I am using Vietnamese, but Enterprise Architect only display true unicode at first time, then display error. If you using EA, let's try some unicode text: "Sinh viên", "Quản trị hệ thống", "Kho đồ án" etc.. and figure out how to type unicode font and display, print these text true.
Thank you!
(I have been read this , but answer doesn't figure out real solution, although the answer maked accepted)
I use EA 10.x, your text is ok for storing in EA.
But, before add unicode symbols to your model you should perform following:
Switch on Jet4 support
Download empty jet4 project template from sparx
Import existing project to new one
or in details follow this instruction: http://www.sparxsystems.com/enterprise_architect_user_guide/9.3/projects_and_teams/check_in_languages_other_than_.html