How to process CASes produced by CAS Multiplier concurrently - uima

I am implementing UIMA pipeline with CASMultiplier and UIMA AS. I have a Segmenter Analysis Engine (A CASMultiplier) and a Analysis Engine (Annotator A). I created a Aggregate Analysis Engine of the Segmenter and Annotator A, and then I create a UIMA AS Deployment descriptor file with intention, the Segmenter produces CASes, and then the Annotator A process with CASes concurrently. The contents of the aggregate analysis engine descriptor file and the deployment descriptor file are as following:
AAE descriptor file:
<analysisEngineDescription xmlns="http://uima.apache.org/resourceSpecifier">
<frameworkImplementation>org.apache.uima.java</frameworkImplementation>
<primitive>false</primitive>
<delegateAnalysisEngineSpecifiers>
<delegateAnalysisEngine key="Segmenter">
<import location="../cas_multiplier/SimpleTextSegmenter.xml"/>
</delegateAnalysisEngine>
<delegateAnalysisEngine key="AnnotatorA">
<import location="AnnotatorA.xml"/>
</delegateAnalysisEngine>
</delegateAnalysisEngineSpecifiers>
<analysisEngineMetaData>
<name>Segmenter and AnnotatorA</name>
<description>Splits a document into pieces and runs Annotator on each
piece independently. All segments are output.</description>
<configurationParameters/>
<configurationParameterSettings/>
<flowConstraints>
<fixedFlow>
<node>Segmenter</node>
<node>AnnotatorA</node>
</fixedFlow>
</flowConstraints>
<capabilities>
<capability>
<inputs/>
<outputs>
<type allAnnotatorFeatures="true">com.trang.uima.types.Target</type>
<type allAnnotatorFeatures="true">com.eg.uima.types.IntermediateResult</type>
</outputs>
<languagesSupported>
</languagesSupported>
</capability>
</capabilities>
<operationalProperties>
<modifiesCas>true</modifiesCas>
<multipleDeploymentAllowed>true</multipleDeploymentAllowed>
<outputsNewCASes>true</outputsNewCASes>
</operationalProperties>
</analysisEngineMetaData>
</analysisEngineDescription>
Deployment descriptor file:
<?xml version="1.0" encoding="UTF-8"?><analysisEngineDeploymentDescription xmlns="http://uima.apache.org/resourceSpecifier">
<name>SegmenterAndBackTranstion</name>
<description>Deploys Segmenter and BackTranskation with 3 instances of BackTransation</description>
<version/>
<vendor/>
<deployment protocol="jms" provider="activemq">
<casPool numberOfCASes="5" initialFsHeapSize="2000000"/>
<service>
<inputQueue endpoint="SegmentAnBackTranslationQueue" brokerURL="tcp://localhost:61616" prefetch="0"/>
<topDescriptor>
<import location="../../descriptors/langrid_uima/SegmenterAndBackTranslationAE.xml"/>
</topDescriptor>
<analysisEngine async="false">
<scaleout numberOfInstances="5"/>
<casMultiplier poolSize="8" initialFsHeapSize="2000000" processParentLast="false"/>
<asyncPrimitiveErrorConfiguration>
<processCasErrors thresholdCount="0" thresholdWindow="0" thresholdAction="terminate"/>
<collectionProcessCompleteErrors timeout="0" additionalErrorAction="terminate"/>
</asyncPrimitiveErrorConfiguration>
</analysisEngine>
</service>
</deployment>
</analysisEngineDeploymentDescription>
After have this setting, I run the pipeline, however, it seems the CASes are process synchronously, one at a time.
Could anyone tell me, what am doing wrong? Is there a way to process CASes produced by CASMultiplier concurrently?
Thank you very much!

Related

How do I start a job, when the job name is not known at deployment time?

I'm trying to start a batch job, which isn't known at deployment time. (Admin users can define their own jobs via rest-api)
I'm calling:
JobOperator jobOperator = BatchRuntime.getJobOperator();
-- > Class org.wildfly.extension.batch.jberet.deployment.JobOperatorService -
Which dosn't allow to start unknown jobs.
Javadoc says:
* Note that for each method the job name, or derived job name, must exist for the deployment. The allowed job names and
* job XML descriptor are determined at deployment time.
How can i start jobs that are not determined at deployment?
Thanks in advance
You can have some conventions in your batch job naming, so that it is kind of known at deployment time to bypass the deployment-time validation. For instance, you can package a placeholder job in your application:
<?xml version="1.0" encoding="UTF-8"?>
<job id="submitted-job" xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee http://xmlns.jcp.org/xml/ns/javaee/jobXML_1_0.xsd" version="1.0">
<!-- this job is defined and submitted dynamically by the client -->
</job>
At runtime, the admin can then dynamically fill in the job content.

How to override the keys in windows services .exe.config file through VSTS release definition

I am working on VSTS release task for deploying the Windows Services Project. Unfortunately, we are not creating any Build Definition for creating drop folder.
But, my client will provide drop folder for this project, what I need is “I want to override the keys of an existing .exe.config file” at release level.
For creating the Windows Services Deploy task,I followed this Windows Services Extension
For example my drop folder looks like below:
Many thanks for this reference article and It's a very useful for changing values in config file using Power Shell commands. I have doubt in from that reference link :
For Example, If had a Code like this :
<erecruit.tasks>
<tasks>
<task name="AA" taskName="AA">
<parameters>
<param key="connectionString">Server="XXXX"</param>
</parameters>
</task>
How to change this above connectionstring value?
You can use Tokenizer task in Release Management Utility tasks extension.
Install Release Management Utility tasks extension
Add Tokenizer with XPath/Regular expressions task to release definition (Specify Source filename and Configuration Json filename)
Config file sample:
<?xml version="1.0" encoding="utf-8" ?>
<configuration>
<appSettings>
<add key="TestKey1" value="__Token1__" />
<add key="TestKey2" value="__Token2__" />
<add key="TestKey3" value="__Token3__" />
<add key="TestKey4" value="__Token4__" />
</appSettings>
<startup>
<supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5.2" />
</startup>
</configuration>
Configuration Json file (Default Environment is the environment name in release definitioin):
{
"Default Environment":{
"CustomVariables":{
"Token2":"value_from_custom2",
"Token3":"value_from_custom3"
},
"ConfigChanges":[
{
"KeyName":"/configuration/appSettings/add[#key='TestKey1']",
"Attribute":"value",
"Value":"value_from_xpath"
}
]
}
}
Then the value of TestKey1 (key) will be related to value_from_xpath and the values of TestKey2 and TestKey3 will be related to value_from_custom2 and value_from_custom3.
On the other hand, you can use release variables directly if you don’t specify Configuration Json filename.
For example, there is __TokenVariable1__ in your config file and TokenVariable1 release/environment variable in release definition, then the __TokenVariable1__ will be replaced through Tokenizer task.
A related article: Using Tokenization (Token Replacement) for Builds/Releases in vNext/TFS 2015
Update:
You also can do it through PowerShell directly.
Update configuration files using PowerShell

Not create new log file once start my service with using log4j2

following is my configration of log4j2:
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="trace" name="MyApp" packages="com.swimap.base.launcher.log">
<Appenders>
<RollingFile name="RollingFile" fileName="logs/app-${date:MM-dd-yyyy-HH-mm-ss-SSS}.log"
filePattern="logs/$${date:yyyy-MM}/app-%d{MM-dd-yyyy}-%i.log.gz">
<PatternLayout>
<Pattern>%d %p %c{1.} [%t] %m%n</Pattern>
</PatternLayout>
<Policies>
<SizeBasedTriggeringPolicy size="1 KB"/>
</Policies>
<DefaultRolloverStrategy max="3"/>
</RollingFile>
</Appenders>
<Loggers>
<Root level="trace">
<AppenderRef ref="RollingFile"/>
</Root>
</Loggers>
</Configuration>
the issue is that each time when start up my service, a new log will be created even the old one has not reached the specific size. If the program restart frequently, i will got many log files end with ‘.log’ which never be compressed.
the logs i got like this:
/log4j2/logs
/log4j2/logs/2017-07
/log4j2/logs/2017-07/app-07-18-2017-1.log.gz
/log4j2/logs/2017-07/app-07-18-2017-2.log.gz
/log4j2/logs/2017-07/app-07-18-2017-3.log.gz
/log4j2/logs/app-07-18-2017-20-42-06-173.log
/log4j2/logs/app-07-18-2017-20-42-12-284.log
/log4j2/logs/app-07-18-2017-20-42-16-797.log
/log4j2/logs/app-07-18-2017-20-42-21-269.log
someone can tell me how can i append log to the exists log file when i start up my program? much thanks whether u can help me closer to the answer!!
I suppose that your problem it that you have fileName="logs/app-${date:MM-dd-yyyy-HH-mm-ss-SSS}.log in your log4j2 configuration file.
This fileName template means that log4j2 will create log file with name that contains current date + hours + minutes + seconds + milliseconds in its name.
You should probably remove HH-mm-ss-SSS section and this will allow you to have daily rolling file and to not create new file every app restart.
You can play with template and choose format that you need.
If you want only one log file forever - then create constant fileName, like fileName=app.log
It's not hard to implement this. There is a interface DirectFileRolloverStrategy, implement below method:
public String getCurrentFileName(RollingFileManager manager)
Mybe someone met same problem and this can help him.

Interface to work with the Task Scheduler

I'm developing a simple JScript script to be run by Windows Script Host.
This script needs to read some data from the Task Scheduler. I have no clue how to get started.
I've already implemented similar functionality in c++ using Task Scheduler 2.0 Interfaces
Can I use those interfaces in JScript somehow?
No, you can't use the Task Scheduler 2.0 interfaces from JScript.
What you can do however, is read the XML files that the task scheduler creates. They contain all properties of all defined tasks.
They reside in %windir%\system32\tasks (you need Administrator permissions to read this directory and its contents).
Here is an example of such a file, it's very straightforward XML:
<Task version="1.1" xmlns="http://schemas.microsoft.com/windows/2004/02/mit/task">
<RegistrationInfo>
<Author>SYSTEM</Author>
<Description>Some text here...</Description>
</RegistrationInfo>
<Triggers>
<LogonTrigger>
<Enabled>true</Enabled>
</LogonTrigger>
<CalendarTrigger>
<Enabled>true</Enabled>
<StartBoundary>2015-07-16T05:32:00</StartBoundary>
<ScheduleByDay>
<DaysInterval>1</DaysInterval>
</ScheduleByDay>
</CalendarTrigger>
</Triggers>
<Settings>
<Enabled>true</Enabled>
<ExecutionTimeLimit>PT0S</ExecutionTimeLimit>
<Hidden>false</Hidden>
<WakeToRun>false</WakeToRun>
<DisallowStartIfOnBatteries>false</DisallowStartIfOnBatteries>
<StopIfGoingOnBatteries>false</StopIfGoingOnBatteries>
<RunOnlyIfIdle>false</RunOnlyIfIdle>
<Priority>5</Priority>
<IdleSettings>
<Duration>PT600S</Duration>
<WaitTimeout>PT3600S</WaitTimeout>
<StopOnIdleEnd>false</StopOnIdleEnd>
<RestartOnIdle>false</RestartOnIdle>
</IdleSettings>
</Settings>
<Principals>
<Principal id="Author">
<UserId>System</UserId>
<RunLevel>HighestAvailable</RunLevel>
<LogonType>InteractiveTokenOrPassword</LogonType>
</Principal>
</Principals>
<Actions Context="Author">
<Exec>
<Command>C:\path\to\executable.exe</Command>
<Arguments>/args</Arguments>
</Exec>
</Actions>
</Task>
List of things to find out:
How to run a script with elevated permissions.
How to navigate a directory structure using the FileSystemObject.
How to open XML files using the MSXML2 COM objects
How to use XPath to navigate those XML documents.
How to deal with a default XML namespace (this is more important than it sounds - you won't get any results from XPath until you did this part correctly).
If necessary for your task, find out how ISO 8601 time period notation works so you can decode values like PT600S.
Luckily, for all of those things there are any number of examples available (on this site and elsewhere) to get you started.

Load jobs at startup to Spring Batch Admin

From the Spring Batch Admin documentation, it mentioned that jobs will be loaded if job configuration file is located in classpath under META-INF/spring/batch/jobs/*.xml
Documentation
In the spring-batch-admin-sample that comes with STS, the jobs are loaded when the admin web application is deployed, under the file classpath:\META-INF\batch\module-context.xml And it is bootstrapped at deployment. Not sure how that works...
While I can load the job configuration by uploading in the user interface, http://localhost:8080/simple-batch-admin/configuration, some of my custom beans were not autowired for some reason. So the desirable behavior would be to load all the jobs when Admin is deployed.
Thank you in advance.
After several round of digging, I was able to load the job file. I have to place my job file in /META-INF/spring/batch/jobs/ folder not /META-INF/batch/ Also, in order for my jobLauncher, jobRepository, dataSource, etc. to get discover at load time. I have to put it in src/main/resources/META-INF/spring/batch/spring/batch/bootstrap/**/
All because of two files in spring-batch-admin-resources-1.2.0.RELEASE.jar in org.springframework.batch.admin.web.resources
servlet-config.xml
<import resource="classpath*:/META-INF/spring/batch/servlet/resources/*.xml" />
<import resource="classpath*:/META-INF/spring/batch/servlet/manager/*.xml" />
<import resource="classpath*:/META-INF/spring/batch/servlet/override/*.xml" />
which allows me to add menu and controller under the src/main/resources/META-INF/spring/batch/servlet/override/*xml
and
webapp-config.xml
<import resource="classpath*:/META-INF/spring/batch/bootstrap/**/*.xml" />
<import resource="classpath*:/META-INF/spring/batch/override/**/*.xml" />
where I put my launch context