How to set ExitStatus of aborted job to "reason for abort" message? - spring-batch

I need to abort a running job and set its ExitStatus to "reason for abort".
It's quite easy to abort a running job:
try {
if (jobOperator.stop(jobExecutionId)) {
jobOperator.abandon(jobExecutionId);
log.info("The job with JobId :" + jobExecutionId + " was canceled.");
}
}
And yes, I know, the only way to set the ExitStatus is to use AfterJob in JobExecutionListener (Setting EXIT_MESSAGE in batch_job_execution)
But how can I transfer "reason for abort" message from the code that abort the job to JobExecutionListener's AfterJob?

how can I transfer "reason for abort" message from the code that abort the job to JobExecutionListener's AfterJob?
There is no way to do that. The only information that you can pass from the "outside" (through the job operator) to the "inside" (the job execution and its listeners) is the stop signal.
Everything else can be done on the job execution itself after stopping or aborting it, something like:
try {
if (jobOperator.stop(jobExecutionId)) {
jobOperator.abandon(jobExecutionId);
JobExecution jobExecution = jobExplorer.getJobExecution(jobExecutionId);
jobExecution.setExitStatus(new ExitStatus("ABORTED", "reason for abort"));
jobRepository.update(jobExecution);
log.info("The job with JobId :" + jobExecutionId + " was canceled.");
}
}

Related

Kafka transaction: Receiving CONCURRENT_TRANSACTIONS on AddPartitionsToTxnRequest

I am trying to publish in a transaction a message on 16 Kafka partitions on 7 brokers.
The flow is like this:
open transaction
write a message to 16 partitions
commit transaction
sleep 25 ms
repeat
Sometimes the transaction takes over 1 second to complete, with an average of 50 ms.
After enabling trace logging on producer's side, I noticed the following error:
TRACE internals.TransactionManager [kafka-producer-network-thread | producer-1] - [Producer clientId=producer-1, transactionalId=cma-2]
Received transactional response AddPartitionsToTxnResponse(errors={modelapp-ecb-0=CONCURRENT_TRANSACTIONS, modelapp-ecb-9=CONCURRENT_TRANSACTIONS, modelapp-ecb-10=CONCURRENT_TRANSACTIONS, modelapp-ecb-11=CONCURRENT_TRANSACTIONS, modelapp-ecb-12=CONCURRENT_TRANSACTIONS, modelapp-ecb-13=CONCURRENT_TRANSACTIONS, modelapp-ecb-14=CONCURRENT_TRANSACTIONS, modelapp-ecb-15=CONCURRENT_TRANSACTIONS, modelapp-ecb-1=CONCURRENT_TRANSACTIONS, modelapp-ecb-2=CONCURRENT_TRANSACTIONS, modelapp-ecb-3=CONCURRENT_TRANSACTIONS, modelapp-ecb-4=CONCURRENT_TRANSACTIONS, modelapp-ecb-5=CONCURRENT_TRANSACTIONS, modelapp-ecb-6=CONCURRENT_TRANSACTIONS, modelapp-ecb-=CONCURRENT_TRANSACTIONS, modelapp-ecb-8=CONCURRENT_TRANSACTIONS}, throttleTimeMs=0)
for request (type=AddPartitionsToTxnRequest, transactionalId=cma-2, producerId=59003, producerEpoch=0, partitions=[modelapp-ecb-0, modelapp-ecb-9, modelapp-ecb-10, modelapp-ecb-11, modelapp-ecb-12, modelapp-ecb-13, modelapp-ecb-14, modelapp-ecb-15, modelapp-ecb-1, modelapp-ecb-2, modelapp-ecb-3, modelapp-ecb-4, modelapp-ecb-5, modelapp-ecb-6, modelapp-ecb-7, modelapp-ecb-8])
The Kafka producer retries sending AddPartitionsToTxnRequest(s) several times until it succeeds, but this leads to delays.
The code looks like this:
Properties producerProperties = PropertiesUtil.readPropertyFile(_producerPropertiesFile);
_producer = new KafkaProducer<>(producerProperties);
_producer.initTransactions();
_producerService = Executors.newSingleThreadExecutor(new NamedThreadFactory(getClass().getSimpleName()));
_producerService.submit(() -> {
while (!Thread.currentThread().isInterrupted()) {
try {
_producer.beginTransaction();
for (int partition = 0; partition < _numberOfPartitions; partition++)
_producer.send(new ProducerRecord<>(_producerTopic, partition, KafkaRecordKeyFormatter.formatControlMessageKey(_messageNumber, token), EMPTY_BYTE_ARRAY));
_producer.commitTransaction();
_messageNumber++;
Thread.sleep(_timeBetweenProducedMessagesInMillis);
} catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException | UnsupportedVersionException e) {
closeProducer();
break;
} catch (KafkaException e) {
_producer.abortTransaction();
} catch (InterruptedException e) {...}
}
});
Looking to broker's code, it seems there are 2 cases when this error is thrown, but I cannot tell why I get there
object TransactionCoordinator {
...
def handleAddPartitionsToTransaction(...): Unit = {
...
if (txnMetadata.pendingTransitionInProgress) {
// return a retriable exception to let the client backoff and retry
Left(Errors.CONCURRENT_TRANSACTIONS)
} else if (txnMetadata.state == PrepareCommit || txnMetadata.state == PrepareAbort) {
Left(Errors.CONCURRENT_TRANSACTIONS)
}
...
}
...
}
Thanks in advance for help!
Later edit:
Enabling trace logging on broker we were able to see that broker sends to the producer END_TXN response before transaction reaches state CompleteCommit. The producer is able to start a new transaction, which is rejected by the broker while it is still in the transition PrepareCommit -> CompleteCommit.

How to catch/Capture "javax.net.ssl.SSLHandshakeException: Failed to create SSL connection" while sending message over java vert.x Eventbus

I am trying to use SSL over eventbus. To test the failure case I tried sending message to the eventbus from another verticle in same cluster by passing some different keystore.
I am getting below exception on console but it is not failing the replyHandler hence my code is not able to detect the SSL exception.
my code:
eb.request("ping-address", "ping!", new DeliveryOptions(), reply -> {
try {
if (reply.succeeded()) {
System.out.println("Received reply " + reply.result().body());
} else {
System.out.println("An exception " + reply.cause().getMessage());
}
} catch (Exception e) {
System.out.println("An error occured" + e.getCause());
}
});
Exception on console:
**javax.net.ssl.SSLHandshakeException: Failed to create SSL connection**
at io.vertx.core.net.impl.ChannelProvider$1.userEventTriggered(ChannelProvider.java:109)
at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:341)
at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:327)
at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:319)
at io.netty.handler.ssl.SslHandler.handleUnwrapThrowable(SslHandler.java:1249)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1230)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1271)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:505)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:444)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:283)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1422)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:931)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:813)
Caused by: javax.net.ssl.SSLException: Received fatal alert: bad_certificate
at sun.security.ssl.Alerts.getSSLException(Alerts.java:208)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1647)
at sun.security.ssl.SSLEngineImpl.fatal(SSLEngineImpl.java:1615)
at sun.security.ssl.SSLEngineImpl.recvAlert(SSLEngineImpl.java:1781)
at sun.security.ssl.SSLEngineImpl.readRecord(SSLEngineImpl.java:1070)
at sun.security.ssl.SSLEngineImpl.readNetRecord(SSLEngineImpl.java:896)
at sun.security.ssl.SSLEngineImpl.unwrap(SSLEngineImpl.java:766)
at javax.net.ssl.SSLEngine.unwrap(SSLEngine.java:624)
at io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:282)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1329)
at io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1224)
... 20 more
But handler is failing for timeout after 30 sec.
Timed out after waiting 30000(ms) for a reply. address: __vertx.reply.8419a431-d633-4ba8-a12e-c41fd5a4f37a, repliedAddress: ping-address
I want to capture the SSL exception immediately and handle it. Please guide me how can I Capture/catch this exception.
I tried with below code. Below one is able to handle the exception and I am not getting reply result from called event-bus. Reply result is always null. (value is always null)
MessageProducer<Object> ms = eb.sender("ping-address");
ms.write("ping!", reply -> {
if (reply.succeeded()) {
reply.map(value -> {
System.out.println("Received reply " + value);
return reply;
});
} else {
System.out.println("No reply");
System.out.println("An exception : " + reply.cause().getMessage());
}
});
You can't catch this exception because the Vert.x clustered EventBus implementation buffers messages when the nodes are not connected together. The message could be sent later if the problem is only temporary.
If you want to be notified earlier, you could set a lower timeout in DeliveryOptions.

How to get console output of downstream job in upstream job?

I'm trying to find a workaround because first question still is unanswered.
can't run Start-Job with credentials from Jenkins
I have job A. Job A starts powershell script at server and shows some output.
Also I have a pipeline B that runs multiple copy of job A against different servers.
Here is the groovy code
stage 'Copy sources from Git'
build job: 'DeploymentJobs/1_CopySourcesFromGit'
stage 'Deploy to servers'
def servers = env.SERVERLIST.split('\n')
def steps =[:]
for (int i=0; i<servers.size(); i++) {
def server = servers[i]
def stepName = "running ${server}"
steps[stepName] = {->
echo server
build job: 'DeploymentJobs/2_DeployToServer', parameters:
[booleanParam(name: 'REBOOTAFTER', value: Boolean.valueOf(REBOOTAFTER)),
string(name: 'SERVERNAME', value: server)]
}
}
parallel steps
In the output of pipeline I see only info than N copies of job A started but no their output.
I want to see only powershell output from all instances of job A in console output of pipeline B.
I have no iedea how to do this, Is it possible.
A shorter alternative:
def result = build job: 'job_name', wait: true
println result.getRawBuild().getLog()
It will be necessary to whitelist both methods.
Edit: since you don't want to wait for the build to run, you could add this at the end of your job (or at some point after all triggered builds should be finished) where number_of_builds in your case will be servers.size().
def job = Jenkins.getInstance().getItemByFullName('job_A_name')
def last_build = job.getLastBuild().getNumber()
def first_build = last_build - number_of_builds
(first_build..last_build).each {
println "Log of build $it"
def build = job.getBuildByNumber(it)
println build.log
}
If you really want to be sure the builds you're getting are the ones triggered by your job, you can get the build cause from the build object.
I solved my problem by adding this into the cycle. I could get log of downstrema jobs into upstream, display it and work with it.
def checkjob = build job: 'job name', parameters: [ any params here]
checklog = Jenkins.getInstance().getItemByFullName('job name').getBuildByNumber(checkjob.getNumber()).log
println checklog
Here's a silly example:
$myJobs = #()
$myJobs += Start-job -ScriptBlock { while (1) {Get-Item 'c:\*'; sleep 5}}
$myJobs += Start-job -ScriptBlock { while (1) {Get-Item 'c:\windows\*'; sleep 5}}
try {
while(1)
{
$myJobs | Get-Job -HasMoreData $true | Receive-Job
}
} finally {
$myJobs | Stop-Job
$myJobs | Remove-Job
}
Anything the job pipelines is queued. The -HasMoreData state indicates that the job has output that is available to read. The parent receives the output using Receive-Job. By default it displays in the console, but you can receive the output in the parent process and do further processing.
If this isn't what you're going for you'll have to be more specific in your question. Provide a little of the code you've tried so far.
def jobLog = Jenkins.getInstance().getItemByFullName('job-path/testjob').getLastSuccessfulBuild().log
println(jobLog)

Throw an exception in powershell with nesting original error

I'm a C# developer who is trying to build something useful using PowerShell. That's why I'm keep trying to use well-known idioms from .NET world in PowerShell.
I'm writing a script that has different layer of abstractions: database operations, file manipulation etc. At some point I would like to catch an error and wrap it into something more meaningful for the end user. This is a common pattern for C#/Java/C++ code:
Function LowLevelFunction($arg)
{
# Doing something very useful here!
# but this operation could throw
if (!$arg) {throw "Ooops! Can't do this!"}
}
Now, I would like to call this function and wrap an error:
Function HighLevelFunction
{
Try
{
LowLevelFunction
}
Catch
{
throw "HighLevelFunction failed with an error!`nPlease check inner exception for more details!`n$_"
}
}
This approach is almost what I need, because HighLevelFunction will throw new error and the root cause of the original error would be lost!
In C# code I always can throw new exception and provide original exception as an inner exception. In this case HighLevelFunction would be able to communicate their errors in a form more meaningful for their clients but still will provide inner details for diagnostic purposes.
The only way to print original exception in PowerShell is to use $Error variable that stores all the exceptions. This is OK, but the user of my script (myself for now) should do more things that I would like.
So the question is: Is there any way to raise an exception in PowerShell and provide original error as an inner error?
You can throw a new exception in your catch block and specify the base exception:
# Function that will throw a System.IO.FileNotFoundExceptiopn
function Fail-Read {
[System.IO.File]::ReadAllLines( 'C:\nonexistant' )
}
# Try to run the function
try {
Fail-Read
} catch {
# Throw a new exception, specifying the inner exception
throw ( New-Object System.Exception( "New Exception", $_.Exception ) )
}
# Check the exception here, using getBaseException()
$error[0].Exception.getBaseException().GetType().ToString()
Unfortunately when throwing a new exception from the catch block as described by this answer, the script stack trace (ErrorRecord.ScriptStackTrace) will be reset to the location of the throw. This means the root origin of the inner exception will be lost, making debugging of complex code much harder.
There is an alternative solution that uses ErrorRecord.ErrorDetails to define a high-level message and $PSCmdlet.WriteError() to preserve the script stack trace. It requires that the code is written as an advanced function cmdlet. The solution doesn't use nested exceptions, but still fulfills the requirement "to catch an error and wrap it into something more meaningful for the end user".
#------ High-level function ----------------------------------------------
function Start-Foo {
[CmdletBinding()] param()
try {
# Some internal code that throws an exception
Get-ChildItem ~foo~ -ErrorAction Stop
}
catch {
# Define a more user-friendly error message.
# This sets ErrorRecord.ErrorDetails.Message
$_.ErrorDetails = 'Could not start the Foo'
# Rethrows (if $ErrorActionPreference is 'Stop') or reports the error normally,
# preserving $_.ScriptStackTrace.
$PSCmdlet.WriteError( $_ )
}
}
#------ Code that uses the high-level function ---------------------------
$DebugPreference = 'Continue' # Enable the stack trace output
try {
Start-Foo -ErrorAction Stop
}
catch {
$ErrorView = 'NormalView' # to see the original exception info
Write-Error -ErrorRecord $_
''
Write-Debug "`n--- Script Stack Trace ---`n$($_.ScriptStackTrace)" -Debug
}
Output:
D:\my_temp\ErrorDetailDemo.ps1 : Could not start the Foo
+ CategoryInfo : ObjectNotFound: (D:\my_temp\~foo~:String) [Write-Error], ItemNotFoundException
+ FullyQualifiedErrorId : PathNotFound,ErrorDetailDemo.ps1
DEBUG:
--- Script Stack Trace ---
at Start-Foo, C:\test\ErrorDetailDemo.ps1: line 5
at , C:\test\ErrorDetailDemo.ps1: line 14
Our high-level error message 'Could not start the Foo' hides the error message of the underlying exception, but no information is lost (you could access the original error message through $_.Exception.Message from within the catch handler).
Note: There is also a field ErrorDetails.RecommendedAction which you could set as you see fit. For simplicity I didn't use it in the sample code, but you could set it like this $_.ErrorDetails.RecommendedAction = 'Install the Foo'.

Perl process crashes after handling signal

I'm trying to make rereading config file for simple perl daemon on SIGHUP.
I'm trying
use sigtrap qw/handler rereadconf HUP/;
but after executing "rereadconf" procedure process stops
i'm also trying
%SIG{HUP} = \&rereadconf;
sub rereadconf{
.... mycode
print "procedure executed\n";
};
but result was the same, after executing procedure program stops.
So how can i make that process continue execution after signal handling?
Your program exits because the accept returns false because it got interrupted by a signal. You want
while (1) {
my $client = $srv->accept();
if (!$client) {
next if $!{EINTR};
die(sprintf(STDERR "[%s] accept: %s\n", basename($0), $!));
}
print(STDERR "accepted new client\n");
serve_client($client);
}