wso2 wait loop doesn't work after restart

wso2 wait loop doesn't work after restart - persistence

I've developed a pooling logic in bpel process on the WSO2 BPS 3.0.0 connected to a Postgresql 9 DB.
It looks like this:
<bpel:repeatUntil name="RepeatUntilIncidentCompleted">
<bpel:sequence name="CheckIncidentStatus">
<bpel:wait name="Wait">
<bpel:for expressionLanguage="urn:oasis:names:tc:wsbpel:2.0:sublang:xpath1.0"><![CDATA['PT1M']]></bpel:for>
</bpel:wait>
<!-- invoke a service, copy status to a vStatus variable -->
</bpel:sequence>
<bpel:condition expressionLanguage="urn:oasis:names:tc:wsbpel:2.0:sublang:xpath1.0"><![CDATA[$vStatus=36]]></bpel:condition>
I created a process instance and this loop worked fine.
Later I restarted the WSO2 BPS server. In the moment of the restart the process instance was in the loop, but after restart the loop wasn't running anymore. The process is marked as active in the carbon console.
I've added the in-memory=false property in the deploy.xml but it didn't help.
I could have missed some configuration but there also can be a persistence problem with such a loop (probably in the Apache ODE).
Does anyone know a solution to this problem? Thx in advance.

I've discovered that:
1. All sleep operations that you put in a wso2 bpel process are represented in the ode_job table. The attribute ts contains the time of wake up.
2. After restart of the bps server all delayed sleep operations aren't continued (a sleep operation is delayed when wake up time < current time - offset ).
3. After restart of the bps server all non-delayed sleep operations that are continued properly.
Now let's say that:
- You have a bpel process instance that waits in a wait operation. The wake up time is X.
- You stop the bps server, and start it again after X.
Because of 2. the process instance won't continue after restart. This includes the loop I've described earlier.
My workaround to the problem:
Everytime the wso2 bps server is restarted I execute a sql script on the database that updates the wake up attribute of the sleep operations (the ts column in the ode_job table). The wake up times are set to some near future.
I don't know if you can change the 2. / 3. behaviour by configuration. I couldn't find any documentation about it. Some code analasis is needed here. To make things worse, wso2 uses it's own apache ode branch, so you can't just update the apache ode library.
I suspect that there can be two reasons for the behaviour described in 2.:
- delayed sleep operations are droped
- delayed sleep operations are executed right after restart, but the process definitions aren't loaded yet.

Related

REST API does not return answer back after more than 3600 seconds of processing

We have spent several weeks trying to fix an issue that occurs in the customer's production environment and does not occur in our test environment.
After several analyses, we have found that this error occurs only when one condition is met: processing times greater than 3600 seconds in the API.
The situation is the following:
SAP is connected to a server with Windows Server 2016 and IIS 10.0 where we have an API that is responsible for interacting with a DB use by an external system.
The process that we execute sends data from SAP to the API and this, with the data it receives from SAP and the data it obtains from the DB of the external system, performs a processing and a subsequent update in the DB.
This process finishes without problems when the processing time in the API is less than 3600 seconds.
On the other hand, when the processing time is greater than 3600 seconds, the API generates the response correctly, and the server tries to return the response to SAP, but it is not possible.
Below I show an example of a server log entry when it tries to return a response after more than 3600 seconds of API processing. As you can see, a 995 error occurs: (I have censored some parts)
Any idea where the error could come from?
We have compared IIS configurations in Production and Test. We have also reviewed the parameters of the SAP system in Production and Test and we have not found anything either.
I remain at your disposal to provide any type of additional information that may be useful for solving the problem.
UPDATE 1 - 02/09/2022
After enabling FRT (Failed Request Tracing) on IIS for 200 response codes, looking at the event log of the request that is causing the error, we have seen this event at the end:
Any information about what could be causing this error? ErrorCode="The I/O operation has been aborted because of either a thread exit or an application request. (0x800703e3)"
UPDATE 2 - 02/09/2022
Comparing configurations from customer's environment and our test environment:

There is a Firewall between SAP Server and IIS Server with the default idle timeout configured for TCP (3600 seconds). This is not happening in Test Environment because there is no Firewall.
Establishing a Firewall policy specifying a custom idle timeout for this service (7200 seconds) the problem will be solved.

sc-win32 status 995, the I/O operation has been aborted because of
either a thread exit or an application request.
Please check the setting of minBytesPerSecond configuration parameter in IIS. The default "minBytesPerSecond" is 240.
Specifies the minimum throughput rate, in bytes, that HTTP.sys
enforces when it sends a response to the client. The minBytesPerSecond
attribute prevents malicious or malfunctioning software clients from
using resources by holding a connection open with minimal data. If the
throughput rate is lower than the minBytesPerSecond setting, the
connection is terminated.

How to restart an exe when it is exits in windows 10?

I have a process in windows which i am running in startup. Now i need to make it if somehow that process get killed or stopped i need to restart it again in Windows 10?
Is there any way. Process is a HTTP server which if somehow stopped in windows i need to restart it. I have tried of writing a power-shell in which I'll check task-list status of process and then if not found I'll restart but that is not a good way. Please suggest some good way to do it.
I have a golang exe; under a particular scenario my process got killed or stopped i need to start it up again automatically. This has to be done imediately after the exe got killed. What is the best way to achieve this?

I will give you a brief rundown. You can enable Audit Process Termination in local group policy of the machine as shown below. In your case, success audits would be enough. Please note that the pic is for Windows 7. It may change with OS.
Now every time a process gets terminated, a success event will be generated and written to the security eventlog.
This will allow you to create a task scheduler that triggers on the generation of this event that calls a script that would run the process again. Simple right?
Well, you might have some trouble setting that task up especially when you want to pass details about the generating event to the script. This should help you get through that.

You can user Task scheduler for this purpose. There is a option of "restart on failure" which can be selected and whenever your process get failed it will restart again.
Reference :- https://social.technet.microsoft.com/Forums/windowsserver/en-US/4545361c-cc1f-4505-a0a1-c2dcc094109a/restarting-scheduled-task-that-has-failed?forum=winserverManagement

ibm bpm - Halting of the Service timed out

When halting service in Process Admin console - using Halt service button - I get following message after a while:
Halting of the Service timed out. Most probably, the Process Server failed to halt the Service within short time period.
From what I can see, it happens only when said service has few (say in tenths)steps, when it has more (hundreds) steps, it works well.
Can somebody see cause of this and tell me what to do? Thanks.

If we fail to halt a hung or long running (infinite loop) service within a short duration, process admin console does not allow to halt it.
This is the default behavior of how halt will work. Only way to restart the server to reinstate the BPM env.

Wait for system to sync time before performing another task

I'm using a Raspberry Pi, and upon startup it's sending an e-mail with the time and an IP address. The problem is that the time is not correct, it's the time from last time the system was shut down. When I log in through ssh and do a date command, I get the correct time. In other words, the e-mail is sent before the system has updated its time.
I was thinking of automatically running ntpdate on boot, but after reading up on it it seems like a bad idea due to the many risks of error.
So, can I somehow wait until the time has been uppdated before continuing in a script?

There is a tool included in the ntp reference implementation for this very purpose. The utility has a rather cryptic name: ntp-wait. Five minutes with the man page and you will be all set.

Windows Service "Starting"

I have a critical windows service that I need for my web application.
Unfortunately, the windows service does not start properly, but remains in a status of "Starting" for about 7 minutes and 38 seconds, and then fails.
My web application works fine when the service is in the "Starting" mode.
I have a windows scheduled task that runs every minute to restart the service if necessary.
net start "my service"
Therefore there is a gap of about 22 seconds from when the service fails until it starts up again. In additional it takes an additional 30 seconds or so for my application (which is dependent on this service) to start working.
I have intentionally not named the errant service. I did open a separate question https://stackoverflow.com/questions/8470975/oracle-oc4j-service-keeps-stopping whose aim was to actually solve the problem.
In this question, I am not trying to solve the problem, but rather find a workaround to try and keep this service in a status of "Starting" the whole time.
What is infuriating, is that until I restarted the server today, my workaround of restarting the service every 3 minutes actually worked, with no application downtime whatsoever.
Does anybody have any suggestions? I did try changing the registry key of ServicesPipeTimeout to 86400000 (24 hours!) in a bid to keep the service in the status of "Starting" for longer.

I have found a possible solution to my problem that I am very uneasy about...
I downloaded WinDbg from http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=8279
I opened WinDbg and did Attach to Process, and selected my service.
As long as WinDbg is open, it seems to "hold" the process and prevent it from stopping.
How long it will continue to do so, remains to be seen, but it has held for over half an hour now (whereas before the service stopped after 8 minutes)

If you have the timeout set to 24 hours and the service does not start or stay in 'starting' mode , then it must be either crashing or closing itself down.
If you want to try to restarting your service immediately it crashes, then, on the properties of your service, select the 'Recovery' tab. You should be able to set the service to restart on first, second and subsequent failures and set the service to restart after 0 minutes,
Note, this will not work if windows thinks that the service is closing down properly.
It should go without saying that this is a last resort only if you can't get whoever wrote the service to fix the problems.

Try specifying 'Restart the Service' for all three sections on the Recovery tab, but that will only work if the service is ending abnormally.
Our company faced a similar problem and we developed Service Protector, a commercial application that can babysit a service and keep it running 24/7. It may work in your situation too.