Everything seems to be created fine but once it finishes initializing everything it just stops.
#SpringBootApplication
#LocatorApplication
public class ServerApplication {
public static void main(String[] args) {
SpringApplication.run(ServerApplication.class, args);
}
}
Log:
2020-08-03 10:59:18.250 INFO 7712 --- [ main] o.a.g.d.i.InternalLocator : Locator started on 10.25.209.139[8081]
2020-08-03 10:59:18.250 INFO 7712 --- [ main] o.a.g.d.i.InternalLocator : Starting server location for Distribution Locator on LB183054.dmn1.fmr.com[8081]
2020-08-03 10:59:18.383 INFO 7712 --- [ main] c.f.g.l.LocatorSpringApplication : Started LocatorSpringApplication in 8.496 seconds (JVM running for 9.318)
2020-08-03 10:59:18.385 INFO 7712 --- [m shutdown hook] o.a.g.d.i.InternalDistributedSystem : VM is exiting - shutting down distributed system
2020-08-03 10:59:18.395 INFO 7712 --- [m shutdown hook] o.a.g.i.c.GemFireCacheImpl : GemFireCache[id = 1329087972; isClosing = true; isShutDownAll = false; created = Mon Aug 03 10:59:15 EDT 2020; server = false; copyOnRead = false; lockLease = 120; lockTimeout = 60]: Now closing.
2020-08-03 10:59:18.416 INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager : Shutting down DistributionManager 10.25.209.139(locator1:7712:locator)<ec><v0>:41000.
2020-08-03 10:59:18.517 INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager : Now closing distribution for 10.25.209.139(locator1:7712:locator)<ec><v0>:41000
2020-08-03 10:59:18.518 INFO 7712 --- [m shutdown hook] o.a.g.d.i.m.g.Services : Stopping membership services
2020-08-03 10:59:18.518 INFO 7712 --- [ip View Creator] o.a.g.d.i.m.g.Services : View Creator thread is exiting
2020-08-03 10:59:18.520 INFO 7712 --- [Server thread 1] o.a.g.d.i.m.g.Services : GMSHealthMonitor server thread exiting
2020-08-03 10:59:18.536 INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager : DistributionManager stopped in 120ms.
2020-08-03 10:59:18.537 INFO 7712 --- [m shutdown hook] o.a.g.d.i.ClusterDistributionManager : Marking DistributionManager 10.25.209.139(locator1:7712:locator)<ec><v0>:41000 as closed.
Yes, this is the expected behavior, OOTB.
Most Apache Geode processes (clients (i.e. ClientCache), Locators, Managers and "peer" Cache nodes/members of a cluster/distributed system) only create daemon Threads (i.e. non-blocking Threads). Therefore, the Apache Geode JVM process will startup, initialize itself and then shutdown immediately.
Only an Apache Geode CacheServer process (a "peer" Cache that has a CacheServer component to listen for client connections), starts and continues to run. That is because the ServerSocket used to listen for client Socket connections is created on a non-daemon Thread (i.e. blocking Thread), which prevents the JVM process from shutting down. Otherwise, a CacheServer would fall straight through as well.
You might be thinking, well, how does Gfsh prevent Locators (i.e. using the start locator command) and "servers" (i.e. using the start server command) from shutting down?
NOTE: By default, Gfsh creates a CacheServer instance when starting a GemFire/Geode server using the start server command. The CacheServer component of the "server" can be disabled by specifying the --disable-default-server option to the start server command. In this case, this "server" will not be able to serve clients. Still the peer node/member will continue to run, but not without extra help. See here for more details on the start server Gfsh command.
So, how does Gfsh prevent the processes from falling through?
Under-the-hood, Gfsh uses the LocatorLauncher and ServerLauncher classes to configure and fork the JVM processes to launch Locators and servers, respectively.
By way of example, here is Gfsh's start locator command using the LocatorLauncher class. Technically, it uses the configuration from the LocatorLauncher class instance to construct (and specifically, here) the java command-line used to fork and launch (and specifically, here) a separate JVM process.
However, the key here is the specific "command" passed to the LocatorLauncher class when starting the Locator, which is the START command (here).
In the LocatorLauncher class, we see that the START command does the following, from the main method, to the run method, it starts the Locator, then waitsOnLocator (with implementation).
Without the wait, the Locator would fall straight through as you are experiencing.
You can simulate the same effect (i.e. "falling straight through") using the following code, which uses the Apache Geode API to configure and launch a Locator (in-process).
public class ApacheGeodeLocatorApplication {
public static void main(String[] args) {
LocatorLauncher locatorLauncher = new LocatorLauncher.Builder()
.set("jmx-manager", "true")
.set("jmx-manager-port", "0")
.set("jmx-manager-start", "true")
.setMemberName("ApacheGeodeBasedLocator")
.setPort(0)
.build();
locatorLauncher.start();
//locatorLauncher.waitOnLocator();
}
}
This simple little program will fall straight through. However, if you uncomment locatorLaucncher.waitOnLocator(), then the JVM process will block.
This is not unlike what SDG's LocatorFactoryBean class (see source) is doing actually. It, too, uses the LocatorLauncher class to configure and bootstrap a Locator in-process. The LocatorFactoryBean is the class used to configure and bootstrap a Locator when declaring the SDG #LocatorApplication annotation on your #SpringBootApplication class.
However, I do think there is room for improvement, here. Therefore, I have filed DATAGEODE-361.
In the meantime, and as a workaround, you can achieve the same effect of a blocking Locator by having a look at the Smoke Test for the same in Spring Boot for Apache Geode (SBDG) project. See here.
However, after DATAGEODE-361 is complete, the extra logic preventing the Locator JVM process from shutting down will no longer be necessary.
Related
I'm using Spring Boot, Spring Session and JTA Narayana (arjuna), I'm sending select and insert statements in a loop using two different threads.
The application runs correctly for some time but after some number of transactions, the Arjuna ConnectionManager fails to get a connection and generates the following exception:
2019-10-05 22:48:20.724 INFO 27032 --- [o-auto-1-exec-4] c.m.m.db.PrepareStatementExec : START select
2019-10-05 22:49:20.225 WARN 27032 --- [nsaction Reaper] com.arjuna.ats.arjuna : ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffc0a82101:c116:5d989ef0:6e in state RUN
2019-10-05 22:49:20.228 WARN 27032 --- [Reaper Worker 0] com.arjuna.ats.arjuna : ARJUNA012095: Abort of action id 0:ffffc0a82101:c116:5d989ef0:6e invoked while multiple threads active within it.
2019-10-05 22:49:20.234 WARN 27032 --- [Reaper Worker 0] com.arjuna.ats.arjuna : ARJUNA012381: Action id 0:ffffc0a82101:c116:5d989ef0:6e completed with multiple threads - thread http-nio-auto-1-exec-10 was in progress with java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)
com.arjuna.ats.internal.jdbc.ConnectionManager.create(ConnectionManager.java:134)
com.arjuna.ats.jdbc.TransactionalDriver.connect(TransactionalDriver.java:89)
java.sql.DriverManager.getConnection(DriverManager.java:664)
java.sql.DriverManager.getConnection(DriverManager.java:208)
com.mono.multidatasourcetest.db.PrepareStatementExec.executeUpdate(PrepareStatementExec.java:51)
Source code is in github https://github.com/saavedrah/multidataset-test
I'm wondering if the connection should be closed or if I should change some settings in Arjuna to make the ConnectionManager work.
although what you are showing is a stack trace being printed by the Narayana BasicAction class (rather than an exception) the result for you is ultimately the same and you need to close your connections.
You should most likely look to add it in close to the same place you are doing the getConnection calls within https://github.com/saavedrah/multidataset-test/blob/cf910c345db079a4e910a071ac0690af28bd3f81/src/main/java/com/mono/multidatasourcetest/db/PrepareStatementExec.java#L38
e.g.
//connection = getConnection
//do something with it
//connection.close()
But as Connection is AutoCloseable you could just do:
try (Connection connection = DriverManager.getConnection) {
connnection.doSomething();
}
We are facing a major incident in our Camunda Orchestrator. When we hit 100 running process instances, Camunda Cockpit takes an eternity and never responds.
We have the same issue when calling /app/engine/.
Few messages are being consumed from RabbitMQ, and then everything stops.
The application however is not down.
I suspect a process engine configuration issue, because of being unable to get the job executor log.
When I set JobExecutorActivate to false, all things go right for Cockpit and queue consumption, but processes stop at the end of the first subprocess.
We have this log loop non stop:
2018/11/17 14:47:33.258 DEBUG ENGINE-14012 Job acquisition thread woke up
2018/11/17 14:47:33.258 DEBUG ENGINE-14022 Acquired 0 jobs for process engine 'default': []
2018/11/17 14:47:33.258 DEBUG ENGINE-14023 Execute jobs for process engine 'default': [8338]
2018/11/17 14:47:33.258 DEBUG ENGINE-14023 Execute jobs for process engine 'default': [8217]
2018/11/17 14:47:33.258 DEBUG ENGINE-14023 Execute jobs for process engine 'default': [8256]
2018/11/17 14:47:33.258 DEBUG ENGINE-14011 Job acquisition thread sleeping for 100 millis
2018/11/17 14:47:33.359 DEBUG ENGINE-14012 Job acquisition thread woke up
And this log too (for queue consumption):
2018/11/17 15:04:19.582 DEBUG Waiting for message from consumer. {"null":null}
2018/11/17 15:04:19.582 DEBUG Retrieving delivery for Consumer#5d05f453: tags=[{amq.ctag-0ivcbc2QL7g-Duyu2Rcbow=queue_response}], channel=Cached Rabbit Channel: AMQChannel(amqp://guest#127.0.0.1:5672/,4), conn: Proxy#77a5983d Shared Rabbit Connection: SimpleConnection#17a1dd78 [delegate=amqp://guest#127.0.0.1:5672/, localPort= 49812], acknowledgeMode=AUTO local queue size=0 {"null":null}
Environment :
Spring Boot 2.0.3.RELEASE, Camunda v7.9.0 with PostgreSQL, RabbitMQ
Camunda BPM listen and push to 165 RabbitMQ queue.
Configuration :
# Data source (PostgreSql)
com.campDo.fr.camunda.datasource.url=jdbc:postgresql://localhost:5432/campDo
com.campDo.fr.camunda.datasource.username=campDo
com.campDo.fr.camunda.datasource.password=password
com.campDo.fr.camunda.datasource.driver-class-name=org.postgresql.Driver
com.campDo.fr.camunda.bpm.database.jdbc-batch-processing=false
oms.camunda.retry.timer=1
oms.camunda.retry.nb-max=2
SpringProcessEngineConfiguration :
#Bean
public SpringProcessEngineConfiguration processEngineConfiguration() throws IOException {
final SpringProcessEngineConfiguration config = new SpringProcessEngineConfiguration();
config.setDataSource(camundaDataSource);
config.setDatabaseSchemaUpdate("true");
config.setTransactionManager(transactionManager());
config.setHistory("audit");
config.setJobExecutorActivate(true);
config.setMetricsEnabled(false);
final Resource[] resources = resourceLoader.getResources(CLASSPATH_ALL_URL_PREFIX + "/processes/*.bpmn");
config.setDeploymentResources(resources);
return config;
}
Pom dependencies :
<dependency>
<groupId>org.camunda.bpm.springboot</groupId>
<artifactId>camunda-bpm-spring-boot-starter</artifactId>
</dependency>
<dependency>
<groupId>org.camunda.bpm.springboot</groupId>
<artifactId>camunda-bpm-spring-boot-starter-webapp</artifactId>
</dependency>
<dependency>
<groupId>org.camunda.bpm.springboot</groupId>
<artifactId>camunda-bpm-spring-boot-starter-rest</artifactId>
</dependency>
I am quite sure that my job executor config is wrong.
Update :
I can start cockpit and make Camunda consume messages by setting JobExecutorActivate to false, but processes are still stopping at the first job executor required step:
config.setJobExecutorActivate(false);
Thanks for your help.
First: if your process contains async steps (Jobs) then it will pause. Activating the jobExecutor will just say that camunda should manage how these jobs are worked on. If you disable the executor, your processes will still stop and since no-one will execute them, they remain stopped.
Disabling job-execution is only sensible during testing or when you have multiple nodes and only some of them should do processing.
To your main issue: the job executor works with a threadPool. From what you describe, it is very likely, that all threads in the pool block forever, so they never finish and never return, meaning your system is stuck.
This happened to us a while ago when working with a smtp server, there was an infinite timeout on the connection so the threads kept waiting although the machine was not available.
Since job execution in camunda is highly reliable and well tested per se, I yywould suggest that you double check everything you do in your delegates, if you are lucky (and I am right) you will find the spot where you just wait forever ...
I created a project fresh out of the Spring Initializr by choosing Kotlin, Gradle, M7 and Web-reactive.
I made a small project:
data class Person (val id: String)
#Component class PersonHandler(val template: ReactiveMongoTemplate)
{
init
{
println("Initializing")
val jim: Mono<Person> = template.save(Person("Jim"))
val john: Mono<Person> = template.save(Person("John"))
val jack: Mono<Person> = template.save(Person("Jack"))
launch(jim)
launch(john)
launch(jack)
println("Finished Initializing")
}
fun launch(mono: Mono<Person>)
{
mono.subscribe({println(it.id)}, {println("Error")}) // This works
// mono.block() This just hangs
}
}
I try to save three persons into the database. The save method returns just a Mono which needs to be executed. If I try to execute it by simply subscribing, everything works nice:
Initializing
Finished Initializing
2017-12-21 13:14:39.513 INFO 17278 --- [ Thread-13] org.mongodb.driver.connection : Opened connection [connectionId{localValue:3, serverValue:158}] to localhost:27017
2017-12-21 13:14:39.515 INFO 17278 --- [ Thread-12] org.mongodb.driver.connection : Opened connection [connectionId{localValue:4, serverValue:159}] to localhost:27017
2017-12-21 13:14:39.520 INFO 17278 --- [ Thread-14] org.mongodb.driver.connection : Opened connection [connectionId{localValue:5, serverValue:160}] to localhost:27017
Jim
Jack
John
However, when I use block instead of subscribe the application hangs:
Initializing
2017-12-21 13:16:47.200 INFO 17463 --- [ Thread-14] org.mongodb.driver.connection : Opened connection [connectionId{localValue:3, serverValue:163}] to localhost:27017
If I query the database manually, I see that Jim has been saved, but not Jack and John.
Is this a bug, or am I doing something wrong? I would like to have the guarantee that the users are in the database before the code goes any further so I would really like to use block.
I am not sure if it is relevant, but I get a compiler warning
Accessing nonfinal property template in constructor
There is a minimal working example. It contains two branches. One is a workaround for the issue.
https://github.com/martin-drozdik/spring-mongo-bug-example
I think this might be a Spring Framework bug / usability issue.
First, let me underline the difference between subscribe and block:
the subscribe method kicks off the work and returns immediately. So you get no guarantee that the operation is done when other parts of your application run.
block is a blocking operation: it triggers the operation and waits for its completion.
For initialisation work, composing operations and calling block once is probably the best choice:
val jim: Mono<Person> = template.save(Person("Jim"))
val john: Mono<Person> = template.save(Person("John"))
val jack: Mono<Person> = template.save(Person("Jack"))
jim.then(john).then(jack).block();
As you've stated, using block hangs the application. I suspect this might be an Spring context initialisation issue - If I remember correctly, this process might assume a single thread in some parts and using a reactive pipeline there schedules work on many threads.
Could you create a minimal sample application (using just Java/Spring Boot/Spring Data Reactive Mongo) and report that on https://jira.spring.io?
I had a similar situation where by calling "reactiveMongoTemplate.save(model).block()" the application was hanging.
The issue was caused by the #PostConstruct in one of my classes designed to create my system users after the application initialization. I think somehow It was invoked before full Spring context initialization.
#Configuration
public class InitialDataPostLoader {
private Logger logger = LogManager.getLogger(this.getClass());
#PostConstruct
public void init() {
logger.info(String.format(MSG_SERVICE_JOB, "System Metadata initialization"));
createDefaultUsers();
}
By replacing #PostConstruct with ContextRefreshEvent listener the issues got resolved.
#Configuration
public class InitialDataPostLoader implements
ApplicationListener<ContextRefreshedEvent> {
private Logger logger = LogManager.getLogger(this.getClass());
#Override
public void onApplicationEvent(ContextRefreshedEvent arg0) {
logger.info(String.format(MSG_SERVICE_JOB, "System Metadata initialization"));
createDefaultUsers();
}
I have a Camel application deployed on JBoss in a WAR file with a spring configuration for starting the Camel context.
It deploys and runs very nicely on a JBoss EAP 7.0.0.GA.
If I want to change values in a property file that my application depends on and touch the war file, it normally redeploys the application. But in some cases it fails.
I get the following in the server.log:
2017-07-25 12:05:26.671 INFO class=org.apache.camel.impl.DefaultShutdownStrategy thread="ServerService Thread Pool -- 74" Starting to graceful shutdown 12 routes (timeout 300 seconds)
2017-07-25 12:05:26.725 INFO class=org.apache.camel.impl.DefaultShutdownStrategy thread="Camel (interfacedb) thread #2 - ShutdownTask" Waiting as there are still 4 inflight and pending exchanges to complete, timeout in 300 seconds. Inflights per route: [interfacePersistDirect = 1, route1 = 1, pullFromTransferEntityTable = 1, lastScheduledRun = 1]
...
2017-07-25 12:10:26.691 WARN class=org.apache.camel.impl.DefaultShutdownStrategy thread="ServerService Thread Pool -- 74" Timeout occurred during graceful shutdown. Forcing the routes to be shutdown now. Notice: some resources may still be running as graceful shutdown did not complete successfully.
2017-07-25 12:10:26.691 WARN class=org.apache.camel.impl.DefaultShutdownStrategy thread="Camel (interfacedb) thread #2 - ShutdownTask" Interrupted while waiting during graceful shutdown, will force shutdown now.
2017-07-25 12:10:26.694 INFO class=org.apache.camel.impl.DefaultShutdownStrategy thread="ServerService Thread Pool -- 74" Graceful shutdown of 12 routes completed in 300 seconds
After this the application will not start again. JBoss reports the following in the myApp.war.failed file in the deployments folder.
"WFLYDS0022: Did not receive a response to the deployment operation within the allowed timeout period [600 seconds]. Check the server configuration file and the server logs to find more about the status of the deployment."
The application normally deploys a lot quicker than 600 seconds. I can touch the war file or delete the .failed file, which normally triggers a redeployment, but JBoss keeps giving me the error above in the .failed file.
The application starts normally if I restart the JBoss VM, but I would like to avoid restarting the other applications running on the JBoss instance.
Any suggestions?
I am having web application running in JBOSS AS 4.2.2.
Observed that jboss server automatically shuts down, and the following exception is observed in server.log
14:20:38,048 INFO [Server] Runtime shutdown hook called, forceHalt: true
14:20:38,049 INFO [Server] JBoss SHUTDOWN: Undeploying all packages
I want to enable TRACE for org.jboss.system.server.Server in jboss-log4j.xml, to hopefully get some more info when the server shuts down.
Please let me know how to enable TRACE for org.jboss.system.server.Server in jboss-log4j.xml.
I was able to add trace for server log and i could see the following output when JBOSS AS shuts down automatically:
2010-06-09 19:07:46,631 DEBUG [org.jboss.wsf.stack.jbws.RequestHandlerImpl] END handleRequest: jboss.ws:context=hpnp_lqs,endpoint=APIWebService
2010-06-09 19:07:46,631 DEBUG [org.jboss.ws.core.soap.MessageContextAssociation] popMessageContext: org.jboss.ws.core.jaxws.handler.SOAPMessageContextJAXWS#3290a11e (Thread http-0.0.0.0-8080-1)
2010-06-09 19:07:55,895 INFO [org.jboss.system.server.Server] Runtime shutdown hook called, forceHalt: true
2010-06-09 19:07:55,895 TRACE [org.jboss.system.server.Server] Shutdown caller:
java.lang.Throwable: Here
at org.jboss.system.server.ServerImpl$ShutdownHook.shutdown(ServerImpl.java:1017)
at org.jboss.system.server.ServerImpl$ShutdownHook.run(ServerImpl.java:996)
2010-06-09 19:07:55,895 INFO [org.jboss.system.server.Server] JBoss SHUTDOWN: Undeploying all packages
If anybody, has any clue, on what might be cause for automatic shutdown, pls help me.
Thanks!
There's a JBoss wiki page listing log output for various shutdown causes. It looks like yours was caused by a Ctrl-C. I assume you would have known if you hit Ctrl-C, though.
On unix-type servers, Ctrl-C generates a TERM signal, which could also come from someone or some script running as your jboss user or as root executing "kill <jboss pid>". If you're on linux I'd take a look at this question about the OOM killer.
One possible cause for this behaviour is console logout. We have observed this with our own server.
In brief, by default the Sun JVM listens to the event of the console user logging out, and shuts itself down automatically when that happens. To disable this, start the JVM with the -Xrs parameter.
See here for more details (look for Mysterious shutdowns).
One possible cause for a forced shutdown is if the virtual machine is out of memory.
I had this problem several years ago when a colleague implemented some very nasty bulk loading of objects from a database which caused jboss to shutdown on certain requests.
Try searching for "memory" or similar keywords in the log file and/or monitor the memory usage of the server.