I have a route with a size set to 5000 like so:
<route id="rdfProcessContent">
<from uri="vm:rdfProcessContent?concurrentConsumers=2&size=5000&blockWhenFull=true"/>
.........
</route>
One question- 5000 size, is it too much or can I do something much higher? Or is there something else that I could do? Also, I'm using a try/catch (below), but is it a good way of dealing with a full queue?
If it matters, the route gets accessed in the following way:
def endPoint = camelContext.getEndpoint("vm:rdfProcessContent?size=5000");
def producer = endPoint.createProducer();
....
try{
while(gotNextPage)
{
...
contentList.each{
...
def exchange = endPoint.createExchange(org.apache.camel.ExchangePattern.InOnly);
exchange.getIn().setBody(it);
exchange.getIn().setHeader("isBulkLoad", "true" );
producer.process(exchange);
}
}
}catch(){...}
Here is the error I'm getting (first bit of the error):
java.lang.IllegalStateException: Queue full
at java.util.AbstractQueue.add(AbstractQueue.java:71)
at org.apache.camel.component.seda.SedaProducer.addToQueue(SedaProducer.java:233)
at org.apache.camel.component.seda.SedaProducer.process(SedaProducer.java:170)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:113)
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:72)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:398)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:191)
at org.apache.camel.processor.FilterProcessor.process(FilterProcessor.java:58)
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:72)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:398)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:191)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:191)
at org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:105)
Thank you for taking the time.
After digging around, I found that the issue was NOT with route "rdfProcessContent", but a route farther in execution - I was using JAVA api to post graphs. That turned out to be much slower than posting sparql queries - a big surprise.
Related
I have one vert.x Standard Verticle Basically,it parses HttpRequest and prepare JsonObject then I am sending JSONObject through eventbus. In Another Worker verticle that event get consumed and it will kick off execution(includes call to Penthao Data Integration Java API) it is blocking API.It took around 30 minutes to complete execution of ".kjb" file. But vert.x is continuously warning about Worker Thread Block so my question is what would be best practice in vert.x to tackle this scenario.
Any help would be highly appreciated.
According to vertx doc all blocking operations need to perform in code
vertx.executeBlocking(future -> {
// Call some blocking API that takes a significant amount of time to return
String result = someAPI.blockingMethod("hello");
future.complete(result);
}, res -> {
System.out.println("The result is: " + res.result());
});
So it's the best practice for all blocking tasks.
You could also deploy your verticle as a worker.
This way:
vertx.deployVerticle(yourVerticleInstance, new DeploymentOptions().setWorker(true));
This is the situation, lets say i have and endpoint and receive a request that retrieves data between a range of time or whatever, and the result of that request is a big list that i get from a database, lets say a list of a "Person" object, then for each of this person objects I have to call another method that it may be a little slow and it would delay the response a lot if i have to wait until it is executed for all the elements of this big list.
What i would like to accomplish is that i can stream the response through a rest endpoint and my front end does not have to wait until all this list is processed to start displaying it on the screen.
So i have a confusion here, i know that an asynchronous method using spring #Async it would make the consumer to be able to give a response even if the task is still not finished, but as far as i understand, this is helpful in the case of sending emails, or any other task or series of tasks whose response you are not going to display in the screen.
But in the case of a response that is meant to be displayed in the screen, i guess i should stream a chunk of data as soon as i have a whole "person" object ready.
What is the right way to accomplish this? is the Async method of any help in this situation or i should only find a way to detect when i have a person object is formed to stream it? or i'm terribly wrong and im not understanding the concepts of Async and streaming.
A little example would help.
Thanks.
I have been trying to understand the same concept from last 3 days and here is the my understanding which may help you.
Asynchronous REST endpoint:
If your REST end point is doing some complex business logic or calling some external service and may take some time respond back, its better to respond back from API ASAP moving the time consuming logic to background (separate thread). This is where Asynchronous processing will help.
Chunked output:
If your end point is expected to send large amount of data. In order to improve the user experience if i decide to start rendering the output (in UI) as soon as they start becoming available, chunked output from REST end point is the better approach.
Using jersey we can achieve both asynchronous processing and chunked output as mentioned in the below sample.
public ChunkedOutput<String> getChunkedResponse() {
final ChunkedOutput<String> output = new ChunkedOutput<String>(String.class);
new Thread() {
public void run() {
try {
String chunk;
int index = 0;
while ((chunk = getWordAtIndex(index)) != null) {
output.write(chunk);
index++;
}
} catch (IOException e) {
//Add code to handle the IO Exception during this operation
} finally {
try {
output.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}.start();
return output; // This output object may be returned way before output is created
}
I have tried out a sample to test this out with jersey and spring-boot combination. You can check it out in my git repository here.
Hope it helps.
recently I started to build some small web processing service using akka streams. It's quite simple, I'm pulling urls from redis, then I'm downloading those urls(they are images) later I'm processing images, and pushing them to s3 and some json to redis.
I'm downloading lot of different kinds of images from multiple sites, I'm getting whole bunch of errors like 404, Unexpected disconnect , Response Content-Length 17951202 exceeds the configured limit of 8388608, EntityStreamException: Entity stream truncation and redirects. With redirects I'm invoking requestWithRedirects with address founded in location header of response.
Part responsible for downloading is pretty much like this:
override lazy val http: HttpExt = Http()
def requestWithRedirects(request: HttpRequest, retries: Int = 10)(implicit akkaSystem: ActorSystem, materializer: FlowMaterializer): Future[HttpResponse] = {
TimeoutFuture(timeout, msg = "Download timed out!") {
http.singleRequest(request)
}.flatMap {
response => handleResponse(request, response, retries)
}.recoverWith {
case e: Exception if retries > 0 =>
requestWithRedirects(request, retries = retries - 1)
}
}
TimeoutFuture is quite simple it takes future and timeout. If future takes longer than timeout it returns other future with timeout exception.
The problem I'm having is: after some time I'm getting an error:
Message: RuntimeException: Exceeded configured max-open-requests value of [128] akka.http.impl.engine.client.PoolInterfaceActor$$anonfun$receive$1.applyOrElse in PoolInterfaceActor.scala::109
akka.actor.Actor$class.aroundReceive in Actor.scala::467
akka.http.impl.engine.client.PoolInterfaceActor.akka$stream$actor$ActorSubscriber$$super$aroundReceive in PoolInterfaceActor.scala::46
akka.stream.actor.ActorSubscriber$class.aroundReceive in ActorSubscriber.scala::208
akka.http.impl.engine.client.PoolInterfaceActor.akka$stream$actor$ActorPublisher$$super$aroundReceive in PoolInterfaceActor.scala::46
akka.stream.actor.ActorPublisher$class.aroundReceive in ActorPublisher.scala::317
akka.http.impl.engine.client.PoolInterfaceActor.aroundReceive in PoolInterfaceActor.scala::46
akka.actor.ActorCell.receiveMessage in ActorCell.scala::516
akka.actor.ActorCell.invoke in ActorCell.scala::487
akka.dispatch.Mailbox.processMailbox in Mailbox.scala::238
akka.dispatch.Mailbox.run in Mailbox.scala::220
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec in AbstractDispatcher.scala::397
scala.concurrent.forkjoin.ForkJoinTask.doExec in ForkJoinTask.java::260
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask in ForkJoinPool.java::1339
scala.concurrent.forkjoin.ForkJoinPool.runWorker in ForkJoinPool.java::1979
scala.concurrent.forkjoin.ForkJoinWorkerThread.run in ForkJoinWorkerThread.java::107
I'm not sure what could be the problem but I think I have some downloads that were not finished properly and they stay in some global pool of connections after a while causing mentioned error. Any ideas what could be causing the problem? Or how to try find root of the problem: I already tested 404 responses, and Response Content-Length exceeds... errors, and they doesn't seem to be my troublemakers.
EDIT:
Most likely the problem is with my TimeoutFuture. I'm filling it with error as described here https://stackoverflow.com/a/29330010/2963977 but in my opinion future that is actually downloading an image never completes and it's taking my connection pool resources.
I wonder why those settings doesn't have any impact in my case :
akka.http.client.connecting-timeout = 1 s
akka.http.client.idle-timeout = 1 s
akka.http.host-connection-pool.idle-timeout = 1 s
EDIT2:
Apparently timeouts are not supported yet. Here is my bug report
https://github.com/akka/akka/issues/17732#issuecomment-112315953
A Gatling scenario with an exec chain. After a request, returned data is saved. Later it's processed and depending on the processing result, it should either fail or pass the test.
This seems like the simplest possible scenario, yet I can't find any reliable info how to fail a test from within an exec block. assert breaks the scenario and seemingly Gatling (as in: the exception throw doesn't just fail the test).
Example:
// The scenario consists of a single test with two exec creating the execChain
val scn = scenario("MyAwesomeScenario").exec(reportableTest(
// Send the request
exec(http("127.0.0.1/Request").get(requestUrl).check(status.is(200)).check(bodyString.saveAs("MyData")
// Process the data
.exec(session => {
assert(processData(session.attributes("MyData")) == true, "Invalid data");
})
))
Above the scenario somewhere along the line "guardian failed, shutting down system".
Now this seems a useful, often-used thing to do - I'm possibly missing something simple. How to do it?
You have to abide by Gatling APIs.
With checks, you don't "fail" the test, but the request. If you're looking for failing the whole test, you should have a look at the Assertions API and the Jenkins plugin.
You can only perform a Check at the request site, not later. One of the very good reasons is that if you store the bodyString in the Sessions like you're doing, you'll end using a lot of memory and maybe crashing (still referenced, so not garbage collectable). You have to perform your processData in the check, typically in the transform optional step.
were you looking for something like
.exec(http("getRequest")
.get("/request/123")
.headers(headers)
.check(status.is(200))
.check(jsonPath("$.request_id").is("123")))
Since the edit queue is already full.
This is already resolved in the new version of Gatling. Release 3.4.0
They added
exitHereIf
exitHereIf("${myBoolean}")
exitHereIf(session => true)
Make the user exit the scenario from this point if the condition holds. Condition parameter is an Expression[Boolean].
I implemented something using exitHereIfFailed that sounds like exactly what you were trying to accomplish. I normally use this after a virtual user attempts to sign in.
exitHereIfFailed is used this way
val scn = scenario("MyAwesomeScenario")
.exec(http("Get data from endpoint 1")
.get(request1Url)
.check(status.is(200))
.check(bodyString.saveAs("MyData"))
.check(processData(session.attributes("MyData")).is(true)))
.exitHereIfFailed // If we weren't able to get the data, don't continue
.exec(http("Send the data to endpoint 2")
.post(request2Url)
.body(StringBody("${MyData}"))
This scenario will abort gracefully at exitHereIfFailed if any of the checks prior to exitHereIfFailed have failed.
I faced the following problem with JPA but it's maybe more like a conceptional question about Camel.
I need a cron based Quartz consumer. But if it's triggered, I'd like to make a selection as a 1st step with JPA component.
<from uri="quartz://myQuartz?cron=myCronExpression/>
<to uri="jpa://home.myEntity?consumer.query=select o from home.myEntity o"/>
But if I call the JPA component with "to", then it's used as a Producer, and not as a Consumer. Can I use somehow the JPA component to handle this, or I have to follow the Service Activator (bean-based) logic and leave the JPA component behind?
Thanks in advance,
Gergely
This is pretty much the Content-Enrichement pattern. You can use the
<pollEnrich uri="jpa://home.myEntity?consumer.query=select o from home.myEntity o"/>
instead to use a consumer mid-route. Keep in mind that you cannot use runtime data from the route (headers or the like) but need to keep the route URI static in this case. Seems your URI is static so that should be no issue.
Very good point Petter. I had a similar issue. I wanted to create a simple route that when called will retrieve data from the database. The solutions is simple.
from("direct:test")
.pollEnrich("jpa://" + User.class.getName() + "?consumer.query=select u from test.User u&consumeDelete=false")
Also check this Camel - content enricher: enrich() vs pollEnrich().