Camel REST: Async request processing for long-operating HTTP services - rest

I tried to find an up-to-date example for this kind of problem, but unluckily didn't find one.
I am trying to implement a webservice with camel that should behave like the following:
Camel receives input from a Rest-Endpoint either via GET or POST (api/startsearch)
a bean processes the input and generates a ticket-id
the same bean responds to client with HTTP-202 or a redirect-Status-Code including the redirect url (api/result?ticket-id=jf3298u23).
bean also passes the input to the activemq start-queue where the Camel route will do all its long-op processing.
When the route is finished, the result should be available at the redirect URL (/result?ticket-id=jf3298u23). If processing is not finished yet, it should respond with a custom status code like HTTP-299-processing.
So my route looks like this:
rest().path(apiPath).produces("application/json")
.get(searchEndpoint)
.to("bean:requestHandler?method=processNewSearch") // generate ticket-id and reply with 202 or 3xx
.route().inOnly("activemq:queue:start").endRest() // put the incoming message into the start-queue where the processing starts
.get(resultEndpoint).to("bean:requestHandler?method=returnResult"); // return 299 when processing not done or 200 + result
from("activemq:queue:start")
.setHeader("recipients").method(new ExtractRecipients(), "extractRecipients")
.to("activemq:queue:recipientlist");
... etc, etc... until:
from("activemq:queue:output")
.to("bean:requestHandler?method=saveFinishedSearch");
The bean itself has three methods:
public void processNewSearch(Exchange exchange) {
//generate ticket and stuff and finally set Header and body of the response
exchange.getOut().setHeader(Exchange.HTTP_RESPONSE_CODE, 202);
exchange.getOut().setBody(redirectUrl);
}
public void returnResult(Exchange exchange) {
//handle ticket related stuff, if valid fetch result and create http response:
exchange.getOut().setHeader(Exchange.HTTP_RESPONSE_CODE, 200);
exchange.getOut().setBody(searchResult);
return;
}
public void saveFinishedSearch(Exchange exchange) {
// get search Results from the final Message that was processing asynchronously in the background and save it
finishedSearches.put(ticket, body);
}
I am sure this is not the proper way to reply with manually set response codes and messages, but I did not find another way to do it.
So the problem currently is that camel waits until the whole message is processed and therefore the response generated by .to("bean:requestHandler?method=processNewSearch") does nothing since it will just be put into the start queue.
How do I immediatley return a custom response with camel and let the route process the request asyncronously?

First and foremost, you should stick to the HTTP Protocol and only trigger background tasks through POST operations. You probably don't want a crawler to trigger a long running background process via GET requests, do you?
As such, you should also make use of the Location HTTP header to return the URI of the resource further information on the current state of the process can be retrieved from. I'd also would use a common URI and not some redirection.
In your route-setup, I usually keep all route dependend things in the .route() block as well. We maintain a catalogue assembly process and a EDI message archive system that will assemble messages sent and/or received in a certain timeframe due to German law forcing clients to backup their EDI messages exchanged.
We separate between triggering a new archiving or assembly request and retrieving the current state of a request.
rest("/archives")
.post()
.bindingMode(RestBindingMode.json)
.type(ArchiveRequestSettings.class)
.consumes(MediaType.APPLICATION_JSON)
.produces(MediaType.APPLICATION_JSON)
.description("Invokes the generation of a new message archive for
"messages matching a criteria contained in the payload")
.route().routeId("create-archives")
// Extract the IP address of the user who invokes the service
.bean(ExtractClientIP.class)
// Basic Authentication
.bean(SpringSecurityContextLoader.class).policy(authorizationPolicy)
// check the amount of requests received within a certain time-period
.bean(receivedRequestFilter)
// extract specified settings
.bean(ExtractArchiveRequestSettings.class)
// forward the task to the archive generation queue
.to(SomeEndpoints.ARCHIVE_GENERATION_QUEUE)
// return 202 Accepted response
.bean(ReturnArchiveRequestCreatedStatus.class)
.endRest()
.get("/{archiveId}")
.bindingMode(RestBindingMode.json)
.outType(ArchiveRequestEntity.class)
.produces(MediaType.APPLICATION_JSON)
.description("Returns the status of the message archive generation process."
+ " If the process has finished this operation will return the"
+ " link to the download location of the generated archive")
.route().routeId("archive-status")
// Extract the IP address of the user who invokes the service
.bean(ExtractClientIP.class)
// Basic Authentication
.bean(SpringSecurityContextLoader.class).policy(authorizationPolicy)
// check the amount of requests received within a certain time-period
.bean(receivedRequestFilter)
// return the current state of the task to the client. If the job is done,
// the response will also include a download link as wel as an MD5 hash to
// verify the correctness of the downloaded archive
.bean(ReturnArchiveRequestStatus.class)
.endRest();
The ExtractArchiveRequestSettings class just performs sanity checks on the received payload and sets default values for missing fields. Afterwards the request is stored into the database and its unique identifier stored into a header.
The ArchiveRequestSetting does look like the sample below (slightly simplified)
#Getter
#Setter
#JsonIgnoreProperties(ignoreUnknown = true)
#JsonInclude(JsonInclude.Include.NON_NULL)
public class ArchiveRequestSettings {
/** Specifies if sent or received messages should be included in the artifact. Setting this field
* to 'DELIVERED' will include only delivered documents where the companyUuid of the requesting
* user matches the documents sender identifier. Specifying this field as 'RECEIVED' will include
* only documents whose receiver identifier matches the companyUuid of the requesting user. **/
private String direction;
/** The naming schema of entries within the archive **/
private String entryPattern;
/** The upper timestamp bound to include messages. Entries older than this value will be omitted **/
#JsonSerialize(using = Iso8601DateSerializer.class)
#JsonDeserialize(using = Iso8601DateDeserializer.class)
private Date from;
/** The lower timestamp bound to include messages. Entries younger than this value will be
* omitted. If left empty this will include even the most recent messages. **/
#JsonSerialize(using = Iso8601DateSerializer.class)
#JsonDeserialize(using = Iso8601DateDeserializer.class)
private Date till;
}
The ReturnArchiveRequestCreatedStatus class looksup the stored request entity and returns it with a 202 Accepted response.
#Handler
public void returnStatus(Exchange exchange) {
String archiveId = exchange.getIn().getHeader(HeaderConstants.ARCHIVES_REQUEST_ID, String.class);
ArchiveRequestEntity archive = repository.findOne(archiveId);
Message msg = new DefaultMessage(exchange.getContext());
msg.setHeader(Exchange.HTTP_RESPONSE_CODE, 202); // Accepted
msg.setHeader(Exchange.CONTENT_TYPE, "application/json; charset=\"utf-8\"");
msg.setHeader("Location", archiveLocationUrl + "/" + archiveId);
msg.setBody(archive);
exchange.setOut(msg);
}
Returning the current state of the stored request ensures that the client can check which settings actually got applied and could update them if either some default settings are inconvenient or further changes need to be applied.
The actual backing process is started by sending the exchange to a Redis queue which is consumed on a different machine. The output of this process will be an archive containing the requested files which is uploaded to a public accessible location and only the link will be stored in the request entity. Note we have a custom camel component that mimics a seda entpoint just for Redis queues. Using seda though should be enough to start a processing of the task in a different thread.
Depending on the current status of the backing process the stored request entity will be updated by the backing process. On receiving a status request (via GET) the datastore is queried for the current status and mapped to certain responses:
public class ReturnArchiveRequestStatus {
#Resource
private ArchiveRequestRepository repository;
#Handler
public void returnArchiveStatus(Exchange exchange) throws JSONException {
String archiveId = exchange.getIn().getHeader("archiveId", String.class);
if (StringUtils.isBlank(archiveId)) {
badRequest(exchange);
return;
}
ArchiveRequestEntity archive = repository.findOne(archiveId);
if (null == archive) {
notFound(archiveId, exchange);
return;
}
ok(archive, exchange);
}
private void badRequest(Exchange exchange) throws JSONException {
Message msg = new DefaultMessage(exchange.getContext());
msg.setHeader(Exchange.HTTP_RESPONSE_CODE, 400);
msg.setHeader(Exchange.CONTENT_TYPE, "application/json; charset=\"utf-8\"");
msg.setFault(false);
JSONObject json = new JSONObject();
json.put("status", "ERROR");
json.put("message", "No archive identifier found");
msg.setBody(json.toString());
exchange.setOut(msg);
}
private void notFound(String archiveId, Exchange exchange) throws JSONException {
Message msg = new DefaultMessage(exchange.getContext());
msg.setHeader(Exchange.HTTP_RESPONSE_CODE, 403);
msg.setHeader(Exchange.CONTENT_TYPE, "application/json; charset=\"utf-8\"");
msg.setFault(false);
JSONObject json = new JSONObject();
json.put("status", "ERROR");
json.put("message", "Could not find pending archive process with ID " + archiveId);
msg.setBody(json.toString());
exchange.setOut(msg);
}
private void ok(UserArchiveRequestEntity archive, Exchange exchange) throws JSONException {
Message msg = new DefaultMessage(exchange.getContext());
msg.setHeader(Exchange.HTTP_RESPONSE_CODE, 200);
msg.setHeader(Exchange.CONTENT_TYPE, "application/json; charset=\"utf-8\"");
msg.setFault(false);
msg.setBody(archive);
exchange.setOut(msg);
}
}
The actual entity stored and updated through the whole process looks something along the line (simplified):
#Getter
#Setter
#Builder
#ToString
#Document(collection = "archive")
#JsonInclude(JsonInclude.Include.NON_EMPTY)
public class ArchiveRequestEntity {
/**
* The current state of the archiving process
*/
public enum State {
/** The request to create an archive was cued but not yet processed **/
QUEUED,
/** The archive is currently under construction **/
RUNNING,
/** The archive was generated successfully. {#link #downloadUrl} should contain the link the
* archive can be found **/
FINISHED,
/** Indicates that the archive generation failed. {#link #error} should indicate the actual
* reason why the request failed **/
FAILED
}
#Id
#JsonIgnore
private String id;
/** Timestamp the process was triggered **/
#JsonIgnore
#Indexed(expireAfterSeconds = DEFAULT_EXPIRE_TIME)
private Date timestamp = new Date();
/** The identifier of the company to create the archive for **/
private String companyUuid;
/** The state this archive is currently in **/
private State state = State.QUEUED;
...
/** Marks the upper limit to include entries to the archive. Entries older then this field will
* not be included in the archives while entries equal or younger than this timestamp will be
* included unless they are younger than {#link #till} timestamp **/
#JsonFormat(pattern = "yyyy-MM-dd'T'HH:mm:ssXX")
private Date from;
/** Marks the lower limit to include entries to the archive. Entries younger than this field will
* not be included in the archive **/
#JsonFormat(pattern = "yyyy-MM-dd'T'HH:mm:ssXX")
private Date till;
/** Information on why the archive creation failed **/
private String error;
/** The URL of the final archive to download **/
private String downloadUrl;
/** The MD5 Hash of the final artifact in order to guarantee clients an unmodified version of the
* archive **/
private String md5Hash;
...
}
Note that regardless of the current state of the processing status a 200 OK is returned with the current JSON representation of the processes status. The client will either see a FINISHED state with downloadUrl and md5Hash properties set or a different status with yet again different properties available.
The backing process, of course, needs to update the request status appropriately as otherwise the client would not retrieve correct information on the current status of the request.
This approach should be applicable to almost any long running processes though the internals of which information you pass along will probably differ from our scenario. Hope this helps though

Related

Design of a pipeline that invokes a maximum number of requests per second

My goal is to create a pipeline that invokes a back-end (Cloud hosted) service a maximum number of times per second ... how can I achieve that?
Back story: Imagine a back-end service that is invoked with a single input and returns a single output. This service has quotas associated with it that permit a maximum number of requests per second (let's say 10 requests per second). Now imagine an unbounded source PCollection where I wish to transform the elements in the input by passing them through my back-end service. I can envisage a ParDo invoking the back-end service once for each element in the input PCollection. However, this doesn't perform any kind of flow control against the back-end.
I could imagine my DoFn logic testing the response from the back-end response and retrying till it succeeds but this doesn't feel right. If I have 100 workers, then I seem to be burning a lot of resources and putting a load on the back-end. What I think I want to do is throttle the calls to the back-end from the pipeline.
Good Day, kolban. In addition to Bruno Volpato's helpful RampupThrottlingFn example, I've seen a combination of the following. Please do not hesitate at all to let me know how I can update the example with more clarity.
PeriodicImpulse - emits an Instant at a fixed specified interval.
Fix the number of workers with the maxNumWorkers and numWorkers (Please see Dataflow Pipeline Options), if using the Dataflow runner.
Beam Metrics API to monitor the actual resource request count over time and set alerts. When using Dataflow, the Beam Metrics API automatically connects to Cloud Monitoring as Custom metrics
The following shows abbreviated code starting from the whole pipeline followed by some details as needed to provide clarity. It assumes a target of 10 workers, using Dataflow with the arguments --maxNumWorkers=10 and --numWorkers=10 and a goal to limit the resource requests among all workers to 10 requests per second. This translates to 1 request per second per worker.
PeriodicImpulse limits the Request creation to 1 per second
public class MyPipeline {
public static void main(String[] args) {
Pipeline pipeline = Pipeline.create(/* Usually with options */);
PCollection<Response> responses = pipeline.apply(
"PeriodicImpulse",
PeriodicImpulse
.create()
.withInterval(Duration.standardSeconds(1L))
).apply(
"Build Requests",
ParDo.of(new RequestFn())
)
.apply(ResourceTransform.create());
}
}
RequestFn DoFn emits Requests per Instant emitted from PeriodicImpulse
class RequestFn extends DoFn<Instant, Request> {
#ProcessElement
public void process(#Element Instant instant, OutputReceiver<Request> receiver) {
receiver.output(
Request.builder().build()
);
}
}
ResourceTransform transforms Requests to Responses, incrementing a Counter
class ResourceTransform extends PTransform<PCollection<Request>, PCollection<Response>> {
static ResourceTransform create() {
return new ResourceTransform();
}
public PCollection<Response> expand(PCollection<Request> input) {
return ParDo.of("Consume Resource", new ResourceFn());
}
}
class ResourceFn extends DoFn<Request, Response> {
private Counter counter = Metrics.counter(ResourceFn.class, "some:resource");
private transient ResourceClient client = null;
#Setup
public void setup() {
client = new ResourceClient();
}
#ProcessElement
public void process(#Element Request request, OutputReceiver<> receiver)
{
counter.inc(); // Increment the counter.
// not showing error handling
Response response = client.execute(request);
receiver.output(response);
}
}
Request and Response classes
(Aside: consider creating a Schema for the request input and response output classes. Example below uses AutoValue and AutoValueSchema)
#DefaultSchema(AutoValueSchema.class)
#AutoValue
abstract class Request {
/* abstract Getters. */
abstract String getId();
#AutoValue.Builder
static abstract class Builder {
/* abstract Setters. */
abstract Builder setId(String value);
abstract Request build();
}
}
#DefaultSchema(AutoValueSchema.class)
#AutoValue
abstract class Response {
/* abstract Getters. */
abstract String getId();
#AutoValue.Builder
static abstract class Builder {
/* abstract Setters. */
abstract Builder setId(String value);
abstract Response build();
}
}

Apache Camel: GET service call using toD() results in infinite loop

I want to read file(s) from a location, extract the fileName & make a rest call (GET) with the fileName as a request parameter. The file name is required to be passed dynamically as each file will be unique. I used toD() after going through the tutorials. The high level pseudo code is provided below (I am just interested with the status code from this call. There's further operations required after this.).
The issue I am facing now using toD() is that it is running into an infinite loop after making the Get service call.
How can this issue be handled? Appreciate your suggestions!
from("file:C:/inbound?delete=true&noop=true")
.process(new Processor() {
public void process(Exchange exchange) throws Exception {
String fileName = exchange.getIn().getHeader("CamelFileName").toString();
exchange.getIn().setHeader("fileName", fileName);
}
})
.setHeader(Exchange.HTTP_METHOD, simple("GET"))
.toD("http://localhost:8090/fileWatcher?fileName=${header.fileName}")
Here's a simple Get endpoint mockup running on port 8090:
#RequestMapping(value = "/fileWatcher", method = RequestMethod.GET)
public ResponseEntity<FileDetails> firstService(#RequestParam String fileName) {
return new ResponseEntity<>(HttpStatus.OK);
}

Error using "condition paramter header" #StreamListener of new release Chelsea.RC1

I am trying to use the event filter to reduce the amount of topics the application uses using the new feature available in the new release of the spring cloud stream (Chelsea.RC1). The message is being created, with the correct header, however, inspecting the contents of the message in the queue, the message does not contain the header, only the body with the payload.
public void sendEnroll(EnrollCommand data) {
//MessageChannel
outputEnroll.send(MessageBuilder
.withPayload(data)
.setHeader("brand", "MASTERCARD")
.setHeader("operation", Operation.ENROLL).build());
}
Consumer
#Service
#EnableBinding(Channel.class)
public class EnrollConsumer {
#Autowired
private EnrollService service;
#StreamListener(target = Channel.INPUT_ENROLL, condition = "headers['brand']=='MASTERCARD'")
public void enrollConsumer(#Payload String command){
System.out.println(command);
//service.enrollment(command);
}
}
In consumer service, it gives the following warning:
WARN -kafka-listener-1 o.s.c.s.b.DispatchingStreamListenerMessageHandler:62 - Cannot find a #StreamListener matching for message with id: 7baae934-7484-a7fd-91b0-ba906558bb13
You have to map that your custom headers:
spring.cloud.stream.kafka.binder.headers = brand,operation
That information is present in the documentation.

JAX-RS AsyncResponse.resume() with Location header

UPDATE
Some more digging showed that thrown Exceptions were dropped and the actual problem is that an injected UriInfo could not be resolved in the AsyncResponse's thread!
Accessing #Context UriInfo uriInfo; during AsyncResponse.resume() gives the following LoggableFailure's message:
Unable to find contextual data of type: javax.ws.rs.core.UriInfo
ORIGINAL
According to RFC 7231 HTTP/1.1 Semantics and Control, a POSTshould return 201 CREATED and supply the new resource's location in the header:
the origin server
SHOULD send a 201 (Created) response containing a Location header
field that provides an identifier for the primary resource created
(Section 7.1.2) and a representation that describes the status of the
request while referring to the new resource(s).
When writing a synchronous REST Server, the javax.ws.rs.core.Responseoffers the Response.created() shorthand which does exactly that.
I would save the new entity, build an URI and return
return Response.created(createURL(created)).build();
However, when I switch to an asynchronous approach utilizing a
#Suspended javax.ws.rs.container.AsyncResponse
the HTTP request on the client will hang infinitely:
#POST
public void createUser(#Valid User user, #Suspended AsyncResponse asyncResponse) {
executorService.submit(() -> {
User created = userService.create(user);
asyncResponse.resume(
Response.created(createURL(created)).build()
);
});
}
Through trial-and-error I found out that the modified location header is responsible.
If I return my entity and set the 201 Created, without touching the header, the request will eventually resolve:
#POST
public void createUser(#Valid User user, #Suspended AsyncResponse asyncResponse) {
executorService.submit(() -> {
User created = userService.create(user);
asyncResponse.resume(
Response.status(Status.CREATED).entity(created).build() //this works
//Response.created(createURL(created)).build()
);
});
}
So what's the problem? Am I misunderstanding the concepts?
I am running RestEasy on GlassFish4.1
If you need more information, please comment!
edit
As soon as I change any link or the header, the request will hang.
In case anyone ever has the same problem:
The problem was that I created the location header through an injected #Context UriInfo uriInfo using its .getAbsolutePathBuilder().
The approach was working in a synchronous server because the thread which accessed the UriInfo still had the same Request context.
However, when I switched to an async approach, the underlying Runnable which eventually had to access uriInfo.getAbsolutePathBuilder() was NOT within any context - thus throwing an exception which halted further execution.
The workaround:
In any async method which should return a location header, I .getAbsolutePathBuilder() while still within the context. The UriBuilder implemantion can then be used within the async run:
#POST
public void createUser(#Valid User user, #Suspended AsyncResponse asyncResponse) {
UriBuilder ub = uriInfo.getAbsolutePathBuilder();
executorService.submit(() -> {
User created = userService.create(user);
asyncResponse.resume(
Response.created(createURL(ub, created)).build()
);
});
}
private URI createURL(UriBuilder builder, ApiRepresentation entity) {
return builder.path(entity.getId().toString()).build();
}

extracting the complete envelope xml from MessageContext

I have an interceptor like this:
public class WebServiceInterceptor extends EndpointInterceptorAdapter {
#Inject
private Jaxb2Marshaller myJaxb2Marshaller;
#Inject
private WebServiceHistoryDao webServiceHistoryDao;
#Override
public boolean handleRequest(MessageContext messageContext, Object endpoint)
throws Exception {
Source payloadSource = messageContext.getRequest().getPayloadSource();
Object unmarshaled = myJaxb2Marshaller.unmarshal(payloadSource);
//EXTRACT XML HERE
//is there a better way than this:
String extractedXml = myJaxb2Marshaller.marshal(unmarshaled);
return true;
}
}
How can i extract the whole xml of envelope (for logging purposes - to write it to the DB)
You don't need to write one, there's an existing one in the API - SoapEnvelopeLoggingInterceptor. See the javadoc.
SOAP-specific EndpointInterceptor that logs the complete request and response envelope of SoapMessage messages. By default, request, response and fault messages are logged, but this behaviour can be changed using the logRequest, logResponse, logFault properties.
If you only need to see the payload, rather than the entire SOAP envelope, then there's PayloadLoggingInterceptor.