List files under a folder within a Camel rest service - rest

I am creating a rest service using Camel Rest DSL. Within the service I need to list all files under a folder and do some processing on them. PFB the code -
from("direct:postDocument")
.to("file:/home/s469457/service/content-util/composite?noop=true")
.setBody(constant(null))
.log("Scanning file ${file:name.noext}.${file:name.ext}...");
Please advice.
~ Arunava

I would suggest to write a processor or a bean to list files in directory. I think that would be more efficient and so much simpler. Using Camel's file component you would have to deal with intricacies you might not expect.
Regardless. You will need to do pollEnrich and afterwards aggregate the whole result. I also think that you would run into trouble and will not be able to read files multiple times, to solve that you might need to create idempotent repository, but when reading files there might be concurrency/file locking issues...
Here's some pseudocode to get you started if you want to go that way:
from("direct:listFiles")
.pollEnrich("file:"+getFullPath()+"?noop=true")
.aggregate(new AggregationStrategy {
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
String filename = newExchange.getIn().getHeader("CamelFileName", String.class)
if (oldExchange == null) {
newExchange.getIn().setBody(new ArrayList<String>(Arrays.asList(filename)));
return newExchange;
} else {
...
}
})

//Camel Rest Api to list files
rest().path("/my-api/")
.get()
.produces("text/plain")
.to("direct:listFiles");
//Camel Route to list files
List<String> fileList = new ArrayList<String>();
from("direct:listFiles")
.loopDoWhile(body().isNotNull())
.pollEnrich("file:/home/s469457/service/content-util/composite?noop=true&recursive=true&idempotent=false&include=.*.csv")
.choice()
.when(body().isNotNull())
.process( new Processor(){
#Override
public void process(Exchange exchange) throws Exception {
File file = exchange.getIn().getBody(File.class);
fileList.add(file.getName());
}
})
.otherwise()
.process( new Processor(){
#Override
public void process(Exchange exchange) throws Exception {
if (fileList.size() != 0)
exchange.getOut().setBody(String.join("\n", fileList));
fileList.clear();
}
})
.end();

Related

How to generate output files for each input in Apache Flink

I'm using Flink to process my streaming data.
The streaming is coming from some other middleware, such as Kafka, Pravega, etc.
Saying that Pravega is sending some word stream, hello world my name is....
What I need is three steps of process:
Map each word to my custom class object MyJson.
Map the object MyJson to String.
Write Strings to files: one String is written to one file.
For example, for the stream hello world my name is, I should get five files.
Here is my code:
// init Pravega connector
PravegaDeserializationSchema<String> adapter = new PravegaDeserializationSchema<>(String.class, new JavaSerializer<>());
FlinkPravegaReader<String> source = FlinkPravegaReader.<String>builder()
.withPravegaConfig(pravegaConfig)
.forStream(stream)
.withDeserializationSchema(adapter)
.build();
// map stream to MyJson
DataStream<MyJson> jsonStream = env.addSource(source).name("Pravega Stream")
.map(new MapFunction<String, MyJson>() {
#Override
public MyJson map(String s) throws Exception {
MyJson myJson = JSON.parseObject(s, MyJson.class);
return myJson;
}
});
// map MyJson to String
DataStream<String> valueInJson = jsonStream
.map(new MapFunction<MyJson, String>() {
#Override
public String map(MyJson myJson) throws Exception {
return myJson.toString();
}
});
// output
valueInJson.print();
This code will output all of results to Flink log files.
My question is how to write one word to one output file?
I think the easiest way to do this would be with a custom sink.
stream.addSink(new WordFileSink)
public static class WordFileSink implements SinkFunction<String> {
#Override
public void invoke(String value, Context context) {
// generate a unique name for the new file and open it
// write the word to the file
// close the file
}
}
Note that this implementation won't necessarily provide exactly once behavior. You might want to take care that the file naming scheme is both unique and deterministic (rather than depending on processing time), and be prepared for the case that the file may already exist.

Use exchange message inside the .to() method in apache camel

Im new to camel and would like to change my route dynamically according to some logic preformed before hand
camelContext.addRoutes(new RouteBuilder() {
public void configure() {
PropertiesComponent pc = getContext().getComponent("properties", PropertiesComponent.class);
pc.setLocation("classpath:application.properties");
log.info("About to start route: Kafka Server -> Log ");
from("kafka:{{consumer.topic}}?brokers={{kafka.host}}:{{kafka.port}}"
+ "&maxPollRecords={{consumer.maxPollRecords}}"
+ "&consumersCount={{consumer.consumersCount}}"
+ "&seekTo={{consumer.seekTo}}"
+ "&groupId={{consumer.group}}"
+ "&valueDeserializer=" + BytesDeserializer.class.getName())
.routeId("FromKafka")
.process(new Processor() {
#Override
public void process(Exchange exchange) throws Exception {
System.out.println(" message: " + exchange.getIn().getBody());
Bytes body = exchange.getIn().getBody(Bytes.class);
HashMap data = (HashMap)SerializationUtils.deserialize(body.get());
// do some work on data;
Map messageBusDetails = new HashMap();
messageBusDetails.put("topicName", "someTopic");
messageBusDetails.put("producerOption", "bla");
exchange.getOut().setHeader("kafka", messageBusDetails);
exchange.getOut().setBody(SerializationUtils.serialize(data));
}
}).choice()
.when(header("kafka"))
.to("kafka:"+ **getHeader("kafka").get("topicName")** )
.log("${body}");
}
});
getHeader("kafka").get("topicName")
this is what im trying to achieve.
But i dont know how to access the headers value ( which is a map - cause a kafka producer might have more configuration) inside the .to()
I understand i might be using it totally wrong... buts thats what i managed to understand until now...
The main goal is to have multiple message busses as .from()
and multiple message bus options in the .to() that will be decided via an external source (like config file) that way the same route will apply to many logic scenarios
and i thought the choice() method is the best answer
Thanks!
Instead of to(), you can use toD(), which is the "Dynamic To"
See this for details
And for the syntax to use to pull in various headers etc., see the Simple expression page

How to do Async Http Call with Apache Beam (Java)?

Input PCollection is http requests, which is a bounded dataset. I want to make async http call (Java) in a ParDo , parse response and put results into output PCollection. My code is below. Getting exception as following.
I cound't figure out the reason. need a guide....
java.util.concurrent.CompletionException: java.lang.IllegalStateException: Can't add element ValueInGlobalWindow{value=streaming.mapserver.backfill.EnrichedPoint#2c59e, pane=PaneInfo.NO_FIRING} to committed bundle in PCollection Call Map Server With Rate Throttle/ParMultiDo(ProcessRequests).output [PCollection]
Code:
public class ProcessRequestsFn extends DoFn<PreparedRequest,EnrichedPoint> {
private static AsyncHttpClient _HttpClientAsync;
private static ExecutorService _ExecutorService;
static{
AsyncHttpClientConfig cg = config()
.setKeepAlive(true)
.setDisableHttpsEndpointIdentificationAlgorithm(true)
.setUseInsecureTrustManager(true)
.addRequestFilter(new RateLimitedThrottleRequestFilter(100,1000))
.build();
_HttpClientAsync = asyncHttpClient(cg);
_ExecutorService = Executors.newCachedThreadPool();
}
#DoFn.ProcessElement
public void processElement(ProcessContext c) {
PreparedRequest request = c.element();
if(request == null)
return;
_HttpClientAsync.prepareGet((request.getRequest()))
.execute()
.toCompletableFuture()
.thenApply(response -> { if(response.getStatusCode() == HttpStatusCodes.STATUS_CODE_OK){
return response.getResponseBody();
} return null; } )
.thenApply(responseBody->
{
List<EnrichedPoint> resList = new ArrayList<>();
/*some process logic here*/
System.out.printf("%d enriched points back\n", result.length());
}
return resList;
})
.thenAccept(resList -> {
for (EnrichedPoint enrichedPoint : resList) {
c.output(enrichedPoint);
}
})
.exceptionally(ex->{
System.out.println(ex);
return null;
});
}
}
The Scio library implements a DoFn which deals with asynchronous operations. The BaseAsyncDoFn might provide you the handling you need. Since you're dealing with CompletableFuture also take a look at the JavaAsyncDoFn.
Please note that you necessarily don't need to use the Scio library, but you can take the main idea of the BaseAsyncDoFn since it's independent of the rest of the Scio library.
The issue that your hitting is that your outputting outside the context of a processElement or finishBundle call.
You'll want to gather all your outputs in memory and output them eagerly during future processElement calls and at the end within finishBundle by blocking till all your calls finish.

Pax Exam how to start multiple containers

for a project i'm working on, we have the necessity to write PaxExam integration tests which run over multiple Karaf containers.
The idea would be finding a way to extend/configure PaxExam to start-up a Karaf container (or more) and deploying there a bounce of bundles, and then start the test Karaf container which will then test the functionality.
We need this to verify performance tests and other things.
Does someone know anything about that? Is that actually possible in PaxExam?
I write the answer by myself, after having found this interesting article.
In particular have a look at the sections Using the Karaf Shell and Distributed integration tests in Karaf
http://planet.jboss.org/post/advanced_integration_testing_with_pax_exam_karaf
This is basically what the article says:
first of all you have to change the test probe header, allowing the dynamic-package
#ProbeBuilder
public TestProbeBuilder probeConfiguration(TestProbeBuilder probe) {
probe.setHeader(Constants.DYNAMICIMPORT_PACKAGE, "*;status=provisional");
return probe;
}
After that, the article suggests the following code that is able to execute commands in the Karaf shell
#Inject
CommandProcessor commandProcessor;
protected String executeCommands(final String ...commands) {
String response;
final ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
final PrintStream printStream = new PrintStream(byteArrayOutputStream);
final CommandSession commandSession = commandProcessor.createSession(System.in, printStream, System.err);
FutureTask<string> commandFuture = new FutureTask<string>(
new Callable<string>() {
public String call() {
try {
for(String command:commands) {
System.err.println(command);
commandSession.execute(command);
}
} catch (Exception e) {
e.printStackTrace(System.err);
}
return byteArrayOutputStream.toString();
}
});
try {
executor.submit(commandFuture);
response = commandFuture.get(COMMAND_TIMEOUT, TimeUnit.MILLISECONDS);
} catch (Exception e) {
e.printStackTrace(System.err);
response = "SHELL COMMAND TIMED OUT: ";
}
return response;
}
Then, the rest is kind of trivial, you will have to implement a layer able to start-up a child instance of Karaf
public void createInstances() {
//Install broker feature that is provided by FuseESB
executeCommands("admin:create --feature broker brokerChildInstance");
//Install producer feature that provided by imaginary feature repo.
executeCommands("admin:create --featureURL mvn:imaginary/repo/1.0/xml/features --feature producer producerChildInstance");
//Install producer feature that provided by imaginary feature repo.
executeCommands("admin:create --featureURL mvn:imaginary/repo/1.0/xml/features --feature consumer consumerChildInstance");
//start child instances
executeCommands("admin:start brokerChildInstance");
executeCommands("admin:start producerChildInstance");
executeCommands("admin:start consumerChildInstance");
//You will need to destroy the child instances once you are done.
//Using #After seems the right place to do that.
}

Xtend Code Generator How to Copy Files

I am implementing my own DSL and using Xtend to generate codes. I need some static resources to be copied to my generate code. I was trying to use commons-io, but I couldn't get anywhere with that! What is the best way to do so? I am trying to avoid reading each file and writing to the corresponding file in output path...
This should do (taken from this web site, slightly modified, not tested)
def static void copyFileUsingChannel(File source, File dest) throws IOException {
FileChannel sourceChannel = null;
FileChannel destChannel = null;
try {
sourceChannel = new FileInputStream(source).getChannel();
destChannel = new FileOutputStream(dest).getChannel();
destChannel.transferFrom(sourceChannel, 0, sourceChannel.size());
}finally{
sourceChannel.close();
destChannel.close();
}
}