java.io.BufferedReader().map Cannot infer type argument(s) for <T> fromStream(Stream<? extends T>) - mongodb

Scenario: a Spring WebFlux triggering CommandLineRunner.run in order to load data to MongoDb for testing purpose.
Goal: when starting the microservice locally it is aimed to read a json file and load documents to MongDb.
Personal knowledge: "bufferedReader.lines().filter(l -> !l.trim().isEmpty()" reads each json node and return it as stream. Then I can map it to "l" and access the get methods. I guess I don't have to create a list and then stream it since I have already load it as stream by "new InputStreamReader(getClass().getClassLoader().getResourceAsStream()" and I assume I can use lines() since it node will result in a string line. Am I in right direction or I am messing up some idea?
This is a json sample file:
{
"Extrato": {
"description": "credit",
"value": "R$1.000,00",
"status": 11
},
"Extrato": {
"description": "debit",
"value": "R$2.000,00",
"status": 99
}
}
model
import org.springframework.data.annotation.Id;
import org.springframework.data.mongodb.core.mapping.Document;
#Document
public class Extrato {
#Id
private String id;
private String description;
private String value;
private Integer status;
public Extrato(String id, String description, String value, Integer status) {
super();
this.id = id;
this.description = description;
this.value = value;
this.status = status;
}
... getters and setter accordinly
Repository
import org.springframework.data.mongodb.repository.Query;
import org.springframework.data.repository.reactive.ReactiveCrudRepository;
import com.noblockingcase.demo.model.Extrato;
import reactor.core.publisher.Flux;
import org.springframework.data.domain.Pageable;
public interface ExtratoRepository extends ReactiveCrudRepository<Extrato, String> {
#Query("{ id: { $exists: true }}")
Flux<Extrato> retrieveAllExtratosPaged(final Pageable page);
}
command for loading from above json file
import org.springframework.boot.CommandLineRunner;
import org.springframework.stereotype.Component;
import com.noblockingcase.demo.model.Extrato;
import com.noblockingcase.demo.repository.ExtratoRepository;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import reactor.core.publisher.Flux;
#Component
public class TestDataLoader implements CommandLineRunner {
private static final Logger log = LoggerFactory.getLogger(TestDataLoader.class);
private ExtratoRepository extratoRepository;
TestDataLoader(final ExtratoRepository extratoRepository) {
this.extratoRepository = extratoRepository;
}
#Override
public void run(final String... args) throws Exception {
if (extratoRepository.count().block() == 0L) {
final LongSupplier longSupplier = new LongSupplier() {
Long l = 0L;
#Override
public long getAsLong() {
return l++;
}
};
BufferedReader bufferedReader = new BufferedReader(
new InputStreamReader(getClass().getClassLoader().getResourceAsStream("carga-teste.txt")));
//*** THE ISSUE IS NEXT LINE
Flux.fromStream(bufferedReader.lines().filter(l -> !l.trim().isEmpty())
.map(l -> extratoRepository.save(new Extrato(String.valueOf(longSupplier.getAsLong()),
l.getDescription(), l.getValue(), l.getStatus()))))
.subscribe(m -> log.info("Carga Teste: {}", m.block()));
}
}
}
Here is the MongoDb config althought I don't think it is relevant
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import com.mongodb.MongoClientOptions;
#Configuration
public class MongoDbSettings {
#Bean
public MongoClientOptions mongoOptions() {
return MongoClientOptions.builder().socketTimeout(2000).build();
}
}
If I tried my original code and adjust it for reading a text file I can successfully read text file instead of json. Obvisouly it doesn't fit my demand since I want read json file. By the way, it can clarify a bit more where I am blocked.
load-test.txt (available in https://github.com/jimisdrpc/webflux-worth-scenarious/blob/master/demo/src/main/resources/carga-teste.txt)
crédito de R$1.000,00
débito de R$100,00
snippet code working with simple text file
BufferedReader bufferedReader = new BufferedReader(
new InputStreamReader(getClass().getClassLoader().getResourceAsStream("carga-teste.txt")));
Flux.fromStream(bufferedReader.lines().filter(l -> !l.trim().isEmpty())
.map(l -> extratoRepository
.save(new Extrato(String.valueOf(longSupplier.getAsLong()), "Qualquer descrição", l))))
.subscribe(m -> log.info("Carga Teste: {}", m.block()));
Whole project working succesfully reading from text file: https://github.com/jimisdrpc/webflux-worth-scenarious/tree/master/demo
Docker compose for booting MongoDb https://github.com/jimisdrpc/webflux-worth-scenarious/blob/master/docker-compose.yml
To summarize, my issue is: I didn't figure out how read a json file and insert the data into MongoDb during CommandLineRunner.run()

I found an example with Flux::using Flux::fromStream to be helpful for this purpose. This will read your file into a Flux and then you can subscribe to and process with .flatmap or something. From the Javadoc
using(Callable resourceSupplier, Function> sourceSupplier, Consumer resourceCleanup)
Uses a resource, generated by a supplier for each individual Subscriber, while streaming the values from a Publisher derived from the same resource and makes sure the resource is released if the sequence terminates or the Subscriber cancels.
and the code that I put together:
private static Flux<Account> fluxAccounts() {
return Flux.using(() ->
new BufferedReader(new InputStreamReader(new ClassPathResource("data/ExportCSV.csv").getInputStream()))
.lines()
.map(s->{
String[] sa = s.split(" ");
return Account.builder()
.firstname(sa[0])
.lastname(sa[1])
.build();
}),
Flux::fromStream,
BaseStream::close
);
}

Please note your json is invalid. Text data is not same as json. Json needs a special handling so always better to use library.
carga-teste.json
[
{"description": "credit", "value": "R$1.000,00", "status": 11},
{"description": "debit","value": "R$2.000,00", "status": 99}
]
Credits goes to article here - https://www.nurkiewicz.com/2017/09/streaming-large-json-file-with-jackson.html.
I've adopted to use Flux.
#Override
public void run(final String... args) throws Exception {
BufferedReader bufferedReader = new BufferedReader(
new InputStreamReader(getClass().getClassLoader().getResourceAsStream("carga-teste.json")));
ObjectMapper mapper = new ObjectMapper();
Flux<Extrato> flux = Flux.generate(
() -> parser(bufferedReader, mapper),
this::pullOrComplete,
jsonParser -> {
try {
jsonParser.close();
} catch (IOException e) {}
});
flux.map(l -> extratoRepository.save(l)).subscribe(m -> log.info("Carga Teste: {}", m.block()));
}
}
private JsonParser parser(Reader reader, ObjectMapper mapper) {
JsonParser parser = null;
try {
parser = mapper.getFactory().createParser(reader);
parser.nextToken();
} catch (IOException e) {}
return parser;
}
private JsonParser pullOrComplete(JsonParser parser, SynchronousSink<Extrato> emitter) {
try {
if (parser.nextToken() != JsonToken.END_ARRAY) {
Extrato extrato = parser.readValueAs(Extrato.class);
emitter.next(extrato);
} else {
emitter.complete();
}
} catch (IOException e) {
emitter.error(e);
}
return parser;
}

Related

Why is Flink Cep constantly waiting for a new entry to catch patterns?

I'm making a Flink CEP application that reads data through Kafka. When I try to catch the patterns, the sink operation does not occur when there is no data after it. For example, I expect A-> B-> C as a pattern. And from the kafka comes A, B, C data. However, in order for the sink operation I added to the patternProcess function to work, the data coming from the kafka must be like A, B, C, X. How do I fix this problem please help.
READ KAFKA
DataStream<String> dataStream = env.addSource(KAFKA).assignTimestampsAndWatermarks(WatermarkStrategy
.forBoundedOutOfOrderness(Duration.ofSeconds(0)));
dataStream.print("DS:"); //to see every incoming data
PATTERN
Pattern<Event, ?> pattern = Pattern.<Event>begin("start").where(
new SimpleCondition<Event>() {
#Override
public boolean filter(Event event) {
return event.actionId.equals("2.24");
}
}
).next("middle").where(
new SimpleCondition<Event>() {
#Override
public boolean filter(Event event) {
return event.actionId.equals("2.24");
}
}
).within(Time.seconds(5));
CEP And Sink
PatternStream<Event> patternStream = CEP.pattern(eventStringKeyedStream, pattern);
patternStream.process(new PatternProcessFunction<Event, Event>() {
#Override
public void processMatch(Map<String, List<Event>> map, Context context, Collector<Event> collector) throws Exception {
collector.collect(map.get("start").get(0));
}
}).print();//or sink function
My Program RESULT
DS::2> {"ActionID":"2.24"}
DS::2> {"ActionID":"2.24"}
DS::2> {"ActionID":"2.25"}
4> {ActionID='2.24'}
I was expecting
DS::2> {"ActionID":"2.24"}
DS::2> {"ActionID":"2.24"}
4> {ActionID='2.24'}
So why does it produce results when one more data comes after the conditions are met, not when the conditions are met for the pattern?
Please help me.
EDIT
import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.functions.PatternProcessFunction;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.cep.pattern.conditions.SimpleCondition;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;
import java.time.Duration;
import java.util.List;
import java.util.Map;
public class EventTimePattern {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<String> input = env.socketTextStream("localhost",9999)
.map(new MapFunction<String, Tuple2<String, Long>>() {
#Override
public Tuple2<String, Long> map (String value) throws Exception {
String[] fields = value.split(",");
if (fields.length == 2) {
return new Tuple2<String, Long>(
fields[0] ,
Long.parseLong(fields[1]));
}
return null;
}
})
/* env.fromElements(
Tuple2.of("A", 5L),
Tuple2.of("A", 10L)
)*/
.assignTimestampsAndWatermarks(
WatermarkStrategy
.<Tuple2<String, Long>>forBoundedOutOfOrderness(Duration.ofMillis(0))
.withTimestampAssigner((event, timestamp) -> event.f1))
.map(event -> event.f0);
Pattern<String, ?> pattern =
Pattern.<String>begin("start")
.where(
new SimpleCondition<String>() {
#Override
public boolean filter(String value) throws Exception {
return value.equals("A");
}
})
.next("end")
.where(
new SimpleCondition<String>() {
#Override
public boolean filter(String value) throws Exception {
return value.equals("A");
}
})
.within(Time.seconds(5));
input.print("I");
DataStream<String> result =
CEP.pattern(input, pattern)
.process(new PatternProcessFunction<String, String>() {
#Override
public void processMatch(
Map<String, List<String>> map,
Context context,
Collector<String> out) throws Exception {
StringBuilder builder = new StringBuilder();
builder.append(map.get("start").get(0))
.append(",")
.append(map.get("end").get(0));
out.collect(builder.toString());
}
});
result.print();
env.execute();
}
}
I failed to reproduce your problem. Here's a similar example that works fine (I used Flink 1.12.2):
import org.apache.flink.api.common.eventtime.WatermarkStrategy;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.cep.CEP;
import org.apache.flink.cep.functions.PatternProcessFunction;
import org.apache.flink.cep.pattern.Pattern;
import org.apache.flink.cep.pattern.conditions.SimpleCondition;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;
public class EventTimePattern {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
DataStream<String> input =
env.fromElements(
Tuple2.of("A", 5L),
Tuple2.of("A", 10L)
)
.assignTimestampsAndWatermarks(
WatermarkStrategy
.<Tuple2<String, Long>>forBoundedOutOfOrderness(Duration.ofMillis(0))
.withTimestampAssigner((event, timestamp) -> event.f1))
.map(event -> event.f0);
Pattern<String, ?> pattern =
Pattern.<String>begin("start")
.where(
new SimpleCondition<String>() {
#Override
public boolean filter(String value) throws Exception {
return value.equals("A");
}
})
.next("end")
.where(
new SimpleCondition<String>() {
#Override
public boolean filter(String value) throws Exception {
return value.equals("A");
}
})
.within(Time.seconds(5));
DataStream<String> result =
CEP.pattern(input, pattern)
.process(new PatternProcessFunction<String, String>() {
#Override
public void processMatch(
Map<String, List<String>> map,
Context context,
Collector<String> out) throws Exception {
StringBuilder builder = new StringBuilder();
builder.append(map.get("start").get(0))
.append(",")
.append(map.get("end").get(0));
out.collect(builder.toString());
}
});
result.print();
env.execute();
}
}
Please share a simple, complete, reproducible example that illustrates the problem you're having.

Springbatch FlatfileItemReader problem with & in line

I created an application using Springbatch. The batch read a csv file and do some stuff after. Everything works fine except when a line in the file contains the character &.
For example:
"BB1222";"Myexample & blabla";"tayoo"
I don't understand why and how to fix it, but the batch fail and can not convert the line in my object. It throw java.lang.IndexOutOfBoundsException: start ....
I defined my reader like this:
#Bean
public FlatFileItemReader<Bank> bankReader() {
FlatFileItemReader<Bank> reader = new FlatFileItemReader<Bank>();
reader.setLinesToSkip(1);
reader.setStrict(false);
reader.setEncoding("UTF-8");
reader.setLineMapper(new DefaultLineMapper<Bank>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(new String[]{
...
});
setDelimiter(";");
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Bank>() {
{
setTargetType(Bank.class);
}
});
}
});
return reader;
}
Can you help me ?
Thanks in advance !
Your issue does not seem to be related to the presence of &. Here is a passing test with your example:
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;
import org.springframework.batch.item.ExecutionContext;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.transform.DelimitedLineTokenizer;
import org.springframework.core.io.ByteArrayResource;
class DemoApplicationTests {
#Test
void testFlatFileItemReaderWithAmpersandInInput() throws Exception {
// given
String line = "\"BB1222\";\"Myexample & blabla\";\"tayoo\"";
Charset encoding = StandardCharsets.UTF_8;
FlatFileItemReader<Bank> reader = new FlatFileItemReader<>();
reader.setStrict(false);
reader.setEncoding(encoding.name());
reader.setLineMapper(new DefaultLineMapper<Bank>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames("code", "name", "address");
setDelimiter(";");
}
});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Bank>() {
{
setTargetType(Bank.class);
}
});
}
});
reader.setResource(new ByteArrayResource(line.getBytes(encoding)));
// when
reader.open(new ExecutionContext());
Bank item1 = reader.read();
Bank item2 = reader.read();
reader.close();
// then
Assertions.assertNotNull(item1);
Assertions.assertEquals("BB1222", item1.getCode());
Assertions.assertEquals("Myexample & blabla", item1.getName());
Assertions.assertEquals("tayoo", item1.getAddress());
Assertions.assertNull(item2);
}
public static class Bank {
String code;
String name;
String address;
public String getCode() {
return code;
}
public void setCode(String code) {
this.code = code;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getAddress() {
return address;
}
public void setAddress(String address) {
this.address = address;
}
}
}
You didn't share the full stacktrace, but you probably have a line with less or more tokens than expected. Moreover, since you configured the reader with UTF-8, you need to make sure the input file is encoded in UTF-8.

Run MyBatis migrations' 'up' command on startup of application

I have myBatis setup for my account. This by using the migrate command in the command line (in Jenkins). Now I want to integrate this with the application itself (Spring boot). Currently I have different sql files with #Undo and up sql code.
So When I start the Sping boot application I want to run the migrate up command without changing the sql files that I already have? Is this possible in MyBatis and Spring?
This is about MyBatis-Migrations, right?
Spring Boot does not provide out-of-box support, however, it seems to be possible to write a custom DatabasePopulator.
Here is a simple implementation.
It uses Migrations' Runtime Migration feature.
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.InputStream;
import java.io.Reader;
import java.math.BigDecimal;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.Collection;
import java.util.List;
import java.util.Properties;
import java.util.TreeSet;
import java.util.stream.Collectors;
import javax.sql.DataSource;
import org.apache.ibatis.migration.Change;
import org.apache.ibatis.migration.DataSourceConnectionProvider;
import org.apache.ibatis.migration.MigrationException;
import org.apache.ibatis.migration.MigrationLoader;
import org.apache.ibatis.migration.MigrationReader;
import org.apache.ibatis.migration.operations.UpOperation;
import org.apache.ibatis.migration.options.DatabaseOperationOption;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;
import org.springframework.core.io.support.PathMatchingResourcePatternResolver;
import org.springframework.core.io.support.ResourcePatternResolver;
import org.springframework.jdbc.datasource.init.DataSourceInitializer;
import org.springframework.jdbc.datasource.init.DatabasePopulator;
import org.springframework.jdbc.datasource.init.ScriptException;
import org.springframework.jdbc.datasource.init.UncategorizedScriptException;
#Configuration
public class MyBatisMigrationsConfig {
private static final String scriptsDir = "scripts";
private static final String changelogTable = "changelog";
#Bean
public DataSourceInitializer dataSourceInitializer(DataSource dataSource) {
Properties properties = new Properties();
properties.setProperty("changelog", changelogTable);
DatabaseOperationOption options = new DatabaseOperationOption();
options.setChangelogTable(changelogTable);
MyBatisMigrationsPopulator populator = new MyBatisMigrationsPopulator(dataSource, scriptsDir, properties, options,
new PathMatchingResourcePatternResolver());
DataSourceInitializer dataSourceInitializer = new DataSourceInitializer();
dataSourceInitializer.setDataSource(dataSource);
dataSourceInitializer.setDatabasePopulator(populator);
return dataSourceInitializer;
}
private static class MyBatisMigrationsPopulator implements DatabasePopulator {
private final DataSource dataSource;
private final String scriptsDir;
private final Properties properties;
private final DatabaseOperationOption options;
private final ResourcePatternResolver resourcePatternResolver;
public MyBatisMigrationsPopulator(DataSource dataSource, String scriptsDir,
Properties properties, DatabaseOperationOption options, ResourcePatternResolver resourcePatternResolver) {
super();
this.dataSource = dataSource;
this.scriptsDir = scriptsDir;
this.properties = properties;
this.options = options;
this.resourcePatternResolver = resourcePatternResolver;
}
public void populate(Connection connection) throws SQLException, ScriptException {
try {
new UpOperation().operate(new DataSourceConnectionProvider(dataSource),
createMigrationsLoader(), options, System.out);
} catch (MigrationException e) {
throw new UncategorizedScriptException("Migration failed.", e.getCause());
}
}
protected MigrationLoader createMigrationsLoader() {
return new SpringMigrationLoader(resourcePatternResolver, scriptsDir, "utf-8", properties);
}
}
private static class SpringMigrationLoader implements MigrationLoader {
protected static final String BOOTSTRAP_SQL = "bootstrap.sql";
protected static final String ONABORT_SQL = "onabort.sql";
private ResourcePatternResolver resourcePatternResolver;
private String path;
private String charset;
private Properties properties;
public SpringMigrationLoader(
ResourcePatternResolver resourcePatternResolver,
String path,
String charset,
Properties properties) {
this.resourcePatternResolver = resourcePatternResolver;
this.path = path;
this.charset = charset;
this.properties = properties;
}
#Override
public List<Change> getMigrations() {
Collection<String> filenames = new TreeSet<>();
for (Resource res : getResources("/*.sql")) {
filenames.add(res.getFilename());
}
filenames.remove(BOOTSTRAP_SQL);
filenames.remove(ONABORT_SQL);
return filenames.stream()
.map(this::parseChangeFromFilename)
.collect(Collectors.toList());
}
#Override
public Reader getScriptReader(Change change, boolean undo) {
try {
return getReader(change.getFilename(), undo);
} catch (IOException e) {
throw new MigrationException("Failed to read bootstrap script.", e);
}
}
#Override
public Reader getBootstrapReader() {
try {
return getReader(BOOTSTRAP_SQL, false);
} catch (FileNotFoundException e) {
// ignore
} catch (IOException e) {
throw new MigrationException("Failed to read bootstrap script.", e);
}
return null;
}
#Override
public Reader getOnAbortReader() {
try {
return getReader(ONABORT_SQL, false);
} catch (FileNotFoundException e) {
// ignore
} catch (IOException e) {
throw new MigrationException("Failed to read onabort script.", e);
}
return null;
}
protected Resource getResource(String pattern) {
return this.resourcePatternResolver.getResource(this.path + "/" + pattern);
}
protected Resource[] getResources(String pattern) {
try {
return this.resourcePatternResolver.getResources(this.path + pattern);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
protected Change parseChangeFromFilename(String filename) {
try {
String name = filename.substring(0, filename.lastIndexOf("."));
int separator = name.indexOf("_");
BigDecimal id = new BigDecimal(name.substring(0, separator));
String description = name.substring(separator + 1).replace('_', ' ');
Change change = new Change(id);
change.setFilename(filename);
change.setDescription(description);
return change;
} catch (Exception e) {
throw new MigrationException("Error parsing change from file. Cause: " + e, e);
}
}
protected Reader getReader(String fileName, boolean undo) throws IOException {
InputStream inputStream = getResource(fileName).getURL().openStream();
return new MigrationReader(inputStream, charset, undo, properties);
}
}
}
Here is an executable demo project.
You may need to modify the datasource settings in application.properties.
Hope this helps!
For Spring:
import java.io.File;
import java.net.URL;
import java.sql.Connection;
import java.sql.SQLException;
import java.util.Properties;
import javax.sql.DataSource;
import org.apache.ibatis.migration.ConnectionProvider;
import org.apache.ibatis.migration.FileMigrationLoader;
import org.apache.ibatis.migration.operations.UpOperation;
import org.apache.ibatis.migration.options.DatabaseOperationOption;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.datasource.init.DataSourceInitializer;
import org.springframework.jdbc.datasource.init.DatabasePopulator;
import org.springframework.jdbc.datasource.init.ScriptException;
#Configuration
public class MyBatisMigrationRuntimeConfiguration {
private static final String CHANGELOG_TABLE = "changelog";
private static final String MIGRATION_SCRIPTS = "migration/scripts";
#Bean
public DataSourceInitializer dataSourceInitializer(DataSource dataSource) {
DataSourceInitializer dataSourceInitializer = new DataSourceInitializer();
dataSourceInitializer.setDataSource(dataSource);
dataSourceInitializer.setDatabasePopulator(new Populator());
return dataSourceInitializer;
}
private DatabaseOperationOption getOption() {
DatabaseOperationOption options = new DatabaseOperationOption();
options.setChangelogTable(CHANGELOG_TABLE);
return options;
}
private Properties getProperties() {
Properties properties = new Properties();
properties.setProperty("changelog", CHANGELOG_TABLE);
return properties;
}
private File getScriptDir() {
URL url = getClass().getClassLoader().getResource(MIGRATION_SCRIPTS);
if (url == null) {
throw new IllegalArgumentException("file is not found!");
} else {
return new File(url.getFile());
}
}
private class Populator implements DatabasePopulator {
#Override
public void populate(Connection connection) throws SQLException, ScriptException {
new UpOperation().operate(
new SimplyConnectionProvider(connection),
new FileMigrationLoader(getScriptDir(), "utf-8", getProperties()),
getOption(),
System.out
);
}
}
private static class SimplyConnectionProvider implements ConnectionProvider {
private final Connection connection;
public SimplyConnectionProvider(Connection connection) {
this.connection = connection;
}
public Connection getConnection() {
return connection;
}
}
}

can Flink receive http requests as datasource?

Flink can read a socket stream, can it read http requests? how?
// socket example
DataStream<XXX> socketStream = env
.socketTextStream("localhost", 9999)
.map(...);
There's an open JIRA ticket for creating an HTTP sink connector for Flink, but I've seen no discussion about creating a source connector.
Moreover, it's not clear this is a good idea. Flink's approach to fault tolerance requires sources that can be rewound and replayed, so it works best with input sources that behave like message queues. I would suggest buffering the incoming http requests in a distributed log.
For an example, look at how DriveTribe uses Flink to power their website on the data Artisans blog and on YouTube.
I write one custom http source. please ref OneHourHttpTextStreamFunction. you need create a fat jar to include apache httpserver classes if you want run my code.
package org.apache.flink.streaming.examples.http;
import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.utils.ParameterTool;
import org.apache.flink.streaming.api.datastream.DataStream;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.source.SourceFunction;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.streaming.examples.socket.SocketWindowWordCount.WordWithCount;
import org.apache.flink.util.Collector;
import org.apache.http.HttpException;
import org.apache.http.HttpRequest;
import org.apache.http.HttpResponse;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.bootstrap.HttpServer;
import org.apache.http.impl.bootstrap.ServerBootstrap;
import org.apache.http.protocol.HttpContext;
import org.apache.http.protocol.HttpRequestHandler;
import java.io.IOException;
import java.util.concurrent.TimeUnit;
import static org.apache.flink.util.Preconditions.checkArgument;
import static org.apache.flink.util.Preconditions.checkNotNull;
public class HttpRequestCount {
public static void main(String[] args) throws Exception {
// the host and the port to connect to
final String path;
final int port;
try {
final ParameterTool params = ParameterTool.fromArgs(args);
path = params.has("path") ? params.get("path") : "*";
port = params.getInt("port");
} catch (Exception e) {
System.err.println("No port specified. Please run 'SocketWindowWordCount "
+ "--path <hostname> --port <port>', where path (* by default) "
+ "and port is the address of the text server");
System.err.println("To start a simple text server, run 'netcat -l <port>' and "
+ "type the input text into the command line");
return;
}
// get the execution environment
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// get input data by connecting to the socket
DataStream<String> text = env.addSource(new OneHourHttpTextStreamFunction(path, port));
// parse the data, group it, window it, and aggregate the counts
DataStream<WordWithCount> windowCounts = text
.flatMap(new FlatMapFunction<String, WordWithCount>() {
#Override
public void flatMap(String value, Collector<WordWithCount> out) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
for (String word : value.split("\\s")) {
out.collect(new WordWithCount(word, 1L));
}
}
})
.keyBy("word").timeWindow(Time.seconds(5))
.reduce(new ReduceFunction<WordWithCount>() {
#Override
public WordWithCount reduce(WordWithCount a, WordWithCount b) {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return new WordWithCount(a.word, a.count + b.count);
}
});
// print the results with a single thread, rather than in parallel
windowCounts.print().setParallelism(1);
env.execute("Http Request Count");
}
}
class OneHourHttpTextStreamFunction implements SourceFunction<String> {
private static final long serialVersionUID = 1L;
private final String path;
private final int port;
private transient HttpServer server;
public OneHourHttpTextStreamFunction(String path, int port) {
checkArgument(port > 0 && port < 65536, "port is out of range");
this.path = checkNotNull(path, "path must not be null");
this.port = port;
}
#Override
public void run(SourceContext<String> ctx) throws Exception {
server = ServerBootstrap.bootstrap().setListenerPort(port).registerHandler(path, new HttpRequestHandler(){
#Override
public void handle(HttpRequest req, HttpResponse rep, HttpContext context) throws HttpException, IOException {
ctx.collect(req.getRequestLine().getUri());
rep.setStatusCode(200);
rep.setEntity(new StringEntity("OK"));
}
}).create();
server.start();
server.awaitTermination(1, TimeUnit.HOURS);
}
#Override
public void cancel() {
server.stop();
}
}
Leave you comment, if you want the demo jar.

Intersystems Cache using XEP

I am trying to extract data from the Samples namespace that comes with Intersystems Cache install. Specifically, I am trying to retrieve Sample.Company global data using XEP. Inorder to achieve this, I created Sample.Company class like this -
package Sample;
public class Company {
public Long id;
public String mission;
public String name;
public Long revenue;
public String taxId;
public Company(Long id, String mission, String name, Long revenue,
String taxId) {
this.id = id;
this.mission = mission;
this.name = name;
this.revenue = revenue;
this.taxId = taxId;
}
public Company() {
}
}
XEP related code looks like this -
import java.util.ArrayList;
import java.util.List;
import org.springframework.stereotype.Service;
import Sample.Company;
import com.intersys.xep.Event;
import com.intersys.xep.EventPersister;
import com.intersys.xep.EventQuery;
import com.intersys.xep.EventQueryIterator;
import com.intersys.xep.PersisterFactory;
import com.intersys.xep.XEPException;
#Service
public class CompanyService {
public List<Company> fetch() {
EventPersister myPersister = PersisterFactory.createPersister();
myPersister.connect("SAMPLES", "user", "pwd");
try { // delete any existing SingleStringSample events, then import
// new ones
Event.isEvent("Sample.Company");
myPersister.deleteExtent("Sample.Company");
String[] generatedClasses = myPersister.importSchema("Sample.Company");
for (int i = 0; i < generatedClasses.length; i++) {
System.out.println("Event class " + generatedClasses[i]
+ " successfully imported.");
}
} catch (XEPException e) {
System.out.println("import failed:\n" + e);
throw new RuntimeException(e);
}
EventQuery<Company> myQuery = null;
List<Company> list = new ArrayList<Company>();
try {
Event newEvent = myPersister.getEvent("Sample.Company");
String sql = "Select * from Sample.Company";
myQuery = newEvent.createQuery(sql);
newEvent.close();
myQuery.execute();
EventQueryIterator<Company> iterator = myQuery.getIterator();
while (iterator.hasNext()) {
Company c = iterator.next();
System.out.println(c);
list.add(c);
}
myQuery.close();
myPersister.close();
return list;
} catch (XEPException e) {
System.out.println("createQuery failed:\n" + e);
throw new RuntimeException(e);
}
}
}
When I try executing the fetch() method of the above class, I am seeing the following exception -
com.intersys.xep.XEPException: Cannot import - extent for Sample.Company not empty.
at com.intersys.xep.internal.Generator.generate(Generator.java:52)
at com.intersys.xep.EventPersister.importSchema(EventPersister.java:954)
at com.intersys.xep.EventPersister.importSchema(EventPersister.java:363)
I got the simple string example working. Does it mean, we can not read the existing data using XEP? If we can read, Could some please help me in resolving the above issue? Thanks in advance.
You are trying to create a new class named Sample.Company in your instance:
String[] generatedClasses = myPersister.importSchema("Sample.Company");
But you still have data and an existing class there.