Why do I get a socket timeout the first time I hit the database after a recompile? - mongodb

I am using playframework 2.0 and after every recompile I get a socket timeout the first time my app tries to go to the database. I am using the Mongo Driver directly. here is a typical stack trace:
play.core.ActionInvoker$$anonfun$receive$1$$anon$1: Execution exception [[Network: can't call something : ds031907.mongolab.com/107.21.99.26:31907/heroku_app4620908]]
at play.core.ActionInvoker$$anonfun$receive$1.apply(Invoker.scala:82) [play_2.9.1.jar:2.0]
at play.core.ActionInvoker$$anonfun$receive$1.apply(Invoker.scala:63) [play_2.9.1.jar:2.0]
at akka.actor.Actor$class.apply(Actor.scala:290) [akka-actor.jar:2.0]
at play.core.ActionInvoker.apply(Invoker.scala:61) [play_2.9.1.jar:2.0]
at akka.actor.ActorCell.invoke(ActorCell.scala:617) [akka-actor.jar:2.0]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:179) [akka-actor.jar:2.0]
Caused by: com.mongodb.MongoException$Network: can't call something : ds031907.mongolab.com/107.21.99.26:31907/heroku_app4620908
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:227) ~[mongo-java-driver-2.7.3.jar:na]
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:305) ~[mongo-java-driver-2.7.3.jar:na]
at com.mongodb.DBCollection.findOne(DBCollection.java:647) ~[mongo-java-driver-2.7.3.jar:na]
at com.mongodb.DBCollection.findOne(DBCollection.java:626) ~[mongo-java-driver-2.7.3.jar:na]
at models.daos.ModuleDAO.findPublishedModuleById(ModuleDAO.java:445) ~[classes/:na]
at controllers.LearnController.viewModule(LearnController.java:31) ~[classes/:2.0]
Caused by: java.net.SocketException: Operation timed out
at java.net.SocketInputStream.socketRead0(Native Method) ~[na:1.6.0_31]
at java.net.SocketInputStream.read(SocketInputStream.java:129) ~[na:1.6.0_31]
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) ~[na:1.6.0_31]
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) ~[na:1.6.0_31]
at java.io.BufferedInputStream.read(BufferedInputStream.java:317) ~[na:1.6.0_31]
at org.bson.io.Bits.readFully(Bits.java:35) ~[mongo-java-driver-2.7.3.jar:na]
And here is my initialization code:
public static DB getDB(){
ensureMongo();
DB db = mongo.getDB(MOJULO_DB);
if(!db.isAuthenticated()){
db.authenticate(MONGO_USERNAME, MONGO_PASSWORD);
if(db.isAuthenticated())
System.out.println("authentication success on db:" + db.getName());
else
System.out.println("db authentication failure");
}
return db;
}
private static synchronized void ensureMongo(){
if(mongo == null){
try{
MongoURI mongoURI = new MongoURI(MONGO_URI);
mongo = new Mongo(mongoURI);
DB db = mongo.getDB(MOJULO_DB);
db.command("ping");
}catch(UnknownHostException ex){
mongo = null;
System.out.println("failed to connect to mongo");
ex.printStackTrace();
}
}
}
public static void disconnect(){
System.out.println("disconnecting from mongo");
if(mongo != null){
mongo.close();
mongo = null;
}
}
I use the getDB method from outside the class to get the db. The method is meant to create the mongo singleton if it does not exists. I always get the authentication success println, but then on the first hit to the database, I get the socket timeout exception.
In my Global class, I close the connection to the database when the application is closed.
#Override
public void onStop(Application app) {
System.out.println("stop");
Logger.info("Application shutdown...");
DBManager.disconnect();
}
Any Ideas?

I am not an expert on MongoDB but can see similarity with other DB connectivity.
How is your connection configured?
It looks (to me) like it may be attempting to load all mappings/DB and table definitions/everything else when you attempt to use DB connection (find method) for the first time.
It may be better to run a simple DB query in your ensureMongo() method to allow the system to re-initialise everything it needs (you may have to set longer time-out on this method).

Related

How to handle failures when publishing to pubsub using pubsub write in apache beam

I'm developing an Apache beam pipeline to publish unbounded data into a pubsub topic. Publishing is done using pubsub IO connector PubsubIO.writeMessages().
If pubsub connection is failed during pipeline is processing, I need to capture the connection failure and identify the data which is being processed during the connection failure. But I couldn't find a straight forward failure handling mechanism in Apache beam pubsub write.
When I test this using a bad pubsub connection, pipeline is trying to connect throwing following exception for a while and if the connection is unsuccessful pipeline execution will fail.
com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:69)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1041)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1215)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:983)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:771)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:563)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:533)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:553)
at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:68)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:739)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:718)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: io exception
at io.grpc.Status.asRuntimeException(Status.java:535)
... 10 more
Caused by: io.grpc.netty.shaded.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: no further information: /127.0.0.1:58843
Caused by: java.net.ConnectException: Connection refused: no further information
at java.base/sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at java.base/sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:779)
at io.grpc.netty.shaded.io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:330)
at io.grpc.netty.shaded.io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:334)
at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:702)
at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
at io.grpc.netty.shaded.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
at io.grpc.netty.shaded.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
at io.grpc.netty.shaded.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.grpc.netty.shaded.io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:834)
I tried to catch this exception from the pubsub write transform and it is not working either.
So my question is: Is there any way to capture above exception and continue pipeline until the connection is successful? My pubsub write code snippet is as follows:
public class PubSubWrite extends PTransform<PCollection<String>, PDone> {
private final String outputTopic;
public PubSubWrite(String outputTopic) {
this.outputTopic = outputTopic;
}
#Override
public PDone expand(PCollection<String> input) {
return input
.apply(
"convertMessagesToPubsubMessages",
MapElements.into(TypeDescriptor.of(PubsubMessage.class))
.via(
(String json) ->
new PubsubMessage(json.getBytes(Charsets.UTF_8), ImmutableMap.of("SOURCE", "TEST"))))
.apply(
"writePubsubMessagesToPubSub", PubsubIO.writeMessages().to(outputTopic));
}
}
There is not a native API for error handling in transforms for PubsubIO as you can see on the documentation.
I recommend you to open a feature request on issue tracker asking for a error handling implementation on the Java Library - PubsubIO connector.
While, you could return ans empty error collection or implement it to catch the exception by yourself.
Example for the empty error:
#Override
public WithFailures.Result < PDone, PubsubMessage > expand(PCollection < PubsubMessage > input) {
PDone done = input //
.apply(
"convertMessagesToPubsubMessages",
MapElements.into(TypeDescriptor.of(PubsubMessage.class))
.via(
(String json) - >
new PubsubMessage(json.getBytes(Charsets.UTF_8), ImmutableMap.of("SOURCE", "TEST"))))
.apply(
"writePubsubMessagesToPubSub", PubsubIO.writeMessages().to(outputTopic));
return WithFailures.Result.of(done, EmptyErrors.in(input.getPipeline()));
}
private static class EmptyErrors extends PTransform < PBegin, PCollection < PubsubMessage >> {
/** Creates an empty error collection in the given pipeline. */
public static PCollection < PubsubMessage > in (Pipeline pipeline) {
return pipeline.apply(new EmptyErrors());
}
#Override
public PCollection < PubsubMessage > expand(PBegin input) {
return input.apply(Create.empty(PubsubMessageWithAttributesCoder.of()));
}
}
Usually such failures are retried by the runner. For example, Dataflow runner will retry failures indefinitely for streaming jobs. Note that this is in addition to any local (VM level) retries for errors that produce re-triable HTTP error codes (for example 5xx). So pipeline should continue once you fix the underlying issue. But note that your backlog might significantly increase if the pipeline is unable to process data for some time so you might see a delay.

Big number of values IN Query with ItemReader

The following ItemReader gets a list of thousands accounts (acc).
The database that the ItemReader will connected to in order to retrieve the data is HIVE. I don’t have permission to create any table, only read option.
#Bean
#StepScope
public ItemReader<OmsDto> omsItemReader(#Value("#{stepExecutionContext[acc]}") List<String> accountList) {
String inParams = String.join(",", accountList.stream().map(id ->
"'"+id+"'").collect(Collectors.toList()));
String query = String.format("SELECT ..... account IN (%s)", inParams);
BeanPropertyRowMapper<OmsDto> rowMapper = new BeanPropertyRowMapper<>(OmsDto.class);
rowMapper.setPrimitivesDefaultedForNullValue(true);
JdbcCursorItemReader<OmsDto> reader = new JdbcCursorItemReader<OmsDto>();
reader.setVerifyCursorPosition(false);
reader.setDataSource(hiveDataSource());
reader.setRowMapper(rowMapper);
reader.setSql(query);
reader.open(new ExecutionContext());
return reader;
}
This is the error message that I get when using ItemReader:
Caused by: org.springframework.batch.item.ItemStreamException: Failed to initialize the reader
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.open(AbstractItemCountingItemStreamItemReader.java:153) ~[spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
Caused by: java.sql.SQLException: Error executing query
at com.facebook.presto.jdbc.PrestoStatement.internalExecute(PrestoStatement.java:279) ~[presto-jdbc-0.243.2.jar:0.243.2-128118e]
at com.facebook.presto.jdbc.PrestoStatement.execute(PrestoStatement.java:228) ~[presto-jdbc-0.243.2.jar:0.243.2-128118e]
at com.facebook.presto.jdbc.PrestoPreparedStatement.<init>(PrestoPreparedStatement.java:84) ~[presto-jdbc-0.243.2.jar:0.243.2-128118e]
at com.facebook.presto.jdbc.PrestoConnection.prepareStatement(PrestoConnection.java:130) ~[presto-jdbc-0.243.2.jar:0.243.2-128118e]
at com.facebook.presto.jdbc.PrestoConnection.prepareStatement(PrestoConnection.java:300) ~[presto-jdbc-0.243.2.jar:0.243.2-128118e]
at org.springframework.batch.item.database.JdbcCursorItemReader.openCursor(JdbcCursorItemReader.java:121) ~[spring-batch-infrastructure-4.2.4.RELEASE.jar:4.2.4.RELEASE]
... 63 common frames omitted
Caused by: java.lang.RuntimeException: Error fetching next at https://prestoanalytics-ch2-p.sys.comcast.net:6443/v1/statement/executing/20201118_131314_11079_v3w47/yf55745951e0beccc234c98f36005723457073854/0 returned an invalid response: JsonResponse{statusCode=502, statusMessage=Bad Gateway, headers={cache-control=[no-cache], content-length=[107], content-type=[text/html]}, hasValue=false} [Error: <html><body><h1>502 Bad Gateway</h1>
The server returned an invalid or incomplete response.
</body></html>
]
I was sure that the root cause is because of the driver but I have tested the driver with the same SQL this time using DriverManager and its run perfectly.
#Component
public class OmsItemReader implements ItemReader<OmsDto>, StepExecutionListener {
private ItemReader<OmsDto> delegate;
public SikOmsItemReader() {
Properties properties = new Properties();
properties.setProperty("user", "....");
properties.setProperty("password", "...");
properties.setProperty("SSL", "true");
Connection connection = null;
try {
connection = DriverManager.getConnection("jdbc:presto://.....", properties);
Statement statement = connection.createStatement();
ResultSet resultSet = statement.executeQuery(
I am not sure what is the different ? Is it the driver or sparing batch ?
I am looking for a workaround. How can I retrieve thousands of accounts via IN clauses with spring batch ?
Thank you

SCDF server dual database connection error

In my spring batch task I have two datasources configured, one for oracle and another for h2.
H2 I'm using for batch and task execution tables and oracle is for real data for batch processing. I'm able to successfully run the task from ide but when I run it from SCDF server I get following error
Caused by: java.sql.SQLException: Unable to start the Universal Connection Pool: oracle.ucp.UniversalConnectionPoolException: Cannot get Connection from Datasource: org.h2.jdbc.JdbcSQLInvalidAuthorizationSpecException: Wrong user name or password [28000-200]
Question is, why is it connecting to H2 for oracle db connection as well?
Following is my DB configuration:
#Bean(name = "OracleUniversalConnectionPool")
public DataSource secondaryDataSource() {
PoolDataSource pds = null;
try {
pds = PoolDataSourceFactory.getPoolDataSource();
pds.setConnectionFactoryClassName(driverClassName);
pds.setURL(url);
pds.setUser(username);
pds.setPassword(password);
pds.setMinPoolSize(Integer.valueOf(minPoolSize));
pds.setInitialPoolSize(10);
pds.setMaxPoolSize(Integer.valueOf(maxPoolSize));
} catch (SQLException ea) {
log.error("Error connecting to the database: ", ea.getMessage());
}
return pds;
}
#Bean(name = "batchDataSource")
#Primary
public DataSource dataSource() throws SQLException {
final SimpleDriverDataSource dataSource = new SimpleDriverDataSource();
dataSource.setDriver(new org.h2.Driver());
dataSource.setUrl("jdbc:h2:tcp://localhost:19092/mem:dataflow");
dataSource.setUsername("sa");
dataSource.setPassword("");
return dataSource;
}
I got it resolved, problem was I was using this for setting values to oracle db.
spring.datasource.url: ******
spring.datasource.username: ******
It worked fine from IDE, but when I ran it on SCDF server it was overwritten by default properties being used for database connection of SCDF server. So I just updated the connection properties to :
spring.datasource.oracle.url: ******
spring.datasource.oracle.username: ******
And now it's working as expected.

Error while connecting to druid using SQL interface

I am trying to connect to druid database using avatica jar
Following is the code.
String url = "jdbc:avatica:remote:url=http://localhost:8082/druid/v2/sql/avatica";
Properties connectionProperties = new Properties();
try (Connection connection = DriverManager.getConnection(url, connectionProperties))
{
try (
final Statement statement = connection.createStatement();
final ResultSet resultSet = statement.executeQuery("SELECT COUNT(*) as rowcount FROM wikiticker"))
{
while (resultSet.next())
{
int count = resultSet.getInt("rowcount");
System.out.println("Total records:" + count);
}
resultSet.close();
}
}
catch (SQLException e)
{
e.printStackTrace();
}
I get following exception , Can someone please let me know whats going wrong ? I have set the runtime property to enable sql.
Exception in thread "main" java.lang.RuntimeException: Failed to execute HTTP Request, got HTTP/404
at org.apache.calcite.avatica.remote.AvaticaCommonsHttpClientImpl.send(AvaticaCommonsHttpClientImpl.java:138)
at org.apache.calcite.avatica.remote.RemoteService.apply(RemoteService.java:34)
at org.apache.calcite.avatica.remote.JsonService.apply(JsonService.java:172)
at org.apache.calcite.avatica.remote.Driver.connect(Driver.java:175)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at com.test.druid.sql.Main.main(Main.java:17)
Looks like your broker instance is missing -Ddruid.sql.enabled=true flag on startup. You can refer to http://druid.io/docs/latest/querying/sql.html for further details.

When and why does Curator throw ConnectionLossException?

I use Curator 1.2.4 and I keep getting ConnectionLossException when I want to monitor one znode for its children's changes.
I then implemented a watcher like this
public class CuratorChildWatcherImpl implements CuratorWatcher {
private CuratorFramework client;
public CuratorChildWatcherImpl(CuratorFramework client) {
this.client = client;
}
#Override
public void process(WatchedEvent event) throws Exception {
List<String> children=client.getChildren().usingWatcher(this).forPath(event.getPath());
// Do other stuff with the children znode.
}
}
Every 11 seconds the code throws ConnectionLossException if connectionTimeout is set to 10 seconds. It seems the exception is connectionTimeout plus 1 second. Why?
I checked the source code found that GetChildrenBuilderImpl will call the CuratorZookeeperClient's blockUntilConnectedOrTimeout method which will check the connection state every 1 second.
2013-04-17 17:22:08 [ERROR]-[com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:97)] Connection timed out for connection string (...) and timeout (10000) / elapsed (10317913)
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:94)
at com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:107)
at com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:413)
at com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:213)
at com.netflix.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:202)
at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:106)
at com.netflix.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:198)
at com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:190)
at com.netflix.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:37)
at com.netflix.curator.framework.imps.NamespaceWatcher.process(NamespaceWatcher.java:56)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521)
This was a known bug in the Curator/ZooKeeper interaction that's tracked under CURATOR-24 The current method of managing hung ZK handles needs improvement. It was fixed in 2.0.1-incubating version.