Too many empty chk-* directories with Flink checkpointing using RocksDb as state backend - apache-kafka

Too many empty chk-* files exist in the location where I have setup Rocksdb as state backend
I am using FlinkKafkaConsumer to get data from Kafka topic. And I am using RocksDb as state backend. I am just printing the messages received from Kafka.
Following are the properties I have to set up the state backend:
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.enableCheckpointing(100);
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(50);
env.getCheckpointConfig().setCheckpointTimeout(60);
env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
env.getCheckpointConfig().enableExternalizedCheckpoints(ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
StateBackend rdb = new RocksDBStateBackend("file:///Users/user/Documents/telemetry/flinkbackends10", true);
env.setStateBackend(rdb);
env.execute("Flink kafka");
In flink-conf.yaml I have also set this property:
state.checkpoints.num-retained: 3
I am using simple 1 node flink cluster(using ./start-cluster.sh) .I started the job and kept it running for 1 hour and I see too many chk-* files created under /Users/user/Documents/telemetry/flinkbackends10 location
chk-10 chk-12667 chk-18263 chk-20998 chk-25790 chk-26348 chk-26408 chk-3 chk-3333 chk-38650 chk-4588 chk-8 chk-96
chk-10397 chk-13 chk-18472 chk-21754 chk-25861 chk-26351 chk-26409 chk-30592 chk-34872 chk-39405 chk-5 chk-8127 chk-97
chk-10649 chk-13172 chk-18479 chk-22259 chk-26216 chk-26357 chk-26411 chk-31097 chk-35123 chk-39656 chk-5093 chk-8379 chk-98
chk-1087 chk-14183 chk-18548 chk-22512 chk-26307 chk-26360 chk-27055 chk-31601 chk-35627 chk-4 chk-5348 chk-8883 chk-9892
chk-10902 chk-15444 chk-18576 chk-22764 chk-26315 chk-26377 chk-28064 chk-31853 chk-36382 chk-40412 chk-5687 chk-9 chk-99
chk-11153 chk-15696 chk-18978 chk-23016 chk-26317 chk-26380 chk-28491 chk-32356 chk-36885 chk-41168 chk-6 chk-9135 shared
chk-11658 chk-16201 chk-19736 chk-23521 chk-26320 chk-26396 chk-28571 chk-32607 chk-37389 chk-41666 chk-6611 chk-9388 taskowned
chk-11910 chk-17210 chk-2 chk-24277 chk-26325 chk-26405 chk-29076 chk-32859 chk-37642 chk-41667 chk-7 chk-94
chk-12162 chk-17462 chk-20746 chk-25538 chk-26337 chk-26407 chk-29581 chk-33111 chk-38398 chk-41668 chk-7116 chk-95
out of which only chk-41668, chk-41667, chk-41666 have data.
The rest of the directories are empty.
Is this expected behavior. How to delete those empty directories? Is there some configuration for deleting empty directories?

Answering my own question here:
In UI I was seeing 'checkpoint expired before completing' error in the checkpointing section. And found out that to resolve the error we need to increase the checkpoint timeout.
I increased the timeout from 60 to 500 and it started deleting the empty chk-* files.
env.getCheckpointConfig().setCheckpointTimeout(500);

Related

Kafka Connect 5.5.0 - Unable to reset max.request.size

In confluent-5.5.0 - I am unable to change the max.request.size , which always defaults to max.request.size = 1048576 in the ProducerConfig.
The following are the parameters I have already tried with noluck:
confluent-5.5.0/etc/kafka/producer.properties
max.request.size=15728640
producer.max.request.size=15728640
confluent-5.5.0/etc/kafka/server.properties
message.max.bytes=15728640
replica.fetch.max.bytes=15728640
max.request.size=15728640
fetch.message.max.bytes=15728640
/data/confluent-5.5.0/etc/kafka/consumer.properties
max.partition.fetch.bytes=15728640
confluent-5.5.0/etc/kafka-rest/kafka-rest.properties
max.request.size=15728640
NOTE : None of these values is getting updated in the connect.log
I have stop/started confluent-5.5.0 , even destroyed the previous images and restarted.
Am i missing something ?
The following i have also tried after the information from comment :
/data/confluent-5.5.0/etc/kafka/connect-standalone.properties
producer.override.max.request.size=15728640
consumer.override.max.partition.fetch.bytes=15728640
/data/confluent-5.5.0/etc/kafka/connect-distributed.properties
producer.override.max.request.size=15728640
consumer.override.max.partition.fetch.bytes=15728640
Still in the max.request.size has not got changed.
( Solved )Based on the inputs :
I have added the above configuration in the connect
or configuration. And also changed the policy from none to ALL. Which applied the configuration changes properly.
Those files are not used by Connect.
server is for the Apache Kafka Broker only
consumer|producer are for the kafka-console utilities
kafka-rest is for the Confluent REST Proxy only
You need to use connect-distributed.properties or connect-standalone.properties and notice that you need to additionally set the property correctly using prefixes.
the solution is to set configuration in kafka connect proprties file :
add the following in distributed or standalone connect properties file
producer.max.request.size=157286400
consumer.max.request.size=157286400
max.request.size=157286400
and it will work !!!

Starting Druid to consume data from kafka

Took the latest version of druid 0.16.0-incubating.
Had 2 question .
1) As mentioned in quick start , micro-quick-start doesnt work as it complains about no file jvm.config and main.config under /conf/druid/single-server/micro-quickstart/coordinator-overlord .
2) As micro qucik start failed i started to try with single-server-small.
Was trying to import data from kafka in single-server-small but unable to do so as it says extension is not loaded , by the way which i can see gets loaded in logs.
But i think my main problem is when ever i land up on 'Load data' Section on druid web page on localhost:8888 , it keeps me giving below error
"Failed to get overlord modules : Unable to determine destination for [/proxy/overlord/status]; is your coordinator/overlord running ?"
I can see my coordinator-overlord process up .
Any suggestions ?
Thanks
Delete your kafka logs and druid logs(the location set up in common.runtime.properties) and then try again.

mosquitto.db file does not get created

In the process of testing mosquitto Persistence, I have removed mosquitto.db from Persistence location to enable a fresh start. But, to my chagrin, the file does not get created even after I restart the broker.
Did I get it wrong that the broker creates the .db file as per the config? Any pointers on how to get a fresh mosquitto.db file would be appreciated.
# Place your local configuration in /etc/mosquitto/conf.d/
#
# A full description of the configuration file is at
# /usr/share/doc/mosquitto/examples/mosquitto.conf.example
pid_file /var/run/mosquitto.pid
max_inflight_messages 1
persistence true
persistence_file mosquitto.db
persistence_location /var/lib/mosquitto/
log_dest file /var/log/mosquitto/mosquitto.log
include_dir /etc/mosquitto/conf.d
password_file /etc/mosquitto/passwd
allow_anonymous false
max_queued_messages 1000000
autosave_interval 30
# autosave_on_changes false
If you delete the file while the broker is running it is likely to not get recreated because the broker will already hold an open file handle.
Deleting a file while it's open by a process does not actually remove the file, just it's entry in the directory, the process will continue to read/write to the file until the handle is closed.
If you restart mosquitto after deleting the file it won't write to the file until it actually has some data to write to it, e.g.
have a subscribed client (at QOS 1 or 2)
send some messages
disconnect the subscriber
send more messages
shutdown mosquitto
The file should now be written containing the messages that were published while the client was disconnected.

Handling connection failures in apache-camel

I am writing an apache-camel RabbitMQ consumer. I would like to react somehow to connection problems (i.e. try to reconnect). Is it possible to configure apache-camel to automatically reconnect?
If not, how can I find out that a connection to the queue was interrupted? I've done the following test:
start the queue (and some producer)
start my consumer (it was getting messages as expected)
stop the queue (the messages stopped arriving, as expected, but no exception was thrown)
start the queue (no new messages were received)
I am using camel in Scala (via akka-camel), but a Java solution would be probably also OK
You can pass in the flag automaticRecoveryEnabled=true to the URI, Camel will reconnect if the connection is lost.
For automatic RabbitMQ resource recovery (Connections/Channels/Consumers/Queues/Exchanages/Bindings) when failures occur, check out Lyra (which I authored). Example usage:
Config config = new Config()
.withRecoveryPolicy(new RecoveryPolicy()
.withMaxAttempts(20)
.withInterval(Duration.seconds(1))
.withMaxDuration(Duration.minutes(5)));
ConnectionOptions options = new ConnectionOptions().withHost("localhost");
Connection connection = Connections.create(options, config);
The rest of the API is just the amqp-client API, except your resources are automatically recovered when failures occur.
I'm not sure about camel-rabbitmq specifically, but hopefully there's a way you can swap in your own resource creation via Lyra.
Current camel-rabbitmq just create a connection and the channel when the consumer or producer is started. So it don't have a chance to catch the connection exception :(.

Embedded ActiveMQ: jdbcPersistenceAdapter using kahaDB?

I've the following Spring config of the ActiveMQ Broker:
<broker:broker id="activemqbroker" useJmx="false" persistent="true" brokerName="activemqbroker">
<broker:transportConnectors>
<broker:transportConnector name="vm" uri="vm://activemqbroker"/>
</broker:transportConnectors>
<broker:persistenceAdapter>
<broker:jdbcPersistenceAdapter dataSource="#oracle-ds" transactionIsolation="2">
<broker:statements>
<broker:statements tablePrefix="IAG_PROC_"/>
</broker:statements>
</broker:jdbcPersistenceAdapter>
</broker:persistenceAdapter>
</broker:broker>
And the problem is that the active-mq directory with KahaDB is still being created and used. I don't understand why because I'm not using journaledJDBC but jdbcPersistenceAdapter. How could I setup this to use only JDBC?
The scheduler feature in ActiveMQ uses its own KahaDB persistent store, try setting it to disabled on the broker element via: schedulerSupport=false.