I am new to Storm ,
These are my Storm Configuration
val builder = new TopologyBuilder()
builder.setSpout("kafka-spout", new AlgoKafkaSpout().buildKafkaSpout(), 4)
builder.setBolt("wlref-bolt", new WlrefBolt, 4).shuffleGrouping("kafka-spout")
builder.setBolt("params-bolt", new ParamsBolt, 4).shuffleGrouping("wlref-bolt")
builder.setBolt("gender-bolt", new GenderBolt, 4).shuffleGrouping("params-bolt")
builder.setBolt("age-bolt", new AgeBolt, 4).shuffleGrouping("gender-bolt")
builder.setBolt("preference-bolt", new PreferenceBolt, 4).shuffleGrouping("age-bolt")
builder.setBolt("geo-bolt", new GeoBolt, 4).shuffleGrouping("preference-bolt")
builder.setBolt("device-bolt", new DeviceBolt, 4).shuffleGrouping("geo-bolt")
builder.setBolt("druid-bolt", new AlgoBeamBolt[java.util.Map[String, AnyRef]](new MyBeamFactory), 4)
.shuffleGrouping("device-bolt")
builder.setBolt("redis-bolt",new RedisBolt,4).shuffleGrouping("druid-bolt")
val conf = new Config()
conf.setDebug(false)
conf.setMessageTimeoutSecs(120)
conf.setNumWorkers(1)
StormSubmitter.submitTopology(args(0), conf, builder.createTopology)
And this my KafkaSpout
val hosts = new ZkHosts(s"${Config.zkHost}:${Config.zkPort}")
val topic = Config.kafkaTopic
val zkRoot = s"/$topic"+"7"
val groupId = Config.kafkaGroup
val kafkaConfig = new SpoutConfig(hosts, topic, zkRoot, UUID.randomUUID().toString())
kafkaConfig.scheme = new SchemeAsMultiScheme(new StringScheme())
kafkaConfig.startOffsetTime = kafka.api.OffsetRequest.LatestTime
new KafkaSpout(kafkaConfig)
We are producing approx 400 to 1000 messages per second. We make 4 partitions in Kafka Topic.
For few hours we consume message properly but after sometime kafkaSpout did not consume message from some partitions
This is message consuming report
Please let us know if any other information required.
EDIT
Kafka Version: kafka_2.11-0.10.1.1
Storm Version: apache-storm-0.10.2
Sbt Dependencies
"org.apache.storm" % "storm-core" % "0.10.2" % "provided",
"org.apache.storm" % "storm-kafka" % "0.10.2" /*% "provided"*/,
"org.apache.kafka" %% "kafka" % "0.10.0.1"
Related
Kafka Client : 0.11.0.0-cp1
Kafka Broker :
On Kafka broker rolling restart, our application lost some messages while sending to broker. I believe with rolling restart there should not be any loss of message. These are the producer (Using Producer with asynchronous send() and not using callback/future etc) settings we are using :
val acksConfig: String = "all",
val retriesConfig: Int = Int.MAX_VALUE,
val retriesBackOffConfig: Int = 1000,
val batchSize: Int = 32768,
val lingerTime: Int = 1,
val maxBlockTime: Int = Int.MAX_VALUE,
val requestTimeOut: Int = 420000,
val bufferMemory: Int = 33_554_432,
val compressionType: String = "gzip",
val keySerializer: Class<StringSerializer> = StringSerializer::class.java,
val valueSerializer: Class<ByteArraySerializer> = ByteArraySerializer::class.java
I am seeing these exceptions in the logs
2019-03-19 17:30:59,224 [org.apache.kafka.clients.producer.internals.Sender] [kafka-producer-network-thread | producer-1] (Sender.java:511) WARN org.apache.kafka.clients.producer.internals.Sender - Got error produce response with correlation id 1105790 on topic-partition catapult_on_entitlement_updates_prod-67, retrying (2147483643 attempts left). Error: NOT_LEADER_FOR_PARTITION
But log says retry attempt left, i am curious why didnt it retry then? Let me know if anyone has any idea?
Two things to note:
What is the replication factor of the topic you are producing and what is the required number of min.insync.replicas?
What do you mean by "producer lost some messages". The producer if it cannot successfully produce to #min.insync.replicas brokers it will throw an exception and fail (for synchronous production). It is up to the producer/ client to retry in case of failure (synchronous or asynchronous production).
I am using scala 2.12 and have following dependencies in my build.sbt.
libraryDependencies += "org.apache.kafka" % "kafka-clients" % "0.10.1.0"
libraryDependencies += "io.confluent" % "kafka-avro-serializer" % "3.1.1"
libraryDependencies += "io.confluent" % "common-config" % "3.1.1"
libraryDependencies += "io.confluent" % "common-utils" % "3.1.1"
libraryDependencies += "io.confluent" % "kafka-schema-registry-client" % "3.1.1"
Thanks to this community, I am able to convert my raw data to required avro format.
We need to use the confluent libraries to serialize and send the data to the Kafka topics.
I am using the following properties and avro record.
properties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer")
properties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "io.confluent.kafka.serializers.KafkaAvroSerializer")
properties.put("schema.registry.url", "http://myschemahost:8081")
Just showing required snippet of code for brevity.
val producer = new KafkaProducer[String, GenericData.Record](properties)
val schema = new Schema.Parser().parse(new File(schemaFileName))
var avroRecord = new GenericData.Record(schema)
// code to populate record
// check output below to see the data
logger.info(s"${avroRecord.toString}\n")
producer.send(new ProducerRecord[String, GenericData.Record](topic, avroRecord), new ProducerCallback)
producer.flush()
producer.close()
Schema and Data as per the output.
{"name": "person","type": "record","fields": [{"name": "address","type": {"type" : "record","name" : "AddressUSRecord","fields" : [{"name": "streetaddress", "type": "string"},{"name": "city", "type":"string"}]}}]}
I am getting the following error while publishing to Kafka.
Error registering Avro schema:
org.apache.kafka.common.errors.SerializationException:
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: (sun.net.www.protocol.http.HttpURLConnection$HttpInputStream); line: 1, column: 2]; error code: 50005
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:170)
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:187)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:238)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:230)
at io.confluent.kafka.schemaregistry.client.rest.RestService.registerSchema(RestService.java:225)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.registerAndGetId(CachedSchemaRegistryClient.java:59)
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.register(CachedSchemaRegistryClient.java:91)
at io.confluent.kafka.serializers.AbstractKafkaAvroSerializer.serializeImpl(AbstractKafkaAvroSerializer.java:72)
at io.confluent.kafka.serializers.KafkaAvroSerializer.serialize(KafkaAvroSerializer.java:54)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:60)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:877)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:839)
Based on schema and data, is anything missing? My record is correct ?
Also, I want to know how should I populate "avro" NULL from Scala? None doesn't work.
Any help will be appreciated. I am really stuck here.
UPDATE:
Thanks #cricket_007 for pointing out the issue. I do get following error:
2019-03-20 13:26:09.660 [application-akka.actor.default-dispatcher-5] INFO i.c.k.s.KafkaAvroSerializerConfig.logAll(169) - KafkaAvroSerializerConfig values:
schema.registry.url = [http://myhost:8081]
max.schemas.per.subject = 1000
Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Unexpected character ('<' (code 60)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')
at [Source: (sun.net.www.protocol.http.HttpURLConnection$HttpInputStream); line: 1, column: 2]; error code: 50005
However, When I use the same URL (http://myhost:8081) on my browser it works well. I can see the subjects, and other information.
But as soon as I use the client (Scala program above), it fails with above error.
I just checked with a sample code like below, it gives same issue.
val client = new OkHttpClient
val request = new Request.Builder().url("http://myhost:8081/subjects").build()
val output = client.newCall(request).execute().body().string()
logger.info(s"Subjects: ${output}\n")
I am getting connection refused for the schema registry URL.
Subjects: <HEAD><TITLE>Connection refused</TITLE></HEAD>
<BODY BGCOLOR="white" FGCOLOR="black"><H1>Connection refused</H1><HR>
<FONT FACE="Helvetica,Arial"><B>
Description: Connection refused</B></FONT>
<HR>
<!-- default "Connection refused" response (502) -->
</BODY>
So, wanted to check if I am missing anything. Same thing works when I run it on browser but simple code like above it fails.
That's an HTTP response parsing error. Seems your schema registry is not returning a JSON response, and rather some HTML starting with a < open tag.
You should check if the registry is really running at http://myschemahost:8081, and you can manually post your schema to it using the REST API to do the same actions as the serializer would.
I am writing Kafka client producer as:
public class BasicProducerExample {
public static void main(String[] args){
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.RETRIES_CONFIG, 0);
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
//props.put(ProducerConfig.
props.put("batch.size","16384");// maximum size of message
Producer<String, String> producer = new KafkaProducer<String, String>(props);
TestCallback callback = new TestCallback();
Random rnd = new Random();
for (long i = 0; i < 2 ; i++) {
//ProducerRecord<String, String> data = new ProducerRecord<String, String>("dke", "key-" + i, "message-"+i );
//Topci and Message
ProducerRecord<String, String> data = new ProducerRecord<String, String>("dke", ""+i);
producer.send(data, callback);
}
producer.close();
}
private static class TestCallback implements Callback {
#Override
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
if (e != null) {
System.out.println("Error while producing message to topic :" + recordMetadata);
e.printStackTrace();
} else {
String message = String.format("sent message to topic:%s partition:%s offset:%s", recordMetadata.topic(), recordMetadata.partition(), recordMetadata.offset());
System.out.println(message);
}
}
}
}
OUTPUT:
Error while producing message to topic :null
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
NOTE:
Broker port: localhost:6667 is working.
In your property for BOOTSTRAP_SERVERS_CONFIG, try changing the port number to 6667.
Thanks.
--
Hiren
I use Apache Kafka on a Hortonworks (HDP 2.X release) installation. The error message encountered means that Kafka producer was not able to push the data to the segment log file. From a command-line console, that would mean 2 things :
You are using incorrect port for the brokers
Your listener config in server.properties are not working
If you encounter the error message while writing via scala api, additionally check connection to kafka cluster using telnet <cluster-host> <broker-port>
NOTE: If you are using scala api to create topic, it takes sometime for the brokers to know about the newly created topic. So, immediately after topic creation, the producers might fail with the error Failed to update metadata after 60000 ms.
I did the following checks in order to resolve this issue:
The first difference once I check via Ambari is that Kafka brokers listen on port 6667 on HDP 2.x (apache kafka uses 9092).
listeners=PLAINTEXT://localhost:6667
Next, use the ip instead of localhost.
I executed netstat -na | grep 6667
tcp 0 0 192.30.1.5:6667 0.0.0.0:* LISTEN
tcp 1 0 192.30.1.5:52242 192.30.1.5:6667 CLOSE_WAIT
tcp 0 0 192.30.1.5:54454 192.30.1.5:6667 TIME_WAIT
So, I modified the producer call to user the IP and not localhost:
./kafka-console-producer.sh --broker-list 192.30.1.5:6667 --topic rdl_test_2
To monitor if you have new records being written, monitor the /kafka-logs folder.
cd /kafka-logs/<topic name>/
ls -lart
-rw-r--r--. 1 kafka hadoop 0 Feb 10 07:24 00000000000000000000.log
-rw-r--r--. 1 kafka hadoop 10485756 Feb 10 07:24 00000000000000000000.timeindex
-rw-r--r--. 1 kafka hadoop 10485760 Feb 10 07:24 00000000000000000000.index
Once, the producer successfully writes, the segment log-file 00000000000000000000.log will grow in size.
See the size below:
-rw-r--r--. 1 kafka hadoop 10485760 Feb 10 07:24 00000000000000000000.index
-rw-r--r--. 1 kafka hadoop **45** Feb 10 09:16 00000000000000000000.log
-rw-r--r--. 1 kafka hadoop 10485756 Feb 10 07:24 00000000000000000000.timeindex
At this point, you can run the consumer-console.sh:
./kafka-console-consumer.sh --bootstrap-server 192.30.1.5:6667 --topic rdl_test_2 --from-beginning
response is hello world
After this step, if you want to produce messages via the Scala API's , then change the listeners value(from localhost to a public IP) and restart Kafka brokers via Ambari:
listeners=PLAINTEXT://192.30.1.5:6667
A Sample producer will be as follows:
package com.scalakafka.sample
import java.util.Properties
import java.util.concurrent.TimeUnit
import org.apache.kafka.clients.producer.{ProducerRecord, KafkaProducer}
import org.apache.kafka.common.serialization.{StringSerializer, StringDeserializer}
class SampleKafkaProducer {
case class KafkaProducerConfigs(brokerList: String = "192.30.1.5:6667") {
val properties = new Properties()
val batchsize :java.lang.Integer = 1
properties.put("bootstrap.servers", brokerList)
properties.put("key.serializer", classOf[StringSerializer])
properties.put("value.serializer", classOf[StringSerializer])
// properties.put("serializer.class", classOf[StringDeserializer])
properties.put("batch.size", batchsize)
// properties.put("linger.ms", 1)
// properties.put("buffer.memory", 33554432)
}
val producer = new KafkaProducer[String, String](KafkaProducerConfigs().properties)
def produce(topic: String, messages: Iterable[String]): Unit = {
messages.foreach { m =>
println(s"Sending $topic and message is $m")
val result = producer.send(new ProducerRecord(topic, m)).get()
println(s"the write status is ${result}")
}
producer.flush()
producer.close(10L, TimeUnit.MILLISECONDS)
}
}
Hope this helps someone.
Trying the new 0.9 Consumer API and Producer API. But just cant seem to get it working. I have a producer that produces 100 messages per partition of a topic with two paritions. My code reads like
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("acks", "1");
props.put("retries", 0);
props.put("batch.size", 2);
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
Producer<String, String> producer = new KafkaProducer<>(props);
System.out.println("Created producer");
for (int i = 0; i < 100; i++) {
for (int j = 0; j < 2; j++) {
producer.send(new ProducerRecord<>("burrow_test_2", j, "M_"+ i + "_" + j + "_msg",
"M_" + i + "_" + j + "_msg"));
System.out.println("Sending msg " + i + " into partition " + j);
}
System.out.println("Sent 200 msgs");
}
System.out.println("Closing producer");
producer.close();
System.out.println("Closed producer");
Now producer.close takes a really long time to close, post which I assume any buffers are flushed, and messages written to the tail of the log.
Now my consumer , I would like to read the specified number of messages before quitting. I chose to manually commit offset as I read. The code reads like
int noOfMessagesToRead = Integer.parseInt(args[0]);
String groupName = args[1];
System.out.println("Reading " + noOfMessagesToRead + " for group " + groupName);
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("group.id", groupName);
//props.put("enable.auto.commit", "true");
//props.put("auto.commit.interval.ms", "1000");
props.put("session.timeout.ms", "30000");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
consumer.subscribe(Arrays.asList("burrow_test_2"));
//consumer.seekToEnd(new TopicPartition("burrow_test_2", 0), new TopicPartition("burrow_test_2", 1));
int noOfMessagesRead = 0;
while (noOfMessagesRead != noOfMessagesToRead) {
System.out.println("Polling....");
ConsumerRecords<String, String> records = consumer.poll(0);
for (ConsumerRecord<String, String> record : records) {
System.out.printf("offset = %d, key = %s, value = %s", record.offset(), record.key(),
record.value());
noOfMessagesRead++;
}
consumer.commitSync();
}
}
But my consumer is always stuck on the poll call (never returns even though I have specified timeout as 0).
Now just to confirm, I tried consuming from command line console consumer provided by Kafka bin/kafka-console-consumer.sh --from-beginning --new-consumer --topic burrow_test_2 --bootstrap-server localhost:9092.
Now this seems to read only to only consume messages produced before I started the java producer.
So questions are
Whats the problem with Java producer above
Why is the consumer not polling anything (I have older messages that are being read just fine by console consumer).
UPDATE: My bad. I had port forwarding enabled from localhost to a remote machine running Kafka, ZK. When this was the case there seems to be some issue. Running on localhost seemed to produce the message.
My only remaining question is that with consumer API, I am unable to seekan offset. I tried both seekToEnd and seekToBeginning methods, both threw an exception saying no record found. So what is the way for consumer to seek an offset. Is auto.offset.reset consumer property the only option ?
I have been playing around with Akka Persistence and have written the following program to test my understanding. The problem is that I get different results each time I run this program. The correct answer is 49995000 but I don't always get that. I have cleaned out the journal directory between each run but it does not make any difference. Can anyone see what's going wrong? The program simply sums all the numbers from 1 to n (where n is 9999 in the code below).
The correct answer is : (n * (n+1)) / 2. For n=9999 that's 49995000.
EDIT: Seems to work more consistently with JDK 8 than with JDK 7. Should I be using JDK 8 only?
package io.github.ourkid.akka.aggregator.guaranteed
import akka.actor.Actor
import akka.actor.ActorPath
import akka.actor.ActorSystem
import akka.actor.Props
import akka.actor.actorRef2Scala
import akka.persistence.AtLeastOnceDelivery
import akka.persistence.PersistentActor
case class ExternalRequest(updateAmount : Int)
case class CountCommand(deliveryId : Long, updateAmount : Int)
case class Confirm(deliveryId : Long)
sealed trait Evt
case class CountEvent(updateAmount : Int) extends Evt
case class ConfirmEvent(deliveryId : Long) extends Evt
class TestGuaranteedDeliveryActor(counter : ActorPath) extends PersistentActor with AtLeastOnceDelivery {
override def persistenceId = "persistent-actor-ref-1"
override def receiveCommand : Receive = {
case ExternalRequest(updateAmount) => persist(CountEvent(updateAmount))(updateState)
case Confirm(deliveryId) => persist(ConfirmEvent(deliveryId)) (updateState)
}
override def receiveRecover : Receive = {
case evt : Evt => updateState(evt)
}
def updateState(evt:Evt) = evt match {
case CountEvent(updateAmount) => deliver(counter, id => CountCommand(id, updateAmount))
case ConfirmEvent(deliveryId) => confirmDelivery(deliveryId)
}
}
class FactorialActor extends Actor {
var count = 0
def receive = {
case CountCommand(deliveryId : Long, updateAmount:Int) => {
count = count + updateAmount
sender() ! Confirm(deliveryId)
}
case "print" => println(count)
}
}
object GuaranteedDeliveryTest extends App {
val system = ActorSystem()
val factorial = system.actorOf(Props[FactorialActor])
val delActor = system.actorOf(Props(classOf[TestGuaranteedDeliveryActor], factorial.path))
import system.dispatcher
system.scheduler.schedule(0 seconds, 2 seconds) { factorial ! "print" }
for (i <- 1 to 9999)
delActor ! ExternalRequest(i)
}
SBT file
name := "akka_aggregator"
organization := "io.github.ourkid"
version := "0.0.1-SNAPSHOT"
scalaVersion := "2.11.4"
scalacOptions ++= Seq("-unchecked", "-deprecation")
resolvers ++= Seq(
"Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/"
)
val Akka = "2.3.7"
val Spray = "1.3.2"
libraryDependencies ++= Seq(
// Core Akka
"com.typesafe.akka" %% "akka-actor" % Akka,
"com.typesafe.akka" %% "akka-cluster" % Akka,
"com.typesafe.akka" %% "akka-persistence-experimental" % Akka,
"org.iq80.leveldb" % "leveldb" % "0.7",
"org.fusesource.leveldbjni" % "leveldbjni-all" % "1.8",
// For future REST API
"io.spray" %% "spray-httpx" % Spray,
"io.spray" %% "spray-can" % Spray,
"io.spray" %% "spray-routing" % Spray,
"org.typelevel" %% "scodec-core" % "1.3.0",
// CSV reader
"net.sf.opencsv" % "opencsv" % "2.3",
// Logging
"com.typesafe.akka" %% "akka-slf4j" % Akka,
"ch.qos.logback" % "logback-classic" % "1.0.13",
// Testing
"org.scalatest" %% "scalatest" % "2.2.1" % "test",
"com.typesafe.akka" %% "akka-testkit" % Akka % "test",
"io.spray" %% "spray-testkit" % Spray % "test",
"org.scalacheck" %% "scalacheck" % "1.11.6" % "test"
)
fork := true
mainClass in assembly := Some("io.github.ourkid.akka.aggregator.TestGuaranteedDeliveryActor")
application.conf file
##########################################
# Akka Persistence Reference Config File #
##########################################
akka {
# Loggers to register at boot time (akka.event.Logging$DefaultLogger logs
# to STDOUT)
loggers = ["akka.event.slf4j.Slf4jLogger"]
# Log level used by the configured loggers (see "loggers") as soon
# as they have been started; before that, see "stdout-loglevel"
# Options: OFF, ERROR, WARNING, INFO, DEBUG
loglevel = "DEBUG"
# Log level for the very basic logger activated during ActorSystem startup.
# This logger prints the log messages to stdout (System.out).
# Options: OFF, ERROR, WARNING, INFO, DEBUG
stdout-loglevel = "INFO"
# Filter of log events that is used by the LoggingAdapter before
# publishing log events to the eventStream.
logging-filter = "akka.event.slf4j.Slf4jLoggingFilter"
# Protobuf serialization for persistent messages
actor {
serializers {
akka-persistence-snapshot = "akka.persistence.serialization.SnapshotSerializer"
akka-persistence-message = "akka.persistence.serialization.MessageSerializer"
}
serialization-bindings {
"akka.persistence.serialization.Snapshot" = akka-persistence-snapshot
"akka.persistence.serialization.Message" = akka-persistence-message
}
}
persistence {
journal {
# Maximum size of a persistent message batch written to the journal.
max-message-batch-size = 200
# Maximum size of a deletion batch written to the journal.
max-deletion-batch-size = 10000
# Path to the journal plugin to be used
plugin = "akka.persistence.journal.leveldb"
# In-memory journal plugin.
inmem {
# Class name of the plugin.
class = "akka.persistence.journal.inmem.InmemJournal"
# Dispatcher for the plugin actor.
plugin-dispatcher = "akka.actor.default-dispatcher"
}
# LevelDB journal plugin.
leveldb {
# Class name of the plugin.
class = "akka.persistence.journal.leveldb.LeveldbJournal"
# Dispatcher for the plugin actor.
plugin-dispatcher = "akka.persistence.dispatchers.default-plugin-dispatcher"
# Dispatcher for message replay.
replay-dispatcher = "akka.persistence.dispatchers.default-replay-dispatcher"
# Storage location of LevelDB files.
dir = "journal"
# Use fsync on write
fsync = on
# Verify checksum on read.
checksum = off
# Native LevelDB (via JNI) or LevelDB Java port
native = on
# native = off
}
# Shared LevelDB journal plugin (for testing only).
leveldb-shared {
# Class name of the plugin.
class = "akka.persistence.journal.leveldb.SharedLeveldbJournal"
# Dispatcher for the plugin actor.
plugin-dispatcher = "akka.actor.default-dispatcher"
# timeout for async journal operations
timeout = 10s
store {
# Dispatcher for shared store actor.
store-dispatcher = "akka.persistence.dispatchers.default-plugin-dispatcher"
# Dispatcher for message replay.
replay-dispatcher = "akka.persistence.dispatchers.default-plugin-dispatcher"
# Storage location of LevelDB files.
dir = "journal"
# Use fsync on write
fsync = on
# Verify checksum on read.
checksum = off
# Native LevelDB (via JNI) or LevelDB Java port
native = on
}
}
}
snapshot-store {
# Path to the snapshot store plugin to be used
plugin = "akka.persistence.snapshot-store.local"
# Local filesystem snapshot store plugin.
local {
# Class name of the plugin.
class = "akka.persistence.snapshot.local.LocalSnapshotStore"
# Dispatcher for the plugin actor.
plugin-dispatcher = "akka.persistence.dispatchers.default-plugin-dispatcher"
# Dispatcher for streaming snapshot IO.
stream-dispatcher = "akka.persistence.dispatchers.default-stream-dispatcher"
# Storage location of snapshot files.
dir = "snapshots"
}
}
view {
# Automated incremental view update.
auto-update = on
# Interval between incremental updates
auto-update-interval = 5s
# Maximum number of messages to replay per incremental view update. Set to
# -1 for no upper limit.
auto-update-replay-max = -1
}
at-least-once-delivery {
# Interval between redelivery attempts
redeliver-interval = 5s
# Maximum number of unconfirmed messages that will be sent in one redelivery burst
redelivery-burst-limit = 10000
# After this number of delivery attempts a `ReliableRedelivery.UnconfirmedWarning`
# message will be sent to the actor.
warn-after-number-of-unconfirmed-attempts = 5
# Maximum number of unconfirmed messages that an actor with AtLeastOnceDelivery is
# allowed to hold in memory.
max-unconfirmed-messages = 100000
}
dispatchers {
default-plugin-dispatcher {
type = PinnedDispatcher
executor = "thread-pool-executor"
}
default-replay-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 2
parallelism-max = 8
}
}
default-stream-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 2
parallelism-max = 8
}
}
}
}
}
Correct output:
18:02:36.684 [default-akka.actor.default-dispatcher-3] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
18:02:36.684 [default-akka.actor.default-dispatcher-3] DEBUG akka.event.EventStream - logger log1-Slf4jLogger started
18:02:36.684 [default-akka.actor.default-dispatcher-3] DEBUG akka.event.EventStream - Default Loggers started
0
18:02:36.951 [default-akka.actor.default-dispatcher-14] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.persistence.serialization.MessageSerializer] for message [akka.persistence.PersistentImpl]
18:02:36.966 [default-akka.actor.default-dispatcher-3] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.serialization.JavaSerializer] for message [io.github.ourkid.akka.aggregator.guaranteed.CountEvent]
3974790
24064453
18:02:42.313 [default-akka.actor.default-dispatcher-11] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.serialization.JavaSerializer] for message [io.github.ourkid.akka.aggregator.guaranteed.ConfirmEvent]
49995000
49995000
49995000
49995000
Incorrect run:
17:56:22.493 [default-akka.actor.default-dispatcher-4] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
17:56:22.508 [default-akka.actor.default-dispatcher-4] DEBUG akka.event.EventStream - logger log1-Slf4jLogger started
17:56:22.508 [default-akka.actor.default-dispatcher-4] DEBUG akka.event.EventStream - Default Loggers started
0
17:56:22.750 [default-akka.actor.default-dispatcher-2] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.persistence.serialization.MessageSerializer] for message [akka.persistence.PersistentImpl]
17:56:22.765 [default-akka.actor.default-dispatcher-7] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.serialization.JavaSerializer] for message [io.github.ourkid.akka.aggregator.guaranteed.CountEvent]
3727815
22167811
17:56:28.391 [default-akka.actor.default-dispatcher-3] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.serialization.JavaSerializer] for message [io.github.ourkid.akka.aggregator.guaranteed.ConfirmEvent]
49995000
51084018
51084018
52316760
52316760
52316760
52316760
52316760
Another incorrect run:
17:59:12.122 [default-akka.actor.default-dispatcher-3] INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
17:59:12.137 [default-akka.actor.default-dispatcher-3] DEBUG akka.event.EventStream - logger log1-Slf4jLogger started
17:59:12.137 [default-akka.actor.default-dispatcher-3] DEBUG akka.event.EventStream - Default Loggers started
0
17:59:12.387 [default-akka.actor.default-dispatcher-7] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.persistence.serialization.MessageSerializer] for message [akka.persistence.PersistentImpl]
17:59:12.402 [default-akka.actor.default-dispatcher-13] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.serialization.JavaSerializer] for message [io.github.ourkid.akka.aggregator.guaranteed.CountEvent]
2982903
17710176
49347145
17:59:18.204 [default-akka.actor.default-dispatcher-13] DEBUG a.s.Serialization(akka://default) - Using serializer[akka.serialization.JavaSerializer] for message [io.github.ourkid.akka.aggregator.guaranteed.ConfirmEvent]
51704199
51704199
55107844
55107844
55107844
55107844
You're using AtLeastOnceDelivery semantics. As it said here:
Note At-least-once delivery implies that original message send order
is not always preserved and the destination may receive duplicate
messages. That means that the semantics do not match those of a normal
ActorRef send operation:
it is not at-most-once delivery message order for the same
sender–receiver pair is not preserved due to possible resends after a
crash and restart of the destination messages are still delivered—to
the new actor incarnation These semantics is similar to what an
ActorPath represents (see Actor Lifecycle), therefore you need to
supply a path and not a reference when delivering messages. The
messages are sent to the path with an actor selection.
So some numbers may be received more than once. You can just ignore duplicate numbers inside FactorialActor or don't use this semantic.