Loading million rows into partitioned Stateful service - azure-service-fabric

I'm trying to load 20 million rows into partitioned stateful service ReliableDictionary. I partitioned stateful service into 10 partitions. Based on MSDN documentation, I understood that I need to use some hashing algorithm to find the correct partition and send data to it to load into IReliabledictionary. So I used the Hydra to get the partition number based on the value. All I'm storing is a List<long> in the IReliableDictionary.
So I created a Stateless service as wrapper,
which will fetch the rows from the SQL Server (20 million),
get the partition number using Hydra for each row,
group them by partition number
call the Stateful service for each partition using ServiceRemoting. However, I get fabric message too large exception if I send 1 million rows of data per each request so I chunked it into 100000 per request.
This is taking 74 minutes for it to complete. This is too long. Below is the code for uploading -
Please advise.
foreach (var itemKvp in ItemsDictionary)
{
var ulnv2Uri = new Uri("fabric:/TestApp/dataservice");
//Insert to the correct shard based on the hash algorithm
var dataService = _serviceProxyFactory.CreateServiceProxy<IDataService>(
dataStoreUri,
new ServicePartitionKey(itemKvp.Key), TargetReplicaSelector.PrimaryReplica, "dataServiceRemotingListener");
var itemsShard = itemKvp.Value;
//if the total records count is greater then 100000 then send it in chunks
if (itemsShard.Count > 1_000_000)
{
//var tasks = new List<Task>();
var totalCount = itemsShard.Count;
var pageSize = 100000;
var page = 1;
var skip = 0;
while (skip < totalCount)
{
await dataService.InsertData(itemsShard.Skip(skip).Take(pageSize).ToList());
page++;
skip = pageSize * (page - 1);
}
}
else
{
//otherwise send all together
await dataService.InsertData(itemsShard);
}
}

You can likely save some time here, by uploading to all partitions in parallel.
So create 10 service proxies (one for each partition) and use them simultaneously.

Related

UaSerializationException: request exceeds remote max message size: 2434140 > 2097152

I am a rookie, I tried to use the following code for bulk subscription, but something went wrong, how can I solve this problem
OpcUaSubscriptionManager subscriptionManager = opcUaClient.getSubscriptionManager();
UaSubscription subscription = subscriptionManager.createSubscription(publishInterval).get();
List<MonitoredItemCreateRequest> itemsToCreate = new ArrayList<>();
for (Tag tag : tagList) {
NodeId nodeId = new NodeId(nameSpace, tag.getPath());
ReadValueId readValueId = new ReadValueId(nodeId, AttributeId.Value.uid(), null, null);
MonitoringParameters parameters = new MonitoringParameters(
subscription.nextClientHandle(), //
publishInterval, //
null, // filter, null means use default
UInteger.valueOf(queueSize), // queue size
true // discard oldest
);
MonitoredItemCreateRequest request = new MonitoredItemCreateRequest(readValueId,
MonitoringMode.Reporting, parameters);
itemsToCreate.add(request);
}
BiConsumer<UaMonitoredItem, Integer> consumer =(item, id) ->
item.setValueConsumer(this::onSubscriptionValue);
List<UaMonitoredItem> items = subscription.createMonitoredItems(
TimestampsToReturn.Both,
itemsToCreate,
consumer
).get();
for (UaMonitoredItem item : items) {
if (!item.getStatusCode().isGood()) {
log.error("failed to create item for nodeId={} (status={})",item.getReadValueId().getNodeId(), item.getStatusCode());
}
}
How many items are you trying to create?
It seems that the resulting message exceeds the limits set by the server you are connecting to. You may need to break your list up and create the items in smaller chunks.
I do not know the library that you use, but one of the previous steps for a OPC UA client to connect to a server is to negotiate the maximum size of the buffers, the message total size and the max number or chunks a message can be sent, this process is called by the OPC UA documentation as "Handshake".
If your request is too long it should be split and sent in several chunks according to the limits previously negotiated with the server.
And the server will probably also reply in several chunks, all that has to be considered in the programming of an OPC UA client.

How to load records from mongodb with limit using spring data

I want to load only 100000 records which are in NOT_STARTED status in mongodb and want to process those records and update status to STARTED. I want to repeat this process until all the records which are in NOT_STARTED status processed.
Currently i am using Pagerequest as shown in the below code and it seems working. But is there a way i can do this without pagerequest having my repository extends spring MongoRepository. Because Pagerequest seems for pagination. But i am not doing any pagination only loading 100000 records each time and processing them
Sort sort = new Sort(Sort.Direction.ASC, "_id");
int count = (int) PaymentReportRepository.count();
for(int i = 0; i < count; i += reportProperties.getPageSize()) {
List<PaymentReport> paymentReportList =
MongoTraceability.capture(() ->
PaymentReportRepository.findByStatusAndDateLessThan("NOT_STARTED",
LocalDateTime.now().minusSeconds(reportProperties.getTimeInterval()),
,PageRequest.of(0, reportProperties.getPageSize(), sort)));
if (paymentReportList != null && !paymentReportList.isEmpty()) {
for (PaymentReport paymentReport : paymentReportList) {
messageService.processMessage(paymentReport);
}
}
}
It appears that you're processing each record synchronously. Do you have any desire/ability to process asynchronously?
Will this solution be run off a single JVM?
From your question I'm assuming synchronous processing and a single JVM.
I would use Spring's MongoTemplate class. Example tutorials/examples here: https://www.baeldung.com/queries-in-spring-data-mongodb
MongoTemplate will allow you to write your query along the lines of query("NOT_STARTED").limit(100000) to return the results you want. Assuming your messageService.processMessage(paymentReport); is doing an update() to the document after it is done processing and updates its status, then your next query will retrieve the next 100000 messages with your desired status.
You can try to rename findByStatusAndDateLessThan to findFirst100000ByStatusAndDateLessThan

Streaming application with state stores takes up to 1 hour to restart

We are using spring cloud stream with Kafka 2.0.1 and utilizing the InteractiveQueryService to fetch data from the stores. There are 4 stores that persist data on disk after aggregating data. The code for the topology looks like this:
#Slf4j
#EnableBinding(SensorMeasurementBinding.class)
public class Consumer {
public static final String RETENTION_MS = "retention.ms";
public static final String CLEANUP_POLICY = "cleanup.policy";
#Value("${windowstore.retention.ms}")
private String retention;
/**
* Process the data flowing in from a Kafka topic. Aggregate the data to:
* - 2 minute
* - 15 minutes
* - one hour
* - 12 hours
*
* #param stream
*/
#StreamListener(SensorMeasurementBinding.ERROR_SCORE_IN)
public void process(KStream<String, SensorMeasurement> stream) {
Map<String, String> topicConfig = new HashMap<>();
topicConfig.put(RETENTION_MS, retention);
topicConfig.put(CLEANUP_POLICY, "delete");
log.info("Changelog and local window store retention.ms: {} and cleanup.policy: {}",
topicConfig.get(RETENTION_MS),
topicConfig.get(CLEANUP_POLICY));
createWindowStore(LocalStore.TWO_MINUTES_STORE, topicConfig, stream);
createWindowStore(LocalStore.FIFTEEN_MINUTES_STORE, topicConfig, stream);
createWindowStore(LocalStore.ONE_HOUR_STORE, topicConfig, stream);
createWindowStore(LocalStore.TWELVE_HOURS_STORE, topicConfig, stream);
}
private void createWindowStore(
LocalStore localStore,
Map<String, String> topicConfig,
KStream<String, SensorMeasurement> stream) {
// Configure how the statestore should be materialized using the provide storeName
Materialized<String, ErrorScore, WindowStore<Bytes, byte[]>> materialized = Materialized
.as(localStore.getStoreName());
// Set retention of changelog topic
materialized.withLoggingEnabled(topicConfig);
// Configure how windows looks like and how long data will be retained in local stores
TimeWindows configuredTimeWindows = getConfiguredTimeWindows(
localStore.getTimeUnit(), Long.parseLong(topicConfig.get(RETENTION_MS)));
// Processing description:
// The input data are 'samples' with key <installationId>:<assetId>:<modelInstanceId>:<algorithmName>
// 1. With the map we add the Tag to the key and we extract the error score from the data
// 2. With the groupByKey we group the data on the new key
// 3. With windowedBy we split up the data in time intervals depending on the provided LocalStore enum
// 4. With reduce we determine the maximum value in the time window
// 5. Materialized will make it stored in a table
stream
.map(getInstallationAssetModelAlgorithmTagKeyMapper())
.groupByKey()
.windowedBy(configuredTimeWindows)
.reduce((aggValue, newValue) -> getMaxErrorScore(aggValue, newValue), materialized);
}
private TimeWindows getConfiguredTimeWindows(long windowSizeMs, long retentionMs) {
TimeWindows timeWindows = TimeWindows.of(windowSizeMs);
timeWindows.until(retentionMs);
return timeWindows;
}
/**
* Determine the max error score to keep by looking at the aggregated error signal and
* freshly consumed error signal
*
* #param aggValue
* #param newValue
* #return
*/
private ErrorScore getMaxErrorScore(ErrorScore aggValue, ErrorScore newValue) {
if(aggValue.getErrorSignal() > newValue.getErrorSignal()) {
return aggValue;
}
return newValue;
}
private KeyValueMapper<String, SensorMeasurement,
KeyValue<? extends String, ? extends ErrorScore>> getInstallationAssetModelAlgorithmTagKeyMapper() {
return (s, sensorMeasurement) -> new KeyValue<>(s + "::" + sensorMeasurement.getT(),
new ErrorScore(sensorMeasurement.getTs(), sensorMeasurement.getE(), sensorMeasurement.getO()));
}
}
So we are materializing aggregated data to four different stores after determining the max value within a specific window for a specific key.
Please note that retention which is set to two months of data and the clean up policy delete. We don't compact data.
The size of the individual state stores on disk is between 14 to 20 gb of data.
We are making use of Interactive Queries: https://docs.confluent.io/current/streams/developer-guide/interactive-queries.html#interactive-queries
On our setup we have 4 instances of our streaming app to be used as one consumer group. So every instance will store a specific part of all data in its store.
This all seems to work nicely. Until we restart one or more instances and wait for it to become available again. I would expect that the restart of the app would not take that long but unfortunately it takes op to 1 hour. I guess that the issue is caused by the amount of data in combination of restoring state stores, but I'm not sure. I would have expected that as we persist the state store data on persisted volumes outside of the container that runs on kubernetes, the app would receive the last offset from the broker and only has to continue from that point as the previously consumed data is already there in the state store. Unfortunately I don't have a clue how to resolve this.
Restarting our app triggers a restore task:
-StreamThread-2] Restoring task 4_3's state store twelve-hours-error-score from beginning of the changelog anomaly-timeline-twelve-hours-error-score-changelog-3.
This process takes quite a while. Why is it restoring from beginning and why does it take so long? I do have auto.offset.reset set to "earliest" but that is only being used when the offset is unknown isn't it?
Here are my stream settings. Note the max.bytes.buffering set to 0. I changed this, but that didn't make a difference. I also read about a bug with the num.stream.threads where > 1 causes issues, but also putting this on 1 doesn't improve restart speed.
2019-03-05 13:44:53,360 INFO main org.apache.kafka.common.config.AbstractConfig StreamsConfig values:
application.id = anomaly-timeline
application.server = localhost:5000
bootstrap.servers = [localhost:9095]
buffered.records.per.partition = 1000
cache.max.bytes.buffering = 0
client.id =
commit.interval.ms = 500
connections.max.idle.ms = 540000
default.deserialization.exception.handler = class org.apache.kafka.streams.errors.LogAndFailExceptionHandler
default.key.serde = class org.apache.kafka.common.serialization.Serdes$StringSerde
default.production.exception.handler = class org.apache.kafka.streams.errors.DefaultProductionExceptionHandler
default.timestamp.extractor = class errorscore.raw.boundary.ErrorScoreTimestampExtractor
default.value.serde = class errorscore.raw.boundary.ErrorScoreSerde
metadata.max.age.ms = 300000
metric.reporters = []
metrics.num.samples = 2
metrics.recording.level = INFO
metrics.sample.window.ms = 30000
num.standby.replicas = 1
num.stream.threads = 2
partition.grouper = class org.apache.kafka.streams.processor.DefaultPartitionGrouper
poll.ms = 100
processing.guarantee = at_least_once
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
replication.factor = 1
request.timeout.ms = 40000
retries = 0
retry.backoff.ms = 100
rocksdb.config.setter = null
security.protocol = PLAINTEXT
send.buffer.bytes = 131072
state.cleanup.delay.ms = 600000
state.dir = ./state-store
topology.optimization = none
upgrade.from = null
windowstore.changelog.additional.retention.ms = 86400000
It also log these messages after a while:
CleanupThread] Deleting obsolete state directory 1_1 for task 1_1 as 1188421ms has elapsed (cleanup delay is 600000ms).
Also something to note, I did add the following code in order to override the default cleanUp on start and stop where the stores by default are deleted:
#Bean
public CleanupConfig cleanupConfig() {
return new CleanupConfig(false, false);
}
any help would be appreciated!
We think we solved the issue. The different instances each got their own persistent volume. When restarting the instances, it seems that some or sometimes all instances got linked to other persistent volumes instead of the once they were previously using. This caused the state stores to become obsolete and restoration process to kick in. We solved this by utilizing NFS to share the persistent volumes in a way that all instances would point to the same state store directory structure. This seems to solve the issue

Flink Multiple Windows on same data

My flink application does the following
source: read data in form of records from Kafka
split: based on certain criteria
window : timewindow of 10seconds to aggregate into one bulkrecord
sink: dump these bulkrecords to elasticsearch
I am facing issue where flink consumer is not able to hold data for 10seconds, and is throwing the following exception:
Caused by: java.util.concurrent.ExecutionException: java.io.IOException: Size of the state is larger than the maximum permitted memory-backed state. Size=18340663 , maxSize=5242880
I cannot apply countWindow, because if the frequency of records is too slow, then the elasticsearch sink might be deferred for a long time.
My question:
Is it possible to apply a OR function of TimeWindow and CountWindow, which goes as
> if ( recordCount is 500 OR 10 seconds have elapsed)
> then dump data to flink
Not directly. But you can use a GlobalWindow with a custom triggering logic. Take a look at the source for the count trigger here.
Your triggering logic will look something like this.
private final ReducingStateDescriptor<Long> stateDesc =
new ReducingStateDescriptor<>("count", new Sum(), LongSerializer.INSTANCE);
private long triggerTimestamp = 0;
#Override
public TriggerResult onElement(String element, long l, GlobalWindow globalWindow, TriggerContext triggerContext) throws Exception {
ReducingState<Long> count = triggerContext.getPartitionedState(stateDesc);
// Increment window counter by one, when an element is received
count.add(1L);
// Start the timer when the first packet is received
if (count.get() == 1) {
triggerTimestamp = triggerContext.getCurrentProcessingTime() + 10000; // trigger at 10 seconds from reception of first event
triggerContext.registerProcessingTimeTimer(triggerTimestamp); // Override the onProcessingTime method to trigger the window at this time
}
// Or trigger the window when the number of packets in the window reaches 500
if (count.get() >= 500) {
// Delete the timer, clear the count and fire the window
triggerContext.deleteProcessingTimeTimer(triggerTimestamp);
count.clear();
return TriggerResult.FIRE;
}
return TriggerResult.CONTINUE;
}
You could also use the RocksDB state backend, but a custom Trigger will perform better.

How to enumerate all partitions and aggregate results

I have a multiple partitioned stateful service. How can I enumerate all its partitions and aggregate results, using service remoting for communication between client and service?
You can enumerable the partitions using FabricClient:
var serviceName = new Uri("fabric:/MyApp/MyService");
using (var client = new FabricClient())
{
var partitions = await client.QueryManager.GetPartitionListAsync(serviceName);
foreach (var partition in partitions)
{
Debug.Assert(partition.PartitionInformation.Kind == ServicePartitionKind.Int64Range);
var partitionInformation = (Int64RangePartitionInformation)partition.PartitionInformation;
var proxy = ServiceProxy.Create<IMyService>(serviceName, new ServicePartitionKey(partitionInformation.LowKey));
// TODO: call service
}
}
Note that you should probably cache the results of GetPartitionListAsync since service partitions cannot be changed without recreating the service (you can just keep a list of the LowKey values).
In addition, FabricClient should also be shared as much as possible (see the documentation).