xmpp pubsub understanding - xmpp

The subscriber will only receive content from the moment he is subscripting to a node and all old content published by publisher will not be received by subscriber. Is this correct? May i know, what do i need to do in order for subscriber to receive all previous old content ?

You can configure your nodes to be persistent or transient. According to the specifictaion (XEP-0060):
Whether the node is persistent or transient is determined by the "pubsub#persist_items" configuration field.
However, your pubsub service (or server) might be configured to ignore persistence of events. (If you're using Openfire, I think there is a configurable limit for the maximum total size of stored items)
As I know you are using smackx-pubsub, here's some code:
// create new node
pubSubManager.createNode(nodeId, newConfigureForm(persistent, includePayload, accessModel)
// change existing node
node.sendConfigurationForm(newConfigureForm(persistent, includePayload, accessModel));
private ConfigureForm newConfigureForm(final boolean persistent, final boolean includePayload, final AccessModel accessModel) {
final ConfigureForm form = new ConfigureForm(FormType.submit);
form.setPersistentItems(persistent);
form.setDeliverPayloads(includePayload);
form.setAccessModel(accessModel);
return form;
}
PS: Can you tell me why I get the feeling that we're doing a kind of pair programming here? ;)

Related

Unable to retrieve metadata for a state store key in Kafka Streams

I'm trying to use Kafka Streams with a state store distributed among two instances. Here's how the store and the associated KTable are defined:
KTable<String, Double> userBalancesTable = kStreamBuilder.table(
"balances-table",
Consumed.with(String(), Double()),
Materialized.<String, Double, KeyValueStore<Bytes, byte[]>>as(BALANCES_STORE).withKeySerde(String()).withValueSerde(Double())
);
Next, I have some stream processing logic which is aggregating some data to this balance-table KTable:
transactionsStream
.leftJoin(...)
...
.aggregate(...)
.to("balances-table", Produced.with(String(), Double()));
And at some point I am, from a REST handler, querying the state store.
ReadOnlyKeyValueStore<String, Double> balances = streams.store(BALANCES_STORE, QueryableStoreTypes.<String, Double>keyValueStore());
return Optional.ofNullable(balances.get(userId)).orElse(0.0);
Which works perfectly - as long as I have a single stream processing instance.
Now, I'm adding a second instance (note: my topics all have 2 partititions). As explained in the docs, the state store BALANCES_STORE is distributed among the instances based on the key of each record (in my case, the key is an user ID). Therefore, an instance must:
Make a call to KafkaStreams#metadataForKey to discover which instance is handling the part of the state store containing the key it wants to retrieve
Make a RPC (e.g. REST) call to this instance to retrieve it
My problem is that the call to KafkaStreams#metadataForKey is always returning a null metadata object. However, KafkaStreams#allMetadataForStore() is returning a metadata object containing the two instances. So it behaves like it doesn't know about the key I'm querying, although looking it up in the state stores works.
Additional note: my application.server property is correctly set.
Thank you!

Querying a list of Actors in Azure Service Fabric

I currently have a ReliableActor for every user in the system. This actor is appropriately named User, and for the sake of this question has a Location property. What would be the recommended approach for querying Users by Location?
My current thought is to create a ReliableService that contains a ReliableDictionary. The data in the dictionary would be a projection of the User data. If I did that, then I would need to:
Query the dictionary. After GA, this seems like the recommended approach.
Keep the dictionary in sync. Perhaps through Pub/Sub or IActorEvents.
Another alternative would be to have a persistent store outside Service Fabric, such as a database. This feels wrong, as it goes against some of the ideals of using the Service Fabric. If I did, I would assume something similar to the above but using a Stateless service?
Thank you very much.
I'm personally exploring the use of Actors as the main datastore (ie: source of truth) for my entities. As Actors are added, updated or deleted, I use MassTransit to publish events. I then have Reliable Statefull Services subscribed to these events. The services receive the events and update their internal IReliableDictionary's. The services can then be queried to find the entities required by the client. Each service only keeps the entity data that it requires to perform it's queries.
I'm also exploring the use of EventStore to publish the events as well. That way, if in the future I decide I need to query the entities in a new way, I could create a new service and replay all the events to it.
These Pub/Sub methods do mean the query services are only eventually consistent, but in a distributed system, this seems to be the norm.
While the standard recommendation is definitely as Vaclav's response, if querying is the exception then Actors could still be appropriate. For me whether they're suitable or not is defined by the normal way of accessing them, if it's by key (presumably for a user record it would be) then Actors work well.
It is possible to iterate over Actors, but it's quite a heavy task, so like I say is only appropriate if it's the exceptional case. The following code will build up a set of Actor references, you then iterate over this set to fetch the actors and then can use Linq or similar on the collection that you've built up.
ContinuationToken continuationToken = null;
var actorServiceProxy = ActorServiceProxy.Create("fabric:/MyActorApp/MyActorService", partitionKey);
var queriedActorCount = 0;
do
{
var queryResult = actorServiceProxy.GetActorsAsync(continuationToken, cancellationToken).GetAwaiter().GetResult();
queriedActorCount += queryResult.Items.Count();
continuationToken = queryResult.ContinuationToken;
} while (continuationToken != null);
TLDR: It's not always advisable to query over actors, but it can be achieved if required. Code above will get you started.
if you find yourself needing to query across a data set by some data property, like User.Location, then Reliable Collections are the right answer. Reliable Actors are not meant to be queried over this way.
In your case, a user could simply be a row in a Reliable Dictionary.

Camel-JPA: mark as entity as consumed (without updating table record)

I am stuck (again ;-) with some JPA related problem and hope that anyone here can help (Camel in Action couldn't...):
I consume from a JPA endpoint using a namedQuery.
I cannot delete consumed entries, thus I am using the "consumeDelete=false" option.
But how can I prevent reading the same entry multiple times?
I am aware of the "#consumed" annotation, but since I am not allowed to modify/update the original database entries, I haven't figured out how I can mark a entry as "consumed"...
Any ideas?
thanks,
M
If you cannot change the data in the database anyhow to reflect that you have consumed the record already, then you would need to "store" this information elsewhere.
You can use the idempotent consumer EIP pattern
http://camel.apache.org/idempotent-consumer.html
You would then need to use a memory/file/another database/table to store id's of already consumed messages, and use that with the idempotent consumer pattern.
You can use #Consumed in your entity class with any method name and change the value of field to your desire value. for example
#Consumed
public void updateConsumedStatus() {
this.status = false;
}
Please check here too
http://camel.apache.org/jpa.html

Is it a bad practice to return an object in a POST via Web Api?

I'm using Web Api and have a scenario where clients are sending a heartbeat notification every n seconds. There is a heartbeat object which is sent in a POST rather than a PUT, because as I see it they are creating a new heartbeat rather than updating an existing heartbeat.
Additionally, the clients have a requirement that calls for them to retrieve all of the other currently online clients and the number of unread messages that individual client has. It seems to me that I have two options:
Perform the POST followed by a GET, which to me seems cleaner from a pure REST standpoint. I am doing a creation and a retrieval and I think the SOLID principles would prefer to split them accordingly. However, this approach means two round trips.
Have the POST return an object which contains the same information that the GET would otherwise have done. This consolidates everything into a single request, but I'm concerned that this approach would be considered ill-advised. It's not a pure POST.
Option #2 stubbed out looks like this:
public HeartbeatEcho Post(Heartbeat heartbeat)
{
}
HeartbeatEcho is a class which contains properties for the other online clients and the number of unread messages.
Web Api certainly supports option #2, but just because I can do something doesn't mean I should. Is option #2 an abomination, premature optimization, or pragmatism?
The option 2 is not an abomination at all. A POST request creates a new resource, but it's quite common that the resource itself is returned to the caller. For example, if your resources are items in a database (e.g., a Person), the POST request would send the required members for the INSERT operation (e.g., name, age, address), and the response would contain a Person object which in addition to the parameters passed as input it would also have an identifier (the DB primary key) which can be used to uniquely identify the object.
Notice that it's also perfectly valid for the POST request only return the id of the newly created resource, but that's a choice you have, depending on the requirements of the client.
public HttpResponseMessage Post(Person p)
{
var id = InsertPersonInDBAndReturnId(p);
p.Id = id;
var result = this.Request.CreateResponse(HttpStatusCode.Created, p);
result.Headers.Location = new Uri("the location for the newly created resource");
return result;
}
Whichever way solves your business problem will work. You're correct POST for new record vs PUT for update to existing record.
SUGGESTION:
One thing you may want to consider is adding Redis to your stack and the apps can post very fast, then you could use the Pub/Sub functionality for the echo part or Blpop (blocking until record matches criteria). It's super fast and may help you scale and perfectly designed for what you are trying to do.
See: http://redis.io/topics/pubsub/
See: http://redis.io/commands/blpop
I've used both Redis for similar, but also RabbitMQ and with RabbitMQ we added socket.io connection to "stream" the heartbeat in real time without need for long polling.

Remove read data for authenticated user?

In DDS what my requirement is, I have many subscribers but the publisher is single. My subscriber reads the data from the DDS and checks the message is for that particular subscriber. If the checking success then only it takes the data and remove from DDS. The message must maintain in DDS until the authenticated subscriber takes it's data. How can I achieve this using DDS (in java environment)?
First of all, you should be aware that with DDS, a Subscriber is never able to remove data from the global data space. Every Subscriber has its own cached copy of the distributed data and can only act on that copy. If one Subscriber takes data, then other Subscribers for the same Topic will not be influenced by that in any way. Only Publishers can remove data globally for every Subscriber. From your question, it is not clear whether you know this.
Independent of that, it seems like the use of a ContentFilteredTopic (CFT) is suitable here. According to the description, the Subscriber knows the file name that it is looking for. With a CFT, the Subscriber can indicate that it is only interested in samples that have a particular value for the file_name attribute. The infrastructure will take care of the filtering process and will ensure that the Subscriber will not receive any data with a different value for the attribute file_name. As a consequence, any take() action done on the DataReader will contain relevant information and there is no need to check the data first and then take it.
The API documentation should contain more detailed information about how to use a ContentFilteredTopic.