Why ejabberd MUC room shows only one node in table when cluster has 2 nodes - xmpp

I have a cluster with 2 instances of ejabberd.
I created few rooms through ejabberdctl with persistence true settings and the mysql is showing below copied data.
mysql> select * from muc_online_room;
+-------+----------------------------+--------------------------------+------------+
| name | host | node | pid |
+-------+----------------------------+--------------------------------+------------+
| Group 1 | conference.xmpp.myapps.org | ejabberd1#instance-1 | <0.158.0> |
| Group 2 | conference.xmpp.myapps.org | ejabberd1#instance-1 | <0.125.0> |
+-------+----------------------------+--------------------------------+------------+
What is the relevance of node in muc_room table?
I have some issues with receiving and sending messages eventhough the user was able to join the room. I am wondering if it is because the user with issues is connected to the other node (ejabberd2#instance-2) in the cluster.
I have haproxy behind the cluster.

Each ejabberd node has its own MUC service (as implemented in mod_muc.erl). And each individual MUC room (as implemented in mod_muc_room.erl) is handled by one erlang process, that is alive in an individual ejabberd node. That single erlang process that is handling a MUC room is reachable by all the MUC services in all the ejabberd nodes in the cluster, thanks to the routing information provided in the table muc_online_room.
I build a cluster of two ejabberd nodes using the internal Mnesia database. I login to an account in first node and created two rooms. Then I login to the same account, but using the second node, and joined another room.
The muc_online_room as seen in the first node is now:
ets:tab2list(muc_online_room).
[{muc_online_room, {<<"sala1">>,<<"conf.localhost">>}, <0.564.0>},
{muc_online_room,{<<"sala2b">>,<<"conf.localhost">>}, <29502.1430.0>},
{muc_online_room,{<<"sala3c">>,<<"conf.localhost">>}, <0.977.0>}]
As we can see, the rooms sala1 and sala3c are right now alive in the first node (the pid starts with 0.), and sala2b is alive in the other node (the pid starts with something else).
The table, as seen in the second node:
ets:tab2list(muc_online_room).
[{muc_online_room,{<<"sala1">>,<<"conf.localhost">>}, <29512.564.0>},
{muc_online_room,{<<"sala2b">>,<<"conf.localhost">>}, <0.1430.0>},
{muc_online_room,{<<"sala32c">>,<<"conf.localhost">>}, <29512.977.0>}]
In this second node lives the sala2b room (its pid starts with 0.), and the other two rooms are alive in the other node.
The rooms live in the node where the MUC service exists when the user joins that room.
Now I stop both ejabberd nodes, and start only the first node. All the rooms are started in that node, and the table shows:
ets:tab2list(muc_online_room).
[{muc_online_room,{<<"sala1">>,<<"conf.localhost">>}, <0.472.0>},
{muc_online_room,{<<"sala2b">>,<<"conf.localhost">>}, <0.470.0>},
{muc_online_room,{<<"sala32c">>,<<"conf.localhost">>}, <0.471.0>}]
You are using SQL storage, in that case you have only one database, and only one muc_online_room table, so it has a column named node. This indicates where the MUC room erlang process is alive, and its pid.
If you join a new room in the MUC service of the second ejabberd node, you will see that the new room is alive in the second node. If you then stop both nodes and start only one of them, all the rooms will be started in that node.

Related

Vespa.ai storage 0 down

I recently start using vespa, and I deployed a cluster on Kubernetes and index some data.. but toda one of storage shows down on "vespa-get-cluster-state":
[vespa#vespa-0 /]$ vespa-get-cluster-state
Cluster feature:
feature/storage/0: down
feature/storage/1: up
feature/storage/2: up
feature/distributor/0: up
feature/distributor/1: up
feature/distributor/2: up
I don't know what is this storage... this cluster had 2 content nodes, 2 containers nodes and 1 master.
How see logs and diagnostic why this down.
Just a tip: This question would work better on the Vespa Slack, or as a github issue.
According to the message you shared you have 3 content nodes (each have a "storage service responsible for storing the data, and a "distributor" service, responsible for managing a subset of the data space. The reason the node is down is not included in this message but you can find it by running vespa-logfmt -l warning,error on your admin node.

Issues with failback for logical replication using postgres

I am using PostgreSQL 13 to set up logical replication. When the original active node (A) becomes the secondary, the prior secondary(B) which is now active node needs to sync to the (A) node.
To summarize the issue:
Node A is active and fails at some point of time. Node B takes over and now running is active and accepting I/O from application. Now when Node A is recovered from failure and ready to become active again. In order to happen this Node A is trying to get the data which may be have been added while Node A was down. To get the this data Node A is creating a subscription to Node B which is now acting as a publisher. Issue is that this subscription on Node A fails as Node A already has some data before it went down and this data results in conflicts.
So what are my options here?

How to stop the entire Akka Cluster (Sharding)

How to stop the ENTIRE cluster with sharding (spanning multiple machines - nodes) from one actor?
I know I can stop the actor system on 'this' node context.system.terminate()
I know I can stop the local Sharding Region.
I found .prepareForFullClusterShutdown() but it doesn't actually stop the nodes.
I suppose there is no single command to do that, but there must be some way to do this.
There's no out-of-the-box way to do this that I'm aware of: the overall expectation is that there's an external control plane (e.g. kubernetes) which manages this.
However, one could have an actor on every node of the cluster that listens for membership events and also subscribes to a pubsub topic. This actor would track the current cluster membership and, when told to begin a cluster shutdown, it publishes a (e.g.) ShutdownCluster message to the topic and tracks which nodes leave. After some length of time (since distributed pubsub is at-most-once) if there are nodes besides this one that haven't left, it sends it again. Eventually, after all other nodes in the cluster have left, this actor then shuts down its node. When other nodes see a ShutdownCluster message, they immediately shut themselves down.
Of course, this sort of scheme will probably not play nicely with any form of external orchestration (whether it's a container scheduler like kubernetes, mesos, or nomad; or even something simple like monit which notices that the service isn't running and restarts it).

Fail back from slave to master in Artemis colocated HA config is not working

I have a 4 node Artemis 2.10 cluster on Linux configured for async IO journal, replication and colocated HA servers. I have been testing the failover and fail back but its not working. When shutting down one server (A) in an HA pair the colocated backup on the second server (B) will activate and correctly processes messages intended for the original server A. I modified the ColocatedFailoverExample from Artemis examples to check this and it is working. The problem is that when I bring up the original server A it starts, becomes live, registers acceptors and addresses and joins the cluster but a number of things are wrong:
looking at the artemis console for server A there is no longer a colocated_backup_1 listed to show that it is providing a colocated backup to server B.
Server A coming back up causes the server that was failed over to, server B, to go totally offline and only function as a backup. The master it was providing stops and no longer displays addresses or accepters in the UI.
Although it says its running as a backup Server B also doesn't have the colocated_backup_1 object shown in its console either.
Server B seems to be part of the cluster still but in the UI there is no green master node shown for it anymore - just a red slave node circle. Client connections to server B fail, most likely because the colocated master that was running on it was shutdown.
In the Artemis UI for server B under node-name > cluster-connections > cluster-name the attributes for the cluster show no nodes in the node array and the node id is wrong. The node id is now the same as the id of the master broker on server A. Its almost as if the information for the colocated_backup_01 broker on server B that was running before failover has replaced the server B live server and there's now only one broker on server B - the colocated backup.
This all happens immediately when I bring up server A. The UI for server B immediately refreshes at that time and the colocated_backup_01 entry disappears and the acceptors and addresses links under what was the master broker name for server B just disappear. The cluster attributes page will refresh and the 3 nodes that were listed there in the "nodes" attribute disappear and the "nodes" attribute is empty.
Now if I take down server B instead and bring it up, the roles between the server pair are swapped. Now server B becomes live again and is shown as a master node in the topology (but still no colocated_backup_01 in the UI) and the server A master broker goes offline and server A reconfigures as a backup/slave node. Whether server A or B is in this "offline", broker backup-only state the value of the Node property in the cluster attributes shown in the UI is the same value for both. Prior to doing the failover test they had different node ids which makes sense but the colocated_backup_01 backup on each did share the node id of the node it was backing up.
To summarize what I think is happening: The master that is coming backup after failover seems to trigger its partner backup node to come up as a backup but to also stop being a master node itself. From that point the pair colocation stops and there is only ever one live master between the two servers instead of one on each. The fail-back feature seems to be not only failing the original master back but shutting down the colocated master on that backup as well. Almost as if the topology between the two was configured to be colocated and its treating it a the standard two-node HA config where one server is the master and one is the slave.
The only way to fix the issue with the pair is to stop both servers and
remove everything under the broker "data" directory on both
boxes before starting them again. Just removing the colocated backup files on each machine isn't enough - everything under "data" has to go. After doing this they come up correctly and both are live masters and they pair up as HA colocated backups for each other again.
Here is the ha-policy section of the broker.xml file which is the same for all 4 servers:
<ha-policy>
<replication>
<colocated>
<backup-request-retries>5</backup-request-retries>
<backup-request-retry-interval>5000</backup-request-retry-interval>
<max-backups>1</max-backups>
<request-backup>true</request-backup>
<backup-port-offset>20</backup-port-offset>
<master>
<check-for-live-server>true</check-for-live-server>
<vote-on-replication-failure>true</vote-on-replication-failure>
</master>
<slave>
<max-saved-replicated-journals-size>-1</max-saved-replicated-journals-size>
<allow-failback>true</allow-failback>
<vote-on-replication-failure>true</vote-on-replication-failure>
<restart-backup>false</restart-backup>
</slave>
</colocated>
</replication>
</ha-policy>

Configuring replica set in a multi-data center

We have the following multi data-center Scenario
Node1 --- Node3
| |
| |
| |
--- ---
Node2 Node4
Node1 and Node3 form a Replica (sort of) Set ( for high availability )
Node 2/Node 4 are Priority 0 members (They should never become Primaries - Solely for read purpose)
Caveat -- what is the best way to design such a situation, since Node 2 and Node4 are not accessible to one another, given the way we configured our VPN/Firewalls;
essentially ruling out any heartbeat between Node2 and Node4.
Thanks Much
Here's what I got in mind:
Don't keep even members in a set. Thus you need another arbiter or set one of node2/4 to non-voting member.
As I'm using C# driver, I'm not sure you are using the same technology to build your application. Anyway, it turns out C# driver obtain a complete available server list from seeds (servers you provided in connection string) and tries to load-balancing requests to all of them. In your situation, I guess you would have application servers running in all 3 data centers. However, you probably don't want, for example, node 1 to accept connections from a different data center. That would significantly slow down the application. So you need some further settings:
Set node 3/4 to hidden nodes.
For applications running in the same data center with node 3/4, don't config the replicaSet parameter in connection string. But config the readPreference=secondary. If you need to write, you'll have to config another connection string to primary node.
If you make the votes of 2 and 4 also 0 then it should act, in failover as though 1 and 2 are only eligible. If you set them to hidden you have to forceably connect to them, MongoDB drivers will intentionally avoid them normally.
Other than that node 2 and 4 have direct access to whatever would be the primary as such I see no other problem.