I wonder if atomic types consistent in case of nodes shut down. For example, I have an Ignite cluster of three nodes with atomics configured in this way:
<property name="atomicConfiguration">
<bean class="org.apache.ignite.configuration.AtomicConfiguration">
<property name="backups" value="2"/>
<property name="atomicSequenceReserveSize" value="5000"/>
</bean>
</property>
Would I get a correct value of AtomicLong if two of three nodes were shutdown?
Yes, you'll get a correct value because you've configured 2 backups. Atomic internally uses PARTITIONED cache template (it could be reconfigured by the way), it means that the cache has one primary partition and a configured number of backup partitions, let's call it backupNumber. In general Atomic survives after a loss of backupNumber nodes.
Related
I have question about ActiveMQ Artemis cluster migration to the pluggable quorum configuration.
Currently we have a cluster in the test environment which has 6 servers (3 pairs of master and slave with classic replication), and I plan to migrate to the cluster with the pluggable quorum voting. The version of Artemis is 2.23.1.
I have configured another (pretest) cluster with 3 zookeeper nodes and 2 nodes of primary/backup Artemis. It seems to work well, but it is a pretest environment where we perform some experiments, and there are no clients and workload. So I have decided to reconfigure the test cluster to use pluggable quorum voting.
At first I thought that we can change role of each server from master to primary, and from slave to backup.
Previous configuration was - master:
<ha-policy>
<replication>
<master>
<check-for-live-server>true</check-for-live-server>
<vote-on-replication-failure>true</vote-on-replication-failure>
<quorum-size>2</quorum-size>
<group-name>group-for-each-pair</group-name>
</master>
</replication>
</ha-policy>
Slave:
<ha-policy>
<replication>
<slave>
<allow-failback>true</allow-failback>
<group-name>group-for-each-pair</group-name>
</slave>
</replication>
</ha-policy>
The group name is used for slave to determine to which master it has to connect to.
Unfortunately, this setting does not work in the primary and backup sections. I tried to configure it and get xsd validation error for broker.xml.
In the documentation there are some words about settings which are no longer needed in the pluggable quorum configuration:
There are some no longer needed classic replication configurations:
vote-on-replication-failure quorum-vote-wait vote-retries
vote-retries-wait check-for-live-server
But there is nothing about <group-name>. Maybe it is a documentation issue.
New configuration is - primary:
<ha-policy>
<replication>
<primary>
<manager>
<class-name>org.apache.activemq.artemis.quorum.zookeeper.CuratorDistributedPrimitiveManager</class-name>
<properties>
<property key="connect-string" value="zookeeper-amq1:2181,zookeeper-amq2:2181,zookeeper-amq3:2181"/>
</properties>
</manager>
</primary>
</replication>
</ha-policy>
Backup:
<ha-policy>
<replication>
<backup>
<manager>
<class-name>org.apache.activemq.artemis.quorum.zookeeper.CuratorDistributedPrimitiveManager</class-name>
<properties>
<property key="connect-string" value="zookeeper-amq1:2181,zookeeper-amq2:2181,zookeeper-amq3:2181"/>
</properties>
</manager>
<allow-failback>true</allow-failback>
</backup>
</replication>
</ha-policy>
When I tried start the cluster with these settings, I found that backup servers try to connect to any primary server, and some of them cannot start. And I have reverted back to the old configuration.
I read the documentation and found some settings which could help:
<coordination-id>. Used in multi-primary configuration and probably will not work in the section.
namespace in the Apache Curator settings. Maybe it can help to split servers to pairs where each backup will connect to it's primary in the same namespace. But it may be designed for another purpose (to have one zookeeper for the several separate clusters), and there could be some other problems.
Another option is to remove unnecessary 4 ActiveMQ Artemis servers and use only 1 pair of servers. It will require client reconfiguration, but clients will continue to work with only 2 remaining servers even if there are 6 servers remain in the connection string.
Is there a preferred way to migrate from classic replication to the pluggable quorum voting without changing cluster topology (6 servers)?
Any changes in this test environment (if succeeded) will be performed on the UAT and production clusters which have the same topology. So we would prefer a smooth migration if possible.
I recommend just using group-name as you were before. For example on the primary:
<ha-policy>
<replication>
<primary>
<manager>
<class-name>org.apache.activemq.artemis.quorum.zookeeper.CuratorDistributedPrimitiveManager</class-name>
<properties>
<property key="connect-string" value="zookeeper-amq1:2181,zookeeper-amq2:2181,zookeeper-amq3:2181"/>
</properties>
</manager>
<group-name>group-for-each-pair</group-name>
</primary>
</replication>
</ha-policy>
And on the backup:
<ha-policy>
<replication>
<backup>
<manager>
<class-name>org.apache.activemq.artemis.quorum.zookeeper.CuratorDistributedPrimitiveManager</class-name>
<properties>
<property key="connect-string" value="zookeeper-amq1:2181,zookeeper-amq2:2181,zookeeper-amq3:2181"/>
</properties>
</manager>
<group-name>group-for-each-pair</group-name>
<allow-failback>true</allow-failback>
</backup>
</replication>
</ha-policy>
That said, I strongly encourage you to execute performance tests with a single HA pair of brokers. A single broker can potentially handle millions of messages per second so it's likely that you don't need a cluster of 3 primary brokers. Also, if your applications are connected to the cluster nodes such that messages are produced on one node and consumed from another then having a cluster may actually reduce overall message throughput due to the extra "hops" a message has to take. Obviously this wouldn't be an issue for a single HA pair.
Finally, dropping from 6 brokers down to 2 would significantly reduce configuration and operational complexity, and it's likely to reduce infrastructure costs substantially as well. This is one of the main reasons we implemented pluggable quorum voting in the first place.
I have created an ActiveMQ Artemis cluster with two active brokers. I have created a custom load balancer to be able to initially distribute my queues in a static way according to my requirements and workload.
<connectors>
<connector name="broker1-connector">tcp://myhost1:61616</connector>
<connector name="broker2-connector">tcp://myhost2:62616</connector>
</connectors>
<cluster-connections>
<cluster-connection name="myhost1-cluster">
<connector-ref>broker1-connector</connector-ref>
<retry-interval>500</retry-interval>
<use-duplicate-detection>true</use-duplicate-detection>
<message-load-balancing>ON_DEMAND</message-load-balancing>
<max-hops>1</max-hops>
<static-connectors>
<connector-ref>broker2-connector</connector-ref>
</static-connectors>
</cluster-connection>
</cluster-connections>
My issue is that when broker1 is down then based on this topology I can recreate its queues on broker2 to avoid losing messages (by using connection string on producer (tcp://myhost1:61616,tcp://myhost2:62616)).
But then when broker1 becomes available again my producer is unaware of that and it stills uses the connection to broker2 (if that matters broker2 redistribution-delay is set to 0 and no consumers are registered). Is there a way or some configuration to resume my producer to write only to broker1.
This affects my consumers which are initially connected to broker1, and I am not sure if there is also some way/configuration to make them transparently bounce between these brokers or do I need to create two consumers (effectively one them will be idle) each one targeting the corresponding broker ?
There is no way for the broker to tell a client that it should connect to another node joining the cluster.
My recommendation would be to use HA with failback so that when one node fails then all the clients connected to that node failover to the backup and then when the original node comes back all the clients failback to the original node.
You may also find that you don't actually need a cluster of 2 brokers. Many users never perform the performance testing necessary to confirm that clustering is even necessary in the first place. They simply assume that a cluster is necessary. Such an assumption can needlessly complicate a platform's architecture and waste valuable resources. The performance of ActiveMQ Artemis is quite good. A single node can handle millions of messages per second in certain use-cases.
A question about Filtering in ActiveMQ Artemis.
If I have a queue named MyQueue.IN and a filter only accepting a certain JMS Headers. Let's say ORDER.
In Broker.xml under the tag
<core>
<configuration-file-refresh-period>5000</configuration-file-refresh-period>
<queues>
<queue name="MyQueue.IN">
<address>MyQueue.IN</address>
<filter string="TOSTATUS='ORDER'"/>
<durable>true</durable>
</queue>
</queues>
</core>
As I read the manual, changing the Broker.xml it should now relaod config in Broker.xml every 5 seconds.
But when I change the filter to
<filter string="TOSTATUS='ORDERPICKUP'"/>
The config is not changed in ActiveMQ Artemis.
Not even if I restart the node.
It is in a cluster but I have changed Broker.xml on both sides.
Any ideas on how to change a filter on a queue? Preferably by changing the Broker.xml
/Zeddy
You are seeing the expected behavior. Although this behavior may not be intuitive or particularly user friendly it is meant to protect data integrity. Queues are immutable so once they are created they can't be changed. Therefore, to "change" a queue it has to be deleted and re-created. Of course deleting a queue means losing all the messages in the queue which is potentially catastrophic. In general, there are 2 ways to delete the queue and have it re-created:
Set <config-delete-queues>FORCE</config-delete-queues> in a matching <address-setting>. However, there is currently a problem with this approach which will be resolved via ARTEMIS-2076.
Delete the queue via management while the broker is running. This can be done via the JMX (e.g. using JConsole), the web console, the Artemis CLI, etc. Once the broker is stopped, update the XML, and then restart the broker.
We are trying to do a POC where we try to export data from a volt db table to kafka below is the steps I followed:-
Step1:- prepared the deployment.xml to enable the export to kafka
<?xml version="1.0"?>
<deployment>
<cluster hostcount="1" kfactor="0" schema="ddl" />
<httpd enabled="true">
<jsonapi enabled="true" />
</httpd>
<export enabled="true" target="kafka">
<configuration>
<property name="metadata.broker.list">localhost:9092</property>
<property name="batch.mode">false</property>
</configuration>
</export>
</deployment>
Step2:- Then Strted the voltdb server
./voltdb create -d deployment-noschema.xml --zookeeper=2289
Step3:- Create a export only table and insert some data into it
create table test(x int);
export table test;
insert into test values(1);
insert into test values(2);
After this I tried to verify if any topic has been created in kafka but there was none.
./kafka-topics.sh --list --zookeeper=localhost:2289
Also I can see logging of all the data in exportoverflow directory. Could anyone please let me know what's the missing part here.
Prabhat,
In your specific case, a possible explanation of the behavior you observe is you started Kafka with out the auto create topics options set to true. The export process requires Kafka to have this enabled to be able to create topics on the fly. If not you will have to manually create the topic and then export from VoltDB.
As a side note, while you can use the zookeeper that starts with VoltDB to start your Kafka, it is not the recommended approach since when you bring down VoltDB server, then your Kafka is left with no zookeeper. It is best approach to use Kafka's own zookeeper to manager your Kafka instance.
Let me know if this helped - Thx.
Some Questions and Possible answers.
Are you using enterprise version?
Can you call #Quiesce from sqlcmd and see if your data pushes to kafka.
Which version you are using?
VoltDB embeds a zookeeper are you using standalone zookeeper or VoltDB's ? we dont test with embedded one as its not exactly same as kafka supported.
Let us know or email support At voltdb.com
Looking forward.
We would like to use the Publish / Subscribe abilities of NServiceBus with an MSMQ cluster. Let me explain in detail:
We have an SQL Server cluster that also hosts the MSMQ cluster. Besides SQL Server and MSMQ we cannot host any other application on this cluster. This means our subscriber is not allowed to run on the clsuter.
We have multiple application servers hosting different types of applications (going from ASP.NET MVC to SharePoint Server 2010). The goal is to do a pub/sub between all these applications.
All messages going through the pub/sub are critical and have an important value to the business. That's why we don't want local queues on the application server, but we want to use MSMQ on the cluster (in case we lose one of the application servers, we don't risk losing the messages since they are safe on the cluster).
Now I was assuming I could simply do the following at the subscriber side:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<configSections>
<section name="MsmqTransportConfig" type="NServiceBus.Config.MsmqTransportConfig, NServiceBus.Core" />
....
</configSections>
<MsmqTransportConfig InputQueue="myqueue#server" ErrorQueue="myerrorqueue"
NumberOfWorkerThreads="1" MaxRetries="5" />
....
</configuration>
I'm assuming this used to be supported seen the documentation: http://docs.particular.net/nservicebus/messaging/publish-subscribe/
But this actually throws an exception:
Exception when starting endpoint, error has been logged. Reason:
'InputQueue' entry in 'MsmqTransportConfig' section is obsolete. By
default the queue name is taken from the class namespace where the
configuration is declared. To override it, use .DefineEndpointName()
with either a string parameter as queue name or Func parameter
that returns queue name. In this instance, 'myqueue#server' is defined
as queue name.
Now, the exception clearly states I should use the DefineEndpointName method:
Configure.With()
.DefaultBuilder()
.DefineEndpointName("myqueue#server")
But this throws an other exception which is documented (input queues should be on the same machine):
Exception when starting endpoint, error has been logged. Reason: Input
queue must be on the same machine as this process.
How can I make sure that my messages are safe if I can't use MSMQ on my cluster?
Dispatcher!
Now I've also been looking into the dispatcher for a bit and this doesn't seem to solve my issue either. I'm assuming also the dispatcher wouldn't be able to get messages from a remote input queue? And besides that, if the dispatcher dispatches messages to the workers, and the workers go down, my messages are lost (even though they were not processed)?
Questions?
To summarize, these are the things I'm wondering with my scenario in NServiceBus:
I want my messages to be safe on the MSMQ cluster and use a remote input queue. Is this something is should or shouldn't do? Is it possible with NServiceBus?
Should I use a dispatcher in this case? Can it read from a remote input queue? (I cannot run the dispatcher on the cluster)
What if the dispatcher dispatchers messages to the workers and one of the workers goes down? Do I lose the message(s) that were being processed?
Phill's comment is correct.
The thing is that you would get the type of fault tolerance you require practically by default if you set up a virtualized environment. In that case, the C drive backing the local queue of your processes is actually sitting on the VM image on your SAN.
You will need a MessageEndpointMappings section that you will use to point to the Publisher's input queue. This queue is used by your Subscriber to drop off subscription messages. This will need to be QueueName#ClusterServerName. Be sure to use the cluster name and not a node name. The Subscriber's input queue will be used to receive messages from the Publisher and that will be local, so you don't need the #servername.
There are 2 levels of failure, one is that the transport is down(say MSMQ) and the other is that the endpoint is down(Windows Service). In the event that the endpoint is down, the transport will handle persisting the messages to disk. A redundant network storage device may be in order.
In the event that the transport is down, assuming it is MSMQ the messages will backup on the Publisher side of things. Therefore you have to account for the size and number of messages to calculate how long you want to messages to backup for. Since the Publisher is clustered, you can be assured that the messages will arrive eventually assuming you planned your disk appropriately.