how to configure a Akka Pub/Sub to run on same machine? - scala

I am following the Distributed Publish Subscribe in Cluster example in Akka. However, I would like to run all the actor (publisher and subscribers) on the same node (my laptop). I am not sure if I understand how to configure that, could somebody help me? is it possible to use the runOn or should it be declared in a configuration file? Currently,
I run into this error:
Caused by: akka.ConfigurationException: ActorSystem [akka://mySystem]
needs to have a 'ClusterActorRefProvider' enabled in the
configuration, currently uses [akka.actor.LocalActorRefProvider]

Your error is telling you what the problem is. In your application.conf you should set akka.actor.provider = "akka.cluster.ClusterActorRefProvider". If you want to use a 1 node cluster on your laptop you should also set akka.cluster.min-nr-of-members = 1.

Related

Debezium io with pulsar

I want to understand how pulsar uses debezium io connect for CDC.
While creating the source using pulsar-admin source create, how can I pass broker url and authentication params or client. Similar to what we di when using localrun.
The cmd I run :
bin/pulsar-admin source localrun --sourceConfigFile debezium-mysql-source-config.yaml --client-auth-plugin --client-auth-params --broker-service-url
Now I want to replace this to create a connector which runs in cluster mode.
Localrun is a special mode that simplifies debugging and it runs outside of normal cluster. It needs extra parameters to create the client for the local runtime.
In the cluster mode the connector will get the client from the Pulsar connectors runtime/through the function worker configuration. All you need to do is use "bin/pulsar-admin source create ...".

Programmatically create Artemis cluster on remote server

Is it possible to programmatically create/update a cluster on a remote Artemis server?
I will have lots of docker instances and would rather configure on the fly than have to set in XML files if possible.
Ideally on app launch I'd like to check if a cluster has been set up and if not create one.
This would probably involve getting the current server configuration and updating it with the cluster details.
I see it's possible to create a Configuration.
However, I'm not sure how to get the remote server configuration, if it's at all possible.
Configuration config = new ConfigurationImpl();
ClusterConnectionConfiguration ccc = new ClusterConnectionConfiguration();
ccc.setAddress("231.7.7.7");
config.addClusterConfiguration(ccc);
// need a way to get and update the current server configuration
ActiveMQServer.getConfiguration();
Any advice would be appreciated.
If it is possible, is this a good approach to take to configure on the fly?
Thanks
The org.apache.activemq.artemis.core.config.impl.ConfigurationImpl object can be used to programmatically configure the broker. The broker test-suite uses this object to configure broker instances. However, this object is not available in any remote sense.
Once the broker is started there is a rich management API you can use to add things like security settings, address settings, diverts, bridges, addresses, queues, etc. However, the changes made by most (although not all) of these operations are volatile which means many of them would need to be performed every time the broker started. Furthermore, there are no management methods to add cluster connections.
You might consider using a tool like Ansible to manage the configuration or even roll your own solution with a templating engine like FreeMarker to customize the XML and then distribute it to your Docker instances using some other technology.

Elixir Kafka client Elsa

I am trying to create dynamically topics in Kafka but unfortunately some error occurs. Here is my code
def hello_from_elsa do
topic = "producer-manager-test"
connection = :conn
Elsa.Supervisor.start_link(endpoints: #endpoints,
connection: connection)
Elsa.create_topic(#endpoints, topic)
end
As far as I understand I can connect to the broker itself but when the crete topic line is executed i get this error:
(MatchError) no match of right hand side value: false
(kafka_protocol) src/kpro_brokers.erl:240: anonymous fn/1 in :kpro_brokers.discover_controller/2
(kafka_protocol) src/kpro_lib.erl:376: :kpro_lib.do_ok_pipe/2
(kafka_protocol) src/kpro_lib.erl:281: anonymous fn/3 in :kpro_lib.with_timeout/2
I am not sure whether i miss some additional step before creating the topic. But it should be fine I guess since i start the supervisor and its running :/
Hard to say since the error is coming from the underlying Kafka protocol and not Elsa directly but it looks like there aren't any Kafka cluster controllers able to be found.
Topic management has to be done through a controller node so the with_connection function create_topic wraps explicitly passes the atom :controller to establish the connection and for whatever reason, likely something specific to your cluster, the function isn't able to successfully find a controller.
What type of cluster are you testing against? If you use the divo and divo_kafka library you can stand up a single-node kafka cluster using Docker on your local host to test against and it should work as expected.

How to connect to a kerberoized hdfs from Spark on Kubernetes?

I'm trying to connect to hdfs which is kerberized which fails with the error
org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
What additional parameters do I need to add while creating the spark setup apart from the standard thing that you need to spawn Spark worker containers?
Check <property>hadoop.security.authentication<property> in your hdfs-site.xml properties file.
In your case it should have value kerberos or token.
Or you can configure it from code by specifying property explicitly:
Configuration conf = new Configuration();
conf.set("hadoop.security.authentication", "kerberos");
You can find more information about secure connection to hdfs here
I have also asked a very similar question here.
Firstly, please verify whether this is error is occurring on your driver pod or the executor pods. You can do this by looking at the logs of the driver and the executors as they start running. While I don't have any errors with my spark job running only on the master, I do face this error when I summon executors. The solution is to use a sidecar image. You can see an implementation of this in ifilonenko's project, which he referred to in his demo.
The premise of this approach is to store the delegation token (obtained by running a kinit) into a shared persistent volume. This volume can then be mounted to your driver and executor pods, thus giving them access to the delegation token, and therefore, the kerberized hdfs. I believe you're getting this error because your executors currently do not have the delegation token necessary for access to hdfs.
P.S. I'm assuming you've already had a look at Spark's kubernetes documentation.

Questions Concerning Using Celery with Multiple Load-Balanced Django Application Servers

I'm interested in using Celery for an app I'm working on. It all seems pretty straight forward, but I'm a little confused about what I need to do if I have multiple load balanced application servers. All of the documentation assumes that the broker will be on the same server as the application. Currently, all of my application servers sit behind an Amazon ELB and tasks need to be able to come from any one of them.
This is what I assume I need to do:
Run a broker server on a separate instance
Configure each application instance to connect to that broker server
Each application instance will also be be a celery working (running
celeryd)?
My only beef with that is: What happens if my broker instance dies? Can I run 2 broker instances some how so I'm safe if one goes under?
Any tips or information on what to do in a setup like mine would be greatly appreciated. I'm sure I'm missing something or not understanding something.
For future reference, for those who do prefer to stick with RabbitMQ...
You can create a RabbitMQ cluster from 2 or more instances. Add those instances to your ELB and point your celeryd workers at the ELB. Just make sure you connect the right ports and you should be all set. Don't forget to allow your RabbitMQ machines to talk among themselves to run the cluster. This works very well for me in production.
One exception here: if you need to schedule tasks, you need a celerybeat process. For some reason, I wasn't able to connect the celerybeat to the ELB and had to connect it to one of the instances directly. I opened an issue about it and it is supposed to be resolved (didn't test it yet). Keep in mind that celerybeat by itself can only exist once, so that's already a single point of failure.
You are correct in all points.
How to make reliable broker: make clustered rabbitmq installation, as described here:
http://www.rabbitmq.com/clustering.html
Celery beat also doesn't have to be a single point of failure if you run it on every worker node with:
https://github.com/ybrs/single-beat