how to configure loadbalancer in fuse cluster with different machine1,machine2 and machine3? - jbossfuse

Following are the steps i follow to setup cluster in 3 different machine.
1. Unzip JBoss fuse in three different folders, so that you have the following configuration:
- machine1/jboss-fuse-6.3.0.redhat-187
- machine2/jboss-fuse-6.3.0.redhat-187
- machine3/jboss-fuse-6.3.0.redhat-187
2. Edit etc/org.apache.karaf.management.cfg and change rmiRegistryPort, rmiServerPort, assiging an unique port:
**#machine1**
rmiRegistryPort = 1099
rmiServerPort = 44444
**#machine2**
rmiRegistryPort = 1100
rmiServerPort = 44445
**#machine3**
rmiRegistryPort = 1101
rmiServerPort = 44446
3. Edit etc/org.apache.karaf.shell.cfg and change sshPort, assiging an unique port:
#machine1
sshPort = 8101
#machine2
sshPort = 8102
#machine3
sshPort = 8103
4. Edit etc/system.properties. Change karaf.name, org.osgi.service.http.port, activemq.port , assiging an unique port:
#machine1
karaf.name = root1
org.osgi.service.http.port=8181
activemq.port = 61616
#machine2
karaf.name = root2
org.osgi.service.http.port=8182
activemq.port = 61617
#machine3
karaf.name = root3
org.osgi.service.http.port=8183
activemq.port = 61618
5. start the root1 Container
./fuse
6. And create the Fabric:
JBossFuse:karaf#root1> fabric:create --new-user administrator --new-user-password password --new-user-role Administrator --zookeeper-password ZooPass1 --resolver manualip --manual-ip 192.168.1.9 --wait-for-provisioning
Above is My root1 machine1 IP Address : 192.168.1.9
7. Now, start the root2 Container and join the Fabric:
./fuse
JBossFuse:karaf#root2> fabric:join 192.168.1.10:2181
Ensemble password: ZooPass1
8. Now, start the root3 Container and join the Fabric:
./fuse
JBossFuse:karaf#root3> fabric:join 192.168.1.11:2181
Ensemble password: ZooPass1
9. Run the following command to ensemble:
JBossFuse:karaf#root1> fabric:ensemble-add root2 root3
This will change of the zookeeper connection string.
Are you sure want to proceed(yes/no):yes
JBossFuse:karaf#root1> fabric:ensemble-list
[id]
root1
root2
root3
Then, i deployed the rest service on all 3 nodes and create the profile also added require profile with HTTP GETEWAY for load balancer and HA but request is not gone throgh machine 2 and machine 3. Even i am also not able access machine 1 and machine 2 hawtio console as per give below URL.
192.168.1.10:8182/hawtio/login
192.168.1.10:8183/hawtio/login
Can anybody help to to achieve load balancing for cluster environment with 3 different machine?

I would suggest -- don't do any of this :) If you're using Fabric8, install one instance of Fuse, do fabric:create, then use container-create-ssh --host localhost to set up other containers on the same machine. That will automatically take care of all the port conflicts that I suspect are at the root of your problem. Fabric8 uses many, many ports, and trying to fix them all up manually is a ghastly job.

Related

Need help setting BROKER_URL in Airflow's config and Celery Executor

Summary
I'm using Apache-Airflow for the first time. I've gotten the webserver, SequentialExecutor and LocalExecutor to work, but I'm running into issues when using the CeleryExecutor with rabbitmq-server. I currently have two AWS EC2 instances.
Error
To summarize: My worker cannot connect to the rabbitmq-server on my scheduler node. Whenever I run airflow worker on the worker instance, it gives:
- ** ---------- [config]
- ** ---------- .> app: airflow.executors.celery_executor:0x7f53a8dce400
- ** ---------- .> transport: amqp://guest:**#localhost:5672//
- ** ---------- .> results: disabled://
- *** --- * --- .> concurrency: 16 (prefork)
-- ******* ----
--- ***** ----- [queues]
-------------- .> default exchange=default(direct) key=default
[2019-02-15 02:26:23,742: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//: [Errno 111] Connection refused.
Configuration
I followed all of the directions I could find online. Both instances have the same airflow.cfg file, with
[core]
executor = CeleryExecutor
[celery]
broker_url = pyamqp://username:password#hostname:port/virtual_host
and result_backend pointing at the same MySQL database on RDS that airflow is working off of.
From what I could tell, no matter what, the worker node always tried connecting to a local rabbitmq-server and completely ignored that broker_url in my airflow.cfg file.
What I've Tried
I went spelunking in the source code, and noticed in celery/app/base.py, if I error log out the configurations it gets in _get_config() when it goes to create a connection, there are actually TWO values in the dictionary returned.
BROKER_URL = None
broker_url = pyamqp://username:password#hostname:port/virtual_host
and all of the connection logic seems to point at the BROKER_URL key.
I tried setting BROKER_URL and CELERY_BROKER_URL in airflow.cfg, but it seems to be case insensitive, and ignores the latter. Just to see if it would work, I modified the _get_config() method and hacked in:
s['BROKER_URL'] = s['broker_url']
return s
And, like I expected, everything started working.
Am I doing something wrong? I'd really rather not use this hack, but I can't understand why it's ignoring the configuration values.
Thanks!
From the error message, it seems like the hostname being passed in the URI is wrong:
If rabbitmq-server and worker are in different machines: instead of localhost/127.0.0.1, the hostname should be the IP address of the rabbitmq machine
If rabbitmq-server and worker are in the same machine as part of a Docker Compose application (e.g. if you took inspiration from here): the hostname should be the service name associated to the RabbitMQ image in docker-compose.yml, e.g. amqp://guest:guest#rabbitmq:5672/

Received AliveMessage from a peer with the same PKI-ID as myself

I am attempting to port the Hyperledger Fabric Getting Started to Kubernetes. But am struggling to get peer1's to deploy. If I enable CORE_PEER_GOSSIP_BOOTSTRAP, I receive errors "Received AliveMessage from a peer with the same PKI-ID as myself".
How can I debug a peer reportedly having the same PKI-ID as another?
Using this as a starting point:
https://hyperledger-fabric.readthedocs.io/en/latest/getting_started.html
I am able to create:
orderer and cli pods in default namespace
peer0's one in each org1|org2 namespace.
peer1's but only if I disable (comment out) CORE_PEER_GOSSIP_BOOTSTRAP
If I enable CORE_PEER_GOSSIP_BOOTSTRAP for the peer1's, I receive the following warning and error:
[gossip/gossip#10.0.0.10:7051] NewGossipService -> WARN 01c External endpoint is empty, peer will not be accessible outside of its organization
...
[gossip/discovery#10.0.0.10:7051] handleAliveMessage -> ERRO 02a Bad configuration detected: Received AliveMessage from a peer with the same PKI-ID as myself: tag:EMPTY alive_msg:<membership:<pki_id:"[[REDACTED]]" > timestamp:<inc_number:1495468533769417608 seq_num:416 > >
In order to better map the Orderer, Peers to DNS names, I'm using Kubernetes Namespaces and this configuration:
OrdererOrgs:
- Name: Orderer
Domain: default.svc.cluster.local
Specs:
- Hostname: orderer
PeerOrgs:
- Name: Org1
Domain: org1.svc.cluster.local
Template:
Count: 2
Users:
Count: 2
- Name: Org2
Domain: org2.svc.cluster.local
Template:
Count: 2
Users:
Count: 2
In order to expose the peer0's to the other peers in the org and to expose the orderer, I have ClusterIP services for the peer0's (selecting only the peer0's) and orderer. It's inelegant but I'm trying to get it to work before I get it working more beautifully.
I am able to resolve orderer.default.svc.cluster.local, peer0.org1.svc.cluster.local, `peer0.org2.svc.cluster.local' using nslookup from within a pod deployed to default on the cluster.
Absent a curl-like tool for gPRC, I am able to open sockets against these endpoints on 7051 and 7053.
First, make sure you are using the right certificates.
Second, verify that your environment/configuration for gossip is set correctly
environment:
- CORE_PEER_GOSSIP_EXTERNALENDPOINT=peer1.org1.example.com:8051
- CORE_PEER_GOSSIP_BOOTSTRAP=peer0.org1.example.com:7051
- CORE_PEER_GOSSIP_ENDPOINT=peer0.org1.example.com:7051
OR in core.yaml
peer:
gossip:
bootstrap: peer0.org1.example.com:7051
externalEndpoint: peer1.org1.example.com:8051
endpoint: peer0.org1.example.com:7051
Edited: Also make sure that you have properly setup your CA
Hope this helps, it worked for me. And I was successfully able to connect peers.
If the peers are started from the same node, its possible that you are mounting the same crypto-material (path to mspconfig directory) for both the peers. If that is the case, separate the directory structures for both the peers and keep their respective certificates in them, update the respective paths for msp in docker-compose file and try to run.

Icinga2 check_load plugin thresholds

I've recently installed Icinga2 on a bunch of Ubuntu LXC containers. I have a master node where you can log into icingaweb to check status.
However the load thresholds seem low and I cannot see how our even where you can adjust the parameters. May I ask for someone to point me in the right direction? Is this done on the master or the remote nodes? What's the file and where does it sit in the file structure?
I installed Icinga2 on Ubuntu 16.04 server from the Icinga2 PPA
Create a service definition for load in master:
apply Service "load" {
import "generic-service"
check_command = "load"
vars.load_wload1 = 5
vars.load_wload5 = 4
vars.load_wload15 = 3
vars.load_cload1 = 10
vars.load_cload5 = 6
vars.load_cload15 = 4
command_endpoint = host.address
assign where host.name == "monitored client"
}
More info here

Run two instances of JBoss Fuse on the same box

What configuration files/values to change in order to run second instance of the JBoss Fuse on the same box?
Second instance properties after configuration:
Installation home: c:\jboss-fuse-6.2.1.redhat-084-2 (/usr/app/jboss-fuse-6.2.1.redhat-084-2)
Remote debug port: 5006
Jetty/CXF port: 8282
RMI registry port: 2099
RMI server port: 54444
SSH port: 8202
ActiveMQ port: 62616
HawtIO console: http://localhost:8282/hawtio/login
Installation home:
$JBOSS_FUSE_HOME\bin\setenv
----
KARAF_HOME=/usr/app/jboss-fuse-6.2.1.redhat-084-2
KARAF_DATA=/usr/app/jboss-fuse-6.2.1.redhat-084-2/data
KARAF_ETC=/usr/app/jboss-fuse-6.2.1.redhat-084-2/etc
export KARAF_HOME
export KARAF_DATA
export KARAF_ETC
%JBOSS_FUSE_HOME%\bin\setenv.bat
----
SET KARAF_HOME=c:\jboss-fuse-6.2.1.redhat-084-2
SET KARAF_DATA=c:\jboss-fuse-6.2.1.redhat-084-2\data
SET KARAF_ETC=c:\jboss-fuse-6.2.1.redhat-084-2\etc
Remote debug port
$JBOSS_FUSE_HOME\bin\admin
$JBOSS_FUSE_HOME\bin\karaf
$JBOSS_FUSE_HOME\bin\patch
----
DEFAULT_JAVA_DEBUG_OPTS="-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5006"
%JBOSS_FUSE_HOME%\bin\admin.bat
%JBOSS_FUSE_HOME%\bin\karaf.bat
%JBOSS_FUSE_HOME%\bin\patch.bat
----
set DEFAULT_JAVA_DEBUG_OPTS=-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=5006
Jetty/CXF port
JBOSS_FUSE_HOME\etc\jetty.xml
---
<Property name="jetty.port" default="8282"/>
JBOSS_FUSE_HOME\etc\org.ops4j.pax.web.cfg
---
org.osgi.service.http.port=8282
JBOSS_FUSE_HOME\etc\system.properties
---
org.osgi.service.http.port=8282
RMI registry port/RMI server port
JBOSS_FUSE_HOME\etc\org.apache.karaf.management.cfg
---
rmiRegistryPort = 2099
rmiServerPort = 54444
SSH port
JBOSS_FUSE_HOME\etc\org.apache.karaf.shell.cfg
---
sshPort = 8202
ActiveMQ port
JBOSS_FUSE_HOME\etc\system.properties
---
activemq.port = 62616
activemq.host = localhost
This depends on the applications installed, so let's stick with vanilla JBoss Fuse 6.2+
There are 3 components that need a change in configuration:
ActiveMQ broker
Hawtio web interface
sshd
Conflicts happen while binding on TCP/IP ports. Use two sets of ports and you're done.
Configuration files are located in $KARAF_ETC folder, usually etc/ inside JBoss Fuse installation folder.
ActiveMQ
Change property activemq.port inside etc/system.properties.
Default value is 61616.
Hawtio / OSGi HTTP
Change property org.osgi.service.http.port inside etc/system.properties. Default is 8181.
This is also defined in etc/org.ops4j.pax.web.cfg.
SSH
Change property sshPort inside etc/org.apache.karaf.shell.cfg. Default is 8101
Another way: Create a Fabric with two child containers. Each container is just like a regular instance. The infrastructure is just a bit more complex than the standalone one.

mesos-master crash with zookeeper cluster

I am deploying a zookeeper cluster which has 3 nodes. I use it to keep my mesos master high availability. I download the zookeeper-3.4.6.tar.gz tarball and uncompress it to /opt, rename it to /opt/zookeeper, enter the directory, edit the conf/zoo.cfg(pasted below), create a myid file in dataDir(which is set to /var/lib/zookeeper in zoo.cfg), and start zookeeper using ./bin/zkServer.sh start, and it goes well. I start all the 3 nodes one by one and they all seems well. I use ./bin/zkCli.sh to connect the server , no problem.
But when I start mesos (3 masters and 3 slaves, each node runs a master and a slave), then the masters soon crashed, one by one, and in the webpage http://mesos_master:5050, slave tab, no slaves are displayed. But when I run only one zookeeper, these are all fine. So I think it's the zookeeper cluster's problem.
I got 3 PV host in my ubuntu server. they are all running ubuntu 14.04 LTS:
node-01, node-02, node-03,
I have /etc/hosts in all three nodes like this:
172.16.2.70 node-01
172.16.2.81 node-02
172.16.2.80 node-03
I installed zookeeper, mesos on all the three nodes. Zookeeper configure file is like this (all three nodes) :
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
initLimit=5
syncLimit=2
server.1=node-01:2888:3888
server.2=node-02:2888:3888
server.3=node-03:2888:3888
they can be started normally and run well. And then I start the mesos-master service, using the command line ./bin/mesos-master.sh --zk=zk://172.16.2.70:2181,172.16.2.81:2181,172.16.2.80:2181/mesos --work_dir=/var/lib/mesos --quorum=2, and after a few seconds, it gives me errors like this:
F0817 15:09:19.995256 2250 master.cpp:1253] Recovery failed: Failed to recover registrar: Failed to perform fetch within 1mins
*** Check failure stack trace: ***
# 0x7fa2b8be71a2 google::LogMessage::Fail()
# 0x7fa2b8be70ee google::LogMessage::SendToLog()
# 0x7fa2b8be6af0 google::LogMessage::Flush()
# 0x7fa2b8be9a04 google::LogMessageFatal::~LogMessageFatal()
▽
# 0x7fa2b81a899a mesos::internal::master::fail()
▽
# 0x7fa2b8262f8f _ZNSt5_BindIFPFvRKSsS1_EPKcSt12_PlaceholderILi1EEEE6__callIvJS1_EJLm0ELm1EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
▽
# 0x7fa2b823fba7 _ZNSt5_BindIFPFvRKSsS1_EPKcSt12_PlaceholderILi1EEEEclIJS1_EvEET0_DpOT_
# 0x7fa2b820f9f3 _ZZNK7process6FutureI7NothingE8onFailedISt5_BindIFPFvRKSsS6_EPKcSt12_PlaceholderILi1EEEEvEERKS2_OT_NS2_6PreferEENUlS6_E_clES6_
# 0x7fa2b826305c _ZNSt17_Function_handlerIFvRKSsEZNK7process6FutureI7NothingE8onFailedISt5_BindIFPFvS1_S1_EPKcSt12_PlaceholderILi1EEEEvEERKS6_OT_NS6_6PreferEEUlS1_E_E9_M_invokeERKSt9_Any_dataS1_
# 0x4a44e7 std::function<>::operator()()
# 0x49f3a7 _ZN7process8internal3runISt8functionIFvRKSsEEJS4_EEEvRKSt6vectorIT_SaIS8_EEDpOT0_
# 0x499480 process::Future<>::fail()
# 0x7fa2b806b4b4 process::Promise<>::fail()
# 0x7fa2b826011b process::internal::thenf<>()
# 0x7fa2b82a0757 _ZNSt5_BindIFPFvRKSt8functionIFN7process6FutureI7NothingEERKN5mesos8internal8RegistryEEERKSt10shared_ptrINS1_7PromiseIS3_EEERKNS2_IS7_EEESB_SH_St12_PlaceholderILi1EEEE6__callIvISM_EILm0ELm1ELm2EEEET_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE
# 0x7fa2b82962d9 std::_Bind<>::operator()<>()
# 0x7fa2b827ee89 std::_Function_handler<>::_M_invoke()
I0817 15:09:20.098639 2248 http.cpp:283] HTTP GET for /master/state.json from 172.16.2.84:54542 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36'
# 0x7fa2b8296507 std::function<>::operator()()
# 0x7fa2b827efaf _ZZNK7process6FutureIN5mesos8internal8RegistryEE5onAnyIRSt8functionIFvRKS4_EEvEES8_OT_NS4_6PreferEENUlS8_E_clES8_
# 0x7fa2b82a07fe _ZNSt17_Function_handlerIFvRKN7process6FutureIN5mesos8internal8RegistryEEEEZNKS5_5onAnyIRSt8functionIS8_EvEES7_OT_NS5_6PreferEEUlS7_E_E9_M_invokeERKSt9_Any_dataS7_
# 0x7fa2b8296507 std::function<>::operator()()
# 0x7fa2b82e4419 process::internal::run<>()
# 0x7fa2b82da22a process::Future<>::fail()
# 0x7fa2b83136b5 std::_Mem_fn<>::operator()<>()
# 0x7fa2b830efdf _ZNSt5_BindIFSt7_Mem_fnIMN7process6FutureIN5mesos8internal8RegistryEEEFbRKSsEES6_St12_PlaceholderILi1EEEE6__callIbIS8_EILm0ELm1EEEET_OSt5tupleIIDpT0_EESt12_Index_tupleIIXspT1_EEE
# 0x7fa2b8307d7f _ZNSt5_BindIFSt7_Mem_fnIMN7process6FutureIN5mesos8internal8RegistryEEEFbRKSsEES6_St12_PlaceholderILi1EEEEclIJS8_EbEET0_DpOT_
# 0x7fa2b82fe431 _ZZNK7process6FutureIN5mesos8internal8RegistryEE8onFailedISt5_BindIFSt7_Mem_fnIMS4_FbRKSsEES4_St12_PlaceholderILi1EEEEbEERKS4_OT_NS4_6PreferEENUlS9_E_clES9_
# 0x7fa2b830f065 _ZNSt17_Function_handlerIFvRKSsEZNK7process6FutureIN5mesos8internal8RegistryEE8onFailedISt5_BindIFSt7_Mem_fnIMS8_FbS1_EES8_St12_PlaceholderILi1EEEEbEERKS8_OT_NS8_6PreferEEUlS1_E_E9_M_invokeERKSt9_Any_dataS1_
# 0x4a44e7 std::function<>::operator()()
# 0x49f3a7 _ZN7process8internal3runISt8functionIFvRKSsEEJS4_EEEvRKSt6vectorIT_SaIS8_EEDpOT0_
# 0x7fa2b82da202 process::Future<>::fail()
# 0x7fa2b82d2d82 process::Promise<>::fail()
Aborted
sometimes the warning is like this, and then crashed with the same output above:
0817 15:09:49.745750 2104 recover.cpp:111] Unable to finish the recover protocol in 10secs, retrying
I want to know whether zookeeper is deployed and run well in my case, and How can I locate where the problem is. Any answers and suggests are welcomed. thanks.
Actually, in my case, It's because I didn't open firewall port 5050 to allow three servers to communicate with each others. After updating firewall rule, it starts to work as expected.
I fall into same issue, I tried different ways and different options and finally --ip option worked for me. Initially I used --hostname option
mesos-master --ip=192.168.0.13 --quorum=2 --zk=zk://m1:2181,m2:2181,m3:2181/mesos --work_dir=/opt/mm1 --log_dir=/opt/mm1/logs
You need to check that all mesos/zookeeper master nodes can communicate correctly. For that, you need:
Zookeeper ports open: TCP 2181, 2888, 3888
Mesos port open: TCP 5050
ping available (ICMP message 0 and 8)
If you use FQDN instead of IP in your config, check that the DNS resolution is working correctly as well.
Split your mesos masters' work_dir to different dir, do not use a share work_dir for all masters, because of zk