cloudfoundry multi node push - push

Can you create application instances on multiple nodes with a single push command. If so, what is the process of that? Would you create multiple DEA instances?
So for a configuration like this would a "vmc push appname --instances 4" create:
REST_ _ _ _ _ _ _ _ _
| | | |
DEA DEA DEA DEA
App1 App2 App3 App4
Or do you have to push instances manually to each DEA node?

If you run the command as you've listed it, yes, it will basically cause your application to be started "at once" on 4 DEAs. You don't need to run vmc push for each instance - in fact, you can't use vmc to individually address a DEA.
Once a DEA is running your app, you can have the app show e.g. the IP address and port the DEA is running on by interrogating the VCAP environment.

Related

ECS Service Discovery is updated too late after task is stopped

Stackoverflow Service Discovery
Translate: box in shopping feed
Stories ship method
Review + Maand
Hi,
I'm running 2 AWS ECS services (A and B) within the same cluster, using the Fargate launch type.
Service A should be able to connect to service B. This is possible using Service Discovery.
I created a service discovery backend.local with the TTL of 15 seconds. The tasks in service B are added to a target-group which has a de-registration of 30 seconds.
+--------------+ +-------------+ +--------------+
| Application +-----> ECS: A +--------> ECS: B |
| Load | +-------------+ +--------------+
| Balancer | | Task 1 | | Task 1 |
+--------------+ | Task 2 | | Task . |
+-------------+ | Task n |
+--------------+
This is working perfect, from service A, I can do requests to http://backend.local, which are routed to one of the tasks in service B.
However, after a rolling deploy of service B, the service discovery DNS records aren't updated in time. So nslookup backend.local also returns IP addresses of the old tasks which are not available anymore.
The lifecylcle of tasks during deployment is:
New task: Pending -> Activating -> Running
Old task: Running -> Deactivating --> Stopped
I would expect that new task are discoverable AFTER they are 'Running', and not discoverable anymore when the target-groups de-registration delay kicks in.
How can I make sure that the Service Discovery doesn't make old tasks discoverable?

airflow tries to access celery workers using the worker ID instead of URL

I have Airflow running with CeleryExecutor and 2 workers. When my DAG runs, the tasks generate a log on the filesystem of the worker that ran them. But when I go to the Web UI and click on the task logs, I get:
*** Log file does not exist: /usr/local/airflow/logs/test_dag/task2/2019-11-01T18:12:16.309655+00:00/1.log
*** Fetching from: http://70953abf1c10:8793/log/test_dag/task2/2019-11-01T18:12:16.309655+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='70953abf1c10', port=8793): Max retries exceeded with url: /log/test_dag/task2/2019-11-01T18:12:16.309655+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f329c3a2650>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
http://70953abf1c10:8793/ is obviously not the correct IP of the worker. However, celery#70953abf1c10 is the name of this worker in Celery. It seems like Airflow is trying to learn the worker's URL from Celery, but Celery is giving the worker's name instead. How can I solve this?
DejanLekic's solution put me on the right track, but it wasn't entirely obvious, so I'm adding this answer to clarify.
In my case I was running Airflow on Docker containers. By default, Docker containers use a bridge network called bridge. This is a special network that does not automatically resolve hostnames. I created a new bridge network in Docker called airflow-net and had all my Airflow containers join this one (leaving the default bridge was not necessary). Then everything just worked.
By default, Docker sets the hostname to the hex ID of the container. In my case the container ID began with 70953abf1c10 and the hostname was also 70953abf1c10. There is a Docker parameter for specifying hostname, but it turned out to not be necessary. After I connected the containers to a new bridge network, 70953abf1c10 began to resolve to that container.
Simplest solution is either to use the default name, which will include the hostname, or to explicitly set the node name that has a valid host name in it (example: celery1#hostname.domain.tld).
If you use the default settings, then machine running the airflow worker has incorrectly set hostname to 70953abf1c10. You should fix this by running something like: hostname -B hostname.domain.tld

Where is a task running with multiple nodes having python Celery installed?

If I have multiple workers running on different nodes, how can I know a task is assigned to which worker then?
e.g. here are two workers 10.0.3.101 and 10.0.3.102; a Redis backend runs on 10.0.3.100; when a task is sent to the task queue to Redis backend, a worker gets and executes it. The worker is 10.0.3.101 or 10.0.3.102?
In addition, if a worker saying it is 10.0.3.101 running a task and suddenly halt, how can I know the failure? i.e. Is there any built-in fail over mechanism inside Celery?
Thanks.
I solved the problem by searching on Google.
The knowledge is mainly from Celery documentation.
We can get the hostname of a task-executing worker in task context or use command to get worker machine IP. The task defined as:
import time
import subprocess
from celery import current_task
#app.task
def report():
id = current_task.request.id
ip = subprocess.check_output(
"ip addr | grep eth0 | grep inet |" +
" cut -d t -f 2 | cut -d / -f 1", shell=True)
ip = ip.split('\n')[0].split(' ')[-1]
hostname = current_task.request.hostname
current_task.backend.store_result(
id, result={"ip": ip, "hostname": hostname}, status="READY")
time.sleep(100)
return {
"ip": ip,
"hostname": hostname
}
If start a worker on a machine or in a container:
celery worker -A node.tasks --hostname="visible_hostname_in_request.hostname"
Then we can use following lines to get worker's IP or hostname:
# python
>>> from node.tasks import report
>>> r = report.delay()
>>> r.result
As far as I know, there is no built-in fail-over mechanism in Celery, thus we need to implement it by self; also we can use 3rd party libs like dispy ...

How do I model a PostgreSQL failover cluster with Docker/Kubernetes?

I'm still wrapping my head around Kubernetes and how that's supposed to work. Currently, I'm struggling to understand how to model something like a PostgreSQL cluster with streaming replication, scaling out and automatic failover/failback (pgpool-II, repmgr, pick your poison).
My main problem with the approach is the dual nature of a PostgreSQL instance, configuration-wise -- it's either a master or a cold/warm/hot standby. If I increase the number of replicas, I'd expect them all to come up as standbys, so I'd imagine creating a postgresql-standby replication controller separately from a postgresql-master pod. However I'd also expect one of those standbys to become a master in case current master is down, so it's a common postgresql replication controller after all.
The only idea I've had so far is to put the replication configuration on an external volume and manage the state and state changes outside the containers.
(in case of PostgreSQL the configuration would probably already be on a volume inside its data directory, which itself is obviously something I'd want on a volume, but that's beside the point)
Is that the correct approaach, or is there any other cleaner way?
There's an example in OpenShift: https://github.com/openshift/postgresql/tree/master/examples/replica The principle is the same in pure Kube (it's not using anything truly OpenShift specific, and you can use the images in plain docker)
You can give PostDock a try, either with docker-compose or Kubernetes. Currently I have tried it in our project with docker-compose, with the schema as shown below:
pgmaster (primary node1) --|
|- pgslave1 (node2) --|
| |- pgslave2 (node3) --|----pgpool (master_slave_mode stream)----client
|- pgslave3 (node4) --|
|- pgslave4 (node5) --|
I have tested the following scenarios, and they all work very well:
Replication: changes made at the primary (i.e., master) node will be replicated to all standby (i.e., slave) nodes
Failover: stops the primary node, and a standby node (e.g., node4) will automatically take over the primary role.
Prevention of two primary nodes: resurrect the previous primary node (node1), node4 will continue as the primary node, while node1 will be in sync but as a standby node.
As for the client application, these changes are all transparent. The client just points to the pgpool node, and keeps working fine in all the aforementioned scenarios.
Note: In case you have problems to get PostDock up running, you could try my forked version of PostDock.
Pgpool-II with Watchdog
A problem with the aforementioned architecture is that pgpool is the single point of failure. So I have also tried enabling Watchdog for pgpool-II with a delegated virtual IP, so as to avoid the single point of failure.
master (primary node1) --\
|- slave1 (node2) ---\ / pgpool1 (active) \
| |- slave2 (node3) ----|---| |----client
|- slave3 (node4) ---/ \ pgpool2 (standby) /
|- slave4 (node5) --/
I have tested the following scenarios, and they all work very well:
Normal scenario: both pgpools start up, with the virtual IP automatically applied to one of them, in my case, pgpool1
Failover: shutdown pgpool1. The virtual IP will be automatically applied to pgpool2, which hence becomes active.
Start failed pgpool: start again pgpool1. The virtual IP will be kept with pgpool2, and pgpool1 is now working as standby.
As for the client application, these changes are all transparent. The client just points to the virtual IP, and keeps working fine in all the aforementioned scenarios.
You can find this project at my GitHub repository on the watchdog branch.
Kubernetes's statefulset is a good base for setting up the stateful service. You will still need some work to configure the correct membership among PostgreSQL replicas.
Kubernetes has one example for it. http://blog.kubernetes.io/2017/02/postgresql-clusters-kubernetes-statefulsets.html
You can look at one of the below postgresql open-source tools
1 Crunchy data postgresql
Patroni postgresql
.

Using celeryd as a daemon with multiple django apps?

I'm just starting using django-celery and I'd like to set celeryd running as a daemon. The instructions, however, appear to suggest that it can be configured for only one site/project at a time. Can the celeryd handle more than one project, or can it handle only one? And, if this is the case, is there a clean way to set up celeryd to be automatically started for each configuration, which requiring me to create a separate init script for each one?
Like all interesting questions, the answer is it depends. :)
It is definitely possible to come up with a scenario in which celeryd can be used by two independent sites. If multiple sites are submitting tasks to the same exchange, and the tasks do not require access to any specific database -- say, they operate on email addresses, or credit card numbers, or something other than a database record -- then one celeryd may be sufficient. Just make sure that the task code is in a shared module that is loaded by all sites and the celery server.
Usually, though, you'll find that celery needs access to the database -- either it loads objects based on the ID that was passed as a task parameter, or it has to write some changes to the database, or, most often, both. And multiple sites / projects usually don't share a database, even if they share the same apps, so you'll need to keep the task queues separate .
In that case, what will usually happen is that you set up a single message broker (RabbitMQ, for example) with multiple exchanges. Each exchange receives messages from a single site. Then you run one or more celeryd processes somewhere for each exchange (in the celery config settings, you have to specify the exchange. I don't believe celeryd can listen to multiple exchanges). Each celeryd server knows its exchange, the apps it should load, and the database that it should connect to.
To manage these, I would suggest looking into cyme -- It's by #asksol, and manages multiple celeryd instances, on multiple servers if necessary. I haven't tried, but it looks like it should handle different configurations for different instances.
Did not try but using Celery 3.1.x which does not need django-celery, according to the documentation you can instantiate a Celery app like this:
app1 = Celery('app1')
app1.config_from_object('django.conf:settings')
app1.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
But you can use celery multi for launching several workers with single configuration each, you can see examples here. So you can launch several workers with different --app appX parameters so it will use different taks and settings:
# 3 workers: Two with 3 processes, and one with 10 processes.
$ celery multi start 3 -c 3 -c:1 10
celery worker -n celery1#myhost -c 10 --config celery1.py --app app1
celery worker -n celery2#myhost -c 3 --config celery2.py --app app2
celery worker -n celery3#myhost -c 3 --config celery3.py --app app3