I have a dotcloud service which I want to configure such that it shuts down and destroys itself after it has completed its task.
This should mean I am not charged for more server time than I actually require.
The obvious way to do this would be from the dotcloud CLI, but this is not installed on dotcloud instances. Also the dotcloud user does not have privilege to run the shutdown command.
Is there a simple way to do this, or would I need to deploy a custom service which installs the dotcloud CLI and from that can then destroy itself?
There is no official way for a service to destroy itself, but you could, as you've described, install the CLI and your credentials (I'd suggest using your API key in an environment variable set via dotcloud env set) and then script the dotcloud destroy -A <app name> <servicename> call.
A more efficient approach would be to have a permanent worker and keep feeding it jobs to do. The dotCloud platform is best suited for applications which have relatively consistent RAM needs, since we don't offer autoscaling.
Related
What is the best way to trigger a PHP script on all running Fargate tasks for an ECS service?
I need to trigger this from another PHP script on one of the ECS tasks.
The reason I need to do this is that I have NGINX FastCGI Cache on these individual ECS tasks and need to purge the cache for all tasks when an admin makes an update in the CMS.
My NGINX configuration has a /purge/[path] endpoint that will purge the cache using fastcgi_cache_purge, and I do currently have a somewhat working hacky solution that looks something like this:
Admin saves in the CMS
PHP snippet runs that:
a) Fetches the number of running ECS tasks using AWS PHP SDK
b) Runs a for loop for the number of ECS tasks running
c) Calls the /purge/[path] URL using cURL (tries multiple times as it sometimes hits the same ECS task)
The above works but is not optimal.
Here are some other solutions that come to mind, but I can't find much information online on how to implement:
Would it be possible perhaps to change the fastcgi_cache_path to a shared file system like AWS EFS, or does that hurt performance not being tmpfs? I see that there's tmpfs support for ECS, but once mounted, would it be shared across the multiple ECS tasks or individually per ECS task?
Using AWS SNS/SQS (write one PHP script that subscribes and another one that publishes an event)
Use Redis pub/sub similar to above (also not sure exactly how to implement and how to start a long-running subscriber when ECS tasks start)
Use ECS Exec and AWS PHP SDK (almost have a working solution that fetches all ECS tasks and loops them through, and executes an "ECS Exec" command, but "--non-interactive" is not available yet, so it doesn't work. Only "--interactive" mode works currently)
Is there an easier/better solution for this? If using any of the above implementations, can someone put me in the right direction on how to implement this using PHP?
Thanks!
This honestly seems like a duplicate of your previous question, but I'll answer:
Would it be possible perhaps to change the fastcgi_cache_path to a shared file system like AWS EFS, or does that hurt performance not
being tmpfs? I see that there's tmpfs support for ECS, but once
mounted, would it be shared across the multiple ECS tasks or
individually per ECS task?
EFS wouldn't be a good option for this because EFS is SLOW. It would kill any performance benefit of having a cache.
A tmpfs on Fargate will not be shared among instances.
Using AWS SNS/SQS (write one PHP script that subscribes and another one that publishes an event)
SNS could work, as discussed in your previous question. SQS would not work at all, since an SQS message is only delivered to one consumer, and you want it to be delivered to all instances.
Use Redis pub/sub similar to above (also not sure exactly how to implement and how to start a long-running subscriber when ECS tasks
start)
Yes this would work, as discussed in your previous question. But of course would require a lot of custom coding.
Use ECS Exec and AWS PHP SDK (almost have a working solution that fetches all ECS tasks and loops them through, and executes an "ECS
Exec" command, but "--non-interactive" is not available yet, so it
doesn't work. Only "--interactive" mode works currently)
This should work. Why do you state that --non-interactive is not available yet? If you just leave off the --interactive parameter, then the command is executed non-interactively.
I have a Node.js web server which, as part of a CD process, I want to deploy to a staging server using Azure Release Pipeline. The problem is, that if I just run a Powershell script:
# Run-Server.ps1
node my-server.js
The Pipeline will hold since the node process blocks the Powershell session.
What I want is to be able to launch the service, and then in the next deployment just kill the node process and run it again with the new code.
So I figured I'll use Start-Process. If I run it locally:
> Start-Process node -ArgumentList ./server.js
I can now exit the Powershell session and the server will continue running. So I thought I can implement it the same way in my Release Pipeline.
But it turns out that once the Release Pipeline ends running, the server is no longer available - the node process is gone.
Can you help me figure out why is that? Is there another way of achieving this? I suppose it's a pretty common use case so there must be best-practices out there regarding to how this should be done.
Another way to achieve this is to use a full-blown web server to host andmanage node process. I.e. on Windows you could use IIS with iisnode module. This is more reliable and gives you a few other benefits:
process management (automatic start, restart on failure, etc.)
security - you can configure the user that node process will run as
scalability on multi-core CPUs
Then the process of app deployment would be just copying files to the right directory - the web server should pick up the change automatically.
By default, A pipeline job cleans up all of the child processes it spins up when it exits. This is killing your node server.
Set Process.Clean variable to false to override the default behavior.
I have a python app that builds a dataset for a machine learning task on GCP.
Currently I have to start an instance of a VM that we have, and then SSH in, and run the app, which will complete in 2-24 hours depending on the size of the dataset requested.
Once the dataset is complete the VM needs to be shutdown so we don't incur additional charges.
I am looking to streamline this process as much as possible, so that we have a "1 click" or "1 command" solution, but I'm not sure the best way to go about it.
From what I've read about so far it seems like containers might be a good way to go, but I'm inexperienced with docker.
Can I setup a container that will pip install the latest app from our private GitHub and execute the dataset build before shutting down? How would I pass information to the container such as where to get the config file etc? It's conceivable that we will have multiple datasets being generated at the same time based on different config files.
Is there a better gcloud feature that suits our purpose more effectively than containers?
I'm struggling to get information regarding these basic questions, it seems like container tutorials are dominated by web apps.
It would be useful to have a batch-like container service that runs a container until its process completes. I'm unsure whether such a service exists. I'm most familiar with Google Cloud Platform and this provides a wealth of compute and container services. However -- to your point -- these predominantly scale by (HTTP) requests.
One possibility may be Cloud Run and to trigger jobs using Cloud Pub/Sub. I see there's async capabilities too and this may be interesting (I've not explored).
Another runtime for you to consider is Kubernetes itself. While Kubernetes requires some overhead in having Google, AWS or Azure manage a cluster for you (I strongly recommend you don't run Kubernetes yourself) and some inertia in the capacity of the cluster's nodes vs. the needs of your jobs, as you scale the number of jobs, you will smooth these needs. A big advantage with Kubernetes is that it will scale (nodes|pods) as you need them. You tell Kubernetes to run X container jobs, it does it (and cleans-up) without much additional management on your part.
I'm biased and approach the container vs image question mostly from a perspective of defaulting to container-first. In this case, you'd receive several benefits from containerizing your solution:
reproducible: the same image is more probable to produce the same results
deployability: container run vs. manage OS, app stack, test for consistency etc.
maintainable: smaller image representing your app, less work to maintain it
One (beneficial!?) workflow change if you choose to use containers is that you will need to build your images before using them. Something like Knative combines these steps but, I'd stick with doing-this-yourself initially. A common solution is to trigger builds (Docker, GitHub Actions, Cloud Build) from your source code repo. Commonly you would run tests against the images that are built but you may also run your machine-learning tasks this way too.
Your containers would container only your code. When you build your container images, you would pip install, perhaps pip install --requirement requirements.txt to pull the appropriate packages. Your data (models?) are better kept separate from your code when this makes sense. When your runtime platform runs containers for you, you provide configuration information (environment variables and|or flags) to the container.
The use of a startup script seems to better fit the bill compared to containers. The instance always executes startup scripts as root, thus you can do anything you like, as the command will be executed as root.
A startup script will perform automated tasks every time your instance boots up. Startup scripts can perform many actions, such as installing software, performing updates, turning on services, and any other tasks defined in the script.
Keep in mind that a startup script cannot stop an instance but you can stop an instance through the guest operating system.
This would be the ideal solution for the question you posed. This would require you to make a small change in your Python app where the Operating system shuts off when the dataset is complete.
Q1) Can I setup a container that will pip install the latest app from our private GitHub and execute the dataset build before shutting down?
A1) Medium has a great article on installing a package from a private git repo inside a container. You can execute the dataset build before shutting down.
Q2) How would I pass information to the container such as where to get the config file etc?
A2) You can use ENV to set an environment variable. These will be available within the container.
You may consider looking into Docker for more information about container.
I'm starting to play around with NixOS deployments. To that end, I have a repo with some packages defined, and a configuration.nix for the server.
It seems like I should then be able to test this configuration locally (I'm also running NixOS). I imagine it's a bad idea to change my global configuration.nix to point to the deployment server's configuration.nix (who knows what that will break); but is there a safe and convenient way to "try out" the server locally - i.e. build it and either boot into it or, better, start it as a separate process?
I can see docker being one way, of course; maybe there's nothing else. But I have this vague sense Nix could be capable of doing it alone.
There is a fairly standard way of doing this that is built into the default system.
Namely nixos-rebuild build-vm. This will take your current configuration file (by default /etc/nixos/configuration.nix, build it and create a script allowing you to boot the configuration into a virtualmachine.
once the script has finished, it will leave a symlink in the current directory. You can then boot by running ./result/bin/run-$HOSTNAME-vm which will start a boot of your virtualmachine for you to play around with.
TLDR;
nixos-rebuild build-vm
./result/bin/run-$HOSTNAME-vm
nixos-rebuild build-vm is the easiest way to do this, however; you could also import the configuration into a NixOS container (see Chapter 47. Container Management in the NixOS manual and the nixos-container command).
This would be done with something like:
containers.mydeploy = {
privateNetwork = true;
config = import ../mydeploy-configuration.nix;
};
Note that you would not want to specify the network configuration in mydeploy-configuration.nix if it's static as that could cause conflicts with the network subnet created for the container.
As you may already know, system configurations can coexist without any problems in the Nix store. The problem here is running more than one system at once. For this, you need an isolation or virtualization tools like Docker, VirtualBox, etc.
NixOS Containers
NixOS provides an efficient implementation of the container concept, backed by systemd-nspawn instead of an image-based container runtime.
These can be specified declaratively in configuration.nix or imperatively with the nixos-container command if you need more flexibility.
Docker
Docker was not designed to run an entire operating system inside a container, so it may not be the best fit for testing NixOS-based deployments, which expect and provide systemd and some services inside their units of deployment. While you won't get a good NixOS experience with Docker, Nix and Docker are a good fit.
UPDATE: Both 'raw' Nix packages and NixOS run in Docker. For example, Arion supports images from plain Nix, NixOS modules and 'normal' Docker images.
NixOps
To deploy NixOS inside NixOS it is best to use a technology that is designed to run a full Linux system inside.
It helps to have a program that manages the integration for you. In the Nix ecosystem, NixOps is the first candidate for this. You can use NixOps with its multiple backends, such as QEMU/KVM, VirtualBox, the (currently experimental) NixOS container backend, or you can use the none backend to deploy to machines that you have created using another tool.
Here's a complete example of using NixOps with QEMU/KVM.
Tests
If the your goal is to run automated integration tests, you can make use of the NixOS VM testing framework. This uses Linux KVM virtualization (expose /dev/kvm in sandbox) to run integrations test on networks of virtual machines, and it runs them as a derivation. It is quite efficient because it does not have to create virtual machine images because it mounts the Nix store in the VM. These tests are "built" like any other derivation, making them easy to run.
Nix store optimization
A unique feature of Nix is that you can often reuse the host Nix store, so being able to mount a host filesystem in the container/vm is a nice feature to have in your solution. If you are creating your own solutions, depending on you needs, you may want to postpone this optimization, because it becomes a bit more involved if you want the container/vm to be able to modify the store. NixOS tests solve this with an overlay file system in the VM. Another approach may be to bind mount the Nix store forward the Nix daemon socket.
I'm using Amazon Web Services to create an autoscaling group of application instances behind an Elastic Load Balancer. I'm using a CloudFormation template to create the autoscaling group + load balancer and have been using Ansible to configure other instances.
I'm having trouble wrapping my head around how to design things such that when new autoscaling instances come up, they can automatically be provisioned by Ansible (that is, without me needing to find out the new instance's hostname and run Ansible for it). I've looked into Ansible's ansible-pull feature but I'm not quite sure I understand how to use it. It requires a central git repository which it pulls from, but how do you deal with sensitive information which you wouldn't want to commit?
Also, the current way I'm using Ansible with AWS is to create the stack using a CloudFormation template, then I get the hostnames as output from the stack, and then generate a hosts file for Ansible to use. This doesn't feel quite right – is there "best practice" for this?
Yes, another way is just to simply run your playbooks locally once the instance starts. For example you can create an EC2 AMI for your deployment that in the rc.local file (Linux) calls ansible-playbook -i <inventory-only-with-localhost-file> <your-playbook>.yml. rc.local is almost the last script run at startup.
You could just store that sensitive information in your EC2 AMI, but this is a very wide topic and really depends on what kind of sensitive information it is. (You can also use private git repositories to store sensitive data).
If for example your playbooks get updated regularly you can create a cron entry in your AMI that runs every so often and that actually runs your playbook to make sure your instance configuration is always up to date. Thus avoiding having "push" from a remote workstation.
This is just one approach there could be many others and it depends on what kind of service you are running, what kind data you are using, etc.
I don't think you should use Ansible to configure new auto-scaled instances. Instead use Ansible to configure a new image, of which you will create an AMI (Amazon Machine Image), and order AWS autoscaling to launch from that instead.
On top of this, you should also use Ansible to easily update your existing running instances whenever you change your playbook.
Alternatives
There are a few ways to do this. First, I wanted to cover some alternative ways.
One option is to use Ansible Tower. This creates a dependency though: your Ansible Tower server needs to be up and running at the time autoscaling or similar happens.
The other option is to use something like packer.io and build fully-functioning server AMIs. You can install all your code into these using Ansible. This doesn't have any non-AWS dependencies, and has the advantage that it means servers start up fast. Generally speaking building AMIs is the recommended approach for autoscaling.
Ansible Config in S3 Buckets
The alternative route is a bit more complex, but has worked well for us when running a large site (millions of users). It's "serverless" and only depends on AWS services. It also supports multiple Availability Zones well, and doesn't depend on running any central server.
I've put together a GitHub repo that contains a fully-working example with Cloudformation. I also put together a presentation for the London Ansible meetup.
Overall, it works as follows:
Create S3 buckets for storing the pieces that you're going to need to bootstrap your servers.
Save your Ansible playbook and roles etc in one of those S3 buckets.
Have your Autoscaling process run a small shell script. This script fetches things from your S3 buckets and uses it to "bootstrap" Ansible.
Ansible then does everything else.
All secret values such as Database passwords are stored in CloudFormation Parameter values. The 'bootstrap' shell script copies these into an Ansible fact file.
So that you're not dependent on external services being up you also need to save any build dependencies (eg: any .deb files, package install files or similar) in an S3 bucket. You want this because you don't want to require ansible.com or similar to be up and running for your Autoscale bootstrap script to be able to run. Generally speaking I've tried to only depend on Amazon services like S3.
In our case, we then also use AWS CodeDeploy to actually install the Rails application itself.
The key bits of the config relating to the above are:
S3 Bucket Creation
Script that copies things to S3
Script to copy Bootstrap Ansible. This is the core of the process. This also writes the Ansible fact files based on the CloudFormation parameters.
Use the Facts in the template.