I want to create a custom layer with custom chef recipes, but I don't want to include the built-in chef recipes which installs mysql etc. Is it possible to do that ?
Yes.
On : Chef 11.10 Stack. Unless you are using a MySQL layer in the same stack, it wont install . Ref opsworks recipe
On : Older Stacks :
You need to modify the mysql recipes and alter this behaviour by modifying the specific client recipe. Just not a great idea since you may miss out on mysql improvements that the AWS team might make.
There is no way to prevent the AWS built-in recipes from running. OpsWorks runs these recipes in a separate Chef run. Custom cookbooks are loaded only after that Chef run concludes.
You can create a Custom Layer, which will minimize the built-in recipes that are included. Prevent OpsWorks from using its built-in mysql recipe by NOT associating any RDS or other database resource with the Stack.
The best you can do is to use a Custom Layer with a cookbook I created called tabula-rasa. This cookbook allows you to run any recipes in an isolated environment, preventing the built-in OpsWorks cookbooks from clashing with community cookbooks of the same name. https://github.com/shlomoswidler/tabula_rasa This doesn't prevent the OpsWorks built-in recipes from running, but it's the closest we can get in OpsWorks today.
Related
I have set up a Kubernetes cluster using Kubernetes Engine on GCP to work on some data preprocessing and modelling using Dask. I installed Dask using Helm following these instructions.
Right now, I see that there are two folders, work and examples
I was able to execute the contents of the notebooks in the example folder confirming that everything is working as expected.
My questions now are as follows
What are the suggested workflow to follow when working on a cluster? Should I just create a new notebook under work and begin prototyping my data preprocessing scripts?
How can I ensure that my work doesn't get erased whenever I upgrade my Helm deployment? Would you just manually move them to a bucket every time you upgrade (which seems tedious)? or would you create a simple vm instance, prototype there, then move everything to the cluster when running on the full dataset?
I'm new to working with data in a distributed environment in the cloud so any suggestions are welcome.
What are the suggested workflow to follow when working on a cluster?
There are many workflows that work well for different groups. There is no single blessed workflow.
Should I just create a new notebook under work and begin prototyping my data preprocessing scripts?
Sure, that would be fine.
How can I ensure that my work doesn't get erased whenever I upgrade my Helm deployment?
You might save your data to some more permanent store, like cloud storage, or a git repository hosted elsewhere.
Would you just manually move them to a bucket every time you upgrade (which seems tedious)?
Yes, that would work (and yes, it is)
or would you create a simple vm instance, prototype there, then move everything to the cluster when running on the full dataset?
Yes, that would also work.
In Summary
The Helm chart includes a Jupyter notebook server for convenience and easy testing, but it is no substitute for a full fledged long-term persistent productivity suite. For that you might consider a project like JupyterHub (which handles the problems you list above) or one of the many enterprise-targeted variants on the market today. It would be easy to use Dask alongside any of those.
We have a cloud-foundry app that is bound to a Postgresql service. Right now we have to manually connect to the Posgresql database with pgAdmin, and then manually run the queries to create our tables.
Attempted solution:
Do a cloud foundry run-task in which I would install
1) Install psql and connect to the remote database
2) Create the tables
The problem I ran into was that cf run-task has limited permissions to install packages.
What is the best way to automate database table creation for a cloud-foundry application?
Your application will run as a non-root user, so it will not have the ability to install packages, at least in the traditional way. If you want to install a package, you can use the Apt Buildpack to install it. This will install the package, but into a location that does not require root access. It then adjusts your environment variables so that binaries & libraries can be found properly.
Also keep in mind that tasks are associated with an application (they both use the same droplet), so to make this work you'd need to do one of two things:
1.) Use multi-buildpacks to run the Apt buildpack plus your standard buildpack. This will produce a droplet that has both your required packages and your app bits. Then you can start your app and kick of tasks to set up the DB.
2.) Use two separate apps. One for your actual app and one for your code that seeds the database.
Either one should work though. Both are valid ways to seed your database. The other option, which is what I typically done, is to use some sort of tool to do this. Some frameworks like Rails, have this built-in. If your framework does not, you could bring your own tool, like Flyway. These tools often also help with the evolution of your DB schema, which can be useful too.
I am working on an application that has source code stored in GitHub, build and test is done by CodeShip, and hosting is done in Amazon Elastic Beanstalk.
I'm at a point where seed data is needed on the development database (PostgreSQL in Amazon RDS) and it is changing regularly in development.
I'd like to execute several SQL statements that are stored in GitHub when a deployment takes place. I haven't found a way to do this with the tools we're using, so I'm wondering if there are some alternatives.
If these are the same SQL statements, then you can simply create an .ebextension (see documentation) that will execute them after each deploy.
If the SQLs are dynamic per deploy, then I'd recommend a database migrations management tool. I'm familiar with rails that has that by default abut there's also a standalone migrations tool for non-rails projects. Google can suggest many other options.
I'm using Fedora and I deploy Symfony projects in my local machine using virtual hosts. How can I deploy my projects in server to public which others can view it through their machines?
Thanx...
You have several way to deploy you symfony project. I will avoid ftp, svn up on prod, etc .. So, here is 2 good ways.
The built-in deploy task
Symfony comes with a built-in depoy task that has been used when symfony 1.4 was released. I think it's less and less used now (because there is better tool).
The simplest way to deploy your website is to use the built-in project:deploy task. It uses SSH and rsync to connect and transfer the files from one computer to another one.
Using capifony, which use Capistrano
Capistrano is an open source tool for running scripts on multiple servers. It’s primary use is for easily deploying applications.
capifony is a deployment recipes collection that works with both symfony and Symfony2 applications.
This way is far better than the previous one because you can automate many script when deploying (like testing your code, start a fresh built lib, upgrade database, share config file). But the most important one (from my POV) is that you can easily rollback a bad deployement. It's damm easy.
I'm currently looking into Orchard CMS to use for my new projects. With other CMS systems that I use, information related to new functionality is sometimes stored in the database (data, configuration, language items). Deploying this functionality to a production site (already running with it's own database ect.) is done using packages which "install" the data in the production database.
How is this done using Orchard? Or is all functionality file-based and can it be easily deployed using XCopy when a site is already running in production?
There is an import/export feature that you can use to transfer data and settings between Orchard instances.