I'm trying to launch several instances of Moodle in a Kubernetes-like container platform to improve performance and make my installation reliable. I came across the following requirement
$CFG->dataroot This MUST be a shared directory where each cluster node
is accessing the files directly. It must be very reliable,
administrators cannot manipulate files directly.
Which tool can be used to transparently sync this directory across several containers? What is the best way to meet this requirement?
I successfully resolved the issue by using ObjectFS plugin for S3 storage and moving sessions to database instead of file system.
Related
I have set up a Kubernetes cluster using Kubernetes Engine on GCP to work on some data preprocessing and modelling using Dask. I installed Dask using Helm following these instructions.
Right now, I see that there are two folders, work and examples
I was able to execute the contents of the notebooks in the example folder confirming that everything is working as expected.
My questions now are as follows
What are the suggested workflow to follow when working on a cluster? Should I just create a new notebook under work and begin prototyping my data preprocessing scripts?
How can I ensure that my work doesn't get erased whenever I upgrade my Helm deployment? Would you just manually move them to a bucket every time you upgrade (which seems tedious)? or would you create a simple vm instance, prototype there, then move everything to the cluster when running on the full dataset?
I'm new to working with data in a distributed environment in the cloud so any suggestions are welcome.
What are the suggested workflow to follow when working on a cluster?
There are many workflows that work well for different groups. There is no single blessed workflow.
Should I just create a new notebook under work and begin prototyping my data preprocessing scripts?
Sure, that would be fine.
How can I ensure that my work doesn't get erased whenever I upgrade my Helm deployment?
You might save your data to some more permanent store, like cloud storage, or a git repository hosted elsewhere.
Would you just manually move them to a bucket every time you upgrade (which seems tedious)?
Yes, that would work (and yes, it is)
or would you create a simple vm instance, prototype there, then move everything to the cluster when running on the full dataset?
Yes, that would also work.
In Summary
The Helm chart includes a Jupyter notebook server for convenience and easy testing, but it is no substitute for a full fledged long-term persistent productivity suite. For that you might consider a project like JupyterHub (which handles the problems you list above) or one of the many enterprise-targeted variants on the market today. It would be easy to use Dask alongside any of those.
We are running out of space on our D:\Temporary Storage drive on our Service Fabric Cluster VM's (5 nodes). I have conversed back and forth with MS support about what is safe to delete from this drive and the answers I'm getting are ambiguous at best.
I've noted that we have many older versions of our applications and services on the VM's that we don't need anymore. Getting rid of these will definitely help free up space. I've asked MS support if it's safe to delete the old versions of the applications and they said yes, but then directed me to these links:
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-deploy-remove-applications#remove-an-application-package-from-the-image-store
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-deploy-remove-applications#remove-an-application
https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-deploy-remove-applications#unregister-an-application-type
So the three sections we have are:
Remove an application package from the image store
Remove an application
Unregister an application type
These all deal with PowerShell scripts that need to be run, which I am very novice with. I have direct RDP access to the VM's and have the ability to simply delete the files via Windows File Explorer. Is it ok to do it this way, or do I need to go the Powershell route for deletion and Unregistering the application? At least for #1, removing the application package from the image store, there shouldn't be any issue with me just deleting that from Windows File Explorer in the VM's, correct?
EDIT: this is not a duplicate of Run out of storage on Service Fabric scale set :
I am asking about manually clearing space on the SFC VM's - the above thread is talking about setting up your application deployment to auto-delete old versions of applications. These are not duplicates.
You shouldn't delete manually from within the VM, SF should handle it and you may cause issues.
The right way to remove them is doing like the documentation says, using powershell like:.
Remove-ServiceFabricApplicationPackage -ApplicationPackagePathInImageStore MyApplicationV1
You can also remove it manually via Service Fabric Explorer:
The option in the left will try to delete all the Application Package Versions registered in the cluster(if none in use)
The one in the right will delete a specific version (if not in use)
Keep in mind that to remove a package you should remove any running application that is using the same package version.
The other option is deleting the old version when you deploy a new one. I will link you to this other SO question: Run out of storage on Service Fabric scale set
I am looking for some documents/guides on how to migrate from other PaaS systems/legacy on premise to Bluemix - best practices, requirements, etc.
Anything at all would help, thanks, Jason.
Your question is quite generic, however here are some links regarding the migration from the main technologies:
From JEE: http://www.slideshare.net/davidcurrie/aai-2698migratingtobluemix
From LAMP: http://www.ibm.com/developerworks/library/wa-migrate-lamp-app-to-bluemix-trs/index.html
From other PaaS: http://www.slideshare.net/kelapure/2259-migrate-herokuopenshift-applicationstobluemixpublic
Finally, please note that moving from an on-premise solution to a Cloud Foundry-based one requires some considerations regarding the local file system:
Local file system storage is short-lived. When an application instance crashes or stops, the resources assigned to that instance are reclaimed by the platform including any local disk changes made since the app started. When the instance is restarted, the application will start with a new disk image. Although your application can write local files while it is running, the files will disappear after the application restarts.
Instances of the same application do not share a local file system. Each application instance runs in its own isolated container. Thus if your application needs the data in the files to persist across application restarts, or the data needs to be shared across all running instances of the application, the local file system should not be used.
For this reason local file system should not be used.
If you want more information on this topic please take a look at Considerations for Designing and Running an Application in the Cloud
If you're talking about a java app, see the post below:
Can I run my Tomcat app on Bluemix?
If you're moving an existing Websphere app, then this will help:
How do I move my existing WebSphere application to Liberty on Bluemix?
Jason - start here:
https://www.ng.bluemix.net/docs/
Then you can watch the YouTube videos:
https://www.youtube.com/channel/UCwYdW8mfXZwJQvB65789_vQ
After that, take a peak at developerWorks:
http://www.ibm.com/developerworks/devops/plan.html
Let me know if that helps.
Migrate an app from Heroku to Bluemix:
http://www.ibm.com/developerworks/cloud/library/cl-bluemix-heroku-migrate-app/
I'm developing an application with a postgresql db. Also it stores files in file system and keep their address in the db. I want an open source solution for backing up the app state including database and file storages.
Mandatory requirements:
supports backing up postgresql db when it is running.
support becking up a folder
support compression
Optional requirements:
Can view, create and restore backups in a web console.(Important)
support plugins or custom backup/restore tasks
support other data storages like mysql
support retention
I've seen project like barman or amanda but It seems each one solve some part of the problem.
Should I develop the solution myself?
The application is developed in java, if it matters.
I have realized that I have to make Image from EBS Volume everytime when I change my code
and following autoscaling configuration everytime (this is really bad).
I have heard that some people try to load their newest code from github or some similar sort of doing.
So that they can let server to have newest code automatically without making new image every single time.
I already have a private github.
Is it a only way to solve Auto-Scaling code management ?
If so, how can I configure this to work?
Use user-data scripts, which work on a lot of public images including Amazon's. You could have it download puppet manifests/templates/files and run directly. Search for master less puppet.
Yes, you can configure your AMI so that the instance loads the latest software and configuration on first boot before it is put into service in the auto scaling group.
How to set up a startup script may depend on the specific os and version you are running.