Can JupyterHub Docker Spawner be configured to give different resource allocations to different users?

Can JupyterHub Docker Spawner be configured to give different resource allocations to different users? - jupyter

I am using JupyterHub's Docker Spawner to manage a Jupyter Notebook server for a set of users. The Docker Spawner allows setting resource allocations limits such as cpu_limit and mem_limit, but this configuration applies to all containers for all users. Is there any way to provide different resource allocations to different users when using this spawner?

Yes, you can, you should use the pre_spawn_hook. here docs
Define a hook function to be executed between the user login and the spawn of the docker container. In this function you can modify values for the current image to be spawned depending on the user logged in.
Here an example:
async def custom_pre_spawn_hook(spawner):
username = spawner.user.name
if username == 'tier1user':
spawner.mem_limit = "4G"
spawner.cpu_limit = 2
else:
spawner.mem_limit = "1G"
spawner.cpu_limit = 1
c.Authenticator.enable_auth_state = True
c.DockerSpawner.pre_spawn_hook = custom_pre_spawn_hook
(Add this at the end of jupyterhub python config file)
You can find more information in general about jupyter and jupyterhub at discourse.jupyter.org and more specific about this question at this nice post.

Related

How can I redeploy a docker-compose stack with terraform?

I use terraform to configure a GCE instance which runs a docker-compose stack. The docker-compose stack references an image with a tag and I would like to be able to rerun docker-compose up when the tag changes, so that a new version of the service can be run.
Currently, I do the following in my terraform files:
provisioner "file" {
source = "training-server/docker-compose.yml"
destination = "/home/curry/docker-compose.yml"
connection {
type = "ssh"
user = "curry"
host = google_compute_address.training-address.address
private_key = file(var.private_key_file)
}
}
provisioner "remote-exec" {
inline = [
"IMAGE_ID=${var.image_id} docker-compose -f /home/curry/docker-compose.yml up -d"
]
connection {
type = "ssh"
user = "root"
host = google_compute_address.training-address.address
private_key = file(var.private_key_file)
}
}
but this is wrong for various reasons:
Provisioners are somewhat frowned upon according to terraform documentation
If the image_id change this won't be considered a change in configuration by terraform so it won't run the provisioners
What I want is to consider my application stack like a resource, so that when one of its attributes change, eg. the image_id, the resource is recreated but the VM instance itself is not.
How can I do that with terraform? Or is there another better approach?

Terraform has a Docker provider, and if you wanted to use Terraform to manage your container stack, that's probably the right tool. But, using it requires essentially translating your Compose file into Terraform syntax.
I'm a little more used to a split where you use Terraform to manage infrastructure – set up EC2 instances and their network setup, for example – but use another tool like Ansible, Chef, or Salt Stack to actually run software on them. Then to update the software (Docker containers) you'd update your configuration management tool's settings to say which version (Docker image tag) you want, and then re-run that.
One trick that may help is to use the null resource which will let you "reprovision the resource" whenever the image ID changes:
resource "null_resource" "docker_compose" {
triggers = {
image_id = "${var.image_id}"
}
provisioner "remote_exec" {
...
}
}
If you wanted to go down the all-Terraform route, in theory you could write a Terraform configuration like
provider "docker" {
host = "ssh://root#${google_compute_address.training-address.address}"
# (where do its credentials come from?)
}
resource "docker_image" "myapp" {
name = "myapp:${var.image_id}"
}
resource "docker_container" "myapp" {
name = "myapp"
image = "${docker_image.myapp.latest}"
}
but you'd have to translate your entire Docker Compose configuration to this syntax, and set it up so that there's an option for developers to run it locally, and replicate Compose features like the default network, and so on. I don't feel like this is generally done in practice.

Terraform with Google Container Engine (Kubernetes): Error executing access token command "...\gcloud.cmd"

I'm trying to deploy some module (Docker image) to google Google Container Engine. What I got in my Terraformconfig file:
terraform.tf
# Google Cloud provider
provider "google" {
credentials = "${file("google_credentials.json")}"
project = "${var.google_project_id}"
region = "${var.google_region}"
}
# Google Container Engine (Kubernetes) cluster resource
resource "google_container_cluster" "secureskye" {
name = "secureskye"
zone = "${var.google_kubernetes_zone}"
additional_zones = "${var.google_kubernetes_additional_zones}"
initial_node_count = 2
}
# Kubernetes provider
provider "kubernetes" {
host = "${google_container_cluster.secureskye.endpoint}"
username = "${var.google_kubernetes_username}"
password = "${var.google_kubernetes_password}"
client_certificate = "${base64decode(google_container_cluster.secureskye.master_auth.0.client_certificate)}"
client_key = "${base64decode(google_container_cluster.secureskye.master_auth.0.client_key)}"
cluster_ca_certificate = "${base64decode(google_container_cluster.secureskye.master_auth.0.cluster_ca_certificate)}"
}
# Module UI
module "ui" {
source = "./modules/ui"
}
My problem is: google_container_cluster was created successfully, but it fails on module ui creation (which contains 2 resource kubernetes_service and kubernetes_pod) with error
* kubernetes_pod.ui: Post https://<ip>/api/v1/namespaces/default/pods: error executing access token command "<user_path>\\AppData\\Local\\Google\\Cloud SDK\\google-cloud-sdk\\bin\\gcloud.cmd config config-helper --format=json": err=exec: "<user_path>\\AppData\\Local\\Google\\Cloud SDK\\google-cloud-sdk\\bin\\gcloud.cmd": file does not exist output=
So, questions:
1. Do I need gcloud + kubectl installed? Even though google_container_cluster was created successfully before I install gcloud or kubectl installed.
2. I want to use independent, separated credentials info, project, region from the one in gcloud, kubectl CLI. Am I doing this right?

I have been able to reproduce your scenario running the Terraform config file you provided (except the Module UI part), in a Linux machine, so your issue should be related to that last part of the code.
Regarding your questions:
I am not sure, because I tried from Google Cloud Shell, and both gcloud and kubectl are already preinstalled there, although I would recommend you to install them just to make sure that is not the issue here.
For the credentials part, I added two new variables to the variables.tf Terraform configuration file, as in this example (those credentials do not need to be the sames as in gcloud or kubectl:
Use your prefered credentials in this case.
variable "google_kubernetes_username" {
default = "<YOUR_USERNAME>"
}
variable "google_kubernetes_password" {
default = "<YOUR_PASSWORD>"
}
Maybe you could share more information regarding what can be found in your Module UI, in order to understand which file does not exist. I guess you are trying the deployment from a Windows machine, as for the notation in the paths to your files, but that should not be an important issue.

How do you access a MongoDB database from two Openshift apps?

I want to be able to access my MongoDB database from 2 Openshift apps- one app is an interactive database maintenance app via the browser, the other is the principle web application which runs on mobile devices via an Openshift app. As I see it in Openshift, MongoDB gets set up within a particular app's folder space, not independent of that space.
What would be the method to accomplish this multiple app access to the database ?
It's not ideal but is my only choice to merge the functionality of both Openshift apps into one ? That's tastes like a bad plate of spaghetti.

2018 update: this applies to Openshift 2. Version 3 is very different, and however the general rules of linux and scaling apply, the details got obsolete.
Although #MartinB answer was timely and correct, it's just a link, so let me put the essentials here.
Assuming that setting up a non-shared DB is already done, you need to find it's host and port. You can ssh to your app (the one with the DB) or use the rhc:
rhc ssh -a appwithdb
env | grep MONGODB
env brings all the environment variables, and grep filters them to show only Mongo-related ones. You should see something like:
OPENSHIFT_MONGODB_DB_HOST=xxxxx-yyyyy.apps.osecloud.com
OPENSHIFT_MONGODB_DB_PORT=zzzzz
xxxxx is the ID of the gear that Mongo sits on
yyyyy is your domain/namespace
zzzzz is MongoDB port
Now, you can use these to create a connection to the DB from anywhere in your Openshift environment. Another application has to use the xxxxx-yyyyy:zzzzz URL. You can store them in custom variables to make maintenance easier.
$ rhc env-set \
MYOWN_DB_HOST=xxxxx-yyyyy \
MYOWN_DB_PORT=zzzzz \
MYOWN_DB_PASSWORD=****** \
MYOWN_DB_USERNAME=admin..... \
MYOWN_DB_NAME=dbname...
And then use the environment variables instead of the standard ones. Just remember they don't get updated automatically when the DB moves away.

Please read the following article from the open shift blog: https://blog.openshift.com/sharing-database-across-applications/

How to modify configuration properties in Mean.js?

I am currently working in a cloud environment (cloud9) and have installed the Mean.js (http://meanjs.org/) package.
Following the tutorial at IBM (http://www.ibm.com/developerworks/library/wa-mean1/index.html) the final step involves running the application using grunt.
Now in order to run the default application I need to change a couple of properties as I am using a Cloud Database (MongoLab).
My question is how I can change the properties, such as config.db, in the mean.js? On their website they describe the following: http://meanjs.org/docs.html#configuration
However, there is no clear explanation in which file to do so or how to do it?

When inspecting the code, you will probably see several environment definition files (production, development etc). Those contain the mapping for the config variables, example:
db: process.env.MONGOHQ_URL || process.env.MONGOLAB_URI || 'mongodb://' + (process.env.DB_1_PORT_27017_TCP_ADDR || 'localhost') + '/mean',
so parameters are expected to be defined as environment variables.
For setting up mongodb you can specify either MONGOHQ_URL, MONGOLAB_URI or the DB_1_PORT_27017_TCP_ADDR, for facebook App ID it looks for: clientID: process.env.FACEBOOK_ID || 'APP_ID' etc.

Torque pbs_python submit job error (15025 queue already exists)

I try to execute this example script (https://oss.trac.surfsara.nl/pbs_python/wiki/TorqueUsage/Scripts/Submit)
#!/usr/bin/env python
import sys
sys.path.append('/usr/local/build_pbs/lib/python2.7/site-packages/pbs/')
import pbs
server_name = pbs.pbs_default()
c = pbs.pbs_connect(server_name)
attropl = pbs.new_attropl(4)
# Set the name of the job
#
attropl[0].name = pbs.ATTR_N
attropl[0].value = "test"
# Job is Rerunable
#
attropl[1].name = pbs.ATTR_r
attropl[1].value = 'y'
# Walltime
#
attropl[2].name = pbs.ATTR_l
attropl[2].resource = 'walltime'
attropl[2].value = '400'
# Nodes
#
attropl[3].name = pbs.ATTR_l
attropl[3].resource = 'nodes'
attropl[3].value = '1:ppn=4'
# A1.tsk is the job script filename
#
job_id = pbs.pbs_submit(c, attropl, "A1.tsk", 'batch', 'NULL')
e, e_txt = pbs.error()
if e:
print e,e_txt
print job_id
But shell shows error "15025 Queue already exists". With qsub job submits normally. I have one queue 'batch' on my server. Torque version - 4.2.7. Pbs_python version - 4.4.0.
What I should to do to start new job?

There are two things going on here. First there is an error in pbs_python that maps the 15025 error code to "Queue already exists". Looking at the source of torque we see that 15025 actually maps to the error "Bad UID for job execution", this means that on the torque server, the daemon cannot determine if the user you are submitting as is allowed to run jobs. This could be because of several things:
The user you are submitting as doesn't exist on the machine running pbs_server
The host you are submitting from is not in the "submit_hosts" parameter of the pbs_server.
Solution For 1
The remedy for this depends on how you authenticate users across systems, you could use /etc/hosts.equiv to specify users/hosts allowed to submit, this file would need to be distributed to all the torque nodes as well as the torque server machine. Using hosts.equiv is pretty insecure, I haven't actually used it in this. We use a central LDAP server to authenticate all users on the network and do not have this problem. You could also manually add the user to all the torque nodes and the torque server, taking care to make sure the UID is the same on all systems.
Solution For 2
If #1 is not your problem (which I doubt it is), you probably need to add the hostname of the machine you're submitting from to the "submit_hosts" parameter on the torque server. This can be accomplished with qmgr:
[root#torque_server ]# qmgr -c "set server submit_hosts += hostname.example.com"

The pbs python library that you are using was written for torque 2.4.x.
The internal api's for torque were largely rewritten in torque 4.0.x. The library will most likely need to be written for thew new API.
Currently the developers of torque do not test any external libraries. It is possible that they could break at any time.