Kill Dataproc job from Yarn UI no longer works -- only from Dataproc UI - google-cloud-dataproc

I used to be able to kill Spark jobs running on Dataproc via the Yarn UI KILL command, rather than from the GCP Dataproc UI command (which is much slower). However I am no longer able to do that -- only the GCP UI works.
Did something change or am I doing something wrong now?
I am using the Dataproc version 1.2 (where this has worked in the past).

To avoid YARN security vulnerabilities, non-get APIs are disabled by default now, but users can change it (with caution) when creating the cluster, or update the config then restart Hadoop services for running clusters. Also, as mentioned in the title of this question, users can kill jobs from Dataproc UI, which is recommended.
The yarn-site.xml yarn.resourcemanager.webapp.methods-allowed property now defaults to "GET,HEAD". This change restricts the HTTP methods that can be called on the YARN Resource Manager web UI (default port 8088) and REST APIs to GET and HEAD only, and disables job submission and modifications via the YARN REST API. You can override the default values and enable specific HTTP methods on port 8088 by setting the yarn.resourcemanager.webapp.methods-allowed property to one or more comma-separated HTTP method names when you create a cluster. An ALL value will allow all HTTP methods on the port.
Example: gcloud dataproc clusters create --properties='yarn:yarn.resourcemanager.webapp.methods-allowed=GET,POST,DELETE'
Recommendation: If you set this property to allow non-default HTTP methods, make sure to configure firewall rules and other security settings to restrict access to port 8088 (see Cluster Web Interfaces→Avoid Security Vulnerabilities).
See more details in this release notes.

Related

Flink native kubernetes deployment

I have some limitations with the rights required by Flink native deployment.
The prerequisites say
KubeConfig, which has access to list, create, delete pods and **services**, configurable
Specifically, my issue is I cannot have a service account with the rights to create/remove services. create/remove pods is not an issue. but services by policy only can be created within an internal tool.
could it be any workaround for this?
Flink creates two service in native Kubernetes integration.
Internal service, which is used for internal communication between JobManager and TaskManager. It is only created when the HA is not enabled. Since the HA service will be used for the leader retrieval when HA enabled.
Rest service, which is used for accessing the webUI or rest endpoint. If you have other ways to expose the rest endpoint, or you are using the application mode, then it is also optional. However, it is always be created currently. I think you need to change some codes to work around.

Where/How to configure Cassandra.yaml when deployed by Google Kubernetes Engine

I can't find the answer to a pretty easy question: Where can I configure Cassandra (normally using Cassandra.yaml) when its deployed on a cluster with kubernetes using the Google Kubernetes Engine?
So I'm completely new to distributed databases, Kubernetes etc. and I'm setting up a cassandra cluster (4VMs, 1 pod each) using the GKE for a university course right now.
I used the official example on how to deploy Cassandra on Kubernetes that can be found on the Kubernetes homepage (https://kubernetes.io/docs/tutorials/stateful-application/cassandra/) with a StatefulSet, persistent volume claims, central load balancer etc. Everything seems to work fine and I can connect to the DB via my java application (using the datastax java/cassandra driver) and via Google CloudShell + CQLSH on one of the pods directly. I created a keyspace, some tables and started filling them with data (~100million of entries planned), but as soon as the DB reaches some size, expensive queries result in a timeout exception (via datastax and via cql), just as expected. Speed isn't necessary for these queries right now, its just for testing.
Normally I would start with trying to increase the timeouts in the cassandra.yaml, but I'm unable to locate it on the VMs and have no clue where to configure Cassandra at all. Can someone tell me if these configuration files even exist on the VMs when deploying with GKE and where to find them? Or do I have to configure those Cassandra details via Kubectl/CQL/StatefulSet or somewhere else?
I think the faster way to configure cassandra in Kubernetes Engine, you could use the next deployment of Cassandra from marketplace, there you could configure your cluster and you could follow this guide that is also marked there to configure it correctly.
======
The timeout config seems to be a configuration that require to be modified inside the container (Cassandra configuration itself).
you could use the command: kubectl exec -it POD_NAME -- bash in order to open a Cassandra container shell, that will allow you to get into the container configurations and you could look up for the configuration and change it for what you require.
after you have the configuration that you require you will need to automate it in order to avoid manual intervention every time that one of your pods get recreated (as configuration will not remain after a container recreation). Next options are only suggestions:
Create you own Cassandra image from am own Docker file, changing the value of the configuration you require from there, because the image that you are using right now is a public image and the container will always be started with the config that the pulling image has.
Editing the yaml of your Satefulset where Cassandra is running you could add an initContainer, which will allow to change configurations of your running container (Cassandra) this will make change the config automatically with a script ever time that your pods run.
choose the option that better fits for you.

How to enable Kubernetes api server instances to connect to external networks via a proxy

The goal is to enable Kubernetes api server to connect to resources on the internet when it is on a private network on which internet resources can only be accessed through a proxy.
Background:
A kubernetes cluster is spun up using kubespray containing two apiserver instances that run on two VMs and are controlled via a manifest file. The Azure AD is being used as the identity provider for authentication. In order for this to work the API server needs to initialize its OIDC component by connecting to Microsoft and downloading some keys that are used to verify tokens issued by Azure AD.
Since the Kubernetes cluster is on a private network and needs to go through a proxy before reaching the internet, one approach was to set https_proxy and no_proxy in the kubernetes API server container environment by adding this to the manifest file. The problem with this approach is that when using Istio to manage access to APIs, no_proxy needs to be updated whenever a new service is added to the cluster. One solution could have been to add a suffix to every service name and set *.suffix in no proxy. However, it appears that using wildcards in the no_proxy configuration is not supported.
Is there any alternate way for the Kubernetes API server to reach Microsoft without interfering with other functionality?
Please let me know if any additional information or clarifications are needed.
I'm not sure how you would have Istio manage the egress traffic for your Kubernetes masters where your kube-apiservers run, so I wouldn't recommend it. As far as I understand, Istio is generally used to manage (ingress/egress/lb/metrics/etc) actual workloads in your cluster and these workloads generally run on your nodes, not masters. I mean the kube-apiserver actually manages the CRDs that Istio uses.
Most people use Docker on their masters, you can use the proxy environment variables for your containers like you mentioned.
We tried a couple of solutions to avoid having to set http(s)_proxy and no_proxy env variables in the kube-apiserver and constantly whitelist new services in the cluster...
Introduce a self managed proxy server which would determine what traffic goes is forwarded to an internet connected proxy and what traffic is not proxied:
squid proxy seemed to do the trick by defining some ACLs. One issue we had was that node names were not resolved by kube-dns so we had to add manual entries into the hosts files of containers (not sure how these were handled by default).
we also tried writing a proxy using node but it had trouble with https in some scenarios.
Introduce a self managed identity provider between azure and our k8s cluster which was configured to use the internet connected proxy this avoiding having to configure the proxy in the kube-apiserver
We landed up going with option 2 as it gave us more flexibility in the long term.

New approach to configure kube-proxy's proxymode in GKE/Kubernetes?

I see a recent pull request was merged to remove the net.experimental.kubernetes.io/proxy-mode and net.beta.kubernetes.io/proxy-mode annotations.
My application's reverse proxy servers currently work much better when the proxymode is set to userspace, and I would like to keep using this setting after upgrading.
Since Google Container Engine hosts the Kubernetes master, and I cannot directly access that VM, how can I configure the kube-proxy proxymode without using annotations? Ideally I could change a cluster-wide setting so that new nodes (after autoscaling) also use the userspace proxy mode.

Is it possible to turn on ABAC mode (authorization) in Google Container Engine?

I would like to enable the ABAC mode for the Kubernetes Cluster I'm using in Google's Container Engine. (more specifically, I would like to restrict access to the API service for the default service account which is automatically assigned to all pods). However, since --authorization-mode=ABAC is a command line argument for kube-apiserver and since the API server is managed in Google Container Engine, I didn't find a way to enable authorization for my cluster.
Is there a way to enable ABAC mode on GCE?
I'm currently running Kubernetes v1.1.7 on server and nodes.
There is not a way to enable ABAC mode on Google Container Engine. If you need fine-grained control over the parameters passed to any of the master components you have to run Kubernetes on GCE instead.
In the meantime Google has added the possibility to use Role Based Access Control (RBAC) for a Kubernetes Cluster. It is enabled by default for all new Clusters running Kubernetes 1.6 or later: https://cloud.google.com/container-engine/docs/role-based-access-control