I have created an pub/sub topic to which I will publish a message every time an new object is uploaded to the bucket. Now I want to create a subscription to push a notification to an endpoint every time a new object is uploaded to that bucket. Following the documentation, I wanted something like that:
gcloud alpha pubsub subscriptions create orderComplete \
--topic projects/PROJECT-ID/topics/TOPIC \
--push-endpoint http://localhost:5000/ENDPOINT/
--ack-deadline=60
However my app is running on kubernetes and it seems that pub/sub cannot reach my endpoint. Any suggestions?
As standing in documentation
In general, the push endpoint must be a publicly accessible HTTPS
server, presenting a valid SSL certificate signed by a certificate
authority and routable by DNS.
So you must expose your service via HTTPS using Ingress as described there:
https://cloud.google.com/kubernetes-engine/docs/concepts/ingress
In order for Cloud Pub/Sub to push messages to your application, you need to provide a publicly accessible endpoint. In Kubernetes, this most likely means exposing a Service. With this, you should have a non-local (i.e. no “localhost”) URL to reach the pods running your binaries.
Before creating the Cloud Pub/Sub subscription, you should also verify your domain with the Cloud Console.
Finally, you can set your subscription to push messages by changing its configuration:
gcloud pubsub subscriptions modify-push-config mySubscription \
--push-endpoint="https://publicly-available-domain.com/push-endpoint"
Yeah, so as #jakub-bujny points out you need a SSL endpoint. So one solution, on GKE, to use google's managed certificates with an Ingress resource (link shows you how)
Related
I have some limitations with the rights required by Flink native deployment.
The prerequisites say
KubeConfig, which has access to list, create, delete pods and **services**, configurable
Specifically, my issue is I cannot have a service account with the rights to create/remove services. create/remove pods is not an issue. but services by policy only can be created within an internal tool.
could it be any workaround for this?
Flink creates two service in native Kubernetes integration.
Internal service, which is used for internal communication between JobManager and TaskManager. It is only created when the HA is not enabled. Since the HA service will be used for the leader retrieval when HA enabled.
Rest service, which is used for accessing the webUI or rest endpoint. If you have other ways to expose the rest endpoint, or you are using the application mode, then it is also optional. However, it is always be created currently. I think you need to change some codes to work around.
in unmanaged cluster in order to export the k8s audit log we can use the AuditSink object and redirect the logs to any webhook we would like to . in order to do so we should changed the API server.
in managed cluster the API server is not accessible - is there any way to send the data to webhook as well?
if you can add an example it will be great since i saw the sub/pub option of GCP for example and it seems that i cant use my webhook
Within a managed GKE cluster, the audit logs are sent to Stackdriver Logging. At this time, there is no way to send the logs directly from GKE to a webhook; however, there is a workaround.
You can export the GKE Audit logs from Stackdriver Logging to Pub/Sub using a log sink. You will need to define which GKE Audit logs you will like to export to Pub/Sub.
Once the logs are exported to Pub/Sub, you will then be able to push them from Pub/Sub using your webhook. Cloud Pub/Sub is highly programmable and you can control the data you exchange. Please take a look at this link for an example about webhooks in Cloud Pub/Sub.
I've recently been making use of the GKE Workload Identity feature. I'd be interested to know in more detail how the gke-metadata-server component works.
GCP client code (gcloud or other language SDKs) falls through to the GCE metadata method
Request made to http://metadata.google.internal/path
(guess) Setting GKE_METADATA_SERVER on my node pool configures this to resolve to the gke-metadata-server pod on that node.
(guess) the gke-metadata-server pod with --privileged and host networking has a means of determining the source (pod IP?) then looking up the pod and its service account to check for the iam.gke.io/gcp-service-account annotation.
(guess) the proxy calls the metadata server with the pods 'pseudo' identity set (e.g. [PROJECT_ID].svc.id.goog[[K8S_NAMESPACE]/[KSA_NAME]]) to get a token for the service account annotated on its Kubernetes service account.
If this account has token creator / workload ID user rights to the service account presumably the response from GCP is a success and contains a token, which is then packaged and set back to the calling pod for authenticated calls to other Google APIs.
I guess the main puzzle for me right now is the verification of the calling pods identity. Originally I thought this would use the TokenReview API but now I'm not sure how the Google client tools would know to use the service account token mounted into the pod...
Edit follow-up questions:
Q1: In between step 2 and 3, is the request to metadata.google.internal routed to the GKE metadata proxy by the setting GKE_METADATA_SERVER on the node pool?
Q2: Why does the metadata server pod need host networking?
Q3: In the video here: https://youtu.be/s4NYEJDFc0M?t=2243 it's taken as a given that the pod makes a GCP call. How does the GKE metadata server identify the pod making the call to start the process?
Before going into details, please familiarize yourself with these components:
OIDC provider: Runs on Google’s infrastructure, provides cluster specific metadata and signs authorized JWTs.
GKE metadata server: It runs as a DaemonSet meaning one instance on every node, exposes pod specific metadata server (it will provide backwards compatibility with old client libraries), emulates existing node metadata server.
Google IAM: issues access token, validates bindings, validates OIDC signatures.
Google cloud: accepts access tokens, does pretty much anything.
JWT: JSON Web token
mTLS: Mutual Transport Layer Security
The steps below explain how GKE metadata server components work:
Step 1: An authorized user binds the cluster to the namespace.
Step 2: Workload tries to access Google Cloud service using client libraries.
Step 3: GKE metadata server is going to request an OIDC signed JWT from the control plane. That connection is authenticated using mutual TLS (mTLS) connection with node credential.
Step 4: Then the GKE metadata server is going use that OIDC signed JWT to request an access token for the [identity namespace]/[Kubernetes service account] from IAM. IAM is going to validate that the appropriate bindings exist on identity namespace and in the OIDC provider.
Step 5: And then IAM validates that it was signed by the cluster’s correct OIDC provider. It will then return an access token for the [identity namespace]/[kubernetes service account].
Step 6: Then the metadata server sends the access token it just got back to IAM. IAM will then exchange that for a short lived GCP service account token after validating the appropriate bindings.
Step 7: Then GKE metadata server returns the GCP service account token to the workload.
Step 8: The workload can then use that token to make calls to any Google Cloud Service.
I also found a video regarding Workload Identity which you will find useful.
EDIT Follow-up questions' answers:
Below are answers to your follow-up questions:
Q1: In between step 2 and 3, is the request to metadata.google.internal routed to the gke metadata proxy by the setting GKE_METADATA_SERVER on the node pool?
You are right, GKE_METADATA_SERVER is set on the node pool. This exposes a metadata API to the workloads that is compatible with the V1 Compute Metadata APIs. Once workload tries to access Google Cloud service, the GKE metadata server performs a lookup (the metadata server checks to see if a pod exists in the list whose IP matches the incoming IP of the request) before it goes on to request the OIDC token from the control plane.
Keep in mind that GKE_METADATA_SERVER enumeration feature can only be enabled if Workload Identity is enabled at the cluster level.
Q2: Why does the metadata server pod need host networking?
The gke-metadata-server intercepts all GCE metadata server requests from pods, however pods using the host network are not intercepted.
Q3: How does the GKE metadata server identify the pod making the call to start the process?
The pods are identified using iptables rules.
IBM internally monitor services which is offered on cloud but somehow I need to get status of middleware services such as kafka,API Connect etc. It will help me to automate things if some service stopped/not accessable.
To monitor your provisioned instances of these services you could exercise them. For example on API Connect create an API called /health and curl the API to verify it is working. For kafka create a topic to check the health.
I have a K8s Cluster made from kubeadm v1.8.4 present on virtual machine. Now i want to access this K8s cluster using rest API from my laptop/workstation.
One way through which i am able to do this is by running this command "kubectl proxy --address= --accept-hosts '.*' ".But i have to manually run this command in order to access my cluster from laptop and i don't want that.
While going through the docs i found that there is another proxy available which is apiserver proxy.I am trying to use this way by following this link(https://kubernetes.io/docs/tasks/access-application-cluster/access-cluster/#manually-constructing-apiserver-proxy-urls) but getting error "could not get any response" in postman.
So i was wondering whether i am using this apiserver proxy properly or not.Or is there any other way through which i can hit REST request from my laptop to my cluster on VM's without manually hitting "kubectl proxy" command?
What kube proxy does for you is essentialy two things.
First, obviously, it proxies your traffic from a localhost port to kubernetes api.
Second, it also authenticates you against the cluster so that all the proxied calls do not neet authentication information.
To hit API directly, it's as simple as pointing your client to the right IP:PORT of your VM, but... you need to either ignore (not advised) tls issues or trust kube CA cert. Also, you still need to authenticate to it, so you need to use appropriate client credentials (ie. bearer token).
Manually constructing apiserver proxy refers to a different kind of beast, which allows you to proxy traffic to services deployed in your kubernetes cluster by means of accessing a particular path in kube API server. So to use that you need to have the access to the API already.