Let's say I'm using an GCE ingress to handle traffic from outside the cluster and terminate TLS (https://example.com/api/items), from here the request gets routed to one of two services that are only available inside the cluster. So far so good.
What if I have to call service B from service A, should I go all the way and use the cluster's external IP/domain and use HTTPS (https://example.com/api/user/1) to call the service or could I use the internal IP of the service and use HTTP (http://serviceb/api/user/1)? Do I have to encrypt the data or is it "safe" as long as it isn't leaving the private k8s network?
What if I want to have "internal" endpoints that should only be accessible from within the cluster - when I'm always using the external https-url those endpoints would be reachable for everyone. Calling the service directly, I could just do a http://serviceb/internal/info/abc.

If you need to use the features that you API Gateway is offering (authentication, cache, high availability, load balancing) then YES, otherwise DON'T. The External facing API should contain only endpoints that are used by external clients (from outside the cluster).
"safe" is a very relative word and I believe that there are no 100% safe networks. You should put in the balance the probability of "somebody" or "something" sniffing data from the network and the impact that it has on your business if that happens.
If this helps you: for any project that I've worked for (or I heard from somebody I know), the private network between containers/services was more than sufficient.
Exactly what I was saying on top of the answer. Keeping those endpoints inside the cluster makes them inaccessible by design from outside.
One last thing, managing a lot of SSL certificates for a lot of internal services is a pain that one should avoid if not necessary.


NLB or HAProxy - Better way to perform SSL termination?

My architecture looks like this:
Here, the HTTPS requests first go to the route53 service for DNS resolution. Route53 forwards the request to the Network Load balancer. This service redirects the traffic to HAProxy pods running inside a Kubernetes cluster.
The HAProxy servers are required to read a specific request header and based on its value, it will route the traffic to backend. To keep things simple, I have kept a single K8 Backend cluster, but assume that there are more than 1 such backend cluster running.
Considering this architecture:
What is the best place to perform TLS termination? Should we do it at NLB (green box) or implement it at HAProxy (Orange box)?
What are the advantages and disadvantages of each scenario?
As you are using the NLB you can achieve End to end HTTPS also however it forces the service also to use.
You can terminate at the LB level if you have multiple LB backed by clusters, leveraging the AWS cert manage with LB will be an easy way to manage the multiple setups.
There is no guarantee that if anyone that enters in your network won't be able to exploit a bug capable of intercepting traffic between services, Software Defined Network(SDN) in your VPC is secure and protects from spoofing but no guarantee.
So there is an advantage if you use TLS/SSL inside the VPC also.

Best way to go between private on-premises network and kubernetes

I have setup an on-premises Kubernetes cluster, and I want to be ensure that my services that are not in Kubernetes, but exist on a separate class B are able to consume those services that have migrated to Kubernetes. There's a number of ways of doing this by all accounts and I'm looking for the simplest one.
Ingress + controller seems to be the one favoured - and it's interesting because of the virtual hosts and HAProxy implementation. But where I'm getting confused is how to set up the Kubernetes service:
We've not a great deal of choice - ClusterIP won't be sufficient to expose it to the outside, or NodePort. LoadBalancer seems to be a simpler, cut down way of switching between network zones - and although there are OnPrem solutions (metalLB), seems to be far geared towards cloud solutions.
But if I stick with NodePort, then my entry into the network is going to be on a non-standard port number, and I would prefer it to be over standard port; particuarly if running a percentage of traffic for that service over non-kube, and the rest over kubernetes (for testing purposes, I'd like to monitor the traffic over a period of time before I bite the bullet and move 100% of traffic for the given microservice to kubernetes). In that case it would be better those services would be available across the same port (almost always 80 because they're standard REST micro-services). More than that, if I have to re-create the service for whatever reason, I'm pretty sure the port will change, and then all traffic will not be able to enter the Kubernetes cluster and that's a frightening proposition.
What are the suggested ways of handling communication between existing on-prem and Kubernetes cluster (also on prem, different IP/subnet)?
Is there anyway to get traffic coming in without changing the network parameters (class B's the respective networks are on), and not being forced to use NodePort?
NodePort service type may be good at stage or dev environments. But i recommend you to go with LoadBalancer type service (Nginx ingress controller is one). The advantage for this over other service types are
You can use standard port (Rather random Nodeport generated by your kubernetes).
Your service is load balanced. (Load balancing will be taken care by ingress controller).
Fixed port (it will not change unless you modify something in ingress object).

How to access a service in a kubernetes cluster using the service name .

I am pretty new to kubernetes and I have successfully setup a cluster on google container engine .
In my cluster I have a backend api developed with dropwizard, front end developed with node js and a mysql database.
All have been deployed to the cluster and are working .However my challenge is this after setting up an external ip for my node containers and backend I can access them remotely but I can't access my backed from my front end using the service name e.g my backend is called backendapi within the cluster. I can't do this http://backendapi:8080 to call my rest services when deployed to the cluster .
The catch for me is when I deploy to the cluster I don't want my front end to hit my back end using the external ip, I want them to connect within the cluster without going via the external ip address. When I connect to a pod and ping backendapi it returns a result but when I deploy my front end and use the label name it doesn't work .What could I be doing wrong ?.
As long as kube-dns is running (which I believe is "always unless you disable it"), all Service objects have an in cluster DNS name of service_name +"."+ service_namespace + ".svc.cluster.local" so all other things would address your backendapi in the default namespace as (to use your port numbered example) http://backendapi.default.svc.cluster.local:8080. That fact is the very reason Kubernetes forces all identifiers to be a "dns compatible" name (no underscores or other goofy characters).
Even if you are not running kube-dns, all Service names and ports are also injected into the environment of Pods just like docker would do, so the environment variables ${BACKENDAPI_SERVICE_HOST}:${BACKENDAPI_SERVICE_PORT} would contain the Service's in-cluster IP (even though the env-var is named "host") and the "default" Service port (8080 in your example) if there is only one.
Whether you choose to use the DNS name or the environment-variable-ip is a matter of whether you like having the "readable" names for things in log output or error messages, versus whether you prefer to skip the DNS lookup and use the Service IP address for speed but less legibility. They behave the same.
The whole story lives in the services-networking concept documentation
But the problem
still persists when I change to this
backendapi.default.svc.cluster.local:8080. I even tried using the other
port that it is mapped to internally and my front end web page keeps
saying backendapi.default.svc.cluster.local:32208/api/v1/auth/login
net::ERR_NAME_NOT_RESOLVED. The funny thing is when I curl from my
front end pod it works . But when I'm accessing it using my web
browser it doesnt
Because it is resolvable only within the cluster. (Because only the K8s cluster with kube-dns add-on can translate the domain name backendapi.default.svc.cluster.local:8080 to it's corresponding IP address)
Could this be because i exposed an external ip for the service as well
. The external ip works though
No, It is because domain name backendapi.default.svc.cluster.local is resolvable only within the cluster, not from a random browser in a random machine.
What you did is one of the solutions, exposing an external ip for the service. If you don't want the IP to be used, you can Create an ingress (and use an ingress controller in your cluster) and expose your Microservice. Since you are on GCP, you can make use of their API gateway rather than exposing a cryptic IP address.
Note: Remember to add the authentication/Authorization to lock down your microservice as it's getting exposed to the user.
Another Solution
Proxy all the backend calls through the server which serves your web app (nginx/nodejs etc)
Advantage of this approach is, you will avoid all the Same Origin Policy/CORS headaches, your microservice (express) authentication details will be abstracted out from user's browser. (This is not necessarily an advantage).
Disadvantage of this approach is, your backend microservice will have a tight coupling with front end (or vice-versa depending on how you look at it), This will make the scaling of backend dependent on front end. Your Backend is not exposed. So, if you have another consumer (let's just say an android app) it will not be able to access your service.
Hybrid Solution
Proxy all the backend calls through the server which serves your web app (nginx/nodejs etc) so that your APIs will inherit your webapps domain. And still expose the backend service (as and when required) so that other consumers (if any, in future) can make use of it.
Kind of similar question: https://stackoverflow.com/a/47043871/6785908

Low Level Protocol for Microservice Orchestration

Recently I started working with Microservices, I wrote a library for service discovery using Redis to store every service's url and port number, along with a TTL value for the entry. It turned out to be an expensive approach since for every cross service call to any other service required one call to Redis. Caching didn't seem to be a good idea, since the services won't be up all the times, there can be possible downtimes as well.
So I wanted to write a separate microservice which could take care of the orchestration part. For this I need to figure out a really low level network protocol to take care of the exchange of heartbeats(which would help me figure out if any of the service instance goes unavailable). How do applications like zookeeperClient, redisClient take care of heartbeats?
Moreover what is the industry's preferred protocol for cross service calls?
I have been calling REST Api's over HTTP and eliminated every possibility of Joins across different collections.
Is there a better way to do this?
I think the term "Orchestration" is not good for what you are asking. From what I've encountered so far in microservices world the term "Orchestration" is used when a complex business process is involved and not for service discovery. What you need is a Service registry combined with a Load balancer. You can find here all the information you need. Here are some relevant extras that great article:
There are two main service discovery patterns: client‑side discovery and server‑side discovery. Let’s first look at client‑side discovery.
The Client‑Side Discovery Pattern
When using client‑side discovery, the client is responsible for determining the network locations of available service instances and load balancing requests across them. The client queries a service registry, which is a database of available service instances. The client then uses a load‑balancing algorithm to select one of the available service instances and makes a request.
The network location of a service instance is registered with the service registry when it starts up. It is removed from the service registry when the instance terminates. The service instance’s registration is typically refreshed periodically using a heartbeat mechanism.
Netflix OSS provides a great example of the client‑side discovery pattern. Netflix Eureka is a service registry. It provides a REST API for managing service‑instance registration and for querying available instances. Netflix Ribbon is an IPC client that works with Eureka to load balance requests across the available service instances. We will discuss Eureka in more depth later in this article.
The client‑side discovery pattern has a variety of benefits and drawbacks. This pattern is relatively straightforward and, except for the service registry, there are no other moving parts. Also, since the client knows about the available services instances, it can make intelligent, application‑specific load‑balancing decisions such as using hashing consistently. One significant drawback of this pattern is that it couples the client with the service registry. You must implement client‑side service discovery logic for each programming language and framework used by your service clients.
The Server‑Side Discovery Pattern
The client makes a request to a service via a load balancer. The load balancer queries the service registry and routes each request to an available service instance. As with client‑side discovery, service instances are registered and deregistered with the service registry.
The AWS Elastic Load Balancer (ELB) is an example of a server-side discovery router. An ELB is commonly used to load balance external traffic from the Internet. However, you can also use an ELB to load balance traffic that is internal to a virtual private cloud (VPC). A client makes requests (HTTP or TCP) via the ELB using its DNS name. The ELB load balances the traffic among a set of registered Elastic Compute Cloud (EC2) instances or EC2 Container Service (ECS) containers. There isn’t a separate service registry. Instead, EC2 instances and ECS containers are registered with the ELB itself.
HTTP servers and load balancers such as NGINX Plus and NGINX can also be used as a server-side discovery load balancer. For example, this blog post describes using Consul Template to dynamically reconfigure NGINX reverse proxying. Consul Template is a tool that periodically regenerates arbitrary configuration files from configuration data stored in the Consul service registry. It runs an arbitrary shell command whenever the files change. In the example described by the blog post, Consul Template generates an nginx.conf file, which configures the reverse proxying, and then runs a command that tells NGINX to reload the configuration. A more sophisticated implementation could dynamically reconfigure NGINX Plus using either its HTTP API or DNS.
Some deployment environments such as Kubernetes and Marathon run a proxy on each host in the cluster. The proxy plays the role of a server‑side discovery load balancer. In order to make a request to a service, a client routes the request via the proxy using the host’s IP address and the service’s assigned port. The proxy then transparently forwards the request to an available service instance running somewhere in the cluster.
The server‑side discovery pattern has several benefits and drawbacks. One great benefit of this pattern is that details of discovery are abstracted away from the client. Clients simply make requests to the load balancer. This eliminates the need to implement discovery logic for each programming language and framework used by your service clients. Also, as mentioned above, some deployment environments provide this functionality for free. This pattern also has some drawbacks, however. Unless the load balancer is provided by the deployment environment, it is yet another highly available system component that you need to set up and manage.

Can I make calls directly to pods from outside Kubernetes?

I'm attempting to transition existing applications to Kubernetes that work as follows:
An outside service calls our application through a load balancer with a new session.
Our application returns the ip of the server that processed the request.
All subsequent calls from the outside service for that session are made directly to the same server (bypassing the load balancer)
Is there any way to do this in kubernetes? I understand that pod ip's are not exposed externally, is there some way to expose them directly?
Also, I don't think I can use sessionAffinity="ClientIP" because the requests will all come in from the same place. Is there a way to write custom sessionAffinity type?
It kind of depends on how your network is set up and what you mean by an "outside service", but the answer is most likely "no".
If you're running using one of the default cluster creation scripts in a cloud environment, pod IP addresses are not routable from the Internet, so any service not in the same private network as your cluster won't be able to talk directly to pods.
However, depending on what cloud provider you're on, you'll likely get the behavior that you want anyways by just continuing to make all calls through to the external IP of a service of type LoadBalancer. For instance, on the Google Cloud Platform, the cloud load balancer that gets created for such services by default maintains connection affinity by 5-tuple (src ip and port, dst ip and port, L4 protocol), which sounds like it's what you want, since you want balancing per session rather than per IP.
As for creating a new sessionAffinity type, that's not an easy thing to extend, since it requires changing Kubernetes source code. If that's really a path you want to take, it's likely that you'd want to run your own load balancer within your cluster rather than relying on the built-in load balancing.