Health check route organisation in microservice(ish) setup behind AWS ALB - rest

How to name health check routes among several services behind ALB?
I'm moving my API and database to AWS. Before moving I split up my monolith REST API into four services:
public API (to which apps and websites connects)
admin API (for admin web site)
messaging API (web socket server for realtime communication with apps)
workers (queue based task processors)
I'm now trying to figure out a good organisation of the routes. At first created two subdomains, api.mydomain.com and www.mydomain.com.
I directed the api subdomain to my ALB which routed traffic based on the path only, like this:
"/sockets" -> messaging-api
"/admin" -> admin-api
"/" -> public-api
Now I'm trying to implement the health check routes. I'd like to name them "/health". But the health checks needs to be directed to each target group. Since the ALB only routes based on the path I cannot have /health on more than one server.
Possible solutions:
1. Separate the services via subdomains
I could create a subdomain for each service like:
- api.mydomain.com
- sockets.mydomain.com
- admin.mydomain.com
With this setup I could have a /health in each service without collisions.
2. Separate the health check routes via naming
I could name the health check route differently for each service like:
api.mydomain.com/health-public-api
api.mydomain.com/health-messaging-api
api.mydomain.com/health-admin-api
Suggestions?
Both the above solutions seems viable, but I'd like to know if maybe one of the solutions will bite me later, when for example more services are added, or when I'll add a graphQL API later on.
edit:
I just bumped into one drawback with solution #1. My local
dev-enviromnemt is setup with a docker image for each service and
nginx for routing the requests. On top of this I use ngrok to be able
to reach the dev environment from the Internet.
I think it would be hard to solve the service separation in based on
subdomains, but I really don't need the /health routes in the dev
enviromnent, so I guess I could just pretend they are not there.

Answering my own question as documentation and possibly some input to others.
tl;dr:
I went with a third option, separating all services via the first level in the the path. The main difference from my previous structure is that my main api (aka public-api) has moved from the root to a subpath called /app. I also renamed it to app-api.
api.mydomain.com/app/..
api.mydomain.com/admin/..
api.mydomain.com/sockets/..
api.mydomain.com/auth/..
www.mydomain.com/..
This solution gives me several pros and no cons (I think).
Pros:
Easy to route request both in ALB and in local dev environment via nginx without the extra work needed for SNI
subdomains separates api vs web sites very clearly
/health routes gets unique names by default since they live under separate paths.
the apps (web and smartphones) can use a common api url (api.mydomain.com/) and still reach all services, i.e. they don't need to store several differently initialized Axios connections. No biggie, but still..
I also opted for making /health a little more future proof and standardized on the following structure in each service.
api.mydomain.com/servicename/health
api.mydomain.com/servicename/health/is-up
api.mydomain.com/servicename/health/is-ready
up = responding to requests, ready = all dependencies are connected (i.e. databases, etc)
/health returns status 200 along with a json object describing the readiness.
/health/is-up responds with 200 or nothing (i.e. not reachable at all)
/health/is-ready responds with 200 if all dependencies are ready, otherwise 500.
The target groups in AWS will use /is-ready for health checks, but for now it's the same thing as /is-up since I haven't implemented the readiness tests yet.

Related

Allow API requests from a specific URL in azure kubernetes

I am using azure kubernetes for backend deployment. I have 2 URLs one is API URL(api.project.com) and other one is BFF URL(bff.project.com).
From Web application, instead of calling API URL(api.project.com) they use BFF URL(bff.project.com) which internally calls the API URL(api.project.com) and sends the response.
I now want to restrict direct usage of API URL(api.project.com) even from any REST API Clients(like postman, insomnia, ...) it should only work when triggered from BFF URL(bff.project.com).
We have used nginx-ingress for subdomain creation and both the URLs(BFF and API) are in same cluster.
Is there any firewall or inbuilt azure services to resolve the above mentioned problem ?
Thanks in Advance :)
You want to keep your api private, only accessible from another K8S service, so don't expose it using your ingress controller and it simply won't be accessible outside K8S to any client.
This means that you lose the api.project.com address (although you can get that back if you really want to, it seems unnecessary). The BFF would then access the API via the URL: http://<service-name>.<namespace>.svc.cluster.local:<service-port>, which in your case might be:
http://api.api_ns.svc.cluster.local
Assuming you haven't used TLS (http rather than https), the service is called api, it's running on port 80 (which it should be) and the namespace is called api_ns.
Should you need to provide temporary access to the API for developers to use, say, postman, then they can use port-forwarding to provide that in a dev environment without allowing external access all the time.
However, this won't restrict access to BFF alone. Any service running in K8S could access the API. If you need/want to restrict things further, then you have a lot of options.

How do I prevent anonymous requests to a REST API / NGINX server while allowing authenticated requests to endpoints?

Initial disclosure:
I’m new to nginx and reverse proxy configuration in general.
Background
I have a Swagger-derived, FOSS, https-accessible REST API [written by another party] running on a certain port of an EC2 CentOS 7 instance behind an nginx 1.16.1 reverse proxy (to path https://foo_domain/bar_api/); for my purposes, this API needs to be reachable from a broad variety of services not all of which publish their IP ranges, i.e., the API must be exposed to traffic from any IP.
Access to the API’s data endpoints (e.g., https://foo_domain/bar_api/resource_id) is controlled by a login function located at
https://foo_domain/bar_api/foobar/login
supported by token auth, which is working fine.
Problem
However, the problem is that an anonymous user is able to GET
https://foo_domain/bar_api
without logging in, which results in potentially sensitive data about the API server configuration being returned, such as the API’s true port, server version, some of the available endpoints and parameters, etc. This is not acceptable for the purpose, from a security standpoint.
Question
How do I prevent anonymous GET requests to the /bar_api/ endpoint, while allowing login and authenticated data requests to endpoints beyond /bar_api/ to proceed unhindered? Or, otherwise, how do I prevent any data from being returned upon such requests?

Kubernetes: What's the idiomatic way in K8s to setup a custom proxy between ingress and its services?

At present we have a lot of ASP.net WebAPI service applications hosted on premises. We are planning to move these to Azure AKS. We've identified a lot of common code across these applications which is mostly implemented as ASP.Net reusable middleware components so that the logic is not duplicated in code.
In a K8s environment it makes sense to offload this common functionality to one or more proxy applications which intercepts the requests being forwarded from the ingress to the services (assuming this is the correct approach). Some of the request inspection / manipulation logic is based on the service host and path to be defined in the ingress and even on the headers in the incoming requests.
For e.g. I considered using OAuth2_proxy but found that even though authentication is quite easy to implement, Azure AD group based authorization is impossible to do out of the box with that. So what's the idiomatic way one goes about setting up such a custom proxy application? (I'm familiar with using libraries such as ProxyKit middleware in ASP.Net to develop http proxies.)
One approach that comes to mind is to deploy such proxies as sidecar containers in each service application pod but that would mean there'd be unnecessary resource usage by all such duplicate container instances in each pod. I don't see the benefit over the use of middleware components as mentioned previously. :(
The ideal setup would be ingress --> custom proxy 1 --> custom proxy 2 --> custom proxy n --> service where custom proxies would be separately deployable and scalable.
So after a lot of reading and googling I found that the solution was to use API Gateways that are available as libraries (preferrably based on .Net):
Ocelot placed behind the nginx ingress fits the bill perfectly
Ocelot is a .NET API Gateway. This project is aimed at people using .NET running a micro services / service oriented architecture that need a unified point of entry into their system. However it will work with anything that speaks HTTP and run on any platform that ASP.NET Core supports.
Ocelot is currently used by Microsoft and Tencent.
The custom middleware and header/query/claims transformation solves my problem. Here are some worthy links
Microsoft Docs: Implement API Gateways with Ocelot
Ocelot on Github
Ocelot Documentation
Features
A quick list of Ocelot's capabilities for more information see the documentation.
Routing
Request Aggregation
Service Discovery with Consul & Eureka
Service Fabric
Kubernetes
WebSockets
Authentication
Authorisation
Rate Limiting
Caching
Retry policies / QoS
Load Balancing
Logging / Tracing / Correlation
Headers / Query String / Claims Transformation
Custom Middleware / Delegating Handlers
Configuration / Administration REST API
Platform / Cloud Agnostic

API gateway/proxy pattern for microservices deployed using Azure Service Fabric

After watching the BUILD conference videos for Azure Service Fabric, I'm left imagining how this might be a good fit for our current microservice-based architecture. There is one thing I'm not entirely sure how I would go about solving, however - the API gateway/proxy.
Consider a less-than-trivial microservice architecture where you have N number of services running within the Azure Service Fabric exposing REST endpoints. In many situations, you want to package these fragmented API endpoints up into a single-entry API for consumers to use, to avoid having them connecting to the service fabric-instances directly. The Azure Service Fabric solution seems so complete in every way that I'm sort of wondering if I missed something obvious when I don't see a way to trivially solve this within the capabilities mentioned during the BUILD talks.
Services like Vulcan aim to solve this problem by having the services register the paths they want routed to them in etcd. I'm guessing one way of solving this may be to create a separate stateful web service that other services can register themselves with, providing service name and the paths they need routed to them. The stateful web service can then route traffic to the correct instance based on its state. This doesn't seem entirely ideal, though, with stuff like removing routes when applications are removed and generally keeping the state in sync with the services deployed within the cluster. Has anybody given this any thought, or have any ideas how one might go about solving this within Azure Service Fabric?
The service registration/discoverability you need to do this is actually already there. There's a stateful system service called the Naming Service, which is basically a registrar of service instances and the endpoints they're listening on. So when you start up a service - either stateless or stateful - and open some listener on it, the address gets registered with the Naming Service.
Now the part you'd need to fill in is the "gateway" that users interact with. This doesn't have to be stateful because the Naming Service manages the stateful part. But you'd have to come up with an addressing scheme that works for you, and then it would just forward requests along to the right place. Basically something like this:
Receive request.
Use NS to find the service that can take the request.
Forward the request to it and the response back to the user.
If the service doesn't exist anymore, 404.
In general we don't like to dictate anything about how your services talk to each other, but we are thinking of ways to solve this problem for HTTP as a complete built-in solution.
We implemented a HTTP gateway service for this purpose as well. To make sure we can have one HTTP gateway for any internal protocol, we implemented the gateway for HTTP based internal services (like ASP.NET WebAPIs) using an ASP.NET 5 middleware. It routes requests from e.g /service to an internal Service Fabric address like fabric:/myapp/myservice by using the ServicePartitionClient and some retry logic from CommunicationClientFactoryBase.
We open-sourced this middleware and you can find it here:
https://github.com/c3-ls/ServiceFabric-HttpServiceGateway
There's also some more documentation in the wiki of the project.
This feature is build in for http endpoints, starting with release 5.0 of service fabric. The documentation is available at https://azure.microsoft.com/en-us/documentation/articles/service-fabric-reverseproxy/
We have used an open source project called Traefik with amazing success. There is an Azure Service Fabric wrapper around it - it's essentially a GoLang exe that is deployed onto the cluster as Managed Executable.
It supports circuit breakers, weighted round robin LB, path & header version routing (this is awesome for hosting multiple API versions), the list goes on. And its got a handy portal to view the config and health stats.
The real power in it lies in how you configure it. It's done via the service itself in the ServiceManifest.xml. This allows you to deploy new services and have them immediately able to be routed to - no need to update a routing table etc.
Example
<StatelessServiceType ServiceTypeName="WebServiceType">
<Extensions>
<Extension Name="Traefik">
<Labels xmlns="http://schemas.microsoft.com/2015/03/fabact-no-schema">
<Label Key="traefik.frontend.rule.example">PathPrefixStrip: /a/path/to/service</Label>
<Label Key="traefik.enable">true</Label>
<Label Key="traefik.frontend.passHostHeader">true</Label>
</Labels>
</Extension>
</Extensions>
</StatelessServiceType>
Highly recommended!
Azure Service Fabric makes it easy to implement the standard architecture for this scenario: a gateway service as a frontend for the clients to connect to and all the N backend services communicating with the front end gateway. There are a few communication API stacks available as part of Service Fabric that make it easy to communicate from clients to services and within services themselves. The communication API stacks provided by Service Fabric hide the details of discovering, connecting and retrying connections so that you can focus on the actual exchange of information. When using the Service Fabric communication APIs the services do not have to implement the mechanism of registering their names and endpoints to a specific routing service except what are the usual steps as part of creating the service itself. The communication APIs take in the service URI and partition key and automatically resolve and connect to the right service instance. This article provides a good starting point to help make a decision with regards to which communication APIs will be best suited for your particular case depending on whether you are using Reliable Actors or Reliable Services, or protocols such as HTTP or WCF, or the choice of programming language that the services are written in. At the end of the article you will find links to more detailed articles and tutorials for different communication APIs. For a tutorial on communication in Web API services see this.
We are using SF with a gateway pattern and about 13 services behind the gateway. We use the built in DNS service that SF provides, see: https://learn.microsoft.com/en-us/azure/service-fabric/service-fabric-dnsservice, this allows the internal service to service calls with known (internal to SF) DNS names, including gateway service to internal services. There are some well known asp.net core gateways (Ocelot, ProxyKit) to use, but we rolled our own. We have an external load balancer to route to multiple gateway instances in SF.
When a service is started, it registers it's endpoint with the fabric naming service. Using the Fabric client APIs you can then ask fabric for the registered endpoints, associated with the registered service name.
So yes, just as you described your case, you would have a gateway that would accept an incoming URI for connection, and then use that path information as the service name lookup, to then create a proxy connection between the incoming request and the actual internal endpoint location.
Looks like the team as posted one the samples that shows how to do this: https://github.com/Azure/servicefabric-samples/tree/master/samples/Services/VS2015/WordCount

How to use S3 as static web page and EC2 as REST API for it together? (AWS)

With AWS services we have the Web application running from the S3 bucket and accessing the data through the REST API from Load Balancer (which is set of Node.js applications running on EC2 instance).
Currently we have specified URL's as following:
API Load Balancer: api.somedomain.com
Static Web App on S3: somedomain.com
But having this setup brought us a set of problems since requests are CORS with this setup. We could workaround CORS with special headers, but that doesn't work with all browsers.
What we want to achieve is running API on the same domain but with different path:
API Load Balancer: somedomain.com/api
Static Web App on S3: somedomain.com
One of the ideas was to attach the API Load Balancer to the CDN and forward all request to Load Balancer if query is coming on the "/api/*" path. But that doesn't work since our API is using not only HEAD and GET requests, but also POST, PUT, DELETE.
Another idea is using second EC2 instance instead of S3 bucket to host website (using some web server like nginx or apache). But that gives too much overhead when everything is in place already (S3 static content hosting). Also if using this scenario we wouldn't get all the benefits of Amazon CloudFront performance.
So, could your recommend how to combine Load Balancer and S3, so they would run on same domain, but with different paths? (API on somedomain.com/api and Web App on somedomain.com)
Thank you!
You can't have an EC2 instance and an S3 bucket with the same host name. Consider what happens when a web browser makes a request to that host name. DNS resolves it to an IP address (or addresses) and the packets of the request are delivered to that address. The address either terminates at the EC2 instance or the S3 bucket, not both.
As I understand your situation, you have static web pages hosted on S3 that include JavaScript code that makes various HTTP requests to the EC2 instance. If the S3 web pages are on a different host than the EC2 instance then the same origin policy will prevent the browser from even attempting some of the requests.
The only solutions I can see are:
Make all requests to the EC2 instance, with it fetching the S3 contents and delivering it to the browser whenever a web page is asked for.
Have your JavaScript use iframes and change the document.domain in the the web pages to a common parent origin. For example, if your web pages are at www.example.com and your EC2 instance is at api.example.com, the JavaScript would change document.domain to just example.com and the browser would permit iframes from from www.example.com to communicate with api.example.com.
Bite the bullet and use CORS. It's really not hard, and it's supported in all remotely recent browsers (IE 8 and 9 do it, but not in a standard way).
The first method is no good, because you almost might as well not use S3 at all in that case.
The second case should be okay for you. It should work in any browser, because it's not really CORS. So no CORS headers are needed. But it's tricky.
The third, CORS, approach should be just fine. Your EC2 instance just has to return the proper headers telling web pages from the S3 bucket that it's safe for them to talk to the EC2 instance.
Just wanted to add an additional bit to the answer that, if we go with CORS approach and preflight requests adds an overhead to the server and network bandwidth, we may even consider adding header "Access-Control-Max-Age" to the CORS response
Access-Control-Max-Age