Querying Remote State Stores in Kubernetes (Interactive Queries) - kubernetes

Are there any recommendations on querying remote state stores between application instances that are deployed in Kubernetes? Our application instances are deployed with 2 or more replicas.
Based on documentation
https://kafka.apache.org/10/documentation/streams/developer-guide/interactive-queries.html#id7
streams.allMetadataForStore("word-count")
.stream()
.map(streamsMetadata -> {
// Construct the (fictituous) full endpoint URL to query the current remote application instance
String url = "http://" + streamsMetadata.host() + ":" + streamsMetadata.port() + "/word-count/alice";
// Read and return the count for 'alice', if any.
return http.getLong(url);
})
.filter(s -> s != null)
.findFirst();
will streamsMetadata.host() result in the POD IP? And if it does, will the call from this pod to another be allowed? Is this the correct approach?

streamsMetadata.host()
This method returns whatever you configured via application.server configuration parameter. I.e., each application instance (in your case each POD), must set this config to provide the information how it is reachable (e.g., its IP and port). Kafka Streams distributes this information for you to all application instances.
You also need to configure your PODs accordingly to allow sending/receiving query request via the specified port. This part is additional code you need to write yourself, i.e., some kind of "query routing layer". Kafka Streams has only built-in support to query local state and to distribute the metadata about which state is hosted where; but there is no built-in remove query support.
An example implementation (WordCountInteractiveQueries) of a query routing layer can be found on Github: https://github.com/confluentinc/kafka-streams-examples
I would also recommend to checkout the docs and blog post:
https://docs.confluent.io/current/streams/developer-guide/interactive-queries.html
https://www.confluent.io/blog/unifying-stream-processing-and-interactive-queries-in-apache-kafka/

Related

How the dead nodes are handled in AWS OpenSearch?

Trying to understand what is the right approach to connect to AWS OpenSearch (single cluster, multiple data nodes).
To my understanding, as long as data nodes are behind the load balancer (according to this and other AWS docs: https://aws.amazon.com/blogs/database/set-access-control-for-amazon-elasticsearch-service/), we can not use:
var pool = new StaticConnectionPool(nodes);
and we probably should not use CloudConnectionPool - as originally it was dedicated to elastic search cloud and was left in open search client by mistake?
Hence we use SingleNodeConnectionPool and it works, but I've noticed several exceptions, which indicated that node had DeadUntil set to date one hour in advance - so I was wondering if that is expected behavior, as from client's perspective that is the only node it knows about?
What is correct way to connect to AWS OpenSearch that has multiple nodes and should I be concerned about DeadUntil property?

How do I seed default data to Mongo db (or any database) in a microservice architecture?

I have a use case where there are multiple microservices and one of them deals with roles and resources(let's call this microservice as A). Resources are just endpoints.
A maintains a collection(let's call this X) to store all the resources from different microservices. For each microservice other than A, I would like to store all of its resources(endpoints) into X the first time this microservice boots up.
I am thinking of having a json file with all the resources in each microservice and calling A's endpoint to add resources whenever a microservice boots up.
Is there any idiomatic way to do this?
Consider the use of Viper so you can set default data from multiple different sources like yaml, json, remote config like etcd, live watch of files among others. You can algo configure the call to and endpoint with it's remote configuration feature.

Dedicate a node to a stream - Security rules

Can anyone let me know how to show a stream only in a specific node
i have a 2 nodes cluster.. and i would like to dedicate RIM01 specific to Stream1. RIM02 to Steam2. Meaning any request to that streams or apps in that stream should go to there nodes
So, if a go to RIM01 the Stream2 should be hidden etc...
Central node
RIM02 -- Repository + Engine
RIM03 -- Repository + Engine + Scheduler
i tried lot of security rules like
Filter : ServerNodeConfiguration_,Stream_
(node.#NodeUse="dev") and (node.#NodeType=stream.#StreamType and !resource.stream.Empty())
or
Filter : ServerNodeConfiguration_,Stream_
((resource.resourcetype = "Nodes" and resource.name="RIM01")) and ((resource.name="test"))
but none of them work :/
Thanks
So, at present, load balancing in Qlik Sense applies to Apps, not Streams. Load Balancing routes apps to servers, whereas security rules govern stream visibility. And, unfortunately, there is not a clean mechanism to use node meta-data in security rules. All in all, there isn't a solution for hiding a stream on a given server.
I have the same issue, you can designate the apps are only readable on single node, so depending on how your user stream rights are configured some users may see an empty stream on the node where the app cannot be accessed.
There's some interesting stuff happening with the multi cloud capability where the concept of streams is now collections, which gives lots more flexibility around this type of thing. Alas QEFE capability is only just come with June 2018, and access is limited to certain use cases / customers.

Dynamically Configuring a Zuul Proxy during Runtime?

I have a url path that looks like this:
/{identifier}/rest/of/resource/path
If the identifier is A then the request should go to service_I. If the identifier is B then the request should also go to service_I. If the identifier is C, then the request should go to service_II, and so on.
Later on a new identifiers M and N is added to the system and their requests should be routed to service_IV.
Is it possible to dynamically configure a Spring cloud zuul proxy to perform the tasks described above?
Edit
This question offered contains a different way to examine the question.
In it Zuul has the following configuration:
zuul:
routes:
<service_id>:
path: /path/**
Zuul will collaborate with Eureka to find the service-id and return the host parameters so that the service can be accessed. What if instead of /path we have /{userID} and the userID instances are distributed across several service_id hosts?
Can Zuul / the DiscoveryClient query Eureka for both the service_id and the userID to figure out which host is hosting the particular userID?
You would need to write a custom ZuulFilter to accomplish this. Take a look at the PreDecorationFilter for some hints as this is the filter responsible for handling /path where the path is a service-id (among other things).

Embedding custom metadata with Service Fabric application/service

The objective that I have is to run multiple applications with some metadata embedded into applications/services so that I could query applications/services using the metadata. Is this possible?
I was looking at the following post and the answer hints at this possibility, but no specific details on how to achieve the result.
The primary piece of "metadata" you get is the service/application instance name. That's what I talked about in my other post. The way that works is by creating each service/application instance with a name that contains some information clients can use when resolving them. Clients can then query Service Fabric for named application/service instances and connect to a specific one. A service/application instance name is URI, so you can use a path hierarchy to categorize information.
Continuing with the audio/video example: Let's extend that example so we have an application that can perform specific tasks for specific media formats for audio or video. Each combination of task + media format is a unique named service instance, resulting in a deployment that looks something like this:
Application:
fabric:/avapp
Services:
fabric:/avapp/video/encoding/mp4
fabric:/avapp/video/encoding/h264
fabric:/avapp/video/captioning/english
fabric:/avapp/video/captioning/czech
fabric:/avapp/audio/encoding/aac
fabric:/avapp/audio/encoding/mp3
etc.
Now clients can query Service Fabric to discover what services are available:
FabricClient fabricClient = new FabricClient();
System.Fabric.Query.ServiceList services = await fabricClient.QueryManager.GetServiceListAsync(new Uri("fabric:/avapp"));
Then you can simply query the list of services with LINQ. For example, if I want to see all services that do video encoding:
services.Where(x => x.ServiceName.AbsolutePath.Contains("video/encoding"));
And then you can resolve an address for a specific service to connect to it:
ServicePartitionResolver resolver = ServicePartitionResolver.GetDefault();
ResolvedServicePartition servicePartition = await resolver.ResolveAsync(new Uri("fabric:/avapp/video/encoding/h264"), new ServicePartitionKey(1), cancellationToken);
ResolvedServiceEndpoint endpoint = servicePartition.GetEndpoint();
There's a bit more to the address resolution part (see here), but that's the general idea.
Application instances also allow you to set custom application parameters (key-value pairs) that can be set per instance at creation time. They don't show up in the application name, but you get that information back when you ask Service Fabric for a list of running application instances. That can potentially also be used as metadata by clients when they need to decide what application to connect to.
Update: More info on application instance parameters:
When you create a new application instance you can supply a set of key-value pairs in the application description. Then when you query Service Fabric for application instances you get back a list of Application result objects that have said parameters. This also shows up in Visual Studio, in your application project, where you have environment-specific application parameter files. Visual Studio extracts those key-value pairs from the XML files and uses them in the application description when it creates an instance of your application.