Is there a way to make all quarkus logs follow a schema - keycloak

I am looking for a solution to format Quarkus logs in JSON with respect to a schema, i have come across Quarkus Logging Json but what i dont understand is how to implement an extension so that all the logs generated by the application are then formatted with respect to my implementation, can some one help ?
The application is running in a docker container and is keycloak 18, i would like to get the logs and format them based on my implementation.

Related

How To Design a Distributed Logging System in Kubernetes?

I'm designing a distributed application, comprised of several Spring microservices that will be deployed with Kubernetes. It is a batch processing app, and a typical request could take several minutes of processing, with the processing getting distributed across the services, using Kafka as a message broker.
A requirement of the project is that each request will generate a log file, which will need to be stored on the application file store for retrieval. The current design is, all the processing services write log messages (with the associated unique request ID) to Kafka, and there is a dedicated logging microservice that reads these messages down, does some formatting and should persist them to the log file associated with the given request ID.
I'm very unfamiliar with how files should be stored in web applications. Should I be storing these log files to the local file system? If so, wouldn't that mean this "logging service" couldn't be scaled? For example, if I scaled the log service to 2 instances, then each instance would only have access to half of the log files in theory. And if a user makes a request to retrieve a log file, there is no guarantee that the requested log file will be at whatever log service instance the Kubernetes load balancer routed them too.
What is the currently accepted "best practice" for having a file system in a distributed application? Or should I just accept that the logging service can never be scaled up?
A possible solution I can think of would just store the text log files in our MySQL database as TEXT rows, making the logging service effectively stateless. If someone could point out any potential issues with this that would be much appreciated?
deployed with Kubernetes
each request will generate a log file, which will need to be stored on the application file store
Don't do this. Use a Fluentd / Filebeat / promtail / Splunk forwarder side car that gathers stdout from the container processes.
Or have your services write to a kafka logs topic rather than create files.
With either option, use a collector like Elasticsearch, Grafana Loki, or Splunk
https://kubernetes.io/docs/concepts/cluster-administration/logging/#sidecar-container-with-a-logging-agent
wouldn't that mean this "logging service" couldn't be scaled?
No, each of these services are designed to be scaled
possible solution I can think of would just store the text log files in our MySQL database as TEXT rows,
Sure, but Elasticsearch or Solr are purpose-built for gathering and searching plaintext, not MySQL.
Don't treat logs as something application specific. In other words, your solution shouldn't be unique to Spring

Apache Druid Datasources creation

I have a Druid platform deployed without a router (and therefore no UI) on Kubernetes. I noticed that some datasources have disappeared (most probably erased manually). Is there a way to re-created them manually without re-deploying the full platform (for example, by restarting the ingestion server, through an API call, other)?
Thanks - Christian
It really depends on how the data was deleted and whether the segment files are still being pointed to by the Metadata Database.
Your best starting point is the full Druid API documentation.
For example, they may be in Deep Storage and still in Metadata DB but just marked as "unused", in which case you can use an API:
https://druid.apache.org/docs/latest/operations/api-reference.html#post-1
Or they could be "used" in the MDDB but just not being loaded, in which case check the Load Rules:
https://druid.apache.org/docs/latest/operations/api-reference.html#retention-rules

Kubernetes Log Splitting (Stdout/Stderr)

When I call kubectl logs pod_name, I get both the stdout/err combined. Is it possible to specify that I only want stdout or stderr? Likewise I am wondering if it is possible to do so through the k8s rest interface. I've searched for several hours and read through the repository but could not find anything.
Thanks!
No, this is not possible. To my knowlegde, the moment of writing this, kubernetes supports only one logs api endpoint that returns all logs (stdout and stderr combined).
If you want to access them separately you should consider using different logging driver or query logs directly from docker.

Safely give secret/token to Kafka Connector?

We are using Kafka Connectors (JDBC and others), and configuring them using the REST API (using curl in shell scripts). Right now, when testing/developing, we are including secrets (for the JDBC connect - database user/pw) directly in the request. This is obviously bad, as those are then readily available for everybody to see when reading them out using the GET request.
Is there a good way to give secrets to the connectors? We can bring them in safely using environment variables or config files (injected fom OpenShift) - but is there a syntax available when starting a connector via the REST API for that?
EDIT: This is for the distributed mode of connectors; i.e., configuration by REST API, not connector config files...
A pluggable interface for this was implemented in Apache Kafka 2.0 through KIP-297. You can see more details in the documented example here.

Logging Kubernetes with an external ELK stack

Is there any documentation out there on sending logs from containers in K8s to an external ELK cluster running on EC2 instances?
We're in the process of trying to Kubernetes set up and I'm trying to figure out how to get the logging to work correctly. We already have an ELK stack setup on EC2 for current versions of the application but most of the documentation out there seems to be referring to ELK as it's deployed to the K8s cluster.
I am also working on the same cause.
First you should know what driver is being used by your docker containers to manage the logs (json driver/ journald etc - read here).
After that you should use some log collector in your architecture to send the logs to the Logstash endpoint. You can use filebeat/fluent bit. They are light weight alternatives to logstash/fluentd respectively. You must use one of them and not directly send your logs to logstash via syslog since these log shippers have a special functionality of enriching your logs with kubernetes metadata of the respective containers.
There might be lot of challenges after that. Parsing log data (multiline logs for example) etc. For an efficient pipeline, it’s better to do most of the work (i.e. extracting the date object from the logs etc) at the log sender side, than using the common logstash for this purpose that might be a bottle-neck.
Note that in case the container logs are not sent to stdout/stderr but written else-where, you might need to run filebeat/fluent-bit as side-car with your containers.
As for the links for documentation are concerned, I myself didn’t find anything documented in a single place on this, but the keywords that I mentioned over, reading about them I got to know many things.
Hope this helps.