Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
So I am exploring Jaeger for Tracing and I saw that we can directly send spans from the client to the collector in HTTP (PORT: 14268), if so then what is the advantage of using the jaeger agent.
When to go for the Jaeger Agent Approach and when to go with the direct HTTP Approach. What is the disadvantage of going with the Direct approach to the collector
From the official FAQ (https://www.jaegertracing.io/docs/latest/faq/#do-i-need-to-run-jaeger-agent):
jaeger-agent is not always necessary. Jaeger client libraries can be configured to export trace data directly to jaeger-collector. However, the following are the reasons why running jaeger-agent is recommended:
If we want SDK libraries to send trace data directly to collectors, we must provide them with a URL of the HTTP endpoint. It means that our applications require additional configuration containing this parameter, especially if we are running multiple Jaeger installations (e.g. in different availability zones or regions) and want the data sent to a nearby installation. In contrast, when using the agent, the libraries require no additional configuration because the agent is always accessible via localhost. It acts as a sidecar and proxies the requests to the appropriate collectors.
The agent can be configured to enrich the tracing data with infrastructure-specific metadata by adding extra tags to the spans, such as the current zone, region, etc. If the agent is running as a host daemon, it will be shared by all applications running on the same host. If the agent is running as a true sidecar, i.e. one per application, it can provide additional functionality such as strong authentication, multi-tenancy (see this blog post), pod name, etc.
Agents allow implementing traffic control to the collectors. If we have thousands of hosts in the data center, each running many applications, and each application sending data directly to the collectors, there may be too many open connections for each collector to handle. The agents can load balance this traffic with fewer connections.
Related
Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 months ago.
This post was edited and submitted for review 4 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
I am trying to run some installation instructions for a software development environment built on top of K3S.
I am getting the error "no nodes available to schedule pods", which when Googled takes me to the question no nodes available to schedule pods - Running Kubernetes Locally with No VM
The answer to that question tells me to run kubectl get nodes.
And when I do that, it shows me perhaps not surprisingly, that I don't have any nodes running.
Without having to learn how Kubernetes actually works, how can I start some nodes and get past this error?
This is a local environment running on a single VM (just like the linked question).
It would depend how your K8s was installed. Kubernetes is a complex system requiring multiple nodes all configured correctly in order to function.
If there are no nodes found for scheduling, my first though would be you only have a single node and its a master node (which runs the control plane services but not workloads) and have not attached any worker nodes. You would need to add another node to the cluster which is running as a worker for it to schedule workloads.
If you want to get up and running without understanding it, there are distributions such as minikube or k3s, which will set it up out of the box and are designed to run on a single machine.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
So far in our legacy deployments of webservices to VM clusters, we have effectively been using Log4j2 based multi-file logging on to a persistent Volume where the log files are rolled over each day. We have a need to maintain logs for about 3 months, before they can be purged.
We are migrating to a Kubernetes Infrastructure and have been struggling on what would be the best logging strategy to adapt with Kubernetes Clusters. We don't quite like the strategies involving spitting out all logging to STDOUT/ERROUT and using come centralized tools like Datadog to manage the logs.
Our Design requirements for the Kubernetes Logging Solution are:
Using Lo4j2 to multiple files appenders.
We want to maintain the multi-file log appender structure.
We want to preserve the rolling logs in archives for about 3-months
Need a way to have easy access to the logs for searching, filtering etc.
The Kubectrl setup for viewing logs may be a bit too cumbersome for our needs.
Ideally we would like to use the Datadog dashboard approach BUT using multi-file appenders.
The serious limitation of Datadog we run into is the need for having everything pumped to STDOUT.
Start using containers platforms or building containers means that as a first step we must to change our mindset. Create logs files in your containers is not the best practices for two reasons:
Your containers should be stateless, so the should not save anything inside of it, because when it is deleted and created again your files will be desapeared.
When you send your outputs using Passive Logging(STDOUT/STDERR), Kubernetes creates the logs files for you, this files can be used by platforms like fluentd or logstash to collects those logs and send it to a log aggregation tool.
I recommend to use the Passive Logging which is the recommended way by Kubernetes and the standard for cloud native applications, maybe in the future you will need to use your app in a cloud services, which also use Passive Logging to check application errors
In the following links you will see some refereces about why k8s recommends to use Passive Logging:
k8s checklist best practices
Twelve Factor Applications Logging
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
In terms of investing for the most value, how does AWS MSK compare to Confluent when it comes to hosting an end to end Kafka event sourcing?
The main criteria to be used for comparing are:
Monitoring
Ease of deployment and configuring
Security
I have used open-source, on-prem Cloudera, and MSK. When comparing them together they have all had their quirks.
If you just base on the speed of provisioning a secure Kafka cluster. I think MSK would win hands down. Someone with Kafka, AWS Certificate Manager, and Terraform can get it all done very quickly. Though there are a few issues around Terraform TLS and AWS CLI but there are workarounds.
If you are planning to use Kafka Connect then confluent makes lots of sense.
If you have Kafka developers who have experience in writing Kafka Connect sinks and source. Well, then you may not need a subscription-based model from confluent. Though you may not save a lot of money. Either spend in development or spend in subscription costs.
If you like serverless - MSK is quite good. However, there is no SSH access to the Kafka cluster. You cannot tune the JVM.
Monitoring is built out of the box for MSK via open monitoring via JMX metrics and prometheus. You also have CloudWatch as well. But open monitoring pretty much gives all the metrics you need. In open-source, you can easily deploy monitoring. Rather MSK is doing the same.
MSK provides security using either TLS or IAM. Though there are some issues around enabling IAM-based security for MSK using Terraform. 2 Way TLS Client authentication is quite easy to set up.
MSK also provides auto-scaling but again if you are planning to use Terraform there may be some interoperability issues.
I am sure folks here can add a lot more on confluent.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
we are starting with our first cloud run project. in the past we used postgres in combination with spring boot. and we had our migrations running via flyway (similar to liquibase) when the app started.
now with cloud run, this approach will maybe hit it's limit because of the following (corner) cases:
multiple incoming requests (http, messages) routed to parallel instances that could do the same migration in parallel when bootstrapping the container. that would result in exceptions and retries of failed messages or http errors
flyway check on bootstrap would slow down the cold start times additionally every time a container gets started, which could be a lot if we do not have constant traffic with "warm" instances
what would be a good approach having springboot/flyway and postgres as backing database shared accross the instances? a similar problem arises when you replace postgres with a nosql datastore i guess if you want/need to migrate new structures....
right now i can think of:
do a migration of the postgres schema as part of the deployment pipeline before the cloud revision gets replaced which would introduce new challenges (rollbacks etc.)
please share your ideas? looking forward to your answers?
marcel
For migrations that introduce breaking changes either on commit or on rollback, it's mandatory to have a full stop, and of course, rollback planned accordingly.
Also pay attention, that a commit/push should not trigger immediately the new migrations. Often these are not part of the regular CI/CD pipeline that goes to production.
After you deploy a service, you can create a new revision and assign a tag that allows you to access the revision at a specific URL without serving traffic.
A common use case for this, is to run and control the first visit to this container. You can then use that tag to gradually migrate traffic to the tagged revision, and to rollback a tagged revision.
To deploy a new revision of an existing service to production:
gcloud beta run deploy myservice --image IMAGE_URL --no-traffic --tag TAG_NAME
The tag allows you to directly test(or run via this the migration - the very first request) the new revision at a specific URL, without serving traffic. The URL starts with the tag name you provided: for example if you used the tag name green on the service myservice, you would test the tagged revision at the URL https://green---myservice-abcdef.a.run.app
There is an emerging trend of ripping global state out of traditional "static" config management tools like Chef/Puppet/Ansible, and instead storing configurations in some centralized/distributed tool, of which the main players appear to be:
ZooKeeper (Apache)
Consul (Hashicorp)
Eureka (Netflix)
Each of these tools works differently, but the principle is the same:
Store your env vars and other dynamic configurations (that is, stuff that is subject to change) in these tools as key/value pairs
Connect to these tools/services via clients at startup and pull down your config KV pairs. This typically requires the client to supply a service name ("MY_APP"), and an environment ("DEV", "PROD", etc.).
There is an excellent Consul Java client which explains all of this beautifully and provides ample code examples.
My understanding of these tools is that they are built on top of consensus algorithms such as Zab, Paxos and Gossip that allow config updates to spread almost virally, with eventual consistency, throughout your nodes. So the idea there is that if you have a myapp app that has 20 nodes, say myapp01 through myapp20, if you make a config change to one of them, that change will naturally "spread" throughout the 20 nodes over a period of seconds/minutes.
My problem is: how do these updates actually deploy to each node? In none of the client APIs (the one I linked to above, the ZooKeeper API, or the Eureka API) do I see some kind of callback functionality that can be set up and used to notify the client when the centralized service (e.g. the Consul cluster) wants to push and reload config updates.
So I ask: how is this supposed to work (dynamic config deployment and reload on clients)? I'm interested in any viable answer for any of those 3 tools, though Consul's API seems to be the most advanced IMHO.
You could use cfg4j for that. It's a Java configuration library for distributed services. It supports Consul as one of the configuration sources.
That's a nice question. I can tell how Consul HTTP client works.
I also think initially that it works in the push mechanism but while I was recently exploring Consul, I found that all Consul clients poll server for changes they want to watch. Although it is a bit different polling mechanism, Consul supports blocking queries. These are HTTP requests with a max timeout of 10 mins. This query waits until there is some change on the watched key/folder and return with the latest index. If the index is changed, the client reloads the configuration. For more info : Consul Blocking Query