Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
we are starting with our first cloud run project. in the past we used postgres in combination with spring boot. and we had our migrations running via flyway (similar to liquibase) when the app started.
now with cloud run, this approach will maybe hit it's limit because of the following (corner) cases:
multiple incoming requests (http, messages) routed to parallel instances that could do the same migration in parallel when bootstrapping the container. that would result in exceptions and retries of failed messages or http errors
flyway check on bootstrap would slow down the cold start times additionally every time a container gets started, which could be a lot if we do not have constant traffic with "warm" instances
what would be a good approach having springboot/flyway and postgres as backing database shared accross the instances? a similar problem arises when you replace postgres with a nosql datastore i guess if you want/need to migrate new structures....
right now i can think of:
do a migration of the postgres schema as part of the deployment pipeline before the cloud revision gets replaced which would introduce new challenges (rollbacks etc.)
please share your ideas? looking forward to your answers?
marcel
For migrations that introduce breaking changes either on commit or on rollback, it's mandatory to have a full stop, and of course, rollback planned accordingly.
Also pay attention, that a commit/push should not trigger immediately the new migrations. Often these are not part of the regular CI/CD pipeline that goes to production.
After you deploy a service, you can create a new revision and assign a tag that allows you to access the revision at a specific URL without serving traffic.
A common use case for this, is to run and control the first visit to this container. You can then use that tag to gradually migrate traffic to the tagged revision, and to rollback a tagged revision.
To deploy a new revision of an existing service to production:
gcloud beta run deploy myservice --image IMAGE_URL --no-traffic --tag TAG_NAME
The tag allows you to directly test(or run via this the migration - the very first request) the new revision at a specific URL, without serving traffic. The URL starts with the tag name you provided: for example if you used the tag name green on the service myservice, you would test the tagged revision at the URL https://green---myservice-abcdef.a.run.app
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
So far in our legacy deployments of webservices to VM clusters, we have effectively been using Log4j2 based multi-file logging on to a persistent Volume where the log files are rolled over each day. We have a need to maintain logs for about 3 months, before they can be purged.
We are migrating to a Kubernetes Infrastructure and have been struggling on what would be the best logging strategy to adapt with Kubernetes Clusters. We don't quite like the strategies involving spitting out all logging to STDOUT/ERROUT and using come centralized tools like Datadog to manage the logs.
Our Design requirements for the Kubernetes Logging Solution are:
Using Lo4j2 to multiple files appenders.
We want to maintain the multi-file log appender structure.
We want to preserve the rolling logs in archives for about 3-months
Need a way to have easy access to the logs for searching, filtering etc.
The Kubectrl setup for viewing logs may be a bit too cumbersome for our needs.
Ideally we would like to use the Datadog dashboard approach BUT using multi-file appenders.
The serious limitation of Datadog we run into is the need for having everything pumped to STDOUT.
Start using containers platforms or building containers means that as a first step we must to change our mindset. Create logs files in your containers is not the best practices for two reasons:
Your containers should be stateless, so the should not save anything inside of it, because when it is deleted and created again your files will be desapeared.
When you send your outputs using Passive Logging(STDOUT/STDERR), Kubernetes creates the logs files for you, this files can be used by platforms like fluentd or logstash to collects those logs and send it to a log aggregation tool.
I recommend to use the Passive Logging which is the recommended way by Kubernetes and the standard for cloud native applications, maybe in the future you will need to use your app in a cloud services, which also use Passive Logging to check application errors
In the following links you will see some refereces about why k8s recommends to use Passive Logging:
k8s checklist best practices
Twelve Factor Applications Logging
Build servers are generally detached from the VPC running the instance. Be it Cloud Build on GCP, or utilising one of the many CI tools out there (CircleCI, Codeship etc), thus running DB schema updates is particularly challenging.
So, it makes me wonder.... When's the best place to run database schema migrations?
From my perspective, there are four opportunities to automatically run schema migrations or seeds within a CD pipeline:
Within the build phase
On instance startup
Via a warm-up script (synchronously or asynchronously)
Via an endpoint, either automatically or manually called post deployment
The primary issue with option 1 is security. With Google Cloud Sql/Google Cloud Build, it's been possible for me to run (with much struggle), schema migrations/seeds via a build step and a SQL proxy. To be honest, it was a total ball-ache to set up...but it works.
My latest project is utilising MongoDb, for which I've connected in migrate-mongo if I ever need to move some data around/seed some data. Unfortunately there is no such SQL proxy to securely connect MongoDb (atlas) to Cloud Build (or any other CI tools) as it doesn't run in the instance's VPC. Thus, it's a dead-end in my eyes.
I'm therefore warming (no pun intended) to the warm-up script concept.
With App Engine, the warm-up script is called prior to traffic being served, and on the host which would already have access via the VPC. The warmup script is meant to be used for opening up database connections to speed up connectivity, but assuming there are no outstanding migrations, it'd be doing exactly that - a very light-weight select statement.
Can anyone think of any issues with this approach?
Option 4 is also suitable (it's essentially the same thing). There may be a bit more protection required on these endpoints though - especially if a "down" migration script exists(!)
It's hard to answer you because it's an opinion based question!
Here my thoughts about your propositions
It's the best solution for me. Of course you have to take care to only add field and not to delete or remove existing schema field. Like this, you can update your schema during the Build phase, then deploy. The new deployment will take the new schema and the obsolete field will no longer be used. On the next schema update, you will be able to delete these obsolete field and clean your schema.
This solution will decrease your cold start performance. It's not a suitable solution
Same remark as before, in addition to be sticky to App Engine infrastructure and way of working.
No real advantage compare to the solution 1.
About security, Cloud Build will be able to work with worker pool soon. Still in alpha but I expect in the next month an alpha release of it.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
So I am exploring Jaeger for Tracing and I saw that we can directly send spans from the client to the collector in HTTP (PORT: 14268), if so then what is the advantage of using the jaeger agent.
When to go for the Jaeger Agent Approach and when to go with the direct HTTP Approach. What is the disadvantage of going with the Direct approach to the collector
From the official FAQ (https://www.jaegertracing.io/docs/latest/faq/#do-i-need-to-run-jaeger-agent):
jaeger-agent is not always necessary. Jaeger client libraries can be configured to export trace data directly to jaeger-collector. However, the following are the reasons why running jaeger-agent is recommended:
If we want SDK libraries to send trace data directly to collectors, we must provide them with a URL of the HTTP endpoint. It means that our applications require additional configuration containing this parameter, especially if we are running multiple Jaeger installations (e.g. in different availability zones or regions) and want the data sent to a nearby installation. In contrast, when using the agent, the libraries require no additional configuration because the agent is always accessible via localhost. It acts as a sidecar and proxies the requests to the appropriate collectors.
The agent can be configured to enrich the tracing data with infrastructure-specific metadata by adding extra tags to the spans, such as the current zone, region, etc. If the agent is running as a host daemon, it will be shared by all applications running on the same host. If the agent is running as a true sidecar, i.e. one per application, it can provide additional functionality such as strong authentication, multi-tenancy (see this blog post), pod name, etc.
Agents allow implementing traffic control to the collectors. If we have thousands of hosts in the data center, each running many applications, and each application sending data directly to the collectors, there may be too many open connections for each collector to handle. The agents can load balance this traffic with fewer connections.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Does anyone have a working example of using Snakemake with Azure Kubernetes Service (AKS)? If it is supported, which flags and setup are needed to use the Snakemake Kubernetes executor with AKS? What material there is out there is mostly on AWS with S3 buckets for storage.
I have never tried it, but you can basically take this as a blueprint, and replace the google storage part with a storage backend that is working in Azure. As far as I know, Azure has its own storage API, but there are workarounds to expose an S3 interface (google for Azure S3). So, the strategy would be to setup an S3 API, and then use the S3 remote provider for Snakemake. In the future, Snakemake will also support Azure directly as a remote provider.
you are aware of this already, but for the benefit of others:
Support of AKS has been build into Snakemake now. This works even without a shared filesystem. Most parts of the original blog post describing the implementation have made it into the official Snakemake executor documentation.
In a nutshell: upload your data to blob and deploy an AKS cluster. The run Snakemake with these flags: --default-remote-prefix --default-remote-provider AzBlob and --envvars AZ_BLOB_ACCOUNT_URL AZ_BLOB_CREDENTIAL, where AZ_BLOB_CREDENTIAL is optional if you use a SAS in the account URL. You can use your Snakefile as is.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
We are evaluating Spring Batch framework to replace our home grown batch framework in our organization and we should be able to deploy the batch in Pivotal Cloud Foundry (PCF). In this regard, can you let us know your thoughts on the issue below:
Let us say if we use Remote Partitioning strategy to process large volume of records, can the batch job auto scale Slave nodes in the cloud based on the amount of that the batch job processes? Or we have to scale appropriate number of Slave nodes and keep them in place before the batch job kicks-off?
How does the "grid size" parameter configuration in the scenario above?
You have a few questions here. However, before getting into them, let me take a minute and walk through where batch processing is on PCF right now and then get to your questions.
Current state of CF
As of PCF 1.6, Diego (the dynamic runtime within CF) provided a new primitive called Tasks. Traditionally, all applications running on CF were expected to be long running processes. Because of this, in order to run a batch job on CF, you'd need to package it up as a long running process (web app usually) and then deploy that. If you wanted to use remote partitioning, you'd need to deploy and scale slaves as you saw fit, but it was all external to CF. With Tasks, Diego now supports short lived processes...aka processes that won't be restarted when they complete. This means that you can run a batch job as a Spring Boot über jar and once it completes, CF won't try to restart it (that's a good thing). The issue with 1.6 is that an API exposing Tasks was not available so it was only an internal construct.
With PCF 1.7, a new API is being released to expose Tasks for general use. As part of the v3 API, you'll be able to deploy your own apps as Tasks. This allows you to launch a batch job as a task knowing it will execute, then be cleaned up by PCF. With that in mind...
Can the batch job auto scale Slave nodes in the cloud based on the amount of that the batch job processes?
When using Spring Batch's partitioning capabilities, there are two key components. The Partitioner and the PartitionHandler. The Partitioner is responsible for understanding the data and how it can be divided up. The PartitionHandler is responsible for understanding the fabric in which to distribute the partitions to the slaves.
For Spring Cloud Data Flow, we plan on creating a PartitionHandler implementation that will allow users to execute slave partitions as Tasks on CF. Essentially, what we'd expect is that the PartitionHandler would launch the slaves as tasks and once they are complete, they would be cleaned up.
This approach allows the number of slaves to be dynamically launched based on the number of partitions (configurable to a max).
We plan on doing this work for Spring Cloud Data Flow but the PartitionHandler should be available for users outside of that workflow as well.
How does the "grid size" parameter configuration in the scenario above?
The grid size parameter is really used by the Partitioner and not the PartitionHandler and is intended to be a hint on how many workers there may be. In this case, it could be used to configure how many partitions you want to create, but that really is up to the Partitioner implementation.
Conclusion
This is a description of how a batch workflow on CF would look like. It's important to note that CF 1.7 is not out as of the writing of this answer. It is scheduled to be out Q1 of 2016 and at that time, this functionality will follow shortly afterwards.