Using Snakemake with Azure Kubernetes Services [closed]

Using Snakemake with Azure Kubernetes Services [closed] - kubernetes

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
Does anyone have a working example of using Snakemake with Azure Kubernetes Service (AKS)? If it is supported, which flags and setup are needed to use the Snakemake Kubernetes executor with AKS? What material there is out there is mostly on AWS with S3 buckets for storage.

I have never tried it, but you can basically take this as a blueprint, and replace the google storage part with a storage backend that is working in Azure. As far as I know, Azure has its own storage API, but there are workarounds to expose an S3 interface (google for Azure S3). So, the strategy would be to setup an S3 API, and then use the S3 remote provider for Snakemake. In the future, Snakemake will also support Azure directly as a remote provider.

you are aware of this already, but for the benefit of others:
Support of AKS has been build into Snakemake now. This works even without a shared filesystem. Most parts of the original blog post describing the implementation have made it into the official Snakemake executor documentation.
In a nutshell: upload your data to blob and deploy an AKS cluster. The run Snakemake with these flags: --default-remote-prefix --default-remote-provider AzBlob and --envvars AZ_BLOB_ACCOUNT_URL AZ_BLOB_CREDENTIAL, where AZ_BLOB_CREDENTIAL is optional if you use a SAS in the account URL. You can use your Snakefile as is.

Related

Is it possible to sync secrets between Azure Key Vault and Kafka on HDinsight? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 19 days ago.
Improve this question
I currently have an application deployed on AKS, which produces to a Kafka topic, with Kafka deployed on HDinsight. I want to implement a SASL/OAUTHBEARER as the security mechanism.
However, I'd also like the secrets to be stored in Azure Key Vault (AKV).
Is it possible to sync the secrets store in AKV with Kafka on HDinsight?
I have not tried it yet as I didn't find any documentation online that would indicate its feasibility, hence looking for guidance on this issue.

Need to setup a Customized Kubernetes Logging strategy [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
So far in our legacy deployments of webservices to VM clusters, we have effectively been using Log4j2 based multi-file logging on to a persistent Volume where the log files are rolled over each day. We have a need to maintain logs for about 3 months, before they can be purged.
We are migrating to a Kubernetes Infrastructure and have been struggling on what would be the best logging strategy to adapt with Kubernetes Clusters. We don't quite like the strategies involving spitting out all logging to STDOUT/ERROUT and using come centralized tools like Datadog to manage the logs.
Our Design requirements for the Kubernetes Logging Solution are:
Using Lo4j2 to multiple files appenders.
We want to maintain the multi-file log appender structure.
We want to preserve the rolling logs in archives for about 3-months
Need a way to have easy access to the logs for searching, filtering etc.
The Kubectrl setup for viewing logs may be a bit too cumbersome for our needs.
Ideally we would like to use the Datadog dashboard approach BUT using multi-file appenders.
The serious limitation of Datadog we run into is the need for having everything pumped to STDOUT.

Start using containers platforms or building containers means that as a first step we must to change our mindset. Create logs files in your containers is not the best practices for two reasons:
Your containers should be stateless, so the should not save anything inside of it, because when it is deleted and created again your files will be desapeared.
When you send your outputs using Passive Logging(STDOUT/STDERR), Kubernetes creates the logs files for you, this files can be used by platforms like fluentd or logstash to collects those logs and send it to a log aggregation tool.
I recommend to use the Passive Logging which is the recommended way by Kubernetes and the standard for cloud native applications, maybe in the future you will need to use your app in a cloud services, which also use Passive Logging to check application errors
In the following links you will see some refereces about why k8s recommends to use Passive Logging:
k8s checklist best practices
Twelve Factor Applications Logging

AWS MSK vs Confluent for hosting Kafka? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
In terms of investing for the most value, how does AWS MSK compare to Confluent when it comes to hosting an end to end Kafka event sourcing?
The main criteria to be used for comparing are:
Monitoring
Ease of deployment and configuring
Security

I have used open-source, on-prem Cloudera, and MSK. When comparing them together they have all had their quirks.
If you just base on the speed of provisioning a secure Kafka cluster. I think MSK would win hands down. Someone with Kafka, AWS Certificate Manager, and Terraform can get it all done very quickly. Though there are a few issues around Terraform TLS and AWS CLI but there are workarounds.
If you are planning to use Kafka Connect then confluent makes lots of sense.
If you have Kafka developers who have experience in writing Kafka Connect sinks and source. Well, then you may not need a subscription-based model from confluent. Though you may not save a lot of money. Either spend in development or spend in subscription costs.
If you like serverless - MSK is quite good. However, there is no SSH access to the Kafka cluster. You cannot tune the JVM.
Monitoring is built out of the box for MSK via open monitoring via JMX metrics and prometheus. You also have CloudWatch as well. But open monitoring pretty much gives all the metrics you need. In open-source, you can easily deploy monitoring. Rather MSK is doing the same.
MSK provides security using either TLS or IAM. Though there are some issues around enabling IAM-based security for MSK using Terraform. 2 Way TLS Client authentication is quite easy to set up.
MSK also provides auto-scaling but again if you are planning to use Terraform there may be some interoperability issues.
I am sure folks here can add a lot more on confluent.

What AWS services do I need for Angular, Spring Boot and PostgreSQL projects? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
In researching the many AWS offerings and plans, I'm overwhelmed by the terminology and pricing around Docker, RDS, EC2, Beanstalk, and trying to wrap my head around it all. In the end, all we'd like is the cheapest way to host internal Angular 7+ apps that have a corresponding Spring Boot REST API which pulls from a PostgresSQL database. Of course each app/REST/DB stack should have a dev, test, and prod environment as well. Utilizing AWS, what is a good and cost-effective way to achieve these requirements?

Angular - Use S3 and CloudFront (Static content)
Spring Boot Rest API's - Use EC2, Beanstalk or Lambda (for serverless)
PostgreSQL - Use RDS or install it on EC2 instance.

For Angular and Spring Boot Rest APIs, you can host both of them inside the EC2 machine.
For Database, You can host the postgresql servers on EC2 machines for dev and test environments and for production you can choose RDS.

is there any managed mongo DB service that AWS provide? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 4 years ago.
Improve this question
I am looking to use mongo DB for my project but dont want to go in administrative overhead to manage mongo services.
As my project is currently hosting most of its component on AWS, i am looking for a managed mongo DB service (if any) provided by AWS.
AWS provides Dynamo DB as managed service and its well documented but accesing Mongo DB managed service over AWS is not very clear to me.
I have read about Mongo DB managed service - 'Atlas' but not sure can i access it as a service in my existing AWS instances.
Please provide your inputs for the best practice suitable for this scenario.

There is no Managed MongoDB Service provided by AWS.
However, there are managed MongoDB services which provides hosting on AWS (in addition to Azure, GCP etc. MongoDB Atlas is an example.
MongoDB Atlas provides managed mongoDB service with options to host on AWS and you may opt to use that. You can choose the region of your preference and then use VPC Peering feature to make the application servers in your existing VPC/Account communicate with the MongoDB Atlas Setup.
You can read more about all these at https://www.mongodb.com/cloud/atlas

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse