Migrating Cassandra to GCP - kubernetes

I am migrating Cassandra to Google Cloud and I have checked out few options like deploying cassandra inside Kubernetes, using Datastax Enterprise on GCP and Portworks etc., but not sure which one to use. Can someone suggest me with better options that you have used to deploy Cassandra on cloud?

As Carlos Monroy mentioned in his comment is correct, this is wide-ranging, it highly depends on the use case, number of users, SLA. I've found these links useful that describes how to deploy Cassandra in GCE and how to run Cassandra in GKE with stateful sets. This documentation will guide you to about DataStax Distribution of Apache Cassandra on GCP Marketplace You can also consider the cost between running those products. You can estimate the charges using GCP pricing calculator.

Related

What benefits does Cloud Composer provide over a Helm chart and GKE?

As I dive into the world of Cloud Composer, Airflow, Google Kubernetes Engine, and Kubernetes I've not yet found a good answer to what exactly makes Cloud Composer better than Helm and GKE.
Here are some things I've found that could be unique to Composer but mostly seem like they could be handled by GKE.
On their homepage:
End-to-end integration with Google Cloud products including BigQuery, Dataflow, Dataproc, Datastore, Cloud Storage, Pub/Sub, and AI Platform gives users the freedom to fully orchestrate their pipeline.
On the features page:
Identity-Aware Proxy protects the interface
Cloud Composer associates a Cloud Storage bucket with the environment. The associated bucket stores the DAGs, logs, custom plugins, and data for the environment.
The downsides of Composer I've seen include:
It takes many hours to spin up a new instance
It doesn't support Kubernetes Executor
It is risky to change the underlying GKE config because it could be changed back by a composer update
There are often errors that happen when auto-scaling often happen but are documented as known
Upgrading environments is still beta
To be clear, I'm not saying Cloud Composer is bad. I'm just having trouble seeing why people like it. When I've asked folks why it is better than Helm + GKE they haven't had any compelling answers despite that they can tell many stories of Composer being unpredictable and having lots of issues.
Are you comparing the same things?
On one side, GKE, you have a container orchestrator. Declare that you want, it will deploy and maintain the stability of the cluster according with declared configuration. This configuration can be packaged with helm to write it in an easier mode. Because you deploy container, you can use the language that you want in your services.
On the other side, you have a workflow manager, with scheduler, retry policies, parallel task, context forwarding. you write DAG in python (only!) and you have operators to interact with external product/services. It's mainly designed for data processing and used a lot by data scientist and data engineering team.
Note: Cloud Composer is deployed on top of GKE (scheduler and worker), redis, app engine and Cloud SQL.
You compare 2 different worlds: Ops world (GKE/Helm) and the App/Data world (Composer/Airflow). Have a look to this new video
Update 1:
My bad, I didn't understand!!! Anyway, personally I don't want to manage things by myself: a cluster, the update of K8S, VM patching, replicas, snapshot, backup/restore,...
If someone can do this for me, I prefer, and managed services are perfect for me!!
Do you ask yourselves this question about Cloud SQL and a database managed by yourselves on a Compute Engine instance? If not (because Cloud SQL solve a lot of boring issues), my opinion is the same for Composer.
But it's an opinion, I didn't test both and compare the performance, cost and easiness.

How to deploy the postgresql server in Google cloud

I am trying to deploy my postgresql server to google cloud, like how we deploy in heroku. But i am not finding any tutorial or proper docs to start.
Can any one please help me in this, Thanks!
You can easily migrate a postgres database to Google Cloud SQL.
Basically it involves, creating an SQL instance, a replication using a Compute engine VM. Then seeding, and migrating your data.
The official documentation for this from google is here;
Migrate an on-premises PostgreSQL cluster to Google Cloud
This is a very good post giving a detailed step by step guide for the entire process.
How to migrate PostgreSQL databases to Google Cloud SQL

How to get data in Anthos Metrics for Kubernetes clusters

We have one project and there are two clusters inside. We would like to monitor and set alert policies for plenty of parameters like kube_pod_status_phase, kube_pod_container_status_restarts_total, etc. We are able to see all these parameters in Metric Explorer (with prefix kubernetes.io/anthos/..) but it doesn't show any data. Can anyone please guide us if any other configurations are missing to use Anthos Metrics? Or if anyone can provide a guide or steps to use Anthos Metrics?
Note: We have Istio configured in both clusters and we are using Workload Identity feature as well.
Any help would be highly appreciated.
Thank you.
I don't think you want to use this metrics.
Anthos, Anthos GKE and GKE are 3 different google products.
GKE:
is an enterprise-grade platform for containerized applications, including stateful and stateless, AI and ML, Linux and Windows, complex and simple web apps, API, and backend services. Leverage industry-first features like four-way auto-scaling and no-stress management. Optimize GPU and TPU provisioning, use integrated developer tools, and get multi-cluster support from SREs.
Anthos
is an open hybrid and multi-cloud application platform that enables you to modernize your existing applications, build new ones, and run them anywhere in a secure manner. Built on open source technologies pioneered by Google—including Kubernetes, Istio, and Knative—Anthos enables consistency between on-premises and cloud environments and helps accelerate application development.
Anthos GKE
is part of Anthos, lets you take advantage of Kubernetes and cloud technology in your data center and in the cloud. You get Google Kubernetes Engine (GKE) experience with quick, managed, and simple installs as well as upgrades validated by Google. And Google Cloud Console gives you a single pane of glass view for managing your clusters across on-premises and cloud environments.
If you will check information about Anthos GKE pricing you can read that:
Anthos is available as a monthly, term-based subscription service. Anthos subscription is required to use Anthos GKE. For pricing please contact sales.
So to get Anthos metrics, you would need to use Anthos GKE, which requires Anthos subscription. It can produce more costs, for details you would probably need to contact sales.
For monitoring purposes you should check possibilities described here and choose what would fit you best.
However, the most used ways are to use Prometheus on GKE and Stackdriver.
In addition, in the web you can find many HowTo regarding Monitoring on GKE like this tutorial.

Kubeflow pipeline on AWS/On-prem currently feasible?

I’m testing kubeflow pipeline and would like to use it on AWS/On-prem but I saw the below comment on the documentation. Should I wait using it with AWS/on-prem?
Due to kubeflow/pipelines#345 and kubeflow/pipelines#337, Kubeflow Pipelines depends on Google Cloud Platform (GCP) services and some of the functionality is currently not supported by non-GKE clusters.
I'd suggest to wait. See the status update on the issues and acitivity on kubeflow/pipelines#1131 to enable the support on AWS. Similar work is in progress for supporting on-prem as well.
On-prem is definitely possible.
KFP is not tied to GCP and can work even on Windows tables.
AWS support is mostly provided by the community, but most of the issues have been fixed. There are samples that show demonstrate KFP on AWS and there are multiple AWS and Sagemaker components created by people form AWS. https://github.com/kubeflow/pipelines/blob/091316b8bf3790e14e2418843ff67a3072cfadc0/samples/contrib/aws-samples/titanic-survival-prediction/titanic-survival-prediction.py

selecting best way for deploying MongoDB on Cloud Platform?

I am using google cloud platform for my project and planning to use mongodb cloud service.
I am confused in selecting the MongoDB on Cloud Platform.
I had read this link which tells 3 ways through which we can deploy mongodb.
Please help me out for selecting best option.
Recommendation
I have used all three options for deploying MongoDB on the Cloud Platform and believe the Cloud Launcher for MongoDB is by far your best choice.
Justification
I would like to address each of the three deployment options and explain my reasoning.
Cloud Launcher for MongoDB
The Cloud Launcher for MongoDB is what I would recommend. It's much simpler than creating a MongoDB database in any other way, since there are presets and you click through a nice UI. This was the way I created my first MongoDB database and felt pretty confident throughout the setup process.
MongoDB Cloud Manager
The MongoDB Cloud Manager is a more advanced version of the Cloud Launcher for MongoDB. It supports "more complex deployments... ...such as complex replica sets or sharded clusters." You will be able to work your way towards these complex deployments with the Cloud Launcher for MongoDB, without being overwhelmed immediately.
Google Cloud Deployment Manager
The Google Cloud Deployment Manager "lets you automate the setup of [the] MongoDB Cloud Manager." The MongoDB Cloud Manager is already more complicated than the Cloud Launcher for MongoDB, so there is no reason for you to automate deployments at this point.
Documentation Quoted in this Answer