Can I use n2 or n2d machine types with Cloud Dataproc? - google-cloud-dataproc

I want to use Cloud Dataproc with n2 and other machine types that are not n1s. When I look at the Dataproc pricing and the Google Cloud Console it looks like I can only use n1 machine types.
Is there any way to use n2 and other machine types, like n2d? These machine types may save me money or be more appropriate for my workloads.

Cloud Dataproc does support machine types other than n1 machine types when using the gcloud command line tool. If you specify a n2 (or other machine type) it should use that machine type appropriately. For example:
--master-machine-type=n2-standard-4 --worker-machine-type=n2-standard-4
Keep in mind, the machine type you want to use must be available in the zone and region you specify. Support for machine types other than n1 is coming soon in the Cloud Console.
Disclaimer - I am a PM for Dataproc and this is a common question as of Feb 2020.

Related

Specifying data for specific machines in cloud-init

Is it possible, using metadata and cloud init, to set parameters for an already deployed specific machine or a specific group of machines?

What's the value proposition of running Cloud Run versus a normal service in GKE?

Is there any advantage if I use Cloud Run instead of deploying a normal service/container in GKE?
I will try to add my perspective.
This answer does not cover running containers in Google Cloud Run Kubernetes. The reason is that we wanted an almost zero cost solution for a legacy PHP website. Cloud Run fit perfectly and we had an easy time both porting the code and learning Cloud Run.
We needed to do something with a legacy PHP website. This website was running on Windows Server 2012, IIS and PHP 7.0x. The cost was over $100.00 per month - mostly for Windows licensing fees for a VM in the cloud. The site was not accessed very much but was needed for various business reasons.
A decision was made Thursday (4/18/2019) was that we needed to learn Google Cloud Run, so we decided to port this site to a container and try to run the container in Google Cloud. Nothing like a real world example to learn the details.
Friday, we ported the PHP code to Apache. Very easy process. We did not worry about SSL as we intend to use Cloud Run SSL.
Saturday we started to learn Cloud Run. Within an hour we had the Hello World PHP example running. Link.
Within two hours we had the containerized website running in Cloud Run. Again, very simple.
Then we learned how to configure Cloud Run SSL with our DNS server.
End result:
Almost zero cost for a PHP website running in Cloud Run.
Approximately 1.5 days of effort to port the legacy code and learn Cloud Run.
Savings of about $100.00 per month (no Windows IIS server).
We do not have to worry about SSL certificates from now on for this site.
For small websites that are static, Cloud Run is a killer product. The learning curve is very small even if you do not know Google Cloud. You just need to configure gcloud for container builds and deployment. This means developers can be independant of needing to master GCP.
There are many distinctions in using Cloud Run to expose a service as compared to running it natively in GKE. The primary of these is that Cloud Run provides more of a serverless infrastructure. Basically you declare that you want to expose a service and then let GCP do the rest. Contrast this with creating a Kubernetes cluster and then defining your service in pods. With a manually created GKE cluster, the nodes and environment are always on which means that you are billed for them regardless of utilization. With Cloud Run, your service is merely available and you are only billed for actual consumption. If your service not being called, your costs are zero. Another advantage is that you don't have to predict your utilization needs and allocate sufficient nodes. Scaling happens automatically for you.
See also these presentations from Google Next 19:
Migrating from a Monolith to Microservices (Cloud Next '19)
What's New in Serverless Compute? (Cloud Next '19)
Run Containers on GCP's Serverless Infrastructure (Cloud Next '19)
Run Cloud Functions Everywhere (Cloud Next '19)
Container Once, Serverless Anywhere (Cloud Next '19)

How to execute Amazon Lambda functions on dedicated EC2 server?

I am currently developing the backend for my app based on Amazon Web Services. I pretended to use DynamoDB to store the user's data, but finally opted for MongoDB, which I have already installed in my EC2 instance.
I have some code written in Python to update/query... the DB, so that when a Cognito event triggers my lambda function, this code is directly executed on my instance so I can access my DB. Any ideas how can I accomplish this?
As mentioned by Gustavo Tavares, "the whole point of lambda is to run code without the need to deploy EC2 instances". And you do not have to put your EC2 with database to "public" subnets for Lambda to access them. Actually, you should never do that.
When creating/editing Lambda configuration you may select to run it in any of you VPCs (Configuration -> Advanced Settings -> VPC). Then select Subnet(s) to run your Lambda in. This will create ENIs (Elastic Network Interface) for the virtual machines you Lambdas will run on.
Your subnets must have Routing/ACL configured to access the subnets where Database resides. At least one of the SecurityGroups associated with Lambda must also have Outbound traffic allowed to the Database subnet on appropriate ports (27017).
Since you mentioned that your Lambdas are "back-end" then you should probably put them in the same "private" subnets as your MongoDB and avoid any access/routing headache.
One way to accomplish this is to give the Lambda a SAM Template, then use sam local invoke inside of the EC2 instance to execute locally.
OK BUT WHY OH WHY WOULD ANYONE DO THIS?
If your Lambda requires access to both a VPC and the Internet, and doesn't use a lot of memory and doesn't really require scalability, and you already wrote the code (*), it's actually 10x cheaper(**) and higher-performing to launch a t3.nano EC2 Spot Instance on a public subnet than to add a NAT Gateway to the Lambda function.
(*) if you have not written the code yet, don't even bother to make it a Lambda.
(**) 10x cheaper as in $3 vs $30, so this really only applies to hobbyist projects on a shoestring budget. Don't do this at work, because the cost of engineers' time to manage and maintain an EC2 instance will far exceed $30/month over the long term.
If you want Lambda to execute code on your ec2-instances you'll need to use the SDK for the language you're writing your lambda in. Then you can simply use the AWS API to run commands on your EC2 instance.
See: http://docs.aws.amazon.com/systems-manager/latest/userguide/run-command.html
I think you misunderstood the idea of AWS lambda.
The whole point of lambda is to run code without the need to deploy EC2 instances. You upload the code and the infrastructure is provisioned on the fly. If your application does not need the infrastructure anymore (after a brief period), it vanishes and you will not be charged for the idle time. If you need it again a new infrastructure is provisioned.
If you have a service, like your MongoDB, running in EC2 instances your lambda functions can access it like any other code. You just need configure your lambda code to connect to the EC2 instance, like you would be doing if your database were installed in any other internet faced server.
For example: You can put your MongoDB server in a public subnet of your VPC and assign an elastic IP for your server. In your Python lambda code you configure your driver to connect to this elastic IP and update the database.
It will work like every service were deployed in different servers across internet: Cognito connect to Lambda functions across internet and then the python code deployed in lambda connect to your MongoDB across internet.
If I can give you an advice, try DynamoDB a little more. With DynamoDB it will be even more simple to make all this work, because you will not need to configure a public subnet and request an elastic IP. And the API for DynamoDB is not very different of the MongDB API.

How to change the machine type of a Google Cloud SQL failover replica

I recently changed the machine type for a Google Cloud SQL instance, but it did not automatically change the machine type for the failover replica. When I edit the replica, the button to change the machine type is greyed out. Does anyone know how I can change the machine type for the replica?
At the moment, it's not possible via the UI.
Until that's fixed, you can use gcloud to perform the change:
cloud sql instances patch --project=my-project my-instance --tier new-tier

Setting up backup strategy for backing up postgresql database on cloud foundry

We have setup a community postgresql service on Cloud Foundry (IBM Blumix). This is a free service and no automated backup and recovery is supported out of the box.
Is there a way to set up a standby server or a regular backup in case there is any data corruption/failure?
IBM compose and ElephantSQL can provide this service at a cost, butwe are not ready for it yet.
PostgreSQL is an experimental service and there is not a dashboard and other advanced features (Daily backup for example) that you can find in other services that you mentioned. If you want to do a backup you could write an ad-hoc script that 'saves'\exports all tables as you want and run it every day.
If you need PostegreSQL you can create a PostegreSQL by compose service $17.50 / mo for the first GB and $12 for Extra GB )
We used Postgresql Studio and deployed it on IBM Bluemix. The database service was connected to the pgstudio interface (This restricts the access to only connected databases). We also had to make minor changes to pgstudio so that we could use pg_dump with the interface.
The result: We could manually dump the data. This solution works well as we could take regular dumps (though manually).
In the free tier you are right in saying that you cant get the backup. Those features are available only in Compose for PostgresSQL service - but that's a paid service.