I understand that local-ssd disk does not incur any charges as it is located with the compute engine. Assuming the disk to be standard persistent disk, does the ingress and egress charges between the compute engine and disks billable ?
No, there are no network charges associated with Persistent Disk. You are charged for the amount of provisioned space per disk (per GB, per month).
There are additional charges related to snapshots (for total snapshot size) and for network egress if you choose to restore a snapshot across regions.
Depends, as it is stated in the Google Cloud Platform website.
Egress in between the same zone is free of charge, but any egress from a zone will be billed, depending from where to where you move the data. This includes any data movement from persistent disks or buckets.
Even if you have a subnet in between two zones in the same region, you will be charged for the egress.
Related
I have an t4g.small RDS running Postgres. I seem to be constantly battling a low EBS Byte Balance (percent). Burst balance and EBS IO Balance are always near 100%. The documentation seems to say this is tied to throughput credits, but I don't understand why EBS Byte balance is the only thing affected.
When EBS Byte balance drops to 0 my website slows to a crawl.
I'm running a load balanced Django/Python web app. 5000-7000 page views a day. It has endpoints that receive upload from users in the 3-4mb range. I get about 20-30 of those uploads a day
monitoring metrics
storage settings
full metrics
I've tried increasing the allocated storage twice but it had no effect. The only post I see with a similar problem says they increase to 1000 GiB storage but it didn't help. Is that what I should try?
I need a breakdown of my usage inside a single project categorized on the basis of Pods or Services or Deployments but the billing section in console doesn't seem to provide such granular information. Is it possible to get this data somehow? I want to know what was the network + compute cost on per deployment or pods.
Or maybe if it is possible to have it atleast on the cluster level? Is this breakdown available in BigQuery?
Recently it was released a new features in GKE that allows to collect metrics inside a cluster that can also be combined with the exported billing data to separate costs per project/environment, making it possible to separate costs per namespace, deployment, labels, among other criteria.
https://cloud.google.com/blog/products/containers-kubernetes/gke-usage-metering-whose-line-item-is-it-anyway
It's not possible at the moment to breakdown the billing on a pod level, services or deployment, Kubernetes Engine uses Google Compute Engine instances for nodes in the cluster. You are billed for each of those instances according to Compute Engine's pricing, until the nodes are deleted. Compute Engine resources are billed on a per-second basis with a 1 minute minimum usage cost.
You can Export Billing Data to BigQuery enables you to export your daily usage and cost estimates automatically throughout the day to a BigQuery dataset you specify. You can then access your billing data from BigQuery then you can use BigQuery queries on exported billing data to do some breakdown.
You can view your usage reports as well and estimate your kubernetes charges using the GCP Pricing Calculator. If you want to move forward you can create a PIT request as a future request
You can get this visibility with your GKE Usage Metering dataset and your BigQuery cost exports.
Cost per namespace, cost per deployment, per node can be obtained by writing queries to combine these tables. If you have labels set, you can drilldown based on labels too. It shows you what's the spend on CPU, RAM, and egress cost.
Check out economize.cloud - it integrates with your datasets and allows you to slice and dice views. For example, cost per customer or cost per service can be obtained with such granular cost data.
https://www.economize.cloud/blog/gke-usage-cost-monitoring-kubernetes-clusters/
New GCP offering: GKE Cost Allocation allows users easily and natively view and manage the cost of a GKE cluster by cluster, namespace pod labels and more right from the Billing page or export Detailed usage cost data to Big Query:
https://cloud.google.com/kubernetes-engine/docs/how-to/cost-allocations
GKE Cost Allocation is a more accurate and robust compare to GKE Usage Metering.
Kubecost provides Kubernetes cost allocation by any concept, e.g. pod, service, controller, etc. It's open source and is available for GKE, AWS/EKS, and other major providers. https://github.com/kubecost/cost-model
Google cloud storage documentation (https://cloud.google.com/storage/docs/storage-classes) suggests that there is "minimum storage duration" for Nearline (30days) and Coldline (90 days) storage but there is no description about regional and multi-regional storage.
Does it mean that there is absolutely no minimum storage duration? Is the unit by microsecond, second, minute, hour, or a day?
For example, (unrealistically) suppose that I created a google cloud storage bucket, and copied 10 petabyte of the data to the bucket, and removed each piece of data 1 minute after (or moved the data to another bucket I don't have to pay), then, is the cost of regional storage will be
$.02 (per GB per month) * 10,000,000 (GB) * 1/30/24/60 (months) = $4.62 ?
If the "unit" of GCS bucket usage time is an hour rather than a minute, then the cost will be $278. If it is 12 hours, the cost will be $3333, so there are huge differences in such an extreme case.
I want to create a "temporary bucket" that holds petabyte-scale data in a short period of time, and just wanted to know what the budget should be. The previous question (Minimum storage duration Google Cloud Storage buckets) did not help in answering to my question.
There is no minimum storage duration for regional or multi-regional buckets, but keep in mind that you will still have to pay operations costs to upload. At the time of this writing, that would be $0.05 per 10,000 files uploaded.
I believe the granularity of billing is seconds, so your initial calculation $4.62 would be correct. But note depending on the average size of your files, uploading a petabyte is likely to be more expensive than that; if you have a petabyte of files that are 100MB in size, the operations cost for the uploads would be ~$50.
My app runs a daily job that collects data and feeds it to a mongoDB. This data is processed and then exposed via rest API.
Need to setup a mongodb cluster in AWS, the requirements:
Data will grow about the same size each day ( about 50M records), so write throughput doesn't need to scale. writes would be triggered by a cron at a certain hour. Objects are immutable ( they won't grow)
Read throughput will depend on number of users / traffic, so it should be scalable. traffic won't be heavy in the beginning.
Data is mostly simple JSON, need a couple of indices around some of the fields for fast-querying / filtering.
what kind of architecture should I use in terms of replica sets, shards, etc ?.
What kind of storage volumes should I use for this architecture? ( EBS, NVMe) ?
Is it preferred to use more instances or to use RAID setups. ?
I'm looking to spend some ~500 a month.
Thanks in advance
To setup the MongoDB cluster in AWS I would recommend to refer the latest AWS quick start for MongoDB which will cover the architectural aspects and also provides CloudFormation templates.
For the storage volumes you need to use EC2 instance types that supports EBS instead of NVMe storage since NVMe is only an instance storage. If you stop and start the EC2, the data in NVMe is lost.
Also for the storage volume throughput, you can start with General Purpose IOPS with resonable storage size and if you find any limitations then only consider Provisioned IOPS.
For high availability and fault tolerance the CloudFormation will create multiple instances(Nodes) in MongoDB cluster.
According to this MongoDB tutorial which explains how to manually deploy MongoDB on EC2, one of the steps states that you should have:
"Individual PIOPS EBS volumes for data (1000 IOPS), journal (250 IOPS), and log (100 IOPS)."
Why do I need individual EBS volumes for journal, log, and data?
Can I just combine these into one EBS volume?
MongoDB team may have experienced that IOPS needs for data is highest, log is the lowest and journal is somewhere in the middle. Although I am less familiar with MongoDB, I suspect that some of the reasons why they might be suggesting different EBS volumes include:
cost saving: provision right amount of IOPS based on needs will save $. If it was all on a single partition, you'd use maximum IOPS of 1000 and end up paying more
snapshot: you could snapshot data at a different (more frequent?) interval
contention: data, journaling and logging will not contend with each other if they are on different volumes
scaling: you could scale data volume separately from journal and log volumes
risk reduction: if data volume has troubles you could restore from backup and reapply journal (I assume you can), and analyze at logs too
The reason for separating your deployment storage across 3 volumes is that database journal files and log files are sequential in nature, and as such, have different access patterns compared to data files. Separating data files from journal and/or log files, particularly with a write intensive workload, will provide an increase in performance by reducing I/O contention. Depending on your workload, and if you are experiencing high I/O wait times, you may be able to benefit from separate disks for your data files, journal, and log files.
The answer was taken from https://www.mongodb.com/blog/post/maximizing-mongodb-performance-on-aws