I have the following questions with regards to GCS:
Is there a limit on a number of blobs a bucket can contain?
I am using java client to mass create blobs in a multi-threaded application. Is there a limit on a number of blobs created per time unit?
Does Google Cloud Storage API flag requests as a potential DDoS attack when a particular threshold is reached?
The Quotas and Limits page has some documentation about this. Specifically:
No, there is no limit on the number of objects in a bucket.
From the docs page:
There is no limit to writes across multiple objects. Buckets initially
support roughly 1000 writes per second and then scale as needed.
Related
I have an issue with designing a database in mongo db.
So in general, the system will continuously gather insight user data (e.g. likes, retweets, views) from different social websites apis (twitter api , instagram api , fb api) with with different rate of each channel. While also saving each insight every hour as historical data . These current real time insights should be viewed by users in the website.
Should I save the insight data in cache and the historical insight data in document ?
What is the expected write rate and query rate?
What rate will the dataset grow at? These are key questions that will determine the size and topology of your MongoDB Cluster. If your write rate does not exceed the write capacity of a single node then you should be able to host your data on a single replica set. However, this assumes that your data set is not large (>1TB). At that size recovery from a single node failure can be time-consuming (it will not cause an outage but the longer a single node is down the higher the risk of a second node failing).
In both cases (write capacity exceeds a single node or dataset is larger than 1TB) the rough guidance is that this is the time to consider a [sharded cluster][2]. Design of a sharded cluster is beyond the scope of a single StackOverflow answer.
I am using node.js application running on Google Compute engine to create GCS bucket for each user. The bucket creation is a one time activity for each user. But when I try to run the program to create unique buckets for 20 users in parallel, I am getting the below error.
"Error code":429 and "Error message":"The project exceeded the rate limit for creating and deleting buckets."
Is there anyway I can increase this limit?
No.
There is a per-project rate limit to bucket creation and deletion of approximately 1 operation every 2 seconds, so plan on fewer buckets and more objects in most cases. If you're designing a system that adds many users per second, then design for many users in one bucket (with appropriate ACLs) so that the bucket creation rate limit doesn't become a bottleneck.
See https://cloud.google.com/storage/docs/quotas-limits.
This page - http://docs.couchbase.com/admin/admin/Misc/limits.html mentions 'Max Buckets per Cluster - Default is 10 (can be adjusted by users)'. It is not clear whether that is combined limit for both couchbase and memcached buckets or it only applies to couchbase buckets.
I am interested in knowing if there are any limits on the number of memcached buckets?
We don't formally test the limits for memcached buckets, but the overhead of each memcached bucket is relatively low in the data layer (the management and cluster layer consumes more resources and will most likely be your limiting factor). I ran a quick test on my laptop this morning and I could easily create 50 + buckets and the data storage layer consumed < 1% CPU in an idle state (and the cluster and management layer consumed 36%). The overhead during normal client operation is relatively small and occur when the client connects and it needs to look up the credentials and the internal bucket in a pool of n items instead of max 10 items.
People typically deploy Couchbase buckets over memcached buckets due to replication, indexes, persistence, N1QL etc and all of this functionality consumes resources.
It is also very hard define such limits because the hardware will of course count just as much :-)
Oh, And I don't know how the cluster UI behaves with a high number of buckets given that all our formal testing is within the 10 bucket limit.
I have multiple servers/workers going through a task queue doing API requests. (Django with Memcached and Celery for the queue) The API requests are limited to 10 requests a second. How can I rate limit it so that the total number of requests (all servers) don't pass the limit?
I've looked through some of the related rate limit questions I'm guessing they are focused on a more linear, non concurrent scenario. What sort of approach should I take?
Have you looked in Rate Limiter from Guava project? They introduced this class in one of the latest releases and it seems to partially satisfy your needs.
Surely it won't calculate rate limit across multiple nodes in distributed environment but what you coud do is to have rate limit configured dynamically based on number of nodes which are are running (ie for 5 nodes you'd have rate limit of 2 API requests a second)
I have been working on an opensource project to solve this exact problem called Limitd. Although I don't have clients for other technologies than node yet, the protocol and the idea are simple.
Your feedback is very welcomed.
I solved that problem unfortunately not for your technology: bandwidth-throttle/token-bucket
If you want to implement it, here's the idea of the implementation:
It's a token bucket algorithm which converts the containing tokens into a timestamp since when it last was completely empty. Every consumption updates this timestamp (locked) so that each process shares the same state.
I have written a script to automate provisioning a series of Azure VMs configured in its own virtual network and AD domain for testing.
Midway thru testing the script, the error message occurs as:
SubscriptionRequestsThrottled: Number of read requests for subscription 'xxx-...' exceeded the limit of '15000' for time interval '01:00:00'
This error appears intermittently at different steps in the script so I'm assuming it is indeed a throttling issue. My question is if there is a way to configure the limit and time-interval values so that this script can execute swiftly without these errors occurring?
The total request rate limit for a storage account is 20,000 IOPS. If a virtual machine utilizes the maximum IOPS per disk, then to avoid possible throttling, ensure that the total IOPS across all of the virtual machines' VHDs does not exceed the storage account limit (20,000 IOPS).
You can roughly calculate the number of highly utilized disks supported by a single storage account based on the transaction limit. For example, for a Basic Tier VM, the maximum number of highly utilized disks is about 66 (20,000/300 IOPS per disk), and for a Standard Tier VM, it is about 40 (20,000/500 IOPS per disk). However, note that the storage account can support a larger number of disks if they are not all highly utilized at the same time.
See https://azure.microsoft.com/en-us/documentation/articles/azure-subscription-service-limits/ for more information.