We are switching to the new Discovery service from the Retrieve and Rank service on Bluemix. This requires quite a bit of code rewrite and I'm getting a strange error message while transferring our answer units to Discovery:
Error: The service is busy processing 20 of your documents. Please wait for some of the documents to complete, then try again.
Is there some kind of rate limit on inserting documents into Discovery? If so, how do we increase or better, eliminate it?
Yes, there is a rate limit for inserting documents into Watson Discovery. Larger environment sizes have higher rate limits, but every size does have a limit.
Background
The big difference between sending a document into Retrieve and Rank compared to Discovery is that sending a document into Retrieve and Rank is synchronous but sending a document into Discovery is asynchronous.
Retrieve and Rank will not respond to an API call to index a document until the processing is complete. Discovery responds as soon as the document has been accepted into the pipeline for processing and processing is likely to take several seconds to complete.
Discovery limits the number of documents in the processing pipeline per service instance and returns a 429 status when the limit is exceeded.
Retrieve and Rank has what amounts to the same kind of restriction: limiting the number of concurrent requests for a single service instance.
Related
I am computing a lot of stats in the backend, when one record changes,
At least 7-10 queries ran depending on what type of record changes. All of those queries are aggregating data and some of them span over the whole collection.
So I was thinking instead of computing them and holding the request's response. I should send the response and schedule a job that will compute those stats in workers.
What do you think of this solution? Is it a good approach or bad? I am looking for an optimal solution for this.
Is throughput the max number of requests a server instance can handle or is it the number of requests that the server instance is currently handling?
Edit: By "currently handling" I mean, the number of requests the server is receiving for a given time interval in recent time. For eg: The server is currently handling 400 reqs every min.
For eg:, I might have a server instance with a lot of hardware which can have high throughput, but I might be only receiving small amount of traffic. What does throughput measure in such a situation. Also, what about the inverse case, i.e if my instance can only handle x requests per min. but is receiving y>>>x requests per min.
If throughput is the max no. of requests a server can handle, how is it measured? Do we do a load/stress test, where we keep increasing the requests per min on the server until it cannot handle them anymore?
No, Throughput is an aggregation that depends on execution time, you can send 1000 requests in the same second and your server won't handle, but when you'll send 1000 requests in an hour and your server will handle it normally.
Throughput is calculated as requests/unit of time. The time is calculated from the start of the first sample to the end of the last sample. This includes any intervals between samples, as it is supposed to represent the load on the server.
The formula is: Throughput = (number of requests) / (total time).
You want to find the number of concurrent users that your server can handle by increasing JMeter threads until server reach his maximum
Throughput is the number of Samplers which JMeter executes within the duration of your test. If you want to see the actual amount of requests which are being sent - consider using i.e. Server Hits Per Second listener (can be installed using JMeter Plugins Manager)
If you see that your server resources consumption doesn't increase as you increase the load in JMeter the reasons are in:
Your application middleware configuration is not suitable for high load (i.e. there is a limit of concurrent sessions and requests are queuing up instead of being processed), check out Web Performance Optimization: Top 3 Server and Client-Side Performance Tips for overall ideas what could be looked at
Your application code doesn't utilize underlying OS resources efficiently, consider using profiler tool to see what's going on under the hood.
JMeter may fail to send requests fast enough, make sure to follow JMeter Best Practices and if JMeter's machine is overloaded - consider going for Distributed Testing
Coundant standatd plan is written as "100 reads / sec, 50 writes / sec, 5 global queries / sec".
Is this IO/s calculate end-to-end request? Or is it based on the query execution plan?
Let's give some examples
Q1.Let's say I use a Bulk operation to create 3 new documents in Cloudant (Bluemix Standard plan).
1 write operation? 3 write operation?
Q2. Query by aggregation(join)-1000 indexed docs with "name, age range, join time" and get as one docs.
1 read? 1000 + 1 read?
Q3. When I am using the standard plan (limit 100 reads / sec), it is assumed that 100 users executed the query in (Q2) at the same time.
How is IO calculated? 1 * 100 reads? (1000 + 1) * reads?
Do some users fail to execute queries because of limitation IO?
There is no data listed properly about Cloudant Price Method.
Can anyone please point me out correctly?
I want to know exactly how the standard plan calculation is measured.
It would be better if you could add a calculation example and answer!
Also answered here, https://developer.ibm.com/answers/questions/525190/how-do-calculate-io-operations-in-cloudant-account/
Bulk operations currently count as 1 W, regardless of the number of docs it contains.
A query is a request to a URL that has one of _design, _find or _search, again unreated to the number of documents actually involved. Note that some of these API endpoins (search) are paged, so it would be 1 Query per requested page of results.
I assume that by "100 users" you mean 100 concurrent connections using the same credentials, as Cloudant's rate limiting is applied per account. If so, the sum total of requests are counted towards the limit. When that bucket is full, any further requests will be cut off, and failed with a 429: Too Many Requests HTTP status code.
As an example, let's say you have a Standard account where you've set the rate limit to allow 100 queries per second. You have 100 concurrent connections hitting _find repeatedly, each query returning 1000 documents. Cloudant will allow 100 queries per second, so on average each of your connections will get 1 query per second fulfilled, and any attempts to push harder than that will results in 429 http errors. With 10 concurrent connections, on average each will get 10 qps etc.
Cloudant rate limits at the http level. No splitting of bulk ops into the constituent parts take place, at least not yet.
Documentation for how this all hangs together can be found here: https://cloud.ibm.com/docs/services/Cloudant?topic=cloudant-pricing#pricing"
The Cloudant offering in the IBM Cloud Catalog has a link to the documentation. In the docs is a description of the plans with additional examples. The docs also have sections that explain how read and write operations are calculated.
The http code 429 is returned by Cloudant to indicate too many requests. The documentation discusses this fact and there are code samples on how to handle it.
As per the Google Firebase documentation, Firestore will support upto 1,000,000 simultaneous connections and 10,000 writes per second per database. We have ran a scalalbility test to check the Firestore is viable for our solution and encountering the errors like "Connect to firebase.googleapis.com:443:Connection timed out" during write operations for 40K samples, "firestore.googleapis.com:443:failed to respond" during GET operations while the 40K samples writing. Would like to understand the Firestore limits and its scalalbility
Running JMeter script to write data to Firebase (Firestore Blaze Plan purchased for testing) on multiple VMs and PCs which are connected to wired network to check the scalalbility. The JMeter script write data to Firebase using REST API PATCH and each script on PC/VM writes 5K data in a period 5 minutes. There are total of 8 PC/VM which writes 40K data to Firebase. During this we also GET 500 records in 5 minutes 2 times a day. During this test we are htting the Firebase failures
The errors are "Connect to firebase.googleapis.com:443:Connection timed out" during write operations for 40K samples, "firestore.googleapis.com:443:failed to respond" and Connection reset.
Are you considering the other limits for your specific operations?
According to https://firebase.google.com/docs/firestore/quotas
For example, for the write operation:
Maximum writes per second per database = 10,000 (up to 10 MiB per second)
Maximum write rate to a document= 1 per second
Maximum write rate to a collection in which documents contain sequential values in an indexed field = 500 per second
Maximum number of writes that can be passed to a Commit operation or performed in a transaction = 500
During GET operations you have the limits: Maximum number of exists(), get(), and getAfter() calls per request:
10 for single-document requests and query requests.
20 for multi-document reads, transactions, and batched writes. The previous limit of 10 also applies to each operation.
For example, imagine you create a batched write request with 3 write operations and that your security rules use 2 document access calls to validate each write. In this case, each write uses 2 of its 10 access calls and the batched write request uses 6 of its 20 access calls.
Exceeding either limit results in a permission denied error.
Some document access calls may be cached, and cached calls do not count towards the limits.
I think that some parameter could be causing the abortion of these connections.
I have an issue with designing a database in mongo db.
So in general, the system will continuously gather insight user data (e.g. likes, retweets, views) from different social websites apis (twitter api , instagram api , fb api) with with different rate of each channel. While also saving each insight every hour as historical data . These current real time insights should be viewed by users in the website.
Should I save the insight data in cache and the historical insight data in document ?
What is the expected write rate and query rate?
What rate will the dataset grow at? These are key questions that will determine the size and topology of your MongoDB Cluster. If your write rate does not exceed the write capacity of a single node then you should be able to host your data on a single replica set. However, this assumes that your data set is not large (>1TB). At that size recovery from a single node failure can be time-consuming (it will not cause an outage but the longer a single node is down the higher the risk of a second node failing).
In both cases (write capacity exceeds a single node or dataset is larger than 1TB) the rough guidance is that this is the time to consider a [sharded cluster][2]. Design of a sharded cluster is beyond the scope of a single StackOverflow answer.