I am trying to collect information from SoundCloud which involves collecting lists beyond the limit = 200 and offset = 8000 constraints. For example, I would like to collect all the followers of a user, and some users have hundreds of thousands of followers... Is there some way to get around the limit of 8000?
Related
I have a collection with around 30 thousands documents. My security rules are configured so that only authenticated users can read such documents from this collection. I cannot add more constraints regarding read operations for this specific collection.
When using my app, up to 50 documents are averagely returned, depending on the query.
Is there any way to prevent a malicious user to download the entire collection in Firestore?
To limit the number of documents a user can read at once, you can include a limit in your security rules as shown in the documentation on securely querying data:
allow list: if request.query.limit <= 50;
Keep in mind that rules are not filters, so the application code will also need to include this (or a lower) limit in its code.
I have a question about the firebase database pricing. I have about 400,000 rows in the leaderboard of my database, but in my app I just want to load the last 500 rows, so my question will I get charged for the 500 rows loaded when I run the query or will I get charged for all 400,000 rows.
Realtime database charges 5$ per gb stored and 1$ per gb downloaded. I did the calculation with Firestore and found that Realtime Database would be way cheaper if i get charged for the 500 rows and not the 400,000 rows.
I searched all documentation and have not found anything about queries: https://firebase.google.com/pricing
https://firebase.google.com/docs/database/usage/billing
Can someone tell me if I get charged for just the 500 rows in my collection or for all the data in the collection and if there is a way to only get charged for the 500 rows maybe with security rules?
Here is my query code:
let queryRef = ref.child("Leaderboard").queryOrdered(byChild: "totalStars").queryLimited(toLast: 500)
How the database looks like. (It will have about 500,000 childs same as these and be loaded 200,000 per day, But I just want to be priced on the top 500 that I load and not the whole 500,000 each time a user loads the leaderboard is it possible?)
You will only be charged for the number of Firestore Documents corresponding to the result of your query (not to the number of docs in the collection).
So in your case a maximum of 500 reads, since you would limit the Query to 500 documents.
On the other hand, note that the Realtime Database queries are not shallow (while the Firestore ones are) and therefore if you query for a JSON node you'll get the entire tree under this node.
Renaud's answer is the correct one but let me add some additional information and restate that:
With the Firebase Real Time Database for downloads you are charged for
what is downloaded and not how many nodes you are querying.
So the key is to reduce the amount of data you're downloading. Your nodes are already pretty shallow however, there's a huge savings to be made because in your current structure, the node key (the users uid) is duplicated within the node as a child node, and that's not needed.
You can always get the node key with snapshot.key and remove that child node. So it would look like
uid
fullName: "Logan Paul"
stars: 40
Also, I think your calculations are off a bit. It looks like each node would be about 100 bytes of data, and Firebase strings are UTF-8 Encoded so if you download 500 nodes per user per day and you have 200,000 users, that about 38Gb per day (as binary).
Roughly 400 bytes * 500 nodes * 200,000 users * 0.000000000931322574615479 bytes per Gb = 38Gb
so about $38 a day if I did my math correctly.
We know that RESTHEART API for Mongo provides facility to get data by pages and the maximum limit of pages is 1000.
Is there a way in Restheart or outside to get all pages data in single call?
I am just trying to avoid multiple rest calls using restheart for every page.
It is not possible as it is not possible to retrieve an entire collection with a single mongodb driver call.
The limit of 1000 is imposed to bound the http request. With documents up to 10 megabytes or more of json it could even result in a huge payload!
You can however make concurrent requests for different pages to speed up the data retrieval...
In our application we have to display most popular articles but there could be same strategy required in case of trending, hot, relevant, etc. data.
We have 2 collections [Articles] and [Comments] where articles have multiple comments.
We want to display most popular [Articles].
Popularity is counted basing on some complex formula but let's assume that the formula sums total number of [Articles] views and total [Comments] count. We assume that if formula counts popularity of 1 article then it will take all [Articles] into account to give it also a rank among others.
As you can see users are constantly increasing views and adding more comments. Every day different articles can be among the most popular ones.
The problem is as follows: how to display up to date data without spamming database with queries?
I was thinking about scheduled cron job (in our backend app) that would update [Article] popularity for ex. every hour and then save it in article itself. This way when users visit the page nothing would have to be counted and we could just work on saved data.
There might be also possibility to build a query that is fast enough and counts popularity on demand but I don't know if it's possible.
What will be the best strategy? Count data in background and keep it up to date ,build advanced query or something different?
Just for testing purpose I would like to get 100 , 500 , 1000 , 5000 , 10000 , 20000 ... records from a Collection. At the moment the largest pagesize is 1000. How can I increase it to whatever I would like for just testing ?
RESTHeart has a pagesize limit of 1000 pages per request and that's hardcoded into class org.restheart.handlers.injectors.RequestContextInjectorHandler.
If you, for any reason, want to increase that limit then you have to change the source code and build your own jar.
However, RESTHeart speedups the execution of GET requests to collections resources via its db cursors pre-allocation engine. This applies when several documents need to be read from a big collection and moderates the effects of the MongoDB cursor.skip() method that slows downs linearly. So it already optimizes the navigation of large MongoDB collections, if this is what you are looking for.
Please have a look at Speedup Requests with Cursor Pools and Performances page in the official documentation for more information.