I am in the middle of developing an app which harvests tweets, Facebook statuses and Facebook photos for a user. Currently the user sets out exactly when and to they want this harvest to occur and a spider pulls the data during this period. The when and to is stored in a MySQL db and my plan was to store all the tweets, status and photo meta-data in MongoDB (with the actual images on S3).
I was thinking I would just create one collection for each of the periods the user wants to harvest for and then store all the tweets etc from that period in that particular collection.
Does this seem like a reasonable approach?
Does this seem like a reasonable approach?
What the #1 user query? Is it "find activity by period"? If users only ever want to "find by period", then this makes sense.
However, if users want an accumulated view, now you have to gather history for a user and merge it for display.
If you want both a "by this period" and an "accumulated", then I suggest simply stuffing all data into a single user object. It's easy to tag the individual actions with a "harvest run" and a "timestamp".
Mongo Details: MongoDB can handle individual documents up to about 4MB. Most recent versions up this to 8 or 16MB. If you're only using this space for text, please realize that this is a lot of text. A copy of war & peace is just over 3MBs. So you're talking about hundreds of pages of text in 4MB. With 8 or 16MB, you can probably store status updates & tweets for years on most people.
Note that MongoDB has GridFS for storing binary data (like image files), so you'll typically store just pointers to these in the User document.
Related
As Firestore charges by the read/write, it would be super helpful to keep the changes in memory during the session and only commit them when the user exists either the entire app or a specific section. Is there a way to do that in a Flutter web application?
I think one problem with this approach is that the user might just close the tab including your app. In this case, you have no time to send your data to Firestore.
This aside, you could use packages like Hive to store your documents offline and later run a function to add the data to Firestore later.
You also have 50k reads and 20k writes for free with Firebase, which is sufficient for smaller apps. If you exceed this limit, your app is probably big enough to earn money with it anyway.
I saw the best reply after watching this video(https://www.youtube.com/watch?v=poqTHxtDXwU&feature=emb_title) and there was a comment like this.
"So there is a read charge even clients app cached the same document data" (currently 24 thumbs up)
And in the comment of another comment, Todd Kerpelman wrote this comment.
"Great question! The answer is that yes, you really will fetch those first 20 documents all the time. Note that this is different than when you have a realtime listener set up and a document changes in a query you're currently listening to - in that case, only the changed doc will be sent up. But if you're making a series of separate get calls that just happen to overlap, the database will send up the entire data set each time."
I am confused now. My question is, when you load the next list with startAfter, do you load the lists that have already been loaded again? Will you be paid?
when you load the next list with startAfter, do you load the lists that have already been loaded again?
No, for pagination, each query is completely different than the last. It does not re-fetch all the prior documents again, and you will not be charged for those prior documents. The query uses the document specified in startAfter() to determine exactly where the query should start reading results, and you will be charged for only the documents that are returned by the query.
I have a document in Cloud firestore to which I listen for updates. It has 2 fields, it has a field description and a field for a picture. The picture is approximately 0.2 mb and description is a few words. I wanted to know what would happen if I made changes to the description in the document, I wanted to know if addSnapshotListener actually downloads a fresh new copy of the document or just the field that has been changed.
I indeed see, by looking at how much data is being downloaded in Xcode, a new fresh copy of the document is downloaded.
This is not efficient at all, since the picture field is rarely changed, only the description might change in my application.
Is there a way to optimize this?
Is there a way to optimize this?
Yes! Don't do that.
Firestore (and the realtime database) is not intended to store images or large datasets per field.
You should explore Storage and keep a reference (url) to the item stored in storage in your Firebase.
Cloud Storage is built for app developers who need to store and serve
user-generated content, such as photos or videos.
By leveraging storage if you need to update or change a field in Firestore, you're only working with a small amount of data instead of an entire image worth.
To answer the question; if you read a document from Firebase, it does read the Document and it's child data.
Here's a link to the Storage Docs which shows how to capture a reference to the item uploaded to storage.
https://firebase.google.com/docs/storage/ios/upload-files
If you want to automatically sync the images to all clients and have them available offline, just put them in a separate document.
// Store your small, frequently changing text here:
db.collection('users').doc(userId).set({email: vince#example.com})
// Store your image here:
db.collection('user_profile_pic').doc(userId).set({data: <imagedata>})
In my web app, an authenticated user can pick songs from his spotify playlist to play at a party. I want guests (nonauthenticated users) to be able to view the picked songs on a dynamically created react route and vote on their favorite songs on their own device (probably a phone).
I am using a Mongo, Express, React/Redux, Node stack.
Since the guests don't have access to my app's redux store, the only way they can view the authenticated user's picked songs is through a GET request to my app's database. My initial plan was to just store playlist documents, and the users can GET those playlists to make a request to the spotify api. However, they are unauthorized and need an access token. This means that my database has to store every single one of the songs that the authenticated user picked.
My question has to do with design. I don't think it's a good idea for my one document to hold every song because some people might want to pick thousands of songs, and one document won't be able to hold all of the songs. On the other hand, creating a separate document for each song seems a little bit too excessive.
Can anyone help me figure out which option is better, or if there is a different option I haven't thought of that can avoid this problem altogether? Thank you
Assuming that if you would store each song in a separated document, the main disadvantage of this strategy is the space complexity, you'll need more space to store all documents.
But, supposing you'll keep all song documents at the same collection, it gives some advantages, for example: queries and sorts operations will be more flexible and faster. It helps you to save both processing and development time. A similar logic is showed here.
Use just one document to store all songs makes your database operations more complex, what requires more development time and code to organize all retrieved data on the proper way. Another disadvantage is that it isn't a long term scalable strategy, mainly because the limit of a BSON document is 16MB.
At my vision, the design of separated documents for each song is more appropriate and the reasons are:
Space is monetarily cheap.
Save time complexity must be a priority on all points of software development. Database queries usually are the slower operations in a software. So, reduce the cost of time at database operations is a good objective to seek. Storing all documents in one collection instead of in one document will retrieve all data already organized, with no no need to retreat at code.
Is there a limit to how much persistent storage a single iPhone app may consume?
What does save set the error argument to if the iPhone hits a per-app limit? What if it hits the hardware limit?
Is it possible to limit the number of objects stored for certain entities? If so, what's a good approach to doing this?
acani, an iPhone app I'm working on, downloads the nearest 20 users from the server and saves them to Core Data. After using the app for a while, the users SQLite table could become rather large. How could I limit it? What should I limit it to? Once this table has reached capacity, how could I make it so that newly downloaded users replace the oldest downloaded users?
Thanks!
Matt
I don't know the answer to the limits questions, but I would think you would want the maximum amount of data to be limited well ahead of that. There are some iPhone apps (games0 which take up a large amount of storage (I think Myst is something line 1.5G). But if you allowed your database to grow to those sorts of sizes you might start to impact on the storage the user has for their other applications.
I'd be inclined to suggest that your application needs to have some sort of database house keeping implemented. You will have to write this. Either automatically triggered or manually triggered by the user. For example you might want to setup a settings option where the user can specify how many "old" users it wants to preserve. If users are being added automatically based on location, what sort of algorythm would a user most likely want to cull the list with?
There is a 2GB limit for apps from the App Store but as far as user data goes, you should be able to basically fill the disk. When that happens, your saves will start to fail, I believe with 'NSFileWriteOutOfSpaceError' bubbled up from the PSC.
As far as limiting entity space, there's no Core Data support for this - you'd have to handle it programatically. You could extend the validation system to check for certain conditions (free space, number of entities) and fail an insert or update if these didn't match your criteria.
If you want to delete old users, just sort the results and delete the first/last one.