Firestore parallel queries duration increases as the number of queries increases (and it shouldn't) - swift

Context
I have a list of items and a list of stores. I need to ask each store which of the items they have.
I know that, according to Frank van Puffelen, parallel queries shouldn't take long. In fact, it should take the same amount as a regular one, in which "the time it takes to run a query is proportional to the number of results you get back, not the number of docs you're searching through."
Problem
However, the bigger the number of items in the list (hence, the bigger the number of queries executed), the longer it takes to get results back. I'm getting the following durations for each quantity of items:
How I'm doing it
In order to get all the results at the same time, I'm using a DispatchGroup. I iterate through each store asking which items they have, and in each store I iterate through the list querying for such item in the store.
func fetchStoresProducts() {
let productSearchGroup = DispatchGroup()
for i in 0...(comparisonResults.count - 1) {
productSearchGroup.enter()
DataService.instance.getPrices(ofList: state.shoppingList.items, inStore: comparisonResults[i].store) { (productsFound) in
self.comparisonResults[i].products = productsFound
productSearchGroup.leave()
}
}
productSearchGroup.notify(queue: .main) {
// continue with next steps
}
}
And in the getPrices function, I query for each item considering its conditions:
func getPrices(ofList list: [Item], inStore store : Store, completion: #escaping (_ productsFound : [Product]) -> ()) {
var products = [Product]()
let listGroup = DispatchGroup()
list.forEach { (item) in
listGroup.enter()
var query = historyRef.whereField("genericName", isEqualTo: item.genericName)
query = query.whereField("storeID", isEqualTo: store.uid)
if let type = item.genericType {
query = query.whereField("genericType", isEqualTo: type)
}
if item.brandSearchOption == nil {
if let brand = item.brand {
query = query.whereField("brand", isEqualTo: brand)
}
}
query = query.order(by: "price").limit(to: 1)
query.getDocuments { (snapshot, error) in
if let snapshot = snapshot {
if let document = snapshot.documents.first {
if var product = Product(data: document.data(), documentID: document.documentID) {
product.setQuantity(to: item.quantity)
products.append(product)
}
}
} else {
print(error!.localizedDescription)
}
listGroup.leave()
}
}
listGroup.notify(queue: .main) {
completion(products)
}
}
Notice that each of these queries has a limit of 1, which should give the fastest possible query duration. And they are also being executed together, almost in parallel - I don't wait for a result to ask for the next one.
Question
What am I doing wrong? Why is the querying time increasing proportionally to the number of items in the list?
Update
Interesting fact: if I change the historyRef for a smaller collection storesRef (with 100 or so documents), maintaining all other variables (number of stores and number of items), the speed of the queries is lightining fast. It averages 1.3s for 16 items in the list.
Could this issue be related to the collection, then? The size of it or how many composite indexes it has? historyRef has 15k+ documents with 10 composite indexes, storesRef has 100+ documents with 1 composite index.

Related

How to pull all product data from index in Algolia?

After reading the docs on how to search and browse an index with Algolia's Swift Client, it's not clear how I need to pull all product data from an index. In the documentation, it is stated that:
The search query only allows for the retrieval of up to 1000 hits. If
you need to retrieve more than 1000 hits (e.g. for SEO), you can
either leverage the Browse index method or increase the
paginationLimitedTo parameter
So I wrote the following:
let client = SearchClient(appID: "...", apiKey: "...")
var index: Index
index = client.index(withName: "products")
var productFeed:[Product] = []
let settings = Settings()
.set(\.paginationLimitedTo, to: 4500)
index.setSettings(settings) { result in
if case .success(let response) = result {
.....
}
}
Then to Browse:
index.browse(query: Query("")) { result in
if case .success(let response) = result {
do {
let products:[Product] = try response.extractHits()
DispatchQueue.main.async {
self.productFeed = products
}
}catch let error{
print("Hits decoding error :\(error)")
}
}
}
It would seem as though the two blocks of code would work together, but my productFeed array just returns 1000 records. Can someone explain what I am doing wrong here?
To retrieve all records from your index use the browseObjects method.
This method performs multiple consecutive browse method calls extracting all records from an index page by page.

Swift Firebase get batches of documents in order

For context, I have a bunch of documents that hold fields similar to a social media post. (photo url link, like count, date uploaded, person who uploaded it, etc.) And I am showing this data in a gallery (lazyvgrid). I do not want to get all of the documents at once so when the user scrolls down the gallery I am getting 20 documents at a time based on how far the user scrolls down the gallery view. I am sorting my get request with:
self.eventsDataCollection.document(currentEventID).collection("eventMedias").order(by: "savesCount", descending: true).limit(to: 20).getDocuments
I have no problem getting the first 20 using this code. How can I get the next 20 and the 20 after that, and so on?
With query cursors in Cloud Firestore, you can split data returned by a query into batches according to the parameters you define in your query.
Query cursors define the start and end points for a query, allowing you to:
Return a subset of the data.
Paginate query results.
Use the startAt() or startAfter() methods to define the start point for a query. Use the endAt() or endBefore() methods to define an endpoint for your query results.
As Dharmaraj mentioned for your case, it will be best if we use Pagination with Firestore.
Paginate queries by combining query cursors with the limit() method to limit the number of documents you would want to show in the gallery. And as you want no definite numbers, but the user should be able to scroll through as long as he wants, and as long as there are documents, I would suggest to put a cursor until the last document, like in the below code sample.
To get the last document,
let first = db.collection("collectionname")
.order(by: "fieldname")
first.addSnapshotListener { (snapshot, error) in
guard let snapshot = snapshot else {
print("Error retrieving cities: \(error.debugDescription)")
return
}
guard let lastSnapshot = snapshot.documents.last else {
// The collection is empty.
return
}
I ended up referencing Dharmaraj's link in his comment.
#Published var isFetchingMoreDocs: Bool = false
private var lastDocQuery: DocumentSnapshot!
public func getUpdatedEventMedias(currentEventID: String, eventMedias: [EventMedia], completion: #escaping (_ eventMedias: [EventMedia]) -> Void) {
self.isFetchingMoreDocs = true
var docQuery: Query!
if eventMedias.isEmpty {
docQuery = self.eventsDataCollection.document(currentEventID).collection("eventMedias").order(by: "savesCount", descending: true).limit(to: 20)
} else if let lastDocQuery = self.lastDocQuery {
docQuery = self.eventsDataCollection.document(currentEventID).collection("eventMedias").order(by: "savesCount", descending: true).limit(to: 20).start(afterDocument: lastDocQuery)
}
if let docQuery = docQuery {
print("GET DOCS")
docQuery.getDocuments { (document, error) in
if let documents = document?.documents {
var newEventMedias: [EventMedia] = []
for doc in documents {
if let media = try? doc.data(as: EventMedia.self) {
newEventMedias.append(media)
}
}
self.lastDocQuery = document?.documents.last
self.isFetchingMoreDocs = false
completion(newEventMedias)
} else if let error = error {
print("Error getting updated event media: \(error)")
self.isFetchingMoreDocs = false
completion([])
}
}
} else {
self.isFetchingMoreDocs = false
completion([])
}
}
As seen in my code, by utilizing:
.order(by: "savesCount", descending: true).limit(to: 20).start(afterDocument: lastDocQuery)
I am able to start exactly where I left off. I should also note that I am only calling this function if !isFetchingMoreDocs - otherwise the func will be called dozens of times in a matter of seconds while scrolling. The most important thing about this code is that I am checking lastDocQuery if it is nil. After the user scrolls all the way to the bottom, the lastDocQuery will no longer be valid and cause a fatal error. Also I am using a custom scroll view that tracks the scroll offset in order to fetch more media and make more calls to firebase.

Array Function of MPMediaItem Very Slow

I'm trying to edit the queue of my music player using the applicationQueuePlayer and the perform method (details here). However, whenever I apply any array function (map, filter etc.), it takes many seconds to complete, leading to (I think) data races and crashes when the user, for example, removes two tracks immediately after each other.
var musicPlayerController = MPMusicPlayerController.applicationQueuePlayer
self.musicPlayerController.perform { (currentQueue) in
let items = currentQueue.items
let itemsToRemove = items.filter { $0.artist == "Some artist" } // this takes multiple seconds
if let item = itemsToRemove.first {
currentQueue.remove(item)
}
} completionHandler: { (newQueue, error) in
if let e = error {
print(e)
} else {
tracks = items.map { Track(item: $0) } // this takes multiple seconds
}
}
The issue is arising as I'm going through an MPMediaItem array. I don't think this is an issue with the MPMediaItem class though, as I'm able to complete a map of [MPMediaItem] in other places in the app e.g. when getting items from a playlist (a similar sized array to the queue items).
The issue happens solely when the MPMediaItems are taken from the MPMusicPlayerControllerMutableQueue and MPMusicPlayerControllerQueue
Is this just a bug with MusicKit API?

swift for loop order of data is not right

I want to fetch data from firebase and put them in an array. The first part of the function is always in the right order, i can see it when i print(DEBUG(files). But after for loop, the order of the documents messes and i always get random order. Shouldn't i always get the same order?
func getUnreadMessages(){
guard let uid = AuthViewModel.shared.userSession?.uid else {return}
Firestore.firestore().collection("users").document(uid).collection("chats").order(by: "created", descending: true).getDocuments { (snapshot, _) in
guard let files = snapshot?.documents.compactMap({ $0.documentID }) else {return}
print("DEBUG: \(files)")
for file in files{
Firestore.firestore().collection("users").document(uid).collection("chats").document(file).collection("messages").whereField("read", isEqualTo: false).getDocuments { (snapshot, _) in
guard let documents = snapshot?.documents.compactMap({ $0.documentID }) else {return}
print("DEBUG: \(documents)")
self.count.append(documents.count)
print("DEBUG: \(self.count)")
}
}
}
}
You get a different order of results because while you call the database in the correct order, there is no guarantee that the database will return your call in that same order, because some calls take longer than other calls. I think the simplest solution is to record the original order, attach it to the data in your second call (where you determine document count), and sort the collection (the array) by that original order.
The easiest way to attach this index value to the document count is a custom model:
struct MessageCount {
let count: Int // this is the message count you're after
let n: Int // this is the index of the original order
init(count: Int, n: Int) {
self.count = count
self.n = n
}
}
Then just use a dispatch group to coordinate the async tasks and in the completion of the dispatch group, sort the array by index and you will have an array of message counts in the intended order:
func getUnreadMessages() {
guard let uid = AuthViewModel.shared.userSession?.uid else {
return
}
let db = Firestore.firestore() // instantiate it once since it could be created hundreds or thousands of times in this function
db.collection("users").document(uid).collection("chats").order(by: "created", descending: true).getDocuments { (snapshot, error) in
guard let snapshot = snapshot,
!snapshot.isEmpty else {
if let error = error {
print(error) // you oddly omitted the error in your code, never do that
}
return
}
let dispatch = DispatchGroup() // set up the dispatch group outside the loop
var messageCounts = [MessageCount]() // this temp array will carry the data with the index
// to record the original order of the loop, just enumerate it and access `n` (the index)
for (n, doc) in snapshot.documents.enumerated() {
dispatch.enter() // enter dispatch on each iteration
db.collection("users").document(uid).collection("chats").document(doc.documentID).collection("messages").whereField("read", isEqualTo: false).getDocuments { (snapshot, error) in
if let snapshot = snapshot {
let c = snapshot.count // get the message count
let count = MessageCount(count: c, n: n) // add it to the model along with n which is captured by the parent closure
messageCounts.append(count) // append to our temp array
} else if let error = error {
print(error)
}
dispatch.leave() // leave dispatch no matter the outcome
}
}
// this is the completion handler of the dispatch group
dispatch.notify(queue: .main) {
// sort the array by index and then map it to just get the message counts
let counts = messageCounts.sorted(by: { $0.n < $1.n }).map({ $0.count })
}
}
}
The order of returned results are determined by an order(by clause. Otherwise the results may seem somewhat random.
In this case the first Firebase call specifies an order, so those documents will always be returned in the correct order.
collection("chats").order(by: "created"
But the next firebase call does not specify an order so, the returned documents may be somewhat inconsistently ordered.
.collection("messages").whereField
We need to have some way to guarantee that order.
Suppose the structure is this
chats (collection)
user ids (documents)
chats (collection)
chat ids (documents)
messages (collection)
message ids (documents that you want ordered)
the message id's would need to have a field to order them by - call that ordering
Here's the code that prints the count of the number of messages in each chat id and then prints the messages in order
func getUnreadMessages() {
let uid = "uid_0"
self.db.collection("users_chats").document(uid).collection("chats").getDocuments(completion: { snapshot, error in
if let err = error {
print(err.localizedDescription)
return
}
guard let docs = snapshot?.documents else { return }
for doc in docs {
let ref = doc.reference.collection("messages")
ref.order(by: "ordering").getDocuments(completion: { messagesSnapshot, error in
if let err = error {
print(err.localizedDescription)
return
}
guard let messages = messagesSnapshot?.documents else { return }
print("the chat document: \(doc.documentID) has \(messages.count) messages")
for msg in messages {
let order = msg.get("ordering")
let msg = msg.get("read")
print("order: \(order!)", "is read: \(msg!)")
}
})
}
})
}
if there were three messages in chat 0, the output looks like this
the chat document: chat_0 has 3 messages
the chat document: chat_1 has 0 messages
the chat document: chat_2 has 0 messages
order: 0 isRead: 0
order: 1 isRead: 1
order: 2 isRead: 0

Get sorted Firestore documents when denormalizing data

I am using Firestore for my app on which users can publish Posts, stored in the posts collection:
posts
{postID}
content = ...
attachementUrl = ...
authorID = {userID}
Each User will also have a timeline in which the posts of the people they follow will appear. For that I am also keeping a user_timelines collection that gets populated via a Cloud Function:
user_timelines
{userID}
posts
{documentID}
postID = {postID}
addedDate = ...
Because the data is denormalized, if I want to iterate through a user's timeline I need to perform an additional (inner) query to get the complete Post object via its {postID}, like so:
db.collection("user_timelines").document(userID).collection("posts")
.orderBy("addedDate", "desc").limit(100).getDocuments() { (querySnap, _) in
for queryDoc in querySnap.documents {
let postID = queryDoc.data()["postID"] as! String
db.collection("posts").document("postID").getDocument() { (snap, _) in
if let postDoc = snap {
let post = Post(document: postDoc)
posts.append(post)
}
}
}
}
The problem is by doing so I am loosing the order of my collection because we are not guaranteed that all the inner queries will complete in the same order. I need to keep the order of my collection as this will match the order of the timeline.
If I had all the complete Post objects in the timeline collections there would be not issue and the .orderBy("addedDate", "desc").limit(100) would work just fine keeping the Posts sorted, but If I denormalize I cant seem to find a correct solution.
How can I iterate through a user's timeline and make sure to get all the Post objects sorted by addedDate even when denormalizing data?
I was thinking of creating a mapping dictionary postID/addedDate when reading the postIDs, and then sort the Post at the end using this dictionary, but I am thinking there must be a better solution for that?
I was expecting this to be a common issue when denormalizing data, but unfortunately I couldnot find any results. Maybe there's something I am missing here.
Thank you for your help!
What you can do is enumerate the loop where you perform the inner query, which simply numbers each iteration. From there, you could expand the Post model to include this value n and then sort the array by n when you're done.
db.collection("user_timelines").document(userID).collection("posts").orderBy("addedDate", "desc").limit(100).getDocuments() { (querySnap, _) in
for (n, queryDoc) in querySnap.documents.enumerated() {
let postID = queryDoc.data()["postID"] as! String
db.collection("posts").document("postID").getDocument() { (snap, _) in
if let postDoc = snap {
let post = Post(document: postDoc, n: n)
posts.append(post)
}
}
}
posts.sort(by: { $0.n < $1.n })
}
The example above actually won't work because the loop is asynchronous which means the array will sort before all of the downloads have completed. For that, consider using a Dispatch Group to coordinate this task.
db.collection("user_timelines").document(userID).collection("posts").orderBy("addedDate", "desc").limit(100).getDocuments() { (querySnap, _) in
let dispatch = DispatchGroup()
for (n, queryDoc) in querySnap.documents.enumerated() {
dispatch.enter() // enter on each iteration
let postID = queryDoc.data()["postID"] as! String
db.collection("posts").document("postID").getDocument() { (snap, _) in
if let postDoc = snap {
let post = Post(document: postDoc, n: n)
posts.append(post)
}
dispatch.leave() // leave no matter success or failure
}
}
dispatch.notify(queue: .main) { // completion
posts.sort(by: { $0.n < $1.n })
}
}