cancel filter and sorting of big data array

cancel filter and sorting of big data array - swift

I'm building vocabulary app using realm. I have several objects of Vocabulary, which contains list of words. One vocabulary contains 45000 words
UI is build such way, that user can search by "BEGINSWITH", "CONTAINS" or "ENDSWITH" through word's title, if corresponding tab is selected.
As, there are several vocabularies, there are some words, that appear in several vocabularies, and I need to remove "duplicates" from UI.
When I do this filtering duplicates on resulted objects + sorting them alphabetically the UI of app freezes, till process completes.
My question is:
1) How can I cancel previous filter and realm filtering request, if tab changed (for example from Contains to Ends"?
2) How can I do all these filter/sorting requests in background, so UI will not freeze?
My code:
let vocabularyPredicate = NSPredicate(format: "enabled == 1 AND lang_from CONTAINS[c] %#", self.language.value)
self.vocabularies = Array(realm.objects(Vocabulary.self).filter(vocabularyPredicate).sorted(byKeyPath: "display_order"))
let result = List<Word>()
for object in self.vocabularies {
let predicate = NSPredicate(format: "title \(selectedFilter.value)[c] %#", self.query.value.lowercased())
result.append(objectsIn: object.words.filter(predicate))
}
self.words = Array(result).unique{$0.title}.sorted {
(s1, s2) -> Bool in return s1.title.localizedStandardCompare(s2.title) == .orderedAscending
}
selectedFilter.value is selected tab value: "BEGINSWITH", "CONTAINS" or "ENDSWITH"
self.query.value.lowercased() - search query.
unique{$0.title} is extension method for array
extension Array {
func unique<T:Hashable>(map: ((Element) -> (T))) -> [Element] {
var set = Set<T>() //the unique list kept in a Set for fast retrieval
var arrayOrdered = [Element]() //keeping the unique list of elements but ordered
for value in self {
if !set.contains(map(value)) {
set.insert(map(value))
arrayOrdered.append(value)
}
}
return arrayOrdered
}
}
Actually, realm search is pretty fast, but because of looping through vocabularies and filtering duplicates + sorting alphabetically operations through array of objects - request is freezing for 1-2 seconds.
UPDATE, based on EpicPandaForce and Manuel advices:
I have lurked one more time, and it appeared, that .distinct(by: [keypath]) is already presented in Results in new version of RealmSwift.
I have changed filter/sorting request to
realm.objects(Word.self).filter(vocabularyPredicate).distinct(by: ["title"]).sorted(byKeyPath: "title", ascending: true)
works better know, but I want to ensure, UI will not freeze anyway, by passing objects bettween background thread and UI thread. I have updated adviced construction to:
DispatchQueue.global(qos: .background).async {
let realm = try! Realm()
let cachedWords = CashedWords()
let predicate = NSPredicate(format: "enabled == 1")
let results = realm.objects(Word.self).filter(predicate).distinct(by: ["title"]).sorted(byKeyPath: "title", ascending: true)
cachedWords.words.append(objectsIn: results)
try! realm.write {
realm.add(cachedWords)
}
let wordsRef = ThreadSafeReference(to: cachedWords)
DispatchQueue.main.async {
let realm = try! Realm()
guard let wordsResult = realm.resolve(wordsRef) else {
return
}
self.words = Array(wordsResult.words)
if ((self.view.window) != nil) {
self.tableView.reloadData()
}
}
print("data reload finalized")
}

1) How can I cancel previous filter and realm filtering request, if tab changed (for example from Contains to Ends"?
You could create an NSOperation to perform the task and check if it's been cancelled between each of the steps (fetch, check isCancelled, filter, check isCancelled, sort). You won't get to cancel it immediately, but it could improve your performance. It also depends on which of those three steps (fetch, filter, sort) is taking longer...
2) How can I do all these filter/sorting requests in background, so UI will not freeze?
You could run that operation inside a new NSOperationQueue.
Or just use GCD, dispatch a block to a background queue, create a Realm instance in the block and run your code there, then dispatch the results back to the main queue to update the UI.
Something like this:
DispatchQueue.global(qos: .userInitiated).async {
guard let realm = try? Realm() else {
return // maybe pass an empty array back to the main queue?
}
// ...
// your code here
// ...
let words = Array(result).unique{$0.title}.sorted {
(s1, s2) -> Bool in return s1.title.localizedStandardCompare(s2.title) == .orderedAscending
}
// Can't pass Realm objects directly across threads
let wordReferences = words.map { ThreadSafeReference(to: $0) }
DispatchQueue.main.async {
// Resolve references on main thread
let realm = try! Realm()
let mainThreadWords = wordReferences.flatMap { realm.resolve($0) }
// Do something with words
self.words = mainThreadWords
}
}
Additionally, you should try to optimize your query:
let predicate = NSPredicate(format: "vocabulary.enabled == 1 AND vocabulary.lang_from CONTAINS[c] %# AND title \(selectedFilter.value)[c] %#", self.language.value, self.query.value.lowercased())
let words = realm.objects(Word.self).filter(predicate).sorted(byKeyPath: "title")
let wordsReference = ThreadSafeReference(words)
// resolve this wordsReference in the main thread

Related

How to use Combine to assign the number of elements returned from a Core Data fetch request?

I want my app to periodically fetch new records and stores them in Core Data. I have a label on my UI that should display the number of elements for a particular record and I want that number to be updated as more records are added into the database. As an exercise, I want to use Combine to accomplish it.
I'm able to display the number of elements in the database when the app launches, but the number doesn't get updated when new data enters into the database (I verified that new data was being added by implementing a button that would manual refresh the UI).
Here's the code that displays the correct number of elements on launch but doesn't update when new records are added:
let replayRecordFetchRequest: NSFetchRequest<ReplayRecord> = ReplayRecord.fetchRequest()
_ = try? persistentContainer.viewContext.fetch(replayRecordFetchRequest).publisher.count().map { String(format: Constants.Strings.playsText, $0) }.assign(to: \.text, on: self.playsLabel)
Here's a code snippet from the WWDC 2019 Session 230 talk that I adapted but this doesn't work at all (the subscriber is never fired):
let replayRecordFetchRequest: NSFetchRequest<ReplayRecord> = ReplayRecord.fetchRequest()
if let replayRecords = try? replayRecordFetchRequest.execute() {
_ = replayRecords.publisher.count().map { String(format: Constants.Strings.playsText, $0) }.assign(to: \.text, on: self.playsLabel)
}

So, I didn't know this until now, but not all publishers are infinitely alive.
And the problem was that the NSFetchRequest.publisher is not a long-living publisher. It simply provides a way to iterate through the sequence of elements in the fetch request. As a result, the subscriber will cancel after the elements are iterated. In my case, I was counting the elements published until cancellation then assigning that value onto the UI.
Instead, I should be subscribing to changes to the managed object context and assigning that pipeline to my UI. Here's some example code:
extension NotificationCenter.Publisher {
func context<T>(fetchRequest: NSFetchRequest<T>) -> Publishers.CompactMap<NotificationCenter.Publisher, [T]> {
return compactMap { notification -> [T]? in
let context = notification.object as! NSManagedObjectContext
var results: [T]?
context.performAndWait {
results = try? context.fetch(fetchRequest)
}
return results
}
}
}
let playFetchRequest: NSFetchRequest<ReplayRecord> = ReplayRecord.fetchRequest()
let replayVideoFetchRequest: NSFetchRequest<ReplayVideo> = ReplayVideo.fetchRequest()
let playsPublisher = contextDidSavePublisher.context(fetchRequest: playFetchRequest).map(\.count)
let replayVideoPublisher = contextDidSavePublisher.context(fetchRequest: replayVideoFetchRequest).map(\.count)
playsSubscription = playsPublisher.zip(replayVideoPublisher).map {
String(format: Constants.Strings.playsText, $0, $1)
}.receive(on: RunLoop.main).assign(to: \.text, on: self.playsLabel)

Closures for waiting data from CloudKit

I have a CloudKit database with some data. By pressing a button my app should check for existence of some data in the Database. The problem is that all processes end before my app get the results of its search. I found this useful Answer, where it is said to use Closures.
I tried to follow the same structure but Swift asks me for parameters and I get lost very quick here.
Does someone can please help me? Thanks for any help
func reloadTable() {
self.timePickerView.reloadAllComponents()
}
func getDataFromCloud(completionHandler: #escaping (_ records: [CKRecord]) -> Void) {
print("I begin asking process")
var listOfDates: [CKRecord] = []
let predicate = NSPredicate(value: true)
let query = CKQuery(recordType: "Riservazioni", predicate: predicate)
let queryOperation = CKQueryOperation(query: query)
queryOperation.resultsLimit = 20
queryOperation.recordFetchedBlock = { record in
listOfDates.append(record)
}
queryOperation.queryCompletionBlock = { cursor, error in
if error != nil {
print("error")
print(error!.localizedDescription)
} else {
print("NO error")
self.Array = listOfDates
completionHandler(listOfDates)
}
}
}
var Array = [CKRecord]()
func generateHourArray() {
print("generate array")
for hour in disponibleHours {
let instance = CKRecord(recordType: orderNumber+hour)
if Array.contains(instance) {
disponibleHours.remove(at: disponibleHours.index(of: hour)!)
}
}
}
func loadData() {
timePickerView.reloadAllComponents()
timePickerView.isHidden = false
}
#IBAction func checkDisponibility(_ sender: Any) {
if self.timePickerView.isHidden == true {
getDataFromCloud{ (records) in
print("gotData")
self.generateHourArray()
self.loadData()
}
print(Array)
}
}

Im struggling to understand your code and where the CloudKit elements fit in to it, so Im going to try and give a generic answer which will hopefully still help you.
Lets start with the function we are going to call to get our CloudKit data, lets say we are fetching a list of people.
func getPeople() {
}
This is simple enough so far, so now lets add the CloudKit code.
func getPeople() {
var listOfPeople: [CKRecord] = [] // A place to store the items as we get them
let query = CKQuery(recordType: "Person", predicate: NSPredicate(value: true))
let queryOperation = CKQueryOperation(query: query)
queryOperation.resultsLimit = 20
// As we get each record, lets store them in the array
queryOperation.recordFetchedBlock = { record in
listOfPeople.append(record)
}
// Have another closure for when the download is complete
queryOperation.queryCompletionBlock = { cursor, error in
if error != nil {
print(error!.localizedDescription)
} else {
// We are done, we will come back to this
}
}
}
Now we have our list of people, but we want to return this once CloudKit is done. As you rightly said, we want to use a closure for this. Lets add one to the function definition.
func getPeople(completionHandler: #escaping (_ records: [CKRecord]) -> Void) {
...
}
This above adds a completion hander closure. The parameters that we are going to pass to the caller are the records, so we add that into the definition. We dont expect anyone to respond to our completion handler, so we expect a return value of Void. You may want a boolean value here as a success message, but this is entirely project dependent.
Now lets tie the whole thing together. On the line I said we would come back to, you can now replace the comment with:
completionHandler(listOfPeople)
This will then send the list of people to the caller as soon as CloudKit is finished. Ive shown an example below of someone calling this function.
getPeople { (records) in
// This code wont run until cloudkit is finished fetching the data!
}
Something to bare in mind, is which thread the CloudKit API runs on. If it runs on a background thread, then the callback will also be on the background thread - so make sure you don't do any UI changes in the completion handler (or move it to the main thread).
There are lots of improvements you could make to this code, and adapt it to your own project, but it should give you a start. Right off the bat, Id image you will want to change the completion handler parameters to a Bool to show whether the data is present or not.
Let me know if you notice any mistakes, or need a little more help.

Fetch of Core-Data is resulting in duplicate items

In my iOS Xcode8 project using Swift, I'm performing a fetch of my Core-Data:
func searchFoods() {
let context: NSManagedObjectContext = appDel.managedObjectContext
let request = NSFetchRequest<NSFetchRequestResult>(entityName: "Foods")
print("Searching Database for \(searchVariable)...")
var subPredicates : [NSPredicate] = []
let codeSearch = NSPredicate(format: "codeText contains[c] %#", "\(searchVariable)")
subPredicates.append(codeSearch)
request.predicate = NSCompoundPredicate(orPredicateWithSubpredicates: subPredicates)
request.returnsObjectsAsFaults = false
do {
let results = try context.fetch(request)
if results.count > 0 {
for result in results as! [NSManagedObject] {
if let item = result.value(forKey: "title") as? String {
// Maybe put a loop of some kind to only append the found item count??
searchArray.append(item)
myTableView.reloadData()
}
}
}
} catch {
print("Fetch failed...")
}
}
However my searchArray that is a [String] of the search results created many duplicates I know are not there; it's listing them 2 or 3 times. Can't figure out how to limit the appending to just the result count amount. If I search fruit, it might return an array like bananas, strawberries, peaches, oranges, bananas, strawberries, peaches, oranges etc, repeating. Can someone please help?

There's nothing in the code that would cause duplicates in the fetch results. If they're not actually present in the persistent store, the likely cause is that you're doing this:
searchArray.append(item)
But there's no sign of ever clearing out past results. Your sample results are consistent with this-- if there are four results, you add them to the array once, then later add them again.
It's also possible that there's a problem with your table view data source methods, but you're probably driving that directly from the contents of searchArray.

How to improve performance for large datasets with Realm?

My database has 500,000 records. The tables don't have a primary key because Realm doesn't support compound primary keys. I fetch data in background thread, then I want to display it in the UI on the main thread. But since Realm objects cannot be shared across threads I cannot use the record I fetched in the background. Instead I need to refetch the record on main thread? If I fetch a record out of the 500,000 records it will block the main thread. I don't know how to deal with it. I use Realm because it said it's enough quick. If I need refetch the record many times, is it really faster than SQLite? I don't want to create another property that combine other columns as primary key because the Realm database is already bigger than a SQLite file.
#objc class CKPhraseModel: CKBaseHMMModel{
dynamic var pinyin :String!
dynamic var phrase :String = ""
class func fetchObjects(apinyin :String) -> Results<CKPhraseModel> {
let realm = Realm.createDefaultRealm()
let fetchString = generateQueryString(apinyin)
let phrases = realm.objects(self).filter(fetchString).sorted("frequency", ascending: false)
return phrases
}
func save(needTransition :Bool = true) {
if let realm = realm {
try! realm.write(needTransition) {[unowned self] in
self.frequency += 1
}
}
else {
let realm = Realm.createDefaultRealm()
if let model = self.dynamicType.fetchObjects(pinyin).filter("phrase == %#", phrase).first {
try! realm.write(needTransition) {[unowned self] in
model.frequency += self.frequency
}
}
else {
try! realm.write(needTransition) {[unowned self] in
realm.add(self)
}
}
}
}
}
then I store fetched records in Array
let userInput = "input somthing"
let phraseList = CKPhraseModel().fetchObjects(userInput)
for (_,phraseModel) in phraseList.enumerate() {
candidates.append(phraseModel)
}
Then I want to display candidates information in UI when the user clicks one of these. I will call CKPhraseModel's save function to save changes. This step is on main thread.

Realm is fast if you use its lazy loading capability, which means that you create a filter that would return your candidates directly from the Realm, because then you'd need to only retrieve only the elements you index in the results.
In your case, you copy ALL elements out. That's kinda slow, which is why you end up freezing.

The method does not enter for loop Parse Swift

I use parse to query current user's friend list and the friend request user and when user press each cell of the friend request, The app will add that friend back and delete the selected friend request so I query friend list and friend request and use "addedArray" as friend requests and "duplicate" as array of current user's friend list and use for loop to find the duplicate of friend list and friend request and delete that friend from addedArray so the current user will se the latest friend requests
Here's my code in swift
func queryAdded(){
let query = PFQuery(className: "Request")
let user = PFUser.currentUser()?.relationForKey("Friends")
let query2 = user?.query()
query.whereKey("To", equalTo: PFUser.currentUser()!)
query.findObjectsInBackgroundWithBlock {
(objects, error) -> Void in
if error == nil{
for object in objects! {
print("query")
let username = object.valueForKey("FromUsername") as! String
self.userCellAdded = username
self.addedArray.append(username)
print(username)
print(self.addedArray.count)
}
print("READY")
print(self.addedArray.count)
self.tableView.reloadData()
}
else{
/* dispatch_async(dispatch_get_main_queue()){
//reload the table view
query.cachePolicy = PFCachePolicy.NetworkElseCache
}*/
print("errorrrr")
}
}
query2!.findObjectsInBackgroundWithBlock{(objects,error) -> Void in
if error == nil {
for object in (objects)!{
if let username = object["username"] as? String {
self.duplicate.append(username)
print("duplicate")
print(username)
print("size")
print(self.duplicate.count)
}
}
}
}
for self.iIndex = 0 ; self.iIndex < self.addedArray.count ; ++self.iIndex {
for self.jIndex = 0 ; self.jIndex < self.duplicate.count ; ++self.jIndex {
print("in for loop")
if self.addedArray[self.iIndex] == self.duplicate[self.jIndex] {
self.addedArray.removeAtIndex(self.iIndex)
self.tableView.reloadData()
print("find")
}
}
}
}
The problem is The method queryAdded() does not run for loop for me and I don't understand why
The duplicate array and the addedArray have value and size but still it didn't go inside the for loop

Your problem is that your for loop is depending on the results of two asynchronous operations. What happens is that your app starts these two background queries and then immediately starts the for loop. Since there is no data yet from the queries, the for loop has no data to work on.
You can either solve this by creating a "pyramid hell" by nesting your operations (bad), or you can use a framework to achieve the same as Promises would provide for JavaScript (good).
Since you're using Parse, you have such a framework already; namely the Bolts Framework. You could then perform these operations sequentially using tasks (BFTask).
Example from the Bolts readme:
var query = PFQuery(className:"Student")
query.orderByDescending("gpa")
findAsync(query).continueWithSuccessBlock {
(task: BFTask!) -> BFTask in
let students = task.result() as NSArray
var valedictorian = students.objectAtIndex(0) as PFObject
valedictorian["valedictorian"] = true
return self.saveAsync(valedictorian)
}.continueWithSuccessBlock {
(task: BFTask!) -> BFTask in
var valedictorian = task.result() as PFObject
return self.findAsync(query)
}.continueWithSuccessBlock {
(task: BFTask!) -> BFTask in
let students = task.result() as NSArray
var salutatorian = students.objectAtIndex(1) as PFObject
salutatorian["salutatorian"] = true
return self.saveAsync(salutatorian)
}.continueWithSuccessBlock {
(task: BFTask!) -> AnyObject! in
// Everything is done!
return nil
}
You could then first prepare both your queries and then start the chain of tasks:
query1.findObjectsInBackground().continueWithSuccessBlock {
(task: BFTask!) -> BFTask in
var objects = task.result() as NSArray
for object in objects {
//collect your usernames
}
return query2.findObjectsInBackground()
}.continueWithSuccessBlock {
(task: BFTask!) -> AnyObject! in
var objects = task.result() as NSArray
for object in objects {
// collect your usernames from relation
}
// Call a function containing the for loop that is currently not running
return nil
}

The for loop is run
duplicate array and the addedArray have value and size - No they don't
findObjectsInBackgroundWithBlock runs the query in ... the background.
Therefore your program does the following:
start the first query
start the second query
run the for loop
the queries finish at some arbitrary point in time.
In particular when the program reaches point 3 the arrays do not contain anything, they are empty arrays, therefore the for-loop executes perfectly fine as it is supposed to be: it does nothing since there is nothing to loop over.
Solution:
Move the for loop into a function that you call after the first query and the second query finish.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

cancel filter and sorting of big data array - swift

Related

How to use Combine to assign the number of elements returned from a Core Data fetch request?

Closures for waiting data from CloudKit

Fetch of Core-Data is resulting in duplicate items

How to improve performance for large datasets with Realm?

The method does not enter for loop Parse Swift

Categories

Resources