How to improve performance for large datasets with Realm? - swift

My database has 500,000 records. The tables don't have a primary key because Realm doesn't support compound primary keys. I fetch data in background thread, then I want to display it in the UI on the main thread. But since Realm objects cannot be shared across threads I cannot use the record I fetched in the background. Instead I need to refetch the record on main thread? If I fetch a record out of the 500,000 records it will block the main thread. I don't know how to deal with it. I use Realm because it said it's enough quick. If I need refetch the record many times, is it really faster than SQLite? I don't want to create another property that combine other columns as primary key because the Realm database is already bigger than a SQLite file.
#objc class CKPhraseModel: CKBaseHMMModel{
dynamic var pinyin :String!
dynamic var phrase :String = ""
class func fetchObjects(apinyin :String) -> Results<CKPhraseModel> {
let realm = Realm.createDefaultRealm()
let fetchString = generateQueryString(apinyin)
let phrases = realm.objects(self).filter(fetchString).sorted("frequency", ascending: false)
return phrases
}
func save(needTransition :Bool = true) {
if let realm = realm {
try! realm.write(needTransition) {[unowned self] in
self.frequency += 1
}
}
else {
let realm = Realm.createDefaultRealm()
if let model = self.dynamicType.fetchObjects(pinyin).filter("phrase == %#", phrase).first {
try! realm.write(needTransition) {[unowned self] in
model.frequency += self.frequency
}
}
else {
try! realm.write(needTransition) {[unowned self] in
realm.add(self)
}
}
}
}
}
then I store fetched records in Array
let userInput = "input somthing"
let phraseList = CKPhraseModel().fetchObjects(userInput)
for (_,phraseModel) in phraseList.enumerate() {
candidates.append(phraseModel)
}
Then I want to display candidates information in UI when the user clicks one of these. I will call CKPhraseModel's save function to save changes. This step is on main thread.

Realm is fast if you use its lazy loading capability, which means that you create a filter that would return your candidates directly from the Realm, because then you'd need to only retrieve only the elements you index in the results.
In your case, you copy ALL elements out. That's kinda slow, which is why you end up freezing.

Related

SwiftUI + CoreData: Insert more than 1000 entities with relationships

I have a SwiftUI project in which I'm using CoreData to save data fetched from an API into the device. I was trying to insert the entities in batches which worked fine until I realized that the relationships when inserting in batched are "untouched":
An entity Job, which has a one-to-many relationship with the entity Tag.
A one-to-one relationship with Category.
A one-to-one relationship with Type.
What I'm doing now is inserting the entities manually in a background task:
container.performBackgroundTask { context in
for job in jobs {
let jobToInsert = Job(context: context)
let type = JobType(context: context)
let category = Category(context: context)
jobToInsert.id = Int32(job.id)
....
do {
print("Inserting jobs")
try context.save()
} catch {
// log any errors
}
}
Is there any way to improve the performance by perhaps doing this in a way that I don't know? Because for the user, when they start the app, inserting the jobs one by one isn't a very nice experience because first, it takes a long time (more than 2 minutes) and second because my UI isn't automatically updated as I'm inserting the entities.
Thanks a lot in advance!
EDIT: I also see the memory increasing and after taking the screenshot and before I stopped the process, I saw the memory in 1.09 GB
EDIT 2: This is the code I used when trying to insert the jobs in batch:
private func newBatchInsertRequest(with jobs: [JobCodable]) -> NSBatchInsertRequest {
var index = 0
let total = jobs.count
let batchInsert = NSBatchInsertRequest(
entity: Job.entity()) { (managedObject: NSManagedObject) -> Bool in
guard index < total else { return true }
if let job = managedObject as? Job {
let type = JobType(context: self.container.viewContext)
let category = Category(context: self.container.viewContext)
let data: JobCodable = jobs[index]
job.id = Int32(data.id)
...
return batchInsert
Unfortunately, the relationship can't be built due probably to the context? since I'm getting Thread 13: EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0) in the line let category = Category(context: self.container.viewContext)
Are you calling save() on the context for every job that you insert? If so, I would start there — if you want to load a bunch of jobs all at once, you can insert all your Job instances, set up the relationships, and then only save() once at the very end (outside the loop).
This will not only save you time, but also make it appear as if all the objects show up at the same time to your UI. If that's not what you want, you can experiment with saving in batches within the loop — only call save() every thousand Job objects, for example.

How to use Combine to assign the number of elements returned from a Core Data fetch request?

I want my app to periodically fetch new records and stores them in Core Data. I have a label on my UI that should display the number of elements for a particular record and I want that number to be updated as more records are added into the database. As an exercise, I want to use Combine to accomplish it.
I'm able to display the number of elements in the database when the app launches, but the number doesn't get updated when new data enters into the database (I verified that new data was being added by implementing a button that would manual refresh the UI).
Here's the code that displays the correct number of elements on launch but doesn't update when new records are added:
let replayRecordFetchRequest: NSFetchRequest<ReplayRecord> = ReplayRecord.fetchRequest()
_ = try? persistentContainer.viewContext.fetch(replayRecordFetchRequest).publisher.count().map { String(format: Constants.Strings.playsText, $0) }.assign(to: \.text, on: self.playsLabel)
Here's a code snippet from the WWDC 2019 Session 230 talk that I adapted but this doesn't work at all (the subscriber is never fired):
let replayRecordFetchRequest: NSFetchRequest<ReplayRecord> = ReplayRecord.fetchRequest()
if let replayRecords = try? replayRecordFetchRequest.execute() {
_ = replayRecords.publisher.count().map { String(format: Constants.Strings.playsText, $0) }.assign(to: \.text, on: self.playsLabel)
}
So, I didn't know this until now, but not all publishers are infinitely alive.
And the problem was that the NSFetchRequest.publisher is not a long-living publisher. It simply provides a way to iterate through the sequence of elements in the fetch request. As a result, the subscriber will cancel after the elements are iterated. In my case, I was counting the elements published until cancellation then assigning that value onto the UI.
Instead, I should be subscribing to changes to the managed object context and assigning that pipeline to my UI. Here's some example code:
extension NotificationCenter.Publisher {
func context<T>(fetchRequest: NSFetchRequest<T>) -> Publishers.CompactMap<NotificationCenter.Publisher, [T]> {
return compactMap { notification -> [T]? in
let context = notification.object as! NSManagedObjectContext
var results: [T]?
context.performAndWait {
results = try? context.fetch(fetchRequest)
}
return results
}
}
}
let playFetchRequest: NSFetchRequest<ReplayRecord> = ReplayRecord.fetchRequest()
let replayVideoFetchRequest: NSFetchRequest<ReplayVideo> = ReplayVideo.fetchRequest()
let playsPublisher = contextDidSavePublisher.context(fetchRequest: playFetchRequest).map(\.count)
let replayVideoPublisher = contextDidSavePublisher.context(fetchRequest: replayVideoFetchRequest).map(\.count)
playsSubscription = playsPublisher.zip(replayVideoPublisher).map {
String(format: Constants.Strings.playsText, $0, $1)
}.receive(on: RunLoop.main).assign(to: \.text, on: self.playsLabel)

Inserting child records is slow in coredata

I have close to 7K items stored in a relation called Verse. I have another relation called Translation that needs to load 7K related items with a single call from a JSON file.
Here is my code:
let container = getContainer()
container.performBackgroundTask() { (context) in
autoreleasepool {
for row in translations{
let t = Translation(context: context)
t.text = (row["text"]! as? String)!
t.lang = (row["lang"]! as? String)!
t.contentType = "Verse"
t.verse = VerseDao.findById(row["verse_id"] as! Int16, context: context)
// this needs to make a call to the database to retrieve the approparite Verse instance.
}
}
do {
try context.save()
} catch {
fatalError("Failure to save context: \(error)")
}
context.reset()
}
Code for the findById method.
static func findById(_ id: Int16, context: NSManagedObjectContext) -> Verse{
let fetchRequest: NSFetchRequest<Verse>
fetchRequest = Verse.fetchRequest()
fetchRequest.predicate = NSPredicate(format: "verseId == %#", id)
fetchRequest.includesPropertyValues = false
fetchRequest.fetchLimit = 1
do {
let results =
try context.fetch(fetchRequest)
return results[0]
} catch let error as NSError {
print("Could not fetch \(error), \(error.userInfo)")
return Verse()
}
}
This works fine until I add the VerseDao.findById, which makes the whole process really slow because it has to make a request for each object to the Coredata database.
I did everything I could by limiting the number of fetched properties and using NSFetchedResultsController for data fetching but no luck.
I wonder if there's any way to insert child records in a more efficient way? Thanks.
Assuming your store type is persistent store type is sqlite (NSSQLiteStoreType):
The first thing you should check is whether you have an Core Data fetch index on the Verse objects verseId property. See this stack overflow answer for some introductory links on fetch indexes.
Without that, the fetch in your VerseDao.findById function may be scanning the whole database table every time.
To see if your index is working properly you may inspect the SQL queries generated by adding -com.apple.CoreData.SQLDebug 1 to the launch arguments in your Xcode scheme.
Other improvements:
Use NSManagedObjectContext.fetch or NSFetchRequest.execute (equivalent) instead of NSFetchedResultsController. The NSFetchedResultsController is typically used to bind results to a UI. In this case using it just adds overhead.
Don't set fetchRequest.propertiesToFetch, instead set fetchRequest.includesPropertyValues = false. This will avoid fetching the Verse object property values which you don't need to establish the relation to the Translation object.
Don't specify a sortDescriptor on the fetch request, this just complicates the query

cancel filter and sorting of big data array

I'm building vocabulary app using realm. I have several objects of Vocabulary, which contains list of words. One vocabulary contains 45000 words
UI is build such way, that user can search by "BEGINSWITH", "CONTAINS" or "ENDSWITH" through word's title, if corresponding tab is selected.
As, there are several vocabularies, there are some words, that appear in several vocabularies, and I need to remove "duplicates" from UI.
When I do this filtering duplicates on resulted objects + sorting them alphabetically the UI of app freezes, till process completes.
My question is:
1) How can I cancel previous filter and realm filtering request, if tab changed (for example from Contains to Ends"?
2) How can I do all these filter/sorting requests in background, so UI will not freeze?
My code:
let vocabularyPredicate = NSPredicate(format: "enabled == 1 AND lang_from CONTAINS[c] %#", self.language.value)
self.vocabularies = Array(realm.objects(Vocabulary.self).filter(vocabularyPredicate).sorted(byKeyPath: "display_order"))
let result = List<Word>()
for object in self.vocabularies {
let predicate = NSPredicate(format: "title \(selectedFilter.value)[c] %#", self.query.value.lowercased())
result.append(objectsIn: object.words.filter(predicate))
}
self.words = Array(result).unique{$0.title}.sorted {
(s1, s2) -> Bool in return s1.title.localizedStandardCompare(s2.title) == .orderedAscending
}
selectedFilter.value is selected tab value: "BEGINSWITH", "CONTAINS" or "ENDSWITH"
self.query.value.lowercased() - search query.
unique{$0.title} is extension method for array
extension Array {
func unique<T:Hashable>(map: ((Element) -> (T))) -> [Element] {
var set = Set<T>() //the unique list kept in a Set for fast retrieval
var arrayOrdered = [Element]() //keeping the unique list of elements but ordered
for value in self {
if !set.contains(map(value)) {
set.insert(map(value))
arrayOrdered.append(value)
}
}
return arrayOrdered
}
}
Actually, realm search is pretty fast, but because of looping through vocabularies and filtering duplicates + sorting alphabetically operations through array of objects - request is freezing for 1-2 seconds.
UPDATE, based on EpicPandaForce and Manuel advices:
I have lurked one more time, and it appeared, that .distinct(by: [keypath]) is already presented in Results in new version of RealmSwift.
I have changed filter/sorting request to
realm.objects(Word.self).filter(vocabularyPredicate).distinct(by: ["title"]).sorted(byKeyPath: "title", ascending: true)
works better know, but I want to ensure, UI will not freeze anyway, by passing objects bettween background thread and UI thread. I have updated adviced construction to:
DispatchQueue.global(qos: .background).async {
let realm = try! Realm()
let cachedWords = CashedWords()
let predicate = NSPredicate(format: "enabled == 1")
let results = realm.objects(Word.self).filter(predicate).distinct(by: ["title"]).sorted(byKeyPath: "title", ascending: true)
cachedWords.words.append(objectsIn: results)
try! realm.write {
realm.add(cachedWords)
}
let wordsRef = ThreadSafeReference(to: cachedWords)
DispatchQueue.main.async {
let realm = try! Realm()
guard let wordsResult = realm.resolve(wordsRef) else {
return
}
self.words = Array(wordsResult.words)
if ((self.view.window) != nil) {
self.tableView.reloadData()
}
}
print("data reload finalized")
}
1) How can I cancel previous filter and realm filtering request, if tab changed (for example from Contains to Ends"?
You could create an NSOperation to perform the task and check if it's been cancelled between each of the steps (fetch, check isCancelled, filter, check isCancelled, sort). You won't get to cancel it immediately, but it could improve your performance. It also depends on which of those three steps (fetch, filter, sort) is taking longer...
2) How can I do all these filter/sorting requests in background, so UI will not freeze?
You could run that operation inside a new NSOperationQueue.
Or just use GCD, dispatch a block to a background queue, create a Realm instance in the block and run your code there, then dispatch the results back to the main queue to update the UI.
Something like this:
DispatchQueue.global(qos: .userInitiated).async {
guard let realm = try? Realm() else {
return // maybe pass an empty array back to the main queue?
}
// ...
// your code here
// ...
let words = Array(result).unique{$0.title}.sorted {
(s1, s2) -> Bool in return s1.title.localizedStandardCompare(s2.title) == .orderedAscending
}
// Can't pass Realm objects directly across threads
let wordReferences = words.map { ThreadSafeReference(to: $0) }
DispatchQueue.main.async {
// Resolve references on main thread
let realm = try! Realm()
let mainThreadWords = wordReferences.flatMap { realm.resolve($0) }
// Do something with words
self.words = mainThreadWords
}
}
Additionally, you should try to optimize your query:
let predicate = NSPredicate(format: "vocabulary.enabled == 1 AND vocabulary.lang_from CONTAINS[c] %# AND title \(selectedFilter.value)[c] %#", self.language.value, self.query.value.lowercased())
let words = realm.objects(Word.self).filter(predicate).sorted(byKeyPath: "title")
let wordsReference = ThreadSafeReference(words)
// resolve this wordsReference in the main thread

Realm 1.0 How can I use thread

I would like to use the realm in my project, but I have a very complex filter and sort. I have to order the list by name,but the name is in other class.
class CustomObject: Object
{
dynamic var objectId = 0
let objectLangs = List<ObjectLang>()
}
class ObjectLang: Object
{
dynamic var objectId = 0
dynamic var name = ""
}
When I have more than 130 rows, it is very slow in main thread and it blocks the UI. I tried do it in a background thread, but when I want to update the UI, it was crashed by Realm. So what is the perfect solution? How could I use it? Could you give me an example or tutorial? I have read the guide line.
If program is crashed when updating UI on background thread, you should update UI on main thread when realm task finished.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), {
let realm = try! Realm()
//do What you need
dispatch_async(dispatch_get_main_queue(), {
//updateUI()
})
})
You can't use Realm accessors across threads. You will need to retrieve the objects on the thread on which you want to use them. To make that happen, I'd recommend for each of your object classes which need to be passed between threads to designate a property as primary key. This property might be objectId in your case.
class CustomObject: Object {
dynamic var objectId = 0
let objectLangs = List<ObjectLang>()
override class func primaryKey() -> String {
return "objectId"
}
}
This primary key can then be used to identify your objects and pass them over to the main thread to retrieve them there again.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0) {
let realm = try! Realm()
var objects = realm.objects(CustomObject)
// objects = objects.filter(…)
let sortedObjects: [CustomObject] = objects.sort { /* … */ }
let ids = sortedObjects.map { $0.objectId }
dispatch_async(dispatch_get_main_queue()) {
let realm = try! Realm()
let objects = ids.map {
realm.objectForPrimaryKey(CustomObject.self, key: $0)
}
updateUIWithObjects(objects)
}
}