What is the most efficient way to get the overall max speed over all Customers/Cars.
// 157087 Entries (already filtered)
class Customer: Object {
//...
let cars = LinkingObjects(fromType: Car.self, property: "customer")
}
// 2537950 Entries for the 157087 filtered Customers
class Car: Object {
//...
#objc dynamic var customer: Customer?
}
// results here are already filtered (For simplicity i kept the additional filters away)
var results : Results<Customer> = realm.filter("age > 18")
// takes several seconds
let maxSpeed = results.map { $0.cars.max(ofProperty: "speed") as Double? }.max() ?? 0
is there a better way to do so? Just for benchmarking I tried the other way around. It takes just as long
// need to add a `IN` clause because of the pre filtered results (see above)
let cars : Results<Car> = realm.objects(Car.self).filter("customer IN %#", results)
let maxSpeed = cars.max(ofProperty: "speed") as Double?
Super question; clear, concise and property formatted.
I don't have a 2 million row dataset but a few things that may/may not be a complete answer:
1)
This could be bad
results.map { $0.cars.max(ofProperty: "speed") as Double? }.max()
Realm objects are lazily loaded so as long as you work with them in a Realm-ish way, it won't have a significant memory impact. However, as soon as Swift filters, maps, reduce etc. are used ALL of the objects are loaded into memory and could overwhelm the device. Considering the size of your dataset, I would avoid this option.
2)
This is a good strategy as shown in your question as it it treats it as Realm objects, memory won't be affected and is probably your best solution
let max: Double = realm.objects(Car.self).max(ofProperty: "speed") ?? 0.0 //safely handle optional
print(max)
3)
One option we've used in some use cases is ordering the object and then grabbing the last one. Again, it's safer and we generally do this on a background thread to not affect the UI
if let lastOne = realm.objects(Car.self).sorted(byKeyPath: "speed").last {
print(lastOne.speed)
}
4)
One other thought: add a speed_range property to the object and when it's initially written update that as well.
class Car: Object {
#objc dynamic var speed = 0.0
#objc dynamic var speed_range = 0
}
where speed_range is:
0: 0-49
1: 50-99
2: 100-149
3: 150-199
4: 200-249
etc
That would enable you to quickly narrow the results which would dramatically improve the filter speed. Here we want the fastest car in the speed_range of 4 (200-249)
let max: Double = realm.objects(Car.self).filter("speed_range == 4").max(ofProperty: "speed") ?? 0.0
print(max)
You could add a calculated property backed Realm property to the Car object that automatically set's it to the correct speed_range when the speed is initally set.
Related
In the code below, I like the simplicity of calling
results.setValue, rather than iterating over an array and calling it a bunch of times.
It seems likely that there is some way to do this without iterating over an array, but I guess really my concern is that it appears to be significantly slower to iterate. In my testing, for 500 results, it took just under 3 milliseconds to update them in bulk, vs. 537 milliseconds to iterate.
Seems like there has got to be a built in way to set the value for a subset of the results. Limiting the results by a count doesn't appear to be supported, due to the lazy nature, but I don't see any simple way to update them in bulk. I could order them by a unique field, and get the 500th and then filter I suppose to get a new result set, but seems like there should be a better way to do it.
var results = realm.objects(CloudUpdate.self).filter("status = %#", "queued")
let limitedResults = updates[0..<500]
try! realm.write {
// this works, except it sets all the results to posted
// results.setValue("posted", forKey: "status")
// I'd like to be able to do
// limitedResults.setValue("posted", forKey: "status")
// or something rather than iterate as below
// -- note that limitedResults gets smaller as we set them
// because of the filter on the results.
while limitedResults.count > 0 {
limitedResults[0].setValue("posted", forKey: "status")
}
}
Realm can do that update using key paths of the object, and the performance over large datasets is very good.
The use case is not clear but if you have a dataset and want to update the first X number of objects, this will do it
Given a person class with a name property
class PersonClass: Object {
#Persisted var name = ""
}
and you want to update the first three names to... "Jay"
let peopleResults = realm.objects(PersonClass.self)
let peopleList = RealmSwift.List<PersonClass>()
peopleList.append(objectsIn: peopleResults[0...2]) //see note
try! realm.write {
peopleList.setValue("Jay", forKey: "name")
}
Keep in mind though, as soon as realm objects are manipulated by high level Swift functions, the performance will degrade and more importantly, those objects are no longer lazily loaded - they are all loaded into memory and could potentially overwhelm the device.
One other thing to note is that Realm has no pre-defined ordering so ensure the results have a .sorted(byKeyPath: if you want to update objects by an order.
// definition: Company: { Member { [Activity { times: Int }] }
struct Company {
var bossId: Int = 0
var members: [Int: Member] = [:]
var boss: Member? {
get { return members[bossId] }
set { members[bossId] = newValue }
}
}
struct Member {
var name: String
var activities: [Activity]
}
struct Activity {
var type: String
var times: Int
}
// init
var company = Company()
company.members[0] = Member(name: "John",activities: [Activity(type: "Walk", times: 0)])
company.members[1] = Member(name: "Sean", activities: [Activity(type: "Run", times: 0)])
I have a computed property in my top structure, boss returns one of the members by id. When I want to update the properties of some leaf members, compiler complains it needs a setter. So I add a members[bossId] = newValue setter for it. It seems when value is being mutating, it will assign a new updated Member to replace original Member, because it is struct instead of class.
Does it cause whole copy of struct boss: Member even I just need update one int property of member?
Should I worry this redundant copy? Or compiler is smart enough to minimize impact?
// use
company.boss?.activities[0].times = 1
company.boss?.activities[0].times = 2
company.boss?.activities[0].times = 3 // <- Does it cause 3 times whole `Member` copy?
Keep in mind that Swift structs are extremely cheap to copy - it's just a matter of writing values to a memory location. It's similar to assigning Int values, you don't get a performance penalty if you pass around the value 5.
This of course assumes the struct is not huge, like having hundreds of fields, in which case the actual copy can pose some performance issues, but anyway less than operations that involve heap allocations.
Getting back to your case, your three structs are not big at all:
print("Company:", MemoryLayout<Company>.size) // 16
print("Member:", MemoryLayout<Member>.size) // 24
print("Activity:", MemoryLayout<Activity>.size) // 24
print("Array<Activity>:", MemoryLayout<[Activity]>.size) // 8
So, changing properties on one of those structs is a matter of rewriting 8 to 24 bytes, which happens really fast.
So, with the above in mind, let's break down what happens when you write something like
company.boss?.activities[0].times = 2
This is, roughly, what happens under the hood:
activities[0].times = 2 results in Array updating the times value for the first item in the array
the company.boss?.activities property is replaced by a new instance of Array<Activity>, which means rewriting 8 bytes (the size of the Array struct)
the company.boss property gets replaced with the values after the update from #2, which means a write of 24 bytes
finally, the company itself is rewritten, 16 more bytes
So, in total, the rewrites count for less than 50 bytes, it could be way less depending on how much the compiler optimizes the memory writes. Even if this happens 3 times, like in your example, is still incredibly fast.
P.S. It could be that activities[0].times = 2 to result in another heap allocation, which would have a bigger performance impact, if the Copy-on-write mechanism enters into play. However this is bound to happen regardless how you structure your structs.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I'm learning Swift and I'm facing a problem in one of my model classes.
What I'm trying to do is having a lazy-loaded property that can be "invalidated" when the data it is based on changes. (something like this: https://stackoverflow.com/a/25954243/2382892)
What I have now is something like this:
class DataSet {
var entries: [Entry]
var average: Double? {
return self.entries.average
}
}
Array<Entry>.average computes the average of a property of Entry, but returns nil if the array is empty.
Since this average could potentially be expensive to compute, I'd like to do that lazily and store a cached value of it and recalculate it only if necessary (when DataSet.entries is modified).
Now, following this answer, I should do something like this:
class DataSet {
var entries: [Entry]
var _average: Double?
var average: Double? {
if _average == nil {
_average = self.entries.average
}
return _average
}
}
However I have the problem of handling both the case that the average is to be recomputed, and when the array is empty and there's no meaningful average to return.
Since I know the average value will always be positive, I could use a default value (such as -1.0) to indicate the cache is no more valid, and nil to mean there are no entries, or vice-versa (nil means the average has to be computed again; -1.0 when there are no entries).
This however doesn't seem elegant, or the "Swift way" to achieve this behavior (or is it indeed?).
What should I do?
You definitely shouldn't use magic values, like -1 to indicate some state. But I agree that you should not use nil to indicate both that "the cached value is invalidated and must be recalculated" as well as "the cached average has been calculated and is nil because there were zero Entry objects". The problem with the other solutions that suggest just setting the calculated value to nil is that it can't distinguish between the two states of "invalidated" and "calculated but nil" and may call entries.average even though you may have already done so. Admittedly, this is likely not computationally expensive, but it is conflating two very different states.
One Swift solution would be an enum:
class DataSet {
private enum CachedValue {
case Calculated(Double?)
case Invalidated
}
var entries = [Entry]() {
didSet {
cachedAverage = .Invalidated
}
}
private var cachedAverage: CachedValue = .Invalidated
var average: Double? {
switch cachedAverage {
case .Calculated(let result):
return result
case .Invalidated:
let result = entries.average
cachedAverage = .Calculated(result)
return result
}
}
}
This captures a distinct state between .Invalidated and .Calculated and recalculates lazily as needed.
You can make use of didSet:
class DataSet {
var entries: [Entry] {
didSet {
/// just mark the average as outdated, it will be
/// recomputed when someone asks again for it
averageOutOfDate = true
}
}
/// tells us if we should recompute, by default true
var averageOutOfDate = true
/// cached value that avoid the expensive computation
var cachedAverage: Double? = nil
var average: Double? {
if averageOutOfDate {
cachedAverage = self.entries.average
averageOutOfDate = false
}
return cachedAverage
}
}
Basically whenever the entries property value changes, you mark the cached value as outdated, and you use this flag to know when to recompute it.
appzYourLife's answer is a nice concrete solution. However, I suggest a differ approach all-together. Here's what I would do:
I would make a protocol that defines all the important bits that would need to be access externally.
I would then make 2 structs/classes conform to this protocol.
The first struct/class will be a caching layer.
The second struct/class will be the actual implementation, which will only ever be accessed by the caching layer.
The caching layer will have a private instance of the actual implementing struct/class, and will have a variable such as "isCacheValid", which will be set to false by all mutating operations that invalidate the underlying data (and by extension, computed values such as the average).
This design makes it so that the actual implementing struct/class is fairly simple, and completely agnostic to the caching.
The caching layer does all the caching duties, completely agnostic to details of how the cached values are computed (as their computation is just delegated to the implementing class.
Use nil
First of all never use a specific value of your type domain to indicate the absence of value. In other words don't use a negative number to indicate no value. nil is the answer here.
So average should be declared as Double?.
Clearing the cache when entries does change
Next you need to clear your cache each time entries is mutated. You can use didSet for this.
class DataSet {
private var entries: [Entry] = [] {
didSet { cachedAverage = nil }
}
private var cachedAverage: Double?
var average: Double? {
if cachedAverage == nil {
cachedAverage = self.entries.average
}
return cachedAverage
}
}
When entries is empty
Finally if you believe that average for an empty array should be nil, then why don't you change accordingly the average computed property you added to SequenceType?
I am really struggling with the fact that someData[start...stop] returns a MutableRandomAccessSlice. My someData was a let to begin with, so why would I want a Mutable thing? Why don't I get just a RandomAccessSlice. What's really frustrating though, is that it returns a thing that is pretty API incompatible with the original source. With a Data, I can use .withUnsafeBytes, but not so with this offspring. And how you turn the Slice back into a Data isn't clear either. There is no init that takes one of those.
I could use the subdata(in:) method instead of subscripting, but then, what's the point of the subscript if I only ever want a sub collection representation that behaves like the original collection. Furthermore, the subdata method can only do open subranges, why the subscript can do both closed and open. Is this just something they haven't quite finished up for Swift3 final yet?
Remember that the MutableRandomAccessSlice you get back is a value type, not a reference type. It just means you can modify it if you like, but it has nothing to do with the thing you sliced it out of:
let x = Data(bytes: [1,2,3]) // <010203>
var y = x[0...1]
y[0] = 2
x // <010203>
If you look in the code, you'll note that the intent is to return a custom slice type:
public subscript(bounds: Range<Index>) -> MutableRandomAccessSlice<Data> {
get {
return MutableRandomAccessSlice(base: self, bounds: bounds)
}
set {
// Ideally this would be:
// replaceBytes(in: bounds, with: newValue._base)
// but we do not have access to _base due to 'internal' protection
// TODO: Use a custom Slice type so we have access to the underlying data
let arrayOfBytes = newValue.map { $0 }
arrayOfBytes.withUnsafeBufferPointer {
let otherData = Data(buffer: $0)
replaceBytes(in: bounds, with: otherData)
}
}
}
That said, a custom slice will still not be acceptable to a function that takes a Data. That is consistent with other types, though, like Array, which slices to an ArraySlice which cannot be passed where an Array is expected. This is by design (and likely is for Data as well for the same reasons). The concern is that a slice "pins" all of the memory that backs it. So if you took a 3 byte slice out of a megabyte Data and stored it in an ivar, the entire megabyte has to hang around. The theory (according to Swift devs I spoke with) is that Arrays could be massive, so you need to be careful with slicing them, while Strings are usually much smaller, so it's ok for a String to slice to a String.
In my experience so far, you generally want subdata(in:). My experimentation with it is that it's very similar in speed to slicing, so I believe it's still copy on write (but it doesn't seem to pin the memory either in my initial tests). I've only tested on Mac so far, though. It's possible that there are more significant performance differences on iOS devices.
Based on Rob's comments, I just added the following pythonesque subscript extension:
extension Data {
subscript(start:Int?, stop:Int?) -> Data {
var front = 0
if let start = start {
front = start < 0 ? Swift.max(self.count + start, 0) : Swift.min(start, self.count)
}
var back = self.count
if let stop = stop {
back = stop < 0 ? Swift.max(self.count + stop, 0) : Swift.min(stop, self.count)
}
if front >= back {
return Data()
}
let range = Range(front..<back)
return self.subdata(in: range)
}
}
That way I can just do
let input = Data(bytes: [0x60, 0x0D, 0xF0, 0x0D])
input[nil, nil] // <600df00d>
input[1, 3] // <0df0>
input[-2, nil] // <f00d>
input[nil, -2] // <600d>
I have multiple scenes that are stored in Global/BaseScene
Each SceneType: is stored as enum: set to an Integer similar to this present scene
First objective was to get the score to populate in each scene, there is seven of them. DONE thanks to stack overflow, and experimentation
I had to create default keys for each individual scene, to calculate the highScore, so I have seven unique "highScore" "keys" for each scene.
In the GameScene:
var highScoreA1: Int = 0
var score: Int = 0 {
didSet {
a1Score.text = "Score: \(score)"
}
}
//above called before override func didMoveToView(view: SKView) {
//Called in GameOver Method
let scoreDefault = NSUserDefaults.standardUserDefaults()
scoreDefault.setInteger(score, forKey: "score")
if (score > highScoreA1){
let highScoreA1Default = NSUserDefaults.standardUserDefaults()
highScoreA1Default.setInteger(score, forKey: "highScoreA1")
//highscoreA1Default.synchronize()
There are six more keys similar to this.. My objective is to populate a "totalScoreKey" in two different scenes a Hud Scene and another scene (possibly game over scene)
I was thinking a function to add these keys together to populate the total score.
Taking into consideration all these scenes are subclasses (of Global BaseScene, and each scene has sub classes (for the nodes operation, probably not relevant yet thought it might be useful)
I have tried: Moving all score data into a Class and using NSCoding/NSObject the required init, and optional binding, became a serious pain, and honestly I am trying to keep things simple for version one.
import Foundation
class GameState {
var score:Int = 0
var highScore:Int()
var totalScore:Int()
class var sharedInstance: GameState {
struct Singleton {
static let instance = GameState()
}
return Singleton.instance
}
}
init() {
// Init
score.type = 0
highScore = 0
totalScore = 0
// Load game state
let defaults = NSUserDefaults.standardUserDefaults()
highScore = defaults.integerForKey("highScore")
}
func saveState() {
// Update highScore if the current score is greater
highScore = max(score, highScore)
// Store in user defaults
let defaults = NSUserDefaults.standardUserDefaults()
defaults.setInteger(highScore, forKey: "highScore")
defaults.setInteger(totalScore, forKey: "totalScore")
NSUserDefaults.standardUserDefaults().synchronize()
}
Various functions that have not worked they all default to zero, that is until I figure out how to retrieve data properly.
let total score = "totalScoreKey"
similar to this post exactly like this post actually except I had to do different configurations because of my on personal set up..
total Score example I tried to implement
outside of class and referring to that in the scene I needed to populate that data. NO Go defaults to zero.
How do I simply add the value of those keys together? For which I can display similar to the other scenes, I already have implemented.
later on down the road I may want to assign a key chain value, right now I am just trying to get it show up for posting in GameCenter. (which also has key "GameCenterHighScore")
Setting them all to the same key "highScore" does not work.... just to be clear, I tried multiple times. Thanks in advance.
EDIT
if I try to add all the defaults together to get the total, it throws the following error:
Expression was too complex to be solved in reasonable time; consider
breaking up the expression into distinct sub-expressions
[Swift compound arithmetic operation ERROR3