GWT-RPC requires that transfer objects to be serialized must have a default (zero-argument) constructor. Similarly, final fields will not be serialized (see issue 1054).
On the other hand, I know I am supposed to "minimize mutability". My tendency is to want my TOs to be immutable, with final fields, no default constructor, and no mutators.
How can I use GWT-RPC while respecting the immutable paradigm as much as possible. Do I have to convert to a mutable object to marshall, and then back to an immutable one? Is this even worthwhile?
Item 13 in Effective Java (item 15 in second edition) gives strategies on how to minimize mutability or to favor immutability.
Suppose we remove mutators but retain non-final fields and a default constructor. The effect will be a theoretically mutable object, but a practically immutable one. Yes, one could mutate the object via reflection with a bit of effort, but by simply closing off the exposed methods we can at least discourage mutating it in cases like this where it's impractical to make the object truly immutable.
Related
C# 9 introduces record reference types. A record provides some synthesized methods like copy constructor, clone operation, hash codes calculation and comparison/equality operations. It seems to me convenient to use records instead of classes in general. Are there reasons no to do so?
It seems to me that currently Visual Studio as an editor does not support records as well as classes but this will probably change in the future.
Firstly, be aware that if it's possible for a class to contain circular references (which is true for most mutable classes) then many of the auto generated record members can StackOverflow. So that's a pretty good reason to not use records for everything.
So when should you use a record?
Use a record when an instance of a class is entirely defined by the public data it contains, and has no unique identity of it's own.
This means that the record is basically just an immutable bag of data. I don't really care about that particular instance of the record at all, other than that it provides a convenient way of grouping related bits of data together.
Why?
Consider the members a record generates:
Value Equality
Two instances of a record are considered equal if they have the same data (by default: if all fields are the same).
This is appropriate for classes with no behavior, which are just used as immutable bags of data. However this is rarely the case for classes which are mutable, or have behavior.
For example if a class is mutable, then two instances which happen to contain the same data shouldn't be considered equal, as that would imply that updating one would update the other, which is obviously false. Instead you should use reference equality for such objects.
Meanwhile if a class is an abstraction providing a service you have to think more carefully about what equality means, or if it's even relevant to your class. For example imagine a Crawler class which can crawl websites and return a list of pages. What would equality mean for such a class? You'd rarely have two instances of a Crawler, and if you did, why would you compare them?
with blocks
with blocks provides a convenient way to copy an object and update specific fields. However this is always safe if the object has no identity, as copying it doesn't lose any information. Copying a mutable class loses the identity of the original object, as updating the copy won't update the original. As such you have to consider whether this really makes sense for your class.
ToString
The generated ToString prints out the values of all public properties. If your class is entirely defined by the properties it contains, then this makes a lot of sense. However if your class is not, then that's not necessarily the information you are interested in. A Crawler for example may have no public fields at all, but the private fields are likely to be highly relevant to its behavior. You'll probably want to define ToString yourself for such classes.
All properties of a record are per default public
All properties of a record are per default immutable
By default, I mean when using the simple record definition syntax.
Also, records can only derive from records and you cannot derive a regular class from a record.
This question is about optimizing lazy collections. I will first explain the problem and then give some thoughts for a possible solution. Questions are in bold.
Problem
Swift expects operations on Collections to be O(1). Some operations, especially prefix and suffix-like types, deviate and are on the order of O(n) or higher.
Lazy collections can't iterate through the base collection during initialization since computation should be deferred for as long as possible until the value is actually needed.
So, how can we optimize lazy collections? And of course this begs the question, what constitutes an optimized lazy collection?
Thoughts
The most obvious solution is caching. This means that the first call to a collection's method has an unfavourable time complexity, but subsequent calls to the same or other methods can possibly be computed in O(1). We trade some space complexity to the order of O(n) for faster computation.
Attempting to optimize lazy collections on structs by using caching is impossible since subscript(_ position:) and all other methods that you'd need to implement to conform to LazyProtocolCollection are non-mutating and structs are immutable by default. This means that we have to recompute all operations for every call to a property or method.
This leaves us with classes. Classes are mutable, meaning that all computed properties and methods can internally mutate state. When we use classes to optimize a lazy collection we have two options. First, if the properties of the lazy type are variables then we're bringing ourselves into a world of hurt. If we change a property it could potentially invalidate previously cached results. I can imagine managing the code paths to make properties mutable to be headache inducing. Second, if we use lets we're good; the state set during initialization can't be changed so a cached result doesn't need to be updated. Note that we're only talking about lazy collections with pure methods without side effects here.
But classes are reference types. What are the downsides of using reference types for lazy collections? The Swift standard library doesn't use them for starters.
Any thoughts or thoughts on different approaches?
I completely agree with Alexander here. If you're storing lazy collections, you're generally doing something wrong, and the cost of repeated accesses is going to constantly surprise you.
These collections already blow up their complexity requirements, it's true:
Note: The performance of accessing startIndex, first, or any methods that depend on startIndex depends on how many elements satisfy the predicate at the start of the collection, and may not offer the usual performance given by the Collection protocol. Be aware, therefore, that general operations on LazyDropWhileCollection instances may not have the documented complexity.
But caching won't fix that. They'll still be O(n) on the first access, so a loop like
for i in 0..<xs.count { print(xs[i]) }
is still O(n^2). Also remember that O(1) and "fast" are not the same thing. It feels like you're trying to get to "fast" but that doesn't fix the complexity promise (that said, lazy structures are already breaking their complexity promises in Swift).
Caching is a net-negative because it makes the normal (and expected) use of lazy data structures slower. The normal way to use lazy data structures is to consume them either zero or one times. If you were going to consume them more than one time, you should use a strict data structure. Caching something that you never use is a waste of time and space.
There are certainly conceivable use cases where you have a large data structure that will be sparsely accessed multiple times, and so caching would be useful, but this isn't the use case lazy was built to handle.
Attempting to optimize lazy collections on structs by using caching is impossible since subscript(_ position:) and all other methods that you'd need to implement to conform to LazyProtocolCollection are non-mutating and structs are immutable by default. This means that we have to recompute all operations for every call to a property or method.
This isn't true. A struct can internally store a reference type to hold its cache and this is common. Strings do exactly this. They include a StringBuffer which is a reference type (for reasons related to a Swift compiler bug, StringBuffer is actually implemented as a struct that wraps a class, but conceptually it is a reference type). Lots of value types in Swift store internal buffer classes this way, which allows them to be internally mutable while presenting an immutable interface. (It's also important for CoW and lots of other performance and memory related reasons.)
Note that adding caching today would also break existing use cases of lazy:
struct Massive {
let id: Int
// Lots of data, but rarely needed.
}
// We have lots of items that we look at occassionally
let ids = 0..<10_000_000
// `massives` is lazy. When we ask for something it creates it, but when we're
// done with it, it's thrown away. If `lazy` forced caching, then everything
// we accessed would be forever. Also, if the values in `Massive` change over
// time, I certainly may want it to be rebuilt at this point and not cached.
let massives = ids.lazy.map(Massive.init)
let aMassive = massives[10]
This isn't to say a caching data structure wouldn't be useful in some cases, but it certainly isn't always a win. It imposes a lot of costs and breaks some uses while helping others. So if you want those other use cases, you should build a data structure that provides them. But it's reasonable that lazy is not that tool.
Swift's lazy collections are intended to provide one off access to elements. Subsequent access cause redundant computation (e.g. a lazy map sequence would recompute the transform closure.
In the case where you want repeated access to elements, it's best to just slice the portion of the lazy sequence/collection you care about, and create a proper Collection (e.g. an Array) out of it.
The book keeping overhead of lazily evaluating and caching each element would probably be greater than the benefits.
In a scala project, should entity field be mutable or immutable ?
Mutable field:
It is very easy to change field in a nested entity, also when logic is pushed into entity, it is very easy to be implemented.
Immutable field:
It guarantees consensus for one system is running, but it still may have inconsistency data if more than one systems are running, Also, if entity fields are immutable, it has lots of boilerplates to update nested fields. That means that some concept like lens should be introduced.
What should I choose to start up a scala project ?
Always favor immutability. Definitely in Scala, and probably in every other language too.
It's hard to give a more specific answer without a more specific question. But immutability is almost always a safe answer.
I read in this answer A generic list of anonymous class how to load a list with anonymous class objects. My question is why and when is recommendable to use this way instead of using a struct, considering performance and good practices.
An exposed-field structure is essentially a group of variables bound together with duct tape. It won't behave as an "object", and may thus be seen as evil who think everything should behave like an object; nonetheless, in cases where one doesn't really want an object, but rather a group of variables bound together with duct tape, an exposed-field structure may be a perfect fit.
Anonymous classes have only a few advantages over exposed-field structures:
The syntax to declare them is at least slightly smaller; depending upon coding standards, it may be a lot smaller. If coding standards will allow one to write internal struct WeightAndVolume { public double weight, volume;} and say that the struct is "self-explanatory" [it contains two public fields of type double, named weight and volume, each of which will hold whatever was last written to it by outside code], anonymous classes won't save much, but if coding standards would require that every named data type have many pages of associated documentation, including an analysis of required unit-test procedures, anonymous classes could avoid such hassle.
Copying class references is slightly cheaper than copying structures larger than 8 bytes, though unless a reference would be copied many times, the cost of creating the object will outweigh any savings in copying.
Casting an anonymous class to Object is much cheaper than casting a struct. The first time an anonymous class instance gets cast to Object will make up for the extra costs of creating it. Every additional time will represent a savings of that amount.
Passing a structure to a generic method will require the JITter to produce a specialized version of the code for that type; by contrast, the JITter would only have to produce one piece of code to handle all anonymous classes.
In general, structures will work better than anonymous classes. On the other hand, there are a few scenarios (mostly related to the third point above) where classes can end up being much better.
I wouldn't say it is ever recommended to use anonymous classes, in the sense that it's never wrong to not use them. But they typically get used when
it's an one-shot job, for which creating a proper named type would be cumbersome, and
the consumer of the objects is either compiler-generated code (you don't have access to the types backing those anonymous classes, but the compiler does) or uses reflection (in which case you don't need access to the types at compile time)
The most common scenario where this occurs is in LINQ queries.
I'm writing an App that basically uses 5 business entities, A, B C, D and E
A has some properties and holds a list of B's
B has some other properties and a list of C's and a list of D's
C has some other properties and a list of D's and a list of E's
D has only a few properties
E has only a few properties
There is no inheritance between any of them.
There's no real business logic involved, the objects are created, populated, and then accessed read-only, no further manipulations.
My natural coding style would be to go object oriented and write classes for each of those entities, use NSArrays for the lists, and have the mentioned properties synthesized.
It would make the code readable.
But another approach seems obvious too: only use NSDictionaries and NSArrays, and working with keys/values instead of properties. This seems more efficient, and somehow "closer" to iPhone-style programming to me... but obviously leads to less readable code. Another advantage is there's no additional custom encoding/decoding for serialization required (persisting state to disk, using JSON, ...)
So on the paper, it speaks for the latter approach, on the other hand, it still feels somehow awkward NOT to use custom objects...
Is this really just a matter of taste question? Or are there maybe other arguments in favour/against one of the approaches? Is only using Dictionaries better memory/performance-wise? Is it the preferred "Apple Coding Style"? (I'm coming from Java/C#).
I don't see much difference between Java/C# and Cocoa in this area. Your question is equivalently applicable to those platforms as well (the same also applies to key-value stores and relational stores).
In an object oriented environment, you have to make a trade-off between the flexibility of the key-value approach for storing data and the structured and object oriented style. I'd go with the key-value approach only when I need the flexibility (e.g. the structure is dynamic and might change by user or not known at compile time). Otherwise, taking that route might get you completely off the OOP conventions and benefits (By the way, this is the important point. Does the hassle of sticking to object oriented principles worth it for that specific circumstance? I think your question reduces to this one and to answer it, you should analyze your specific situation)
It largely depends on whether your objects are just collections of data (key/value pairs) or implement their own functionality.
If they're data I'd say go with NSDictionary, it's a lot less code and as you point out you won't have to write serialization routines for each class.
Use a hybrid approach. Store the dictionaries the objects are based on, but expose the most-used values as properties that are either filled when the object is initialized from a dictionary, or have the accessors look into the dictionary for values (less efficient).
Also provide a property to get at the dictionary. This way if you need to propagate a new value quickly to a specific area of the code from the dictionary (presumably a new value added by the server) you have that flexibility. Then if callers are making heavy use of a value you can migrate it to be a true property and get the completion and type checking of a property.