should entity field be mutable or immutable - scala

In a scala project, should entity field be mutable or immutable ?
Mutable field:
It is very easy to change field in a nested entity, also when logic is pushed into entity, it is very easy to be implemented.
Immutable field:
It guarantees consensus for one system is running, but it still may have inconsistency data if more than one systems are running, Also, if entity fields are immutable, it has lots of boilerplates to update nested fields. That means that some concept like lens should be introduced.
What should I choose to start up a scala project ?

Always favor immutability. Definitely in Scala, and probably in every other language too.
It's hard to give a more specific answer without a more specific question. But immutability is almost always a safe answer.

Related

What are the disadvantages of using records instead of classes?

C# 9 introduces record reference types. A record provides some synthesized methods like copy constructor, clone operation, hash codes calculation and comparison/equality operations. It seems to me convenient to use records instead of classes in general. Are there reasons no to do so?
It seems to me that currently Visual Studio as an editor does not support records as well as classes but this will probably change in the future.
Firstly, be aware that if it's possible for a class to contain circular references (which is true for most mutable classes) then many of the auto generated record members can StackOverflow. So that's a pretty good reason to not use records for everything.
So when should you use a record?
Use a record when an instance of a class is entirely defined by the public data it contains, and has no unique identity of it's own.
This means that the record is basically just an immutable bag of data. I don't really care about that particular instance of the record at all, other than that it provides a convenient way of grouping related bits of data together.
Why?
Consider the members a record generates:
Value Equality
Two instances of a record are considered equal if they have the same data (by default: if all fields are the same).
This is appropriate for classes with no behavior, which are just used as immutable bags of data. However this is rarely the case for classes which are mutable, or have behavior.
For example if a class is mutable, then two instances which happen to contain the same data shouldn't be considered equal, as that would imply that updating one would update the other, which is obviously false. Instead you should use reference equality for such objects.
Meanwhile if a class is an abstraction providing a service you have to think more carefully about what equality means, or if it's even relevant to your class. For example imagine a Crawler class which can crawl websites and return a list of pages. What would equality mean for such a class? You'd rarely have two instances of a Crawler, and if you did, why would you compare them?
with blocks
with blocks provides a convenient way to copy an object and update specific fields. However this is always safe if the object has no identity, as copying it doesn't lose any information. Copying a mutable class loses the identity of the original object, as updating the copy won't update the original. As such you have to consider whether this really makes sense for your class.
ToString
The generated ToString prints out the values of all public properties. If your class is entirely defined by the properties it contains, then this makes a lot of sense. However if your class is not, then that's not necessarily the information you are interested in. A Crawler for example may have no public fields at all, but the private fields are likely to be highly relevant to its behavior. You'll probably want to define ToString yourself for such classes.
All properties of a record are per default public
All properties of a record are per default immutable
By default, I mean when using the simple record definition syntax.
Also, records can only derive from records and you cannot derive a regular class from a record.

Problem with boundry for different aggregates

I have a problem with the boundaries of aggregates. I was trying to read about aggregates, aggregate roots, and boundaries, looking for some code examples but I still struggle with it.
The app that I'm working on is an app to manage architecture projects.
Among the screens in the app there will be a screen with all details for the selected project, and one with all jobs for the selected constructor.
I have one AggregateRoot - ArchitectureProject.It has an Architect, Stages, etc. and it has a list of ConstructorJobs (as it has to be on the screen with project details). ConstructorJob has its name, some value, and a Constructor. A Constructor can have some ConstructorType. As for me, Constructor is another AggregateRoot. I have a problem with ConstructorJob. Where should I place it? What should be responsible for managing it?
I was trying to thing what cannot exist with what, and ConstructorJob cannot exists without Project, but on the other hand it has to have Constructor as well...
I can't imagine that Constructor would belong to Project Aggregate, as ConstructorType would be 4th level child to id, so searching for all constructors of that type would be painful, wouldn't be?
I would appreciate any explanation, how to handle such cases.
I think you are missing an important rule which usually makes your life a lot easier:
Rule: Reference Other Aggregates by Identity
See also Vaughn Vernon's Book Implementing Domain-Driven Design, chapter 10 - Aggregates.
It is important to note that Aggregates in the sense of domain-driven design are not so much focused on if the existence of one aggregate makes sense without the other. It is more about transactional boundaries. So an aggregate should create a boundary around elements that should only change together within the same transaction - to adhere to consistency.
So I guess, that you will change your Project in different use cases you would change the Constructor - which I guess can be referenced in different projects.
This means you should reference other aggregates within aggregates only by id which avoids modelling huge aggregates with deep hierarchies. It also means that if your aggregates tend to grow bigger over time that you might have missed some new aggregate which you initially modelled as entity and should be an aggregate on its own.
As for me, Constructor is another AggregateRoot. I have a problem with ConstructorJob. Where should I place it? What should be responsible for managing it?
In your case I would model it the following way:
The ConstructorJob is a Value Object which holds some data (name, etc.) and also a reference to a Constructor aggregate. But this reference is not a reference in terms of object reference like you would do it with a child entity of an aggregate root. The constructor aggregate is referenced by an identifier (UUID, integer or whatever you are using as id type) in the ConstructorJob.
The ConstructorJob value object would be part of the Project aggregate. The project aggregate could of course directly hold the id of the constructor aggregate but I guess in your case the value object might fit quite well.

Scala, Morphia and Enumeration

I need to store Scala class in Morphia. With annotations it works well unless I try to store collection of _ <: Enumeration
Morphia complains that it does not have serializers for that type, and I am wondering, how to provide one. For now I changed type of collection to Seq[String], and fill it with invoking toString on every item in collection.
That works well, however I'm not sure if that is right way.
This problem is common to several available layers of abstraction on the top of MongoDB. It all come back to a base reason: there is no enum equivalent in json/bson. Salat for example has the same problem.
In fact, MongoDB Java driver does not support enums as you can read in the discussion going on here: https://jira.mongodb.org/browse/JAVA-268 where you can see the problem is still open. Most of the frameworks I have seen to use MongoDB with Java do not implement low-level functionalities such as this one. I think this choice makes a lot of sense because they leave you the choice on how to deal with data structures not handled by the low-level driver, instead of imposing you how to do it.
In general I feel that the absence of support comes not from technical limitation but rather from design choice. For enums, there are multiple way to map them with their pros and their cons, while for other data types is probably simpler. I don't know the MongoDB Java driver in detail, but I guess supporting multiple "modes" would have required some refactoring (maybe that's why they are talking about a new version of serialization?)
These are two strategies I am thinking about:
If you want to index on an enum and minimize space occupation, you will map the enum to an integer ( Not using the ordinal , please can set enum start value in java).
If your concern is queryability on the mongoshell, because your data will be accessed by data scientist, you would rather store the enum using its string value
To conclude, there is nothing wrong in adding an intermediate data structure between your native object and MongoDB. Salat support it through CustomTransformers, on Morphia maybe you would need to do the conversion explicitely. Go for it.

how to access complex data structures in Scala while preserving immutability?

Calling expert Scala developers! Let's say you have a large object representing a writable data store. Are you comfortable with this common Java-like approach:
val complexModel = new ComplexModel()
complexModel.modify()
complexModel.access(...)
Or do you prefer:
val newComplexModel = complexModel.withADifference
newComplexModel.access(...)
If you prefer that, and you have a client accessing the model, how is the client going
to know to point to newComplexModel rather than complexModel? From the user's perspective
you have a mutable data store. How do you reconcile that perspective with Scala's emphasis
on immutability?
How about this:
var complexModel = new ComplexModel()
complexModel = complexModel.withADifference
complexModel.access(...)
This seems a bit like the first approach, except that it seems the code inside withADifference is going to have to do more work than the code inside modify(), because it has to create a whole new complex data object rather than modifying the existing one. (Do you run into this problem of having to do more work in trying to preserve
immutability?) Also, you now have a var with a large scope.
How would you decide on the best strategy? Are there exceptions to the strategy you would choose?
I think the functional way is to actually have Stream containing all your different versions of your datastructure and the consumer just trying to pull the next element from that stream.
But I think in Scala it is an absolutely valid approach to a mutable reference in one central place and change that, while your whole datastructure stays immutable.
When the datastructure becomes more complex you might be interested in this question: Cleaner way to update nested structures which asks (and gets answered) how to actually create new change versions of an immutable data structure that is not trivial.
By such name of method as modify only it's easy to identify your ComplexModel as a mutator object, which means that it changes some state. That only implies that this kind of object has nothing to do with functional programming and trying to make it immutable just because someone with questionable knowledge told you that everything in Scala should be immutable will simply be a mistake.
Now you could modify your api so that this ComplexModel operated on immutable data, and I btw think you should, but you definitely must not try to convert this ComplexModel into immutable itself.
The canonical answer to your question is using Zipper, one SO question about it.
The only implementation for Scala I know of is in ScalaZ.
Immutability is merely a useful tool, not dogma. Situations will arise where the cost and inconvenience of immutability outweigh its usefulness.
The size of a ComplexModel may make it so that creating a modified copy is sufficiently expensive in terms of memory and/or CPU that a mutable model is more practical.

NSDictionaries vs. custom objects with properties, what's your take?

I'm writing an App that basically uses 5 business entities, A, B C, D and E
A has some properties and holds a list of B's
B has some other properties and a list of C's and a list of D's
C has some other properties and a list of D's and a list of E's
D has only a few properties
E has only a few properties
There is no inheritance between any of them.
There's no real business logic involved, the objects are created, populated, and then accessed read-only, no further manipulations.
My natural coding style would be to go object oriented and write classes for each of those entities, use NSArrays for the lists, and have the mentioned properties synthesized.
It would make the code readable.
But another approach seems obvious too: only use NSDictionaries and NSArrays, and working with keys/values instead of properties. This seems more efficient, and somehow "closer" to iPhone-style programming to me... but obviously leads to less readable code. Another advantage is there's no additional custom encoding/decoding for serialization required (persisting state to disk, using JSON, ...)
So on the paper, it speaks for the latter approach, on the other hand, it still feels somehow awkward NOT to use custom objects...
Is this really just a matter of taste question? Or are there maybe other arguments in favour/against one of the approaches? Is only using Dictionaries better memory/performance-wise? Is it the preferred "Apple Coding Style"? (I'm coming from Java/C#).
I don't see much difference between Java/C# and Cocoa in this area. Your question is equivalently applicable to those platforms as well (the same also applies to key-value stores and relational stores).
In an object oriented environment, you have to make a trade-off between the flexibility of the key-value approach for storing data and the structured and object oriented style. I'd go with the key-value approach only when I need the flexibility (e.g. the structure is dynamic and might change by user or not known at compile time). Otherwise, taking that route might get you completely off the OOP conventions and benefits (By the way, this is the important point. Does the hassle of sticking to object oriented principles worth it for that specific circumstance? I think your question reduces to this one and to answer it, you should analyze your specific situation)
It largely depends on whether your objects are just collections of data (key/value pairs) or implement their own functionality.
If they're data I'd say go with NSDictionary, it's a lot less code and as you point out you won't have to write serialization routines for each class.
Use a hybrid approach. Store the dictionaries the objects are based on, but expose the most-used values as properties that are either filled when the object is initialized from a dictionary, or have the accessors look into the dictionary for values (less efficient).
Also provide a property to get at the dictionary. This way if you need to propagate a new value quickly to a specific area of the code from the dictionary (presumably a new value added by the server) you have that flexibility. Then if callers are making heavy use of a value you can migrate it to be a true property and get the completion and type checking of a property.