How to properly express references in immutable model in Scala? - scala

Let's say I want an immutable model, a world. How one should model references?
case class World(people: Set[Person])
case class Person(name: String, loves: Option[Person])
val alice = Person("Alice", None)
val peter = Person("Peter", Some(alice))
val myWorld = World(Set(alice, peter))
println(myWorld)
Outputs:
World(Set(Person(Alice,None), Person(Peter,Some(Person(Alice,None)))))
But now we have two separate persons named Alice (in the people set and in the peter person).
What is the best practice(s) on approaching this referencing in an immutable model in Scala?
I thought about referencing strictly through ids, but it doesn't feel right. Is there a better way? (Also current implementation doesn't support recursion/circle like A loves B and B loves A.)

I think you have to distinguish between pure values, and things that have a notion of identity that survives state changes.
A person might be something in the latter category, depending on the requirements of your model. E.g. if the age of a person changes, it is still the same person.
For identifying entities over state changes, I don't think there is anything wrong with using some kind of unique identifier. Depending on your model, it might be a good idea to have a map from person id to person state at top level in your model, and then express relationships between persons either in the person state, or in a separate data structure.
Something like this:
case class Person(name: String, age: Int, loves: Set[PersonRef])
case class PersonRef(id: Long) // typesafe identifier for a person
case class World(persons: Map[PersonRef, Person])
Note that a person state does not contain the ID, since two persons with different IDs could have the same state.
A problem with this approach is that the world could be in an inconsistent state. E.g. somebody could love a person that does not exist in the world. But I don't really see a way around this.
I think it might be worth looking at scala libraries that are confronted with a similar problem. E.g.
diode has the concept of a reference to a value elsewhere in the model
scala graph allows to define custom node and edge types.

Although alice is printed two times, it only exists as one and the same value in your example. In general, you would introduce perhaps an id field that carries a unique identifier if you want to trace the mutation of an otherwise immutable object. But here, clearly, you only have one value.
For recursive references, see this question. For example, you could use a by-name parameter.

When modelling some application domain using immutable data structures, you should not use object identity to rely on anything. Just think about how you would update an immutable model: you would generate a modified copy which would have different identity, even if it represents the same "thing". How would you ensure, that you've set all references in your model to the new, modified copy?
Thus, you have to make identity explicit: ask yourself what is the identity of something, e.g. a unique id or a set of unique attributes, e.g. for a person name, date and location of birth an so on. (Be careful though, while the real date of birth of a person never changes, the one stored in your date model might because of an error in your data set).
Then use this information to point to an object from everywhere else.
Making identity explicit requires you to think about it when you build your data model. This might feel like an extra burden but in fact it avoid a lot of trouble later on.
For example, serialization will be easy, distribution will be easy, adding some kind of versioning support will be easy and so on.
As a rule you should only use references to the same information, if this information is just the same by coincidence. If you are pointing to the same "identity" and the information at both places has to be the same, use some explicit id.

Related

using List<ValueObject> inside an entity DDD

Can we use List<ValueObject> inside entity? or we should use them as List<entity>?
When I use 'List<Discount>' inside my 'Product' entity class, entity framework creates a 'Discount' table with a generated column id.
Is it OK that i have defined Discount as List of value objects?
When a value object is used as a list, is it better to use it as an entity with its identity?
The second question is about updating 'List<Discont>' inside entity. how can I update this value object list inside its entity( add,remove discount)?
thanks
As mentioned in the comments by #Ivan Stoev, your domain model is not your database model. You need to model your domain in an object-oriented way with, ideally, no regard for the database. BTW, having some identifier in a Value Object does not make it an Entity. An Entity would, however, always require a unique identifier within that entity set but an identifier isn't the defining factor to make a class an entity (that would be whether it has its own lifecycle).
In the real world one would need to be pragmatic about the general guidance. For instance, if you need an identifier of sorts for your value object then that is fine. However, it may be that there is already something there that may be used. In an OrderItem one would use the ProductId since there should only be a single item for each product. In your Discount scenario perhaps there is a DiscountType where only unique discount types are permitted. On the point of mutable value objects, the reason a value object is usually not mutable is that it represents a particular value. You would never change 10 since that would be another value. It would seem that one would be changing 10 to, say, 15 when you need 15 but, in fact, that 15 is another value object. Again, one would need to be pragmatic and in many circumstances we end up using a Value Object that isn't as primitive as a single value so it may make sense to alter something on the value object. An order item is certainly not an entity but one would need to change the Quantity on the item every-so-often. Well, that may be another discussion around Quote/Cart vs Order but the concepts are still applicable.
On another note, I tend to nowadays define any "value object" that exists only within an aggregate as a nested class within the aggregate. I would not have Order and OrderItem classes but instead an Item class within the Order class... Order.Item. That is a design choice though but I thought I'd mention it.

What are the disadvantages of using records instead of classes?

C# 9 introduces record reference types. A record provides some synthesized methods like copy constructor, clone operation, hash codes calculation and comparison/equality operations. It seems to me convenient to use records instead of classes in general. Are there reasons no to do so?
It seems to me that currently Visual Studio as an editor does not support records as well as classes but this will probably change in the future.
Firstly, be aware that if it's possible for a class to contain circular references (which is true for most mutable classes) then many of the auto generated record members can StackOverflow. So that's a pretty good reason to not use records for everything.
So when should you use a record?
Use a record when an instance of a class is entirely defined by the public data it contains, and has no unique identity of it's own.
This means that the record is basically just an immutable bag of data. I don't really care about that particular instance of the record at all, other than that it provides a convenient way of grouping related bits of data together.
Why?
Consider the members a record generates:
Value Equality
Two instances of a record are considered equal if they have the same data (by default: if all fields are the same).
This is appropriate for classes with no behavior, which are just used as immutable bags of data. However this is rarely the case for classes which are mutable, or have behavior.
For example if a class is mutable, then two instances which happen to contain the same data shouldn't be considered equal, as that would imply that updating one would update the other, which is obviously false. Instead you should use reference equality for such objects.
Meanwhile if a class is an abstraction providing a service you have to think more carefully about what equality means, or if it's even relevant to your class. For example imagine a Crawler class which can crawl websites and return a list of pages. What would equality mean for such a class? You'd rarely have two instances of a Crawler, and if you did, why would you compare them?
with blocks
with blocks provides a convenient way to copy an object and update specific fields. However this is always safe if the object has no identity, as copying it doesn't lose any information. Copying a mutable class loses the identity of the original object, as updating the copy won't update the original. As such you have to consider whether this really makes sense for your class.
ToString
The generated ToString prints out the values of all public properties. If your class is entirely defined by the properties it contains, then this makes a lot of sense. However if your class is not, then that's not necessarily the information you are interested in. A Crawler for example may have no public fields at all, but the private fields are likely to be highly relevant to its behavior. You'll probably want to define ToString yourself for such classes.
All properties of a record are per default public
All properties of a record are per default immutable
By default, I mean when using the simple record definition syntax.
Also, records can only derive from records and you cannot derive a regular class from a record.

Deciding on class responsibility

I know this is an opinionated question. However it comes up often at work.
When creating methods it's often a struggle to know which class should be responsible.
e.g.
bool result = ProductService.CategoryHasSoldOutOfProducts(int categoryId)
vs
bool result = CategoryService.CategoryHasSoldOutOfProducts(int categoryId)
In my opinion, the CategoryService should be responsible, as the method is taking a categoryId and is specific to the Category.
Others at my work say the ProductService should be responsible as the method is dealing with if Products have sold out.
Just trying to develop a better understanding of service architecture and good process. I'm interested in other peoples explanations for why they would choose one over the other.
Thanks
Disclaimer - this is a purely IMHO answer. I am answering this in the spirit of having a design brainstorm.
Based on the OP, it seems the relationship between Category and Product is an optional one to many : Category (0..1) <--------> (*) Product.
Implementation wise, this means that the Category entity probably has a Container of Products, and the Product entity has a reference to a Category which may be NULL.
In this case, I agree with the decision to place CategoryHasSoldOutOfProducts under the responsibility of the Category entity. The method name clearly implies that the Category entity should be responsible for informing its API user on the status of its products.
There is another option, however: An association class/entity. The motivation behind this entity is to describe the relationship between two other entities.
In this case, you can have a functional association entity which we will call ProductContainment for the sake of this example.
ProductContainment will have no internal state, and will hold functions which are provided with Category and/or Product entities as parameters.
It is then the responsibility of the association entity to provide the implementation of functions which relate to how Category and Product relate to one another.
If you end up using ProductContainment - then CategoryHasSoldOutOfProducts should be one of its functions.
Since you're asking for opinions, here is mine:
(Disclaimer: That's probably something you cannot easily implement in the business world)
As you are using the term "class", I assume you want to have something object-oriented. The problem is, a service is nothing a valid object could be created from. Instead, it's just a namespace for functions.
Additionally it's very general. It's like calling a class "Manager". You can put possibly everything inside of it and this class has the potential to grow to have hundreds of functions.
My advice: Create small entities. Small enough to be created without the use of any setters, just by calling the constructor. If you notice your object needs more functionalities, create a decorator that is a little bit smarter and can do the work for you.
I would need a few more details about your environment to be more precise, but I guess in your case, you would have something like a Category class that contains products and knows when it's sold out. Just imagine you have a team of persons and everyone knows something. Ask the right guys to do the stuff and stay away from managers or services.

Refactoring domain model with mutability and cyclical dependencies to work for Scala with good FP practices?

I come from an OO background(C#, javascript) and Scala is my first foray into FP.
Because of my background I am having trouble realizing a domain model that fits my domain problem well and also complies with good practices for FP such as minimal mutability in code.
First, a short description of my domain problem as it is now.
Main domain objects are: Event, Tournament, User, and Team
Teams are made up of Users
Both Teams and Users can attend Tournaments which take place at an Event
Events consist of Users and Tournaments
Scores, stats, and rankings for Teams and Users who compete across Tournaments and Events will be a major feature.
Given this description of the problem my initial idea for the domain is create objects where bidirectional, cyclic relationships are the norm -- something akin to a graph. My line of thinking is that being able to access all associated objects for any given given object will offer me the easiest path for programming views for my data, as well as manipulating it.
case class User(
email: String,
teams: List[TeamUser],
events: List[EventUser],
tournaments: List[TournamentUser]) {
}
case class TournamentUser(
tournament: Tournament,
user: User,
isPresent: Boolean){
}
case class Tournament(
game: Game,
event: Event,
users: List[TournamentUser],
teams: List[TournamentTeam]) {
}
However as I have dived further into FP best practices I have found that my thought process is incompatible with FP principles. Circular references are frowned upon and seem to be almost an impossibility with immutable objects.
Given this, I am now struggling with how to refactor my domain to meet the requirements for good FP while still maintaining a common sense organization of the "real world objects" in the domain.
Some options I've considered
Use lazy val and by-name references -- My qualm with this is that seems to become unmanageable once the domain becomes non-trivial
Use uni-directional relationships instead -- With this method though I am forced to relegate some domain objects as second class objects which can only be accessed through other objects. How would I choose? They all seem equally important to me. Plus this would require building queries "against the grain" just to get a simple list of the second class objects.
Use indirection and store a list of identifiers for relationships -- This removes cyclical dependencies but then creates more complexity because I would have to write extra business logic to emulate relationship updates and make extra trips to the DB to get any relationship.
So I'm struggling with how to alter either my implementation or my original model to achieve the coupling I think I need but in "the right way" for Scala. How do I approach this problem?
TL;DR -- How do I model a domain using good FP practices when the domain seems to call for bidirectional access and mutability at its core?
Assuming that your domain model is backed by a database, in the case you highlighted above, I would make the "teams," "events," and "tournaments" properties of your User class defs that retrieve the appropriate objects from the database (you could implement a caching strategy if you're concerned about excessive db calls). It might look something like:
case class User(email: String)) {
def teams = TeamService.getAllTeams.filter( { t => t.users.contains(this) } )
//similar for events and tournaments
}
Another way of saying this might be that your cyclic dependencies have a single "authoritative" direction, while references in the other direction are calculated from this. This way, for example, when you add a user to a tournament, your function only has to return a new tournament object (with the added user), rather than a new tournament object and a new user object. Also, rather than explicitly modeling the TournamentUser linking table, Tournament could simply contain a list of User/Boolean tuples.
Another option might be to use Lenses to modify your domain model, but I haven't implemented them in a situation like this. Maybe someone with more experience in FP could speak to their applicability here.

Class Naming, Should I ,name a class and make an attribute or put that attribute into the classes name

I'm trying to understand the considerations people use to name classes. What are the ways in which you decide between the following.
student = Student.new(:smart)
vs simply using
student = SmartStudent.new
Edit:
I guess there is really no right or wrong answer its just the way I need to decide what I am modeling.
In general I like to name classes at the most general form of the noun that they represent. So in your example the noun is Student. smart is an adjective describing that noun. the adjective represents a Student's intelligence. so my attrribute would be intelligence.
That way I could have:
Bill = new Student
Bill.intelligence = smart
Bob = new Student
Bob.intelligence = stupid
You should ask your self: does the student's attributes make a student really different from another?
If not and if it is possible, I'd suggest you to use the first solution you proposed. Generally is simpler that deploying a hierarchy of student types (see composition vs inheritance), because if your student types grow, then your student classes may proliferate making difficult to proper handle all of them.
On the other hand, if a Smart student class has a really custom behavior, and you want to use method overriding and stuff like that to perform operations on that classes instead of checking each time the student type of that particular class, then inheritance could be an option. I would stay away from that.
You should give us more details for a correct answer. Can't say simple yes or no to your question, because it depends on your design.
EDIT:
consider also this point: can a student have more than an attribute? If that's the case you should definitively used the first approach, other could be impossible (or at least tricky) modeling a student type that has more than one attribute (example: smart and fat).