In which scenario we need a ReadOnlyCollection? - .net-4.5

Dotnet 4.5 has introduce ReadOnlyCollection. My question is what is the practical useage of it? What scenarios we may need this kind of data structure?

You need read-only collections when your API returns collection objects to your callers, copying is too expensive, and you would prefer to stay away from returning IEnumerable<T>. This is commonly desirable in situations when random access is required over the returned collection.

When you want to return a collection that the caller should not be able to modify, but you still want to have the guarantees that an IList gives over an IEnumerable, e.g. a free .Count property, an indexer and the ability to safely iterate over it multiple times, both which aren't guaranteed on an IEnumerable.

This class is useful in a multithreading application. In a multithreading environment can it be a real problem to have a collection of objects, which might be changed by some other thread. This assures threadsafety and lessens the complexity of the code.

Related

DDD can event handler construct value object for aggregate

Can I construct a value object in the event handler or should I pass the parameters to the aggregate to construct the value object itself? Seller is the aggregate and offer is the value object. Will it be better for the aggregate to pass the value object in the event?
public async Task HandleAsync(OfferCreatedEvent domainEvent)
{
var seller = await this.sellerRepository.GetByIdAsync(domainEvent.SellerId);
var offer = new Offer(domainEvent.BuyerId, domainEvent.ProductId, seller.Id);
seller.AddOffer(offer);
}
should I pass the parameters to the aggregate to construct the value object itself?
You should probably default to passing the assembled value object to the domain entity / root entity.
The supporting argument is that we want to avoid polluting our domain logic with plumbing concerns. Expressed another way, new is not a domain concept, so we'd like that expression to live "somewhere else".
Note: that by passing the value to the domain logic, you protect that logic from changes to the construction of the values; for instance, how much code has to change if you later discover that there should be a fourth constructor argument?
That said, I'd consider this to be a guideline - in cases where you discover that violating the guideline offers significant benefits, you should violate the guideline without guilt.
Will it be better for the aggregate to pass the value object in the event?
Maybe? Let's try a little bit of refactoring....
// WARNING: untested code ahead
public async Task HandleAsync(OfferCreatedEvent domainEvent)
{
var seller = await this.sellerRepository.GetByIdAsync(domainEvent.SellerId);
Handle(domainEvent, seller);
}
static Handle(OfferCreatedEvent domainEvent, Seller seller)
{
var offer = new Offer(domainEvent.BuyerId, domainEvent.ProductId, seller.Id);
seller.AddOffer(offer);
}
Note the shift - where HandleAsync needs to be aware of async/await constructs, Handle is just a single threaded procedure that manipulates two local memory references. What that procedure does is copy information from the OfferCreatedEvent to the Seller entity.
The fact that Handle here can be static, and has no dependencies on the async shell, suggests that it could be moved to another place; another hint being that the implementation of Handle requires a dependency (Offer) that is absent from HandleAsync.
Now, within Handle, what we are "really" doing is copying information from OfferCreatedEvent to Seller. We might reasonably choose:
seller.AddOffer(domainEvent);
seller.AddOffer(domainEvent.offer());
seller.AddOffer(new Offer(domainEvent));
seller.AddOffer(new Offer(domainEvent.BuyerId, domainEvent.ProductId, seller.Id));
seller.AddOffer(domainEvent.BuyerId, domainEvent.ProductId, seller.Id);
These are all "fine" in the sense that we can get the machine to do the right thing using any of them. The tradeoffs are largely related to where we want to work with the information in detail, and where we prefer to work with the information as an abstraction.
In the common case, I would expect that we'd use abstractions for our domain logic (therefore: Seller.AddOffer(Offer)) and keep the details of how the information is copied "somewhere else".
The OfferCreatedEvent -> Offer function can sensibly live in a number of different places, depending on which parts of the design we think are most stable, how much generality we can justify, and so on.
Sometimes, you have to do a bit of war gaming: which design is going to be easiest to adapt if the most likely requirements change happens?
I would also advocate for passing an already assembled value object to the aggregate in this situation. In addition to the reasons already mentioned by #VoiceOfUnreason, this also fits more naturally with the domain language. Also, when reading code and method APIs you can then focus on domain concepts (like an offer) without being distracted by details until you really need to know them.
This becomes even more important if you would need to pass in more then one value object (or entity). Rather passing in all the values required for construction as parameters not only makes the API more resilient to refactoring but also burdens the reader with more details.
The seller is receiving an offer.
Assuming this is what is meant here, fits better than something like the following:
The seller receives some buyer id, product id, etc.
This most probably would not be found in conversations using the ubiquitous language. In my opinion code should be as readable as possible and express the behaviour and business logic as close to human language as possible. Because you compile code for machines to execute it but the way you write it is for humans to easily understand it.
Note: I would even consider using factory methods on value objects in certain cases to unburden the client code of knowing what else might be needed to assemble a valid value object, for instance, if there are different valid constellations and ways of constructing the same value objects where some values need reasonable default values or values are chosen by the value object itself. In more complex situations a separate factory might even make sense.

Scala immutable collections cannot be shared without synchronization?

From the «Learning concurrent programming in Scala» book:
In current versions of Scala (2.11.1), however, certain collections that are
deemed immutable, such as List and Vector, cannot be shared without
synchronization. Although their external API does not allow you to
modify them, they contain non-final fields.
Could anyone demonstrate this with a small example? And does this still apply to 2.11.7?
The behavior of changes made in one thread when viewed from another is governed by the Java Memory Model. In particular, these rules are extremely weak when it comes to something like building a collection and then passing the built-and-now-immutable collection to another thread. The JMM does not guarantee that the other thread won't see an earlier view where the collection was not fully built!
Since synchronized blocks enforce an ordering, they can be used to get a consistent view if they're used on every single operation.
In practice, though, this is rarely actually necessary. On the CPU side, there is typically a memory barrier operation that can be used to enforce memory consistency (i.e. if you write the tail of your list and then pass a memory barrier, no other thread can see the tail un-set). And in practice, JVMs usually have to implement synchronized by using memory barriers. So one could hope that you could just pass the created list within a synchronzied block, trusting that a memory barrier would be issued, and everything thereafter would be fine.
Unfortunately, the JMM doesn't require that it be implemented in this way (and you can't assume that the memory-barrier-like behavior of object creation will actually be a full memory barrier that applies to everything in that thread as opposed to simply the final fields of that object), which is both why the recommendation is what it is, and why it's not fixed (yet, anyway) in the library.
For what it's worth, on x86 architectures, I've never observed a problem if you hand off the immutable object within a synchronized block. I have observed problems if you try to do it with CAS (e.g. by using the java.util.concurrent.atomic classes).
As an addition to the excellent answer from Rex Kerr:
it should be noted that most common use cases of immutable collections in a multithreading context are not affected by this problem. The only situation where this might affect you is when you do something that you probably should not do in the first place.
E.g. you have a variable var x: Vector[Int], which you write from one thread A and read from another thread B.
If you mark x with #volatile, there will be no problem, since the volatile write introduces a memory barrier. So you will never be able to observe the Vector in an inconsistent state. The same is true when using a synchronized { } block when writing and reading, or when using java.util.concurrent.atomic.AtomicReference.
If you don't mark x with #volatile, you might observe the vector in an inconsistent state (not just wrong elements, but internally inconsistent!). But in that case your code is arguably broken to begin with. It is completely undefined when you will see the changes from A in B.
You might see them
immediately
after there is a memory barrier somewhere else in your program
not at all
depending on the architecture you`re running on, the phase of the moon, whatever. So as Viktor Klang put it: "Unsafe publication is unsafe..."
Note that if you use a higher level concurrency framework such as akka actors, it is also guaranteed that receivers of messages can not see immutable collections in an inconsistent state.

Mongo DB Collection Versioning

Are there any best practices or ways we can use to version a collection or objects in a collection in Mongo DB?
The requirement of versioning a collection is because, the objects in the collection maybe added with new attributes going forward but the already added objects (i.e. old objects) will not be having these attributes and the values for these new attributes. So on retrieval, we need to make sure, the code is not broken in de-serializing the different versions of the same object in the collection.
I can think of adding a version attribute to the objects explicitly, but are there any better built in alternatives in Mongo DB for handling this versioning of objects and/or collections.
Thanks,
Bathiya
I guess the best approach would be to update all objects in a batch process when you start using the new software on the server, since otherwise you’ll never know when an object will be updated and you’ll need to keep the old versions of those forever around.
Another thing what I’m doing so far, and it worked (so far), to have the policy to only allow adding new properties to the object. This way in worst case the DB won’t have all the data, but that is fine with all the json-serializers I know. But this means you aren’t allowed to delete or rename properties as well as modifying their type (from scalar value to object, array; from object to scalar, array; …).
Since usually I want to store additional information instead of less, this seems like a good solution without any real limitation for me.
If you need to change the type because scalar value isn’t enough, you can still create some code around which would transform it for every object which has the old value into the new one. And if a bulk update from your side is able to perform the changes I’d still do it but sometimes it needs user input.
For instance if you used to save passwords only as md5-hash it was a scalar value. But someone told you, they should be stored as sha512 with a salt together, now you need an object field for the password, you could call it password_sha512 where you store the salt and the hashed password.

Why do Eclipse APIs use Arrays instead of Collections?

In the Eclipse APIs, the return and argument types are mostly arrays instead of collections. An example is the members method on IContainer, which returns IResources[].
I am interested in why this is the case. Maybe it is one of the following:
The APIs were designed before generics generics were available, so IResource[] was better than just Collection or List
Memory concerns, e.g. ArrayList internally holds an array which has more space than is needed (to offer an efficient implementation of add), whereas an array is always constructed for just the needed target size
It's not possible to add/remove elements on an array, so it is safe for iterating (but defensive copying is still necessary, because one can still change elements, e.g. set them to null)
Does anyone have any insights or other ideas why the API was developed that way?
Posting this as an answer, so it can be accepted.
Eclipse predates generics and they are really serious about API stability. Also, at the low level of SWT passing arrays seems to be used to reflect the operating system APIs that are being wrapped. Once you have a bunch of tooling using Arrays I guess it makes sense to keep things consistent. Also note that arrays aren't subject to all of the type erasure issues when using reflection.
Yeah, I hear you as far as the collections api being generally much easier to work with for dynamic lists of items.

Is there a way to caching mechanism for Class::DBI?

I have a set of rather complex ORM modules that inherit from Class::DBI. Since the data changes quite infrequently, I am considering using a Caching/Memoization layer on top of this to speed things up. I found a module: Class::DBI::Cacheable but no rating or any reviews on RT. I would appreciate hearing from people who have used this or any other Class::DBI caching scheme.
Thanks a ton.
I too have rolled my own ORM plenty of times I hate to say! Caching/Memoization is pretty easy if all your fetches happen through a single api (or subclasses thereof).
For any fetch based on a unique key you can just cache based on a concatenation of the keys. A naive approach might be:
my %_cache;
sub get_object_from_db {
my ($self, $table, %table_lookup_key) = #_;
# concatenate a unique key for this object
my $cache_key = join('|', map { "$_|$table_lookup_key{$_}" }
sort keys %table_lookup_key
return $_cache{$cache_key}
if exists $_cache{$cache_key};
# otherwise get the object from the db and cache it in the hash
# before returning
}
Instead of a hash, you can use the Cache:: suite of modules on CPAN to implement time and memory limits in your cache.
If you're going to cache for some time you might want to think about a way to expire objects in the cache. If for instance all your updates also go through the ORM you can clear (or update) the cache entry in your update() ORM method.
A final point to think carefully about - you're returning the same object each time which has implications. If, eg., one piece of code retrieves an object and updates a value but doesn't commit that change to the db, all other code retrieving that object will see that change. This can be very useful if you're stringing together a series of operations - they can all update the object and then you can commit it at the end - but it may not be what you intend. I usually set a flag on the object when it is fresh from the database and then in your setter method invalidate that flag if the object is updated - that way you can always check that flag if you really want a fresh object.
On a few occasions we've rolled our own, but we limited it to special cases where profiling indicated we needed a boost (for example large joins). Since our applications typically use a custom abstraction layer (akin to a home-grown ORM) on top of the DB access, that's where we implemented the caching. We achieved good results that we were satisfied with and it didn't take a whole lot of effort. Of course, since we weren't using a CPAN ORM, we didn't really have any choice about using a CPAN caching module, either.
It was strictly case-by-case and opt-in. Whether you end up using a CPAN solution or rolling your own, it's probably a good idea to restrict it to cases where profiling indicates you need help, and make sure that it's opt-in so your caching doesn't undermine your application in subtle ways by being active when you didn't expect it.
I have used memcached before to cache objects, but not with Class::DBI (ORM makes me feel dirty).