casbah mongodb more typesafe way to access object parameters - mongodb

In casbah, there are two methods called .getAs and .getAsOrElse in MongoDBObject, which returns the relevant fields' values in the type which given as the type parameter.
val dbo:MongoDBObject = ...
dbo.getAs[String](param)
This must be using type casting, because we can get a Long as a String by giving it as the type parameter, which might caused to type cast exception in runtime. Is there any other typesafe way to retrieve the original type in the result?
This must be possible because the type information of the element should be there in the getAs's output.

Check out this excellent presentation on Salat by it's author. What you're looking for is Salat grater which can convert to and from DBObject.

Disclamer: I am biased as I'm the author of Subset
I built this small library "Subset" exactly for the reason to be able to work effectively with DBObject's fields (both scalar and sub-documents) in a type-safe manner. Look through Examples and see if it fits your needs.

The problem is that mongodb can store multiple types for a single field, so, I'm not sure what you mean by making this typesafe. There's no way to enforce it on the database side, so were you hoping that there is a way to enforce it on the casbah side? You could just do get("fieldName"), and get an Object, to be safest--but that's hardly an improvement, in my opinion.
I've been happy using Salat + Casbah, and when my database record doesn't match my Salat case class, I get a runtime exception. I just know that I have to run migration scripts when I change the types in my model, or create a new model for the new types (multiple models can be stored in the same collection). At least the Salat grater/DAO methods make it less of a hassle (you don't have to specify types every time you access a variable).

Related

Understanding Firestore's Underlying Serialisation Mechanism

I would like to get some information on how the serialisation/deserialisation mechanism works in Firestore. The issue I am having is that I am passing a Scala (JVM language) object into the Firestore create method but it blows up at the point of serialising that data. After some investigation, it appears that Firestore requires values created from classes which have an empty public constructor, why is this a constraint? This is something that Scala classes do not have. Is there anyway to side-step Firestore's serialisation and provide my own?
Firestore requires values created from classes which have an empty public constructor, why is this a constraint?
When you load data from Firestore, you have the option to read a Java object from the document by calling DocumentSnapshot.toObject(Class<T> valueType). In order to create the object that it returns, the Firestore SDK must be able to call a constructor on the class that you pass in. And the only constructor it can reasonably call, is one without any arguments, as it has no way to determine what argument values to pass in otherwise.
Note that calling toObject is not the only option to create an object out of a DocumentSnapshot. You can also construct the Java object yourself, and extract the individual values from the snapshot by calling its get methods.
A quick search seems to hint that it is possible to add a no-argument constructor in Scala too, so I also recommend checking out:
Scala case classes with new firebase-server-sdk
A quick overview of using Firebase in a Scala application

integration testing, comparing JPA entities

Consider you are doing some integration testing, you are storing some bigger entity into db, and then read it back and would like to compare it. Obviously it has some associations as well, but that's just a cherry on top of very unpleasant cake. How do you compare those entities? I saw lot of incorrect ideas and feel, that this has to be written manually. How you guys do that?
Issues:
you cannot use equals/hashcode: these are for natural Id.
you cannot use subclass with fixed equals, as that would test different class and can give wrong results when persisting data as data are handled differently in persistence context.
lot of fields: you don't want to type all comparisons by hand. You want reflection.
#Temporal annotations: you cannot use trivial "reflection equals" approaches, because #Temporal(TIMESTAMP) java.util.Date <> java.sql.Date
associations: typical entity you would like to have properly tested will have several associations, thus tool/approach ideally should support deep comparison. Also cycles in object graph can ruin the fun.
Best solution what I found:
don't use transmogrifying data types (like Date) in JPA entities.
all associations should be initialized in entity, because null <> empty list.
calculate externaly toString via say ReflectionToStringBuilder, and compare those. Reason for that is to allow entity to have its toString, tests should not depend that someone does not change something. Theoretically, toString can be deep, but commons recursive toStringStyle includes object identifier, which ruins it.
I though, that I could use json format to string, but commons support that only for shallow toString, Jackson (without further instructions on entity) fails on cycles over associations
Alternative solution would be actually declaring subclasses with generated id (say lombok) and use some automatic mapping tool (say remondis mapper), with option to overcome differences in Dates/collections.
But I'm listening. Does anyone posses better solution?

Are scala reflection API Names or Symbols adequate for use inside transfer objects?

Introduction
I am working on an API written in Scala. I use data transfer objects (DTOs) as parameters passed to the API's functions. The DTOs will be instanciated by the API's user.
As the API is pretty abstract / generic I want to specify the attributes of a object that the API should operate on. Example:
case class Person(name: String, birthdate: Date)
When an instance of Person "P" is passed to the API, the API needs to know the attributes of "P" it should operate on: either just name or birthdate, or both of them.
So I need to design a DTO that contains the instance of "P" itself, some kind of declaration of the attributes and maybe additional information on the type of "P".
String based approach
One way would be to use Strings to specify the attributes of "P" and maybe its type. This would be relatively simple, as Strings are pretty lightweight and well known. As there is a formal notation of packages, types and members as Strings, the declarations would structured to a certain degree.
On the other side, the String-declarations must be validated, because a user might pass invalid Strings. I could imagine types that represent the attributes with dedicated types instead of String, which may have the benefit of increased structure and maybe even those type are designed so that only valid instances can exist.
Reflection API approach
Of course the reflection API came to my mind and I am experimenting to declare the attributes with types out of the reflection API. Unfortunately the scala 2.10.x reflection API is a bit unintuitive. There are names, symbols, mirrors, types, typetags which can cause a bit of confusion.
Basically I see two alternatives to attribute declaration with Strings:
Attribute declaration with reflection API's "Names"
Attribute declaration with reflection API's "Symbols" (especially TermSymbol)
If I go this way, as far as I can see, the API's user, who constructs the DTOs, will have to deal with the reflection API and its Names / Symbols. Also the API's implementation will have to make use of the reflection API. So there are two places with reflective code and the user must have at least a little bit of knowledge of the reflection API.
Questions
However I don't know how heavyweight these approaches are:
Are Names or Symbols expensive to construct?
Does the reflection API do any caching of expensive operation results or should I take care about that?
Are Names and Symbols transferable to another JVM via network?
Are they serializable?
Main question: Are scala reflection API Names or Symbols adequate for use inside transfer objects?
It seems complicated to do this with the reflection API. Any hints are welcome. And any hints on other alternatives, too.
P.S.: I did not include my own code, yet, because my API is complex and the reflection part is in pretty experimental state. Maye I can deliver something useful later.
1a) Names are easy to construct and are lightweight, as they are just a bit more than strings.
1b) Symbols can't be constructed by the user, but are created internally when one resolves names using APIs like staticClass or member. First calls to such APIs usually involve unpacking type signatures of symbol's owners from ScalaSignature annotations, so they might be costly. Subsequent calls use already loaded signatures, but still pay the cost of a by-name lookup in a sort of a hashtable (1). declaration costs less than member, because declaration doesn't look into base classes.
2) Type signatures (e.g. lists of members of classes, params + return type of methods, etc) are loaded lazily and therefore are cached. Mappings between Java and Scala reflection artifacts are cached as well (2). To the best of my knowledge, the rest (e.g. subtyping checks) is generally uncached with a few minor exceptions.
3-4) Reflection artifacts depend on their universe and at the moment can't be serialized (3).

Scala, Morphia and Enumeration

I need to store Scala class in Morphia. With annotations it works well unless I try to store collection of _ <: Enumeration
Morphia complains that it does not have serializers for that type, and I am wondering, how to provide one. For now I changed type of collection to Seq[String], and fill it with invoking toString on every item in collection.
That works well, however I'm not sure if that is right way.
This problem is common to several available layers of abstraction on the top of MongoDB. It all come back to a base reason: there is no enum equivalent in json/bson. Salat for example has the same problem.
In fact, MongoDB Java driver does not support enums as you can read in the discussion going on here: https://jira.mongodb.org/browse/JAVA-268 where you can see the problem is still open. Most of the frameworks I have seen to use MongoDB with Java do not implement low-level functionalities such as this one. I think this choice makes a lot of sense because they leave you the choice on how to deal with data structures not handled by the low-level driver, instead of imposing you how to do it.
In general I feel that the absence of support comes not from technical limitation but rather from design choice. For enums, there are multiple way to map them with their pros and their cons, while for other data types is probably simpler. I don't know the MongoDB Java driver in detail, but I guess supporting multiple "modes" would have required some refactoring (maybe that's why they are talking about a new version of serialization?)
These are two strategies I am thinking about:
If you want to index on an enum and minimize space occupation, you will map the enum to an integer ( Not using the ordinal , please can set enum start value in java).
If your concern is queryability on the mongoshell, because your data will be accessed by data scientist, you would rather store the enum using its string value
To conclude, there is nothing wrong in adding an intermediate data structure between your native object and MongoDB. Salat support it through CustomTransformers, on Morphia maybe you would need to do the conversion explicitely. Go for it.

How do I implement a collection in Scala 2.8?

In trying to write an API I'm struggling with Scala's collections in 2.8(.0-beta1).
Basically what I need is to write something that:
adds functionality to immutable sets of a certain type
where all methods like filter and map return a collection of the same type without having to override everything (which is why I went for 2.8 in the first place)
where all collections you gain through those methods are constructed with the same parameters the original collection had (similar to how SortedSet hands through an ordering via implicits)
which is still a trait in itself, independent of any set implementations.
Additionally I want to define a default implementation, for example based on a HashSet. The companion object of the trait might use this default implementation. I'm not sure yet if I need the full power of builder factories to map my collection type to other collection types.
I read the paper on the redesign of the collections API but it seems like things have changed a bit since then and I'm missing some details in there. I've also digged through the collections source code but I'm not sure it's very consistent yet.
Ideally what I'd like to see is either a hands-on tutorial that tells me step-by-step just the bits that I need or an extensive description of all the details so I can judge myself which bits I need. I liked the chapter on object equality in "Programming in Scala". :-)
But I appreciate any pointers to documentation or examples that help me understand the new collections design better.
I'd have a look at the implementation of collection.immutable.BitSet. It's a bit spread out, reusing things from collection.BitSetLike and collection.generic.BitSetFactory. But it does exactly what you specified: implement an immutable set of a certain element type that adds new functionality.