Scala, Morphia and Enumeration - scala

I need to store Scala class in Morphia. With annotations it works well unless I try to store collection of _ <: Enumeration
Morphia complains that it does not have serializers for that type, and I am wondering, how to provide one. For now I changed type of collection to Seq[String], and fill it with invoking toString on every item in collection.
That works well, however I'm not sure if that is right way.

This problem is common to several available layers of abstraction on the top of MongoDB. It all come back to a base reason: there is no enum equivalent in json/bson. Salat for example has the same problem.
In fact, MongoDB Java driver does not support enums as you can read in the discussion going on here: https://jira.mongodb.org/browse/JAVA-268 where you can see the problem is still open. Most of the frameworks I have seen to use MongoDB with Java do not implement low-level functionalities such as this one. I think this choice makes a lot of sense because they leave you the choice on how to deal with data structures not handled by the low-level driver, instead of imposing you how to do it.
In general I feel that the absence of support comes not from technical limitation but rather from design choice. For enums, there are multiple way to map them with their pros and their cons, while for other data types is probably simpler. I don't know the MongoDB Java driver in detail, but I guess supporting multiple "modes" would have required some refactoring (maybe that's why they are talking about a new version of serialization?)
These are two strategies I am thinking about:
If you want to index on an enum and minimize space occupation, you will map the enum to an integer ( Not using the ordinal , please can set enum start value in java).
If your concern is queryability on the mongoshell, because your data will be accessed by data scientist, you would rather store the enum using its string value
To conclude, there is nothing wrong in adding an intermediate data structure between your native object and MongoDB. Salat support it through CustomTransformers, on Morphia maybe you would need to do the conversion explicitely. Go for it.

Related

integration testing, comparing JPA entities

Consider you are doing some integration testing, you are storing some bigger entity into db, and then read it back and would like to compare it. Obviously it has some associations as well, but that's just a cherry on top of very unpleasant cake. How do you compare those entities? I saw lot of incorrect ideas and feel, that this has to be written manually. How you guys do that?
Issues:
you cannot use equals/hashcode: these are for natural Id.
you cannot use subclass with fixed equals, as that would test different class and can give wrong results when persisting data as data are handled differently in persistence context.
lot of fields: you don't want to type all comparisons by hand. You want reflection.
#Temporal annotations: you cannot use trivial "reflection equals" approaches, because #Temporal(TIMESTAMP) java.util.Date <> java.sql.Date
associations: typical entity you would like to have properly tested will have several associations, thus tool/approach ideally should support deep comparison. Also cycles in object graph can ruin the fun.
Best solution what I found:
don't use transmogrifying data types (like Date) in JPA entities.
all associations should be initialized in entity, because null <> empty list.
calculate externaly toString via say ReflectionToStringBuilder, and compare those. Reason for that is to allow entity to have its toString, tests should not depend that someone does not change something. Theoretically, toString can be deep, but commons recursive toStringStyle includes object identifier, which ruins it.
I though, that I could use json format to string, but commons support that only for shallow toString, Jackson (without further instructions on entity) fails on cycles over associations
Alternative solution would be actually declaring subclasses with generated id (say lombok) and use some automatic mapping tool (say remondis mapper), with option to overcome differences in Dates/collections.
But I'm listening. Does anyone posses better solution?

Best ID datatype for GWT+ JPA

I know, that similar questions are around.
But for my case: I use GWT 2.4 + JPA 2.0 + (MySQL):
Whatis the best data type for my table IDs?
I want to avoid any type conversions in my GWT project.
My desire is easiness, not performance.
Do you advise me to use Wrapper classes i.e long vs. Long?
A simple and straightforward choice is Long. Prefer to use the wrapper class, so you can set the id to null, before the object is inserted into the DB (see also Always use primitive object wrappers for JPA #Id instead of primitive type?)
If performance is not a high priority, you may consider using UUIDs instead: This makes it a lot easier to put objects into sets and maps - before they are stored on the servers side. For easiness, you could use Strings to store the UUIDs (GWT doesn't support the UUID datatype), though using an UUID-specific datatype would be a lot more efficient in a database.
Certainly the wrapper сlass is better than the primitive. Using of object has many advantages. But in my opinion the best choose of type in this situation is class String. Long is undesirable to use in GWT development because JavaScript doesn't have the concept of a long. So recommended to avoid using long if possible. It's emulated on GWT it affects performance.

casbah mongodb more typesafe way to access object parameters

In casbah, there are two methods called .getAs and .getAsOrElse in MongoDBObject, which returns the relevant fields' values in the type which given as the type parameter.
val dbo:MongoDBObject = ...
dbo.getAs[String](param)
This must be using type casting, because we can get a Long as a String by giving it as the type parameter, which might caused to type cast exception in runtime. Is there any other typesafe way to retrieve the original type in the result?
This must be possible because the type information of the element should be there in the getAs's output.
Check out this excellent presentation on Salat by it's author. What you're looking for is Salat grater which can convert to and from DBObject.
Disclamer: I am biased as I'm the author of Subset
I built this small library "Subset" exactly for the reason to be able to work effectively with DBObject's fields (both scalar and sub-documents) in a type-safe manner. Look through Examples and see if it fits your needs.
The problem is that mongodb can store multiple types for a single field, so, I'm not sure what you mean by making this typesafe. There's no way to enforce it on the database side, so were you hoping that there is a way to enforce it on the casbah side? You could just do get("fieldName"), and get an Object, to be safest--but that's hardly an improvement, in my opinion.
I've been happy using Salat + Casbah, and when my database record doesn't match my Salat case class, I get a runtime exception. I just know that I have to run migration scripts when I change the types in my model, or create a new model for the new types (multiple models can be stored in the same collection). At least the Salat grater/DAO methods make it less of a hassle (you don't have to specify types every time you access a variable).

Database-independent queries to the extreme in Java... or in general

Let's say I have an app that should ideally be able to use a relational database, object database, XML files, or whatever to persist its data. In the spirit of coding to interfaces instead of implementations, I have a generic DataStore interface that specifies a contract for all I/O involving the data store. This interface can be implemented by concrete classes such as RDBMSDataStore, OODBMSDataStore, XMLFileDataStore, and so on.
This works well as long as I keep the contents of the DataStore interface simple - i.e. getThis(), getThose(), saveThat(), updateThis(), etc. But as soon as I require more complicated queries, it breaks down. The XMLFileDataStore class obviously doesn't understand SQL, and the RDBMSDataStore class obviously doesn't understand XPath/XQuery. And OODBMSDataStore understands something entirely different depending on the OODBMS in use.
I could adopt a language-independent object query language, write all my queries in that and then have the concrete classes translate them into their native language, but that's a huge task, if I want to be complete.
Are there standards or best practices for handling this kind of situation in Java? Unfortunately it seems like 99% of the world interprets "database independence" to mean "relational database independence" and ignores the object databases, XML databases, document databases, etc. entirely.
From the way I read the question, this sounds a lot like the semantic that Hibernate brings to the table for Java. It even has mode for dealing with XML as the content backing store (using Dom4J). The Hibernate API has a number of extension points that could allow the addition of an OODBMS model. Even if Hibernate turns out not to be the best solution for you (implementation-wise), I think it provides a good example of the types of patterns that can be used to solve the problems you proposed.

How do I implement a collection in Scala 2.8?

In trying to write an API I'm struggling with Scala's collections in 2.8(.0-beta1).
Basically what I need is to write something that:
adds functionality to immutable sets of a certain type
where all methods like filter and map return a collection of the same type without having to override everything (which is why I went for 2.8 in the first place)
where all collections you gain through those methods are constructed with the same parameters the original collection had (similar to how SortedSet hands through an ordering via implicits)
which is still a trait in itself, independent of any set implementations.
Additionally I want to define a default implementation, for example based on a HashSet. The companion object of the trait might use this default implementation. I'm not sure yet if I need the full power of builder factories to map my collection type to other collection types.
I read the paper on the redesign of the collections API but it seems like things have changed a bit since then and I'm missing some details in there. I've also digged through the collections source code but I'm not sure it's very consistent yet.
Ideally what I'd like to see is either a hands-on tutorial that tells me step-by-step just the bits that I need or an extensive description of all the details so I can judge myself which bits I need. I liked the chapter on object equality in "Programming in Scala". :-)
But I appreciate any pointers to documentation or examples that help me understand the new collections design better.
I'd have a look at the implementation of collection.immutable.BitSet. It's a bit spread out, reusing things from collection.BitSetLike and collection.generic.BitSetFactory. But it does exactly what you specified: implement an immutable set of a certain element type that adds new functionality.