MongoDB : Use string instead of BsonObjectId - mongodb

I have a REST service for my game and decided to give MongoDB a try - seems pretty straight forward except for the "_id" field requirement.
The problem here is that my client app is using the REST services and has no knowledge of "mongodb", its drivers or dependencies - nor should it. To decouple the client from the server side (REST service provider) I need to get around the fact that MongoDB appears to require a "_id" field of type BsonObjectId.
Currently I'm using a lightweight DAO layer, so instead of having:
using System;
public class Item {
private BsonObjectId _id;
private string name;
}
I am using a DAO to translate this to something "mongodb agnostic":
using System;
public class ItemDAO {
private string _id;
private string name;
}
Ideally it would be nice to be rid of BsonObjectId entirely - is there some annotation/custom serialization handler that can be used or some way that I'm able to use a string instead of BsonObjectId?
Failing that, any way to get objects wrapped by MongoDB so they are decorated with the _id which I can inject back into the row as a string?
The ideal result would be to have no DAO class at all just "Item" and have Item using a string for _id (or something that does not require mongodb dependencies to bleed into client implementation).

Your documents must have an _id field, but it doesn't have to be an ObjectID. The only requirement is that it is unique for the collection.
MongoDB will generate an ObjectId for you if you don't supply an _id field when saving a new document, but that is just a helper function.

If you don't want to "polute" your model clases, you could register appropriate Id generator in you data access code.
BsonSerializer.RegisterIdGenerator(typeof(string), new StringObjectIdGenerator());
This way you will have String field in your model, but underneath it will be ObjectId, which is kind of nice i.e. you can see when the records where added (approx)
If you decide however that in your REST service you will accept Ids from clients (via PUT) then ObjectId is obviously not the way to go.
Have a look at this article since it describes how to setup serialization options etc.

How do you query for the objects? If you don't want / need the _id field in the client, you can use a projection to exclude the _id field from the result.
Also, be aware that generating your own string-based _id's can have severe impact on database size. The ObjectId seems to be a pretty efficient structure. I have experimented with using strings for _id, to avoid having an index on a field I would never use - but in my case the cost in database size made it unfeasible. Depends on your database size and the rest of the contents, of course.

It's commonplace to represent BSON ObjectIds as strings; by default, Mongo drivers will generate 96-bit IDs, which you can then obviously represent as 24 hex bytes. Most client libraries have facilities for creating ObjectIds out of strings and casting them to strings.
You would have your external interfaces treat _id as a string, and then when your Mongo-aware DB layer receives an _id as a string, it would just internally convert it with ObjectId.from_string(_id) or whatnot. When writing results, you would just cast the OID to a string.

Using ObjectIds as data type for your primary keys makes a lot of sense for various reasons. Generating good, non-sequential, monotonic IDs with low collision probability isn't trivial, and re-inventing the wheel or essentially rewriting the feature isn't worth the trouble.
Data mapping should be done in controllers, and they should interact with the outside using DTOs. In other words, your REST Endpoints (Controllers/Modules) should known DTOs while your database uses your database models.
The DTOs will look very similar to your models, but they might have a few less fields (neat if there's internal data you don't want exposed via the API) and they use strings where the models use ObjectIds.
To avoid stupid copying code, use something like AutoMapper.

Related

Can we and should we create MongoID for nested objects inside document?

We have a collection of documents, each document has an array of objects
{
"_id":_MONGO_ID_,
"property":"value",
"list":[{...}, {...}, ...]
}
But each object of the list also needs a unique id for the needs of our app.
{"id":213456789, "somestuff":"somevlue" ...}
We do not wish to create a collection for these objects because they are small and would rather store them straight into the document.
Now the question. Right now we generate a unique id based on time which looks like the MongoID. We need an id to make it easier to target each object. Would it be a good idea to generate a MongoID for each object of the list instead? Any pros and cons?
In general, it is wise to separate DB-specific resources from business/data domain resources. You always want to be able to manipulate the data completely independent of the host database and the drivers associated therewith. ObjectId() is relatively lightweight and in fact a BSON type, separate from the MongoDB core objects, but for true arms-length separation and an easier physical implementation, I would recommend a simple string instead. If you don't have extreme space/scale issues, UUIDv4 is good way to get a unique string.

Projecting multiple fields to a POJO

Is there a way in hibernate-search 6 to project multiple fields and map them directly to a POJO object or I should handle it by myself. I'm not sure that I understand the composite method described in the documentation. For example I can do something like that:
SearchResult<List<?>> result = searchSession.search(indicies)
.select(f -> f.composite(f.field("field1"), f.field("field2"), f.field("field3"),f.field("field4")))
.where(SearchPredicateFactory::matchAll)
.fetch(20)
And then I can manually map the returned List of fields to a POJO. But is there a more fancy way to do that without the need to manually loop through the list of fields and set them to the POJO instance?
At the moment projecting to a POJO is only possible for fairly simple POJOs, with up to three fields, using the syntax shown in the documentation. For more fields than that, you have to go through a List.
If you're using the Elasticsearch backend, you can theoretically retrieve the document as a JsonObject and then use Gson to map it to a POJO.
There are plans to offer more fancy solutions, but we're not there yet.

dynamic value class - schema not known until runtime

All the examples for storing multi-field data require specifying a value class. However, I do not know the fields or their types until run-time. I would like to be able to create a region with a dynamic set of field values. For example,
put --key=101 --value=('firstname':'James','lastname':'Gosling')
--region=/region1 --value-class=data.Person
However, the data.Person class does not exist.
Furthermore, I would like to be able to query on the firstname field (or any other field of the value).
How can I do this with Geode?
You don't need a domain class to store data in Geode. You can store json natively in Geode. OQL queries make no distinction between PDX serialized objects and json values. In fact, when you store a json value in Geode, beneath the covers it is converted into a PDXInstance. You can read more about PDX Serialization in the documentation.
You can use PdxInstance.
Example using Java:
region.put(101, cache.createPdxInstanceFactory("data.Person").writeString("firstname","James")
.writeString("lastname","lastname").create());

Intercepting JPA query to calculate the key fields

I am quite new to JPA. I have a particular repository that uses the keys that have parts that are set by the caller and some values that are automatically calculated using these values. There is a need for this :)
Since the keys and entities are simple Java classes it appears to me that I need to put my code that modifies the key (or substitutes it with an internal one with additional values) is the repository implementation. However I do not think that copying the code from SimpleJpaRepository to my custom repositories is a good idea...I think that something should be possible with the entity manager. Basically what I need is proxy that gets called every time something like find() or delete() is called, takes the entity, updates its key, passes the call over to the real repository implementation.
Could someone point me to the right direction or an example that does something similar?
Thanks!
In JPA, you have a bunch of events for this, just chose the one that suits you best. It looks like you are looking for #PrePersist.
http://www.objectdb.com/api/java/jpa/annotations/callback
That said, if the data of these fields is calculated based only in the data of the other fields, it goes against database normalization. A more sensate approach would be make the calculated field #Transient and provide only the getters, that will calculate the values based in the persistent fields.

Preventing duplicates in MongoDB using Spring Data (Spring Roo)

I have been trying to get my head wrapped around MongoDB, as it's used by Spring, so I decided to start a little project in Spring Roo.
In my project, I am storing my User Login data to MongoDB. The trouble is that the registration process, which creates a new User object and stores it in the MongoDB, has a tendency to create duplicates despite the fact I have #Unique on the loginId field.
Now, I know part of the problem is that I am thinking about things from a JPA/RDBMS perspective, and MongoDB is not a relational DB and thus has a different set of parameters in which to operate with, but I having trouble finding guidance in anything more than a VERY simple sample code.
First, what Spring/Other annotations are available, and more importantly, commonly used when dealing with MongoDB from a Spring-world? Second, when dealing with documents that need to be "uniqued", how does one typically do this? Do you first search on the unique field to ensure it's not already there first, then do the insert? Third, in JPA-land, I could use the annotations #PrePersist and #PreUpdate to do last-minute data manipulation, like MD5-hashing passwords that have been updated or adding/updating a "Last Modified" date just prior to storing. I know this are JPA-isms, but can I still use those, and if not, is there an alternative for use with Spring Data/MongoDB?
I ended up using the #Id annotation on my Entities, which indicates which field is used as the id field. As long as the field is unique, writting subsequent updates will properly replace the existing entity instead of adding a new one.
I ended up creating additional method to check if there exists a data which have a duplicate value to the one we are entering.
If it exists, i return failure mentioning that there exist duplicate value. Otherwise it saves the newly entered value