Mongodb: list all values for an indexed field quickly - mongodb

In mongodb I want to be able to rapidly list all index values. for instants lets say I have numerous collections of FOO:
public class Foo{
#ID
private ObjectID id;
#Indexed
private List<String> bars;
#Indexed List<String> bazs;
...
}
There may be repeats in bars and baz, such that iterating through ever foo and looking at the bars list would be inefficient, since I would be spending most of my time looking at repeats.
If I want to quickly list all 'bars' values without having to look at each Foo object can I do that? Since they are indexed there must be a table somewhere with all the indexes listed in an easily iterated manner. However, I can't seem to find a mongodb command to do this? or better yet a morphia command since I'm using java to interface with mongo

You are looking for distinct, which should work for lists / arrays as well. MongoDB will use an index if one is available.
Unfortunately this feature isn't yet implemented in Morphia, but you can do the following with the Java driver:
DBCollection c = collection;
List bars = c.distinct("bars");
For a more complex example see the unit test for this feature.

Related

MongoDB with Spring Boot mapping order to List

I am currently using "spring-boot-starter-data-mongodb" for persisting documents to a collection in mongodb.
The document contains a List with nested objects like:
{
foo:bar,
foos: [
{
foo1: bar1,
foo2: bar2
},
{
foo1: bar4,
foo2: bar3
}
]
}
The mapping of these documents consist the following:
private String foo;
private List<Foo> foos;
Foo:
private String foo1;
private String foo2;
The business logic is heavily depending on the order of the foos (the List elements).
The real questions are:
Are inserting a document preserves the order of the elements, so that the first item in the list will be the first in the JSON and so on?
Are querying preserves the order of the elements, so if an element is the N-th member of the document in the DB, will it be the N-th element in the mapped object as well?
Currently it seems to be true but I need to make sure it is guaranteed.
Yes, AFAIK the sort is guaranteed because MongoDB stores the document as it is. The document is stored and there is no reason to modify values from the array when read it.
Also, there are a couple questions here in StackOverflow like:
Do arrays stored in MongoDB keep their order?
Is the order of elements inside a MongoDB array guaranteed to be consistent?
Are arrays in MongoDB documents always kept in order?
So as seen before, assuming MongoDB preserves the order of elements within an array, when querying the collection in Spring Data MongoDB the returned documents will be mapped to the corresponding objects.
So the order of elements in the List of your object will be the same as the order of the elements in the array into MongoDB document.

Spring Data Mongo: fetch only a specific element of a list

This question has been asked a few times but not exactly how I intend it (or years have passed and I hope things have changed).
Using Spring Data Mongo's MongoRepository is it possible to fetch from the DB only a given nested element that matches some criteria?
For example, let's suppose I have the following Customer document:
public class Customer {
#Id
private String id;
private String name;
private String surname;
private List<CreditCard> creditCards;
}
Is there a way to say "fetch ONLY the CreditCard with number XXXX?"
At the moment in my code, I find myself having to do this sort of "filtering" in memory because the query returns all the elements of the list, not only the one that matches my criteria.
I have found several posts mentioning that to cover this necessity is required to declare CreditCard as a Document itself and use #DBRef but I've read on Mongo's documentation that it may reduce significantly query performance.
Also, the CreditCards are indeed assigned to a user so I do want them to be nested within that object (because then using Mongo Compass it is way easier to understand which CreditCards belong to which Customer)
Criteria c = Criteria.where("creditCards").elemMatch(Criteria.where("number").is("XXXX"));
mongoTemplate.find(new Query(criteria), Customer.class)
The find will fetch all the customers that has some CreditCard with number = XXXX.
Not sure if it's okay as you wrote ONLY
The CreditCard has no #Document

Indexing parallel arrays in Mongodb

I am starting to use MongoDb C#, but have run into a slight issue.
So I have a document with 2 embedded collections(of distinct types). I want to search on fields of both of these collections however I have discovered that if I try to index the searchable fields on the 2 collections I get "cannot index parallel arrays". Reading the Mongodb documentations on multikey indexes I discovered that this is indeed a limitation.
My question is what is the normal work around regarding this issue? I cant really combine these collections since they are pretty distinct? What pattern should I follow?
public class Capture
{
[BsonId]
public Guid Id { get; set; }
...Some other fields
public IList<CustomerInformation> CustomerInformations { get; set; }
public IList<VehicleLicenseDisk> VehicleLicenseDisks { get; set; }
}
Before talking about possible workarounds, I just want to highlight why MongoDB has chosen to enforce this restriction on indexing parallel arrays. When you index an array in MongoDB, it creates a multikey index with one key per array element. Therefore, if you create a compound index on two arrays, one with M distinct values and one with N distinct values, the index essentially has MN keys. This is very bad- it's nonlinear in the number of distinct array elements. Consider the amount of work it takes to maintain an index like this when you add or remove array elements.
OK, justification aside, to work around this restriction it will be helpful to use the current MongoDB version (2.6), which supports index intersection. One can create an index on CustomerInformations and VehicleLicenseDisks and then MongoDB can use both indices and intersect them to serve queries that have restrictions on both.
If you are, for whatever reason, stuck with MongoDB < 2.6, then your options are either to consider redesigning the schema or to depend on indexes that use at most one of the array fields.
It will help if you think about MongoDB schema concerns in terms of MongoDB -- not in terms of programming language objects. Do the arrays really need to be arrays? Can they be replaced with concrete field names? In your case, why is CustomerInformation an array? What does Capture object really represent? You might have to split out, as an example, CustomerInformation into a separate collection where each record contains a link/reference back to the Capture document it belongs to. I dont know the details of what you are trying to model, but whatever it is, forget about object oriented programming and put on a MongoDB hat -- objects will come later.

Is it possible to store multiple types of Objects into 1 mongodb collection?

using document oriënted database mongodb and the Object Document Mapper (ODM) morphia
Lets say we have 3 different classes; Object, Category and Action.
These object are all stored in the collections; objects, categories and actions.
Category and Action are references of Object
#Entity("objects")
public class Object {
#Id
#Property("id")
private ObjectId id;
#Reference
private Category category;
private Action action;
...
}
#Entity("categories")
public class Category {
#Id
public String categoryTLA;
public String categoryName;
...
}
#Entity("actions")
public class Action implements BaseEntity {
#Id
public String action;
public int accesLevel;
...
}
The documents with the current implementation are stored like:
Mongo (Server/location)
store (database)
objects (collection)
object (document)
object
object
categories
categorie
categorie
categorie
actions
action
action
action
Is it possible to store 2 different Objects, in this case Category and Action, in one collection, like shown in the next example? Both with their own identification!
Mongo
store
objects
object
object
object
settings
categorie
categorie
categorie
action
action
action
Absolutely, it's possible to store multiple types of documents in a single collection. In fact, that's one of the strengths of a document oriented database like Mongo. However, you may not want to combine them without considering some issues (positive and negative):
You can't do cross-collection or document SQL-like JOINs. So, having documents in one or more collection won't change that behavior.
You can use aggregation only in a single collection, so you may be able to perform some aggregation style queries more conveniently if a collection has multiple document types rather than split across collections (both the aggregation framework and Map-Reduce operate only on a single collection at a time).
In order to deserialize a document into an object in Morphia, you'll need to know what type a given document represents. You may need to add a field to the document indicating the type, unless there are other ways that can safely represent the type of document so that the deserialization process works correctly. An Action can't be a Category for example. If you did the equivalent of FindAll and there were multiple document types, unless the deserializer can evaluate the document structure before deserialization starts, your code may not work as desired.
It's likely that you'll need to index various properties of your documents/Objects. If you index based on document types (say Action has an index that is unqiue from Category, all documents inserted into the collection containing both will run through the indexer, for all indexes defined in the collection. That can impact performance depending on the nature of the indexes. This means that all documents in the collection will be indexed regardless of whether the index makes sense. This is often a compelling reason to not combine multiple document types that do not share common indexing traits.
Unless you need to do specific types of queries that require all documents to be in a common collection, I'd likely leave them in separate collections, especially if you plan on using custom indexes for various document types/schemas.
yes. but probably you need to add field "documentType" to any document to distinguish documents

How to retrieve all objects in a Mongodb collection including the ids?

I'm using Casbah and Salat to create my own Mongodb dao and am implementing a getAll method like this:
val dao: SalatDAO[T, ObjectId]
def getAll(): List[T] = dao.find(ref = MongoDBObject()).toList
What I want to know is:
Is there a better way to retrieve all objects?
When I iterate through the objects, I can't find the object's _id. Is it excluded? How do I include it in the list?
1°/ The ModelCompanion trait provides a def findAll(): SalatMongoCursor[ObjectType] = dao.find(MongoDBObject.empty) methods. You will have to do a dedicated request for every collection your database have.
If you iterate over the objects returned, it could be better to iterate with the SalatMongoCursor[T] returned by the dao.find rather than doing two iterations (one with the toList from Iterator trait then another on your List[T]).
2°/ Salat maps the _id key with your class id field. If you define a class with an id: ObjectId field. This field is mapped with the mongo _id key.
You can change this behaviour using the #Key annotation as pointed out in Salat documentation
I implemented something like:
MyDAO.ids(MongoDBObject("_id" -> MongoDBObject("$exists" -> true)))
This fetches all the ids, but given the wide range of what you might be doing, probably not the best solution for all situations. Right now, I'm building a small system with 5 records of data, and using this to help understand how MongoDB works.
If this was a production database with 1,000,000 entries, then this (or any getAll query) would be stupid. Instead of doing that, consider trying to write a targeted query that goes after the real results you seek.