morphia/mongo querying on a nested tree structure - mongodb

I have a Java/mongo object Node that can contain another Node, etc.
So my structure in mongo is:
like Document->Node->Node->...
A Node has a name attribute, and I want to find all Documents that have a node (including any nested Nodes) that contains a certain name.
I was using dot notation to do something like:
query.field("document.node.name").equal(name)
but that only works if the parent node has a matching name.. what I need is some kind of wildcard to search for any name (document.node.node....name etc) that is in a Node object.
Thanks for the help!

There is no wildcard search in MongoDB.
You'd need to store the Nodes somehow flattened to perform that query. You could for example store the hierarchy (the parent chain) in each Node so that you could recreate the hierarchy with your client application code.
The most commonly used structures are very well documented here.

Related

How to model "index" list vs "show" details in GraphQL?

My data model has two resources: Folder and Messages. Each message belongs to a folder. Sometimes I'll want to get a list of folders (including some fields for each folder). Sometimes I'll want to get the details of a particular folder (including some fields and messages for that folder).
In a Rails/RESTful system, this would correspond to the index and show actions on the Folder resource; the latter would receive the id parameter specifying the desired folder. What would this schema look like in "idiomatic" GraphQL?
One approach might be have one field for each action:
type Query {
folders: [Folder]
folder(id: String!): Folder
}
There's some duplication here, which seems messy and makes it harder for an client to introspect and understand the schema.
Perhaps the duplication can be removed with a nullable argument:
type Query {
folder(id: String): [Folder]
}
If an id is passed, just the details of that Folder will be returned (as a one-item array). If id is nil, then it'll get the details for all folders. This overloading seems to add some hidden complexity.
Which approach is "better practice"? Is there a better way to model this situation?
TLDR: Fields are cheap, use them.
I'd suggest the first approach. In the same way a REST API can get hopelessly muddled with flags to trigger different behaviors, so can a field with various arguments. By making two different fields, you can also let the type system give stronger client guarantees:
type Query {
folders: [Folder!]!
folder(id: String!): Folder
}
In this case, you'll always get some sort of list back for folders, and it won't contain any nulls. Emptiness is just the empty list. The API documents itself, which may not be the case if you try to mash an increasing number of optional arguments into one field.
Also, if you need to paginate folders, you'll need pagination-specific arguments for that endpoint, and perhaps the intermediate structure of the connection pattern.
I use the second approach. In this way, you can add more arguments to the same endpoint later on. Maybe a folder has a path or you like to filter on creation date.

Is it possible to list object in Google Cloud Storage using the meta?

I am reading this article https://cloud.google.com/storage/docs/json_api/v1/objects/list, and I didn't find it.
I have a bucket with thousands of objects (files). I want to list only object that have specific metadata.
Do you know how to achieve that?
There's no API support for server-side filter of listing results by metadata values. You would need to list all the objects and then filter at the client side. Another option, if it's possible to rename your objects, would be to construct your object names such that the metadata values on which you want to filter are built into the beginning of the object names. You could then use a prefix filter on the listing request.

Extracting information from 2 documents using mongodb driver

I have three different collections.
The first collection is User, (userId, name, address.. etc)
the second collection is service, (serviceId, name, title)
the third collection is service2User(serviceId, and recipientUserId)
(I know i could use some array inside the service instead of the service2User
this is done because the serviceRegister2User contain much more then 2 fields, and can be very big.)
I need to find a collection of users which don't have current service (i.e service=10)
(the solution can be done by: linq or directly through the c# mongo driver)
to my best understand this is a two process action
First: I need to search the serviceRegister2User collection and find all recipientUserId which have already serviceId=10.
Second: I need to find all users which are different from the users found in my first query.
those users are users which did not register to serviceId=10
The collection found after the second process is the wanted result.
Can someone tell me how to do it in both way?
- linq or directly through the c# mongo driver
if it is done by Linq driver, then it need to return MongoCollection.
Thank you.

Is it a bad practice to use the "." symbol in Mongodb collection name?

I have a quite big web application. The application is split in multiple modules. Each modules can create multiple collections in the Mongodb database.
Since each modules can create collections, There is a possibility to have collision between them so I'm currently trying to "namespace" my collections in a elegant way.
Here is an example of what I would like to do:
Module1 creates these collections:
module1.items
module1.employees
Module2 creates these collections:
module2.items // Avoid collision with module1
module2.animals
Here is an example of what I would like to avoid:
Module1 creates these collections:
module1items
module1employees
Module2 creates these collections:
module2items // Avoid collision with module1
module2animals
Now I wonder if it is a bad pratice to use "." in the collection name. Usually, the "." is used to split the database name from the collections name like db.mydatabase.mycollection so I am concerned about possible bugs I might encounter if I use the "." method to namespace my collection name.
[EDIT]
Here is a quote I found on the mongodb website:
For an example acme.users namespace, acme is the database name and users is the collection name. Period characters can occur in collection names, so that acme.user.history is a valid namespace, with acme as the database name, and user.history as the collection name.
Reference
Adding that to Stephane Godbillon's answer (mentioning gridFS), I now feel quite safe about using that naming convention. Now I just hope that the ODM I use (mongoose) will not cause any problem :).
Yes you can do this without any problem. In fact even GridFS works with collections named this way (fs.files and fs.chunks).

What are naming conventions for MongoDB?

Is there a set of preferred naming conventions for MongoDB entitites such as databases, collections, field names?
I was thinking along these lines:
Databases: consist of the purpose (word in singular) and end with “db” – all lower case: imagedb, resumedb, memberdb, etc.
Collections: plural in lower case: images, resumes,
Document fields: lowerCamelCase, e.g. memberFirstName, fileName, etc
Keep'em short: Optimizing Storage of Small Objects, SERVER-863. Silly but true.
I guess pretty much the same rules that apply to relation databases should apply here. And after so many decades there is still no agreement whether RDBMS tables should be named singular or plural...
MongoDB speaks JavaScript, so utilize JS naming conventions of camelCase.
MongoDB official documentation mentions you may use underscores, also built-in identifier is named _id (but this may be be to indicate that _id is intended to be private, internal, never displayed or edited.
DATABASE
camelCase
append DB on the end of name
make singular (collections are plural)
MongoDB states a nice example:
To select a database to use, in the mongo shell, issue the use <db>
statement, as in the following example:
use myDB
use myNewDB
Content from: https://docs.mongodb.com/manual/core/databases-and-collections/#databases
COLLECTIONS
Lowercase names: avoids case sensitivity issues, MongoDB collection names are case sensitive.
Plural: more obvious to label a collection of something as the plural, e.g. "files" rather than "file"
>No word separators: Avoids issues where different people (incorrectly) separate words (username <-> user_name, first_name <->
firstname). This one is up for debate according to a few people
around here but provided the argument is isolated to collection names
I don't think it should be ;) If you find yourself improving the
readability of your collection name by adding underscores or
camelCasing your collection name is probably too long or should use
periods as appropriate which is the standard for collection
categorization.
Dot notation for higher detail collections: Gives some indication to how collections are related. For example you can be
reasonably sure you could delete "users.pagevisits" if you deleted
"users", provided the people that designed the schema did a good
job.
Content from: https://web.archive.org/web/20190313012313/http://www.tutespace.com/2016/03/schema-design-and-naming-conventions-in.html
For collections I'm following these suggested patterns until I find official MongoDB documentation.
Even if no convention is specified about this, manual references are consistently named after the referenced collection in the Mongo documentation, for one-to-one relations. The name always follows the structure <document>_id.
For example, in a dogs collection, a document would have manual references to external documents named like this:
{
name: 'fido',
owner_id: '5358e4249611f4a65e3068ab',
race_id: '5358ee549611f4a65e3068ac',
colour: 'yellow'
...
}
This follows the Mongo convention of naming _id the identifier for every document.
Naming convention for collection
In order to name a collection few precautions to be taken :
A collection with empty string (“”) is not a valid collection name.
A collection name should not contain the null character because this defines the end of collection name.
Collection name should not start with the prefix “system.” as this is reserved for internal collections.
It would be good to not contain the character “$” in the collection name as various driver available for database do not support “$” in collection name.
Things to keep in mind while creating a database name are :
A database with empty string (“”) is not a valid database name.
Database name cannot be more than 64 bytes.
Database name are case-sensitive, even on non-case-sensitive file systems. Thus it is good to keep name in lower case.
A database name cannot contain any of these characters “/, , ., “, *, <, >, :, |, ?, $,”. It also cannot contain a single space or null character.
For more information. Please check the below link : http://www.tutorial-points.com/2016/03/schema-design-and-naming-conventions-in.html
I think it's all personal preference. My preferences come from using NHibernate, in .NET, with SQL Server, so they probably differ from what others use.
Databases: The application that's being used.. ex: Stackoverflow
Collections: Singular in name, what it's going to be a collection of, ex: Question
Document fields, ex: MemberFirstName
Honestly, it doesn't matter too much, as long as it's consistent for the project. Just get to work and don't sweat the details :P
Until we get SERVER-863 keeping the field names as short as possible is advisable
especially where you have a lot of records.
Depending on your use case, field names can have a huge impact on storage. Can't understand why this is not a higher priority for MongoDb, as this will have a positive impact on all users. If nothing else, we can start being more descriptive with our field names, without thinking twice about bandwidth & storage costs.
Please do vote.