Internal reference in protobuf? - scala

I'm new to scala and I'm thinking of using protobuf to pass around some data. However, in the data, there are some common sets of values across different items. The data in JSON might look like this:
[
{
"id" : "1",
"value" : {
"field1" : "f1value.1",
"field2" : "f2value.1",
"field3": commonobject
}
},
{
"id" : "2",
"value" : {
"field1" : "f1value.2",
"field2" : "f2value.2",
"field3": commonobject
}
}
]
I am hoping to find a solution not to duplicated commonobject somehow. I'm wondering if there is an internal reference in protobuf, like $ref in JSON schema.
Thanks for the help!

protobuf messages cannot store references. You can store an object-id to reference common objects.

Related

What is the best way of writing a collection schema to map another collection?

In mongodb I have many collection like below
boys_fashion
girls_fashion
gents_fashion
ladies_fashion
girls_accessories
gents_accessories
ladies_accessories
based on some fields I need to use different collection. So for that I thought to create a collection which will map to a specific collection. Below is the collection I have created for that.
{
"type" : "Fashion",
"gender" : "Gents",
"kids" : "true",
"collection" : "boys_fashion"
}
{
"type" : "Accessories",
"gender" : "Ladies",
"kids" : "false",
"collection" : "ladies_accessories"
}
{
"type" : "Fashion",
"gender" : "Gents",
"kids" : "false",
"collection" : "gents_fashion"
}
{
"type" : "Accessories",
"gender" : "Ladies",
"kids" : "true",
"collection" : "girls_accessories"
}
{
"type" : "Accessories",
"gender" : "Gents",
"kids" : "true",
"collection" : "gents_accessories"
}
{
"type" : "Accessories",
"gender" : "Gents",
"kids" : "false",
"collection" : "gents_accessories"
}
Is this is the right way to do this? or please suggest me some other ways
If I stored like below(the above option is similar to RDBMS. Since its mongo I guess I used this way). How can I write a query for fetching the collection?
{
"fashion" : {
"gents" : {
"true" : "boys_fashion",
"false" : "gents_fashion"
}
},
"accessories" : {
"ladies" : {
"true" : "girls_accessories",
"false" : "ladies_accessories"
}
}
}
Assumptions:
There were one collection before and you split them into multiple collections as they are getting large and you want to solve it without sharing.
I would not even create a collection for the following data. This data is static reference data and will act as a router. On start up, application loads this data and creates a router.
{
"type" : "Fashion",
"gender" : "Gents",
"kids" : "true",
"collection" : "boys_fashion"
}
{
"type" : "Accessories",
"gender" : "Ladies",
"kids" : "false",
"collection" : "ladies_accessories"
}
...
What do I mean by creating a kind of router by that static configuration file? Lets say you receive a query fashionable items for baby girls. This router will tell you, hey you need to search girls_accessories collection. And you send the query to girls_accessories collection and return the result.
Lets take another example, you receive a query for fashionable items for female. This router will tell you hey you need to search, ladies_accessories and girls_accessories. You send the query to both collections and combine the result and send it back.
Conclusion
If my assumptions are correct, you don't need a collection to store the reference data. What you have done is manual sharding by splitting the data across different collections, the reference data will act as router to tell you where to search and combine
Update based on comments
Query does not involve multiple collections
Administrator can add new collection and application should query it without modifying code.
Solution 1
You create a collection for the reference data too
Downside to this is that every query involves two calls to database. First to fetch the static data and second to the data collection
Solution 2
You create a collection for the reference data too
You also build a Dao on top of that which uses #Cacheable for the method that return this data.
You also add a method in Dao to clear the cache with #CacheEvict and have a rest endpoint like /refresh-static-data that will call this method`
Downside to this method is that whenever administrator add new collection, you need to call the endpoint.
Solution 3
Same as solution 2 but instead of having an endpoint to clear the cache, you combine it with scheduler
#Scheduled(fixedRate = ONE_DAY)
#CacheEvict(value = { CACHE_NAME })
public void clearCache() {
}
Downside to this solution is that you have to come up with a period for fixedRate which is acceptable to your business
I added the collection like below.
/* 1 */
{
"type" : "fashion",
"category" : [
{
"value" : "gents",
"true" : "boys_fashion",
"false" : "gents_fasion"
}
]
}
/* 2 */
{
"type" : "accessories",
"category" : [
{
"value" : "ladies",
"true" : "girls_accessories",
"false" : "ladies_accessories"
}
]
}
and will fetch the data using the below query
findOne({"type":type,"category.value":cvalue},{"_id": 0, "category": {"$elemMatch": {"value": cvalue}}})
Try to make subdocuments in MongoDB, instead of nested objects
https://mongoosejs.com/docs/subdocs.html

updating mongo documents based in map value and remove that value

am currently working in Go and have a mongo database (connected via gopkg.in/mgo.v2) so, right now I have a data structure similar to:
{
"_id" : "some_id_bson",
"field1" : "value1",
"field2" : {
{
"key1" : "v1",
"key2" : "v2",
"key3" : "v3",
"key4" : "v4"
}
}
}
So, basically what I need to do (as an example) is to update in the database all the records that contains key1 and remove that from the json, so the result would be something like:
{
"_id" : "some_id_bson",
"field1" : "value1",
"field2" : {
{
"key2" : "v2",
"key3" : "v3",
"key4" : "v4"
}
}
}
What can I use to achieve this? I have been searching and cannot find something oriented to maps (field2 is a map). Thanks in advance
It seems like you're asking how to remove a property from a nested object in a particular document, which appears as if to be answered here: How to remove property of nested object from MongoDB document?.
from the main answer there:
Use $unset as below :
db.collectionName.update({},{"$unset":{"values.727920":""}}) EDIT For
updating multiple documents use update options like :
db.collectionName.update({},{"$unset":{"values.727920":""}},{"multi":true})
Try using $exists and $unset:
query:= bson.M{"$exists":bson.M{"field2.key1":true}}
replace:=bson.M{"$unset":bson.M{"field2.key1":""}}
collection.UpdateAll(query,replace)
This should find all documents containing field2.key1, and remove that.

Aggregating filter for Key

If I have a document as follows:
{
"_id" : ObjectId("54986d5531a011bb5fb8e0ee"),
"owner" : "54948a5d85f7a9527a002917",
"type" : "group",
"deleted" : false,
"participants" : {
"54948a5d85f7a9527a002917" : {
"last_message_id" : null
},
"5491234568f7a9527a002917" : {
"last_message_id" : null
}
"1234567aaaa7a9527a002917" : {
"last_message_id" : null
}
},
}
How do I do a simple filter for all documents this have participant "54948a5d85f7a9527a002917"?
Thanks
Trying to query structures like this does not work well. There are a whole whole host of problems with modelling like this, but the most clear problem is using "data" as the names for "keys".
Try to think a little RDBMS like, at least in the concepts of the limitations to what a database cannot or should not do. You wouldn't design a "table" in a schema that had something like "54948a5d85f7a9527a002917" as the "column" name now would you? But this is essentially what you are doing here.
MongoDB can query this, but not in an efficient way:
db.collection.find({
"participants.54948a5d85f7a9527a002917": { "$exists": true }
})
Naturally this looks for the "presence" of a key in the data. While the query form is available, it does not make efficient use of such things as indexes where available as indexes apply to "data" and not the "key" names.
A better structure and approach is this:
{
"_id" : ObjectId("54986d5531a011bb5fb8e0ee"),
"owner" : "54948a5d85f7a9527a002917",
"type" : "group",
"deleted" : false,
"participants" : [
{ "_id": "54948a5d85f7a9527a002917" },
{ "_id": "5491234568f7a9527a002918" },
{ "_id": "1234567aaaa7a9527a002917" }
]
}
Now the "data" you are looking for is actual "data" associated with a "key" ( possibly ) and inside an array for binding to the parent object. This is much more efficient to query:
db.collection.find({
"participants._id": "54948a5d85f7a9527a002917"
})
It's much better to model that way than what you are presently doing and it makes sense to the consumption of objects.
BTW. It's probably just cut and paste in your question but you cannot possibly duplicate keys such as "54948a5d85f7a9527a002917" as you have. That is a basic hash rule that is being broken there.

MongoDB - How can I use MapReduce to merge a value from one collection into another collection on multiple keys of a second collection?

I have two MongoDB collections: The first is a collection that includes frequency information for different IDs and is shown (truncated form) below:
[
{
"_id" : "A1",
"value" : 19
},
{
"_id" : "A2",
"value" : 6
},
{
"_id" : "A3",
"value" : 12
},
{
"_id" : "A4",
"value" : 8
},
{
"_id" : "A5",
"value" : 4
},
...
]
The second collection is more complex and contains information for each _id listed in the first collection (it's called frequency_collection_id in the second collection), but frequency_collection_id may be inside two lists (info.details_one, and info.details_two) for each record:
[
{
"_id" : ObjectId("53cfc1d086763c43723abb07"),
"info" : {
"status" : "pass",
"details_one" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known"
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown"
}
],
"details_two" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known"
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown"
}
],
}
}
...
]
What I'm looking to do, is merge the frequency information (from the first collection) into the second collection, in effect creating a collection that looks like:
[
{
"_id" : ObjectId("53cfc1d086763c43723abb07"),
"info" : {
"status" : "pass",
"details_one" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known",
**"value" : 19**
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown",
**"value" : 6**
}
],
"details_two" : [
{
"frequency_collection_id" : "A1",
"name" : "A1_object_name",
"class" : "known",
**"value" : 19**
},
{
"frequency_collection_id" : "A2",
"name" : "A2_object_name",
"class" : "unknown",
**"value" : 6**
}
],
}
}
...
]
I know that this should be possible with MongoDB's MapReduce functions, but all the examples I've seen are either too minimal for my collection structure, or are answering different questions than I'm looking for.
Does anyone have any pointers? How can I merge my frequency information (from my first collection) into the records (inside my two lists in each record of the second collection)?
I know this is more or less a JOIN, which MongoDB does not support, but from my reading, it looks like this is a prime example of MapReduce.
I'm learning Mongo as best I can, so please forgive me if my question is too naive.
Just like all MongoDB operations, a MapReduce always operates only on a single collection and can not obtain info from another one. So you first step needs to be to dump both collections into one. Your documents have different _id's, so it should not be a problem for them to coexist in the same collection.
Then you do a MapReduce where the map function emits both kinds of documents for their common key, which is their frequency ID.
Your reduce function will then receive an array of two documents for each key: the two documents you have received. You then just have to merge these two documents into one. Keep in mind that the reduce-function can receive these two documents in any order. It can also happen that it gets called for a partial result (only one of the two documents) or for an already completed result. You need to handle these cases gracefully! A good implementation could be to create a new object and then iterate the input-documents copying all existing relevant fields with their values to the new object, so the resulting object is an amalgamation of the input documents.

Query MongoDB N-Tier Nested on every Level

I've got this structure of document, saved in MongoDB 2.6.1, running on Linux:
{
"_id" : "1",
"name" : "Level0",
"childs" : [
{
"id" : "2",
"name" : "Level1",
"childs" : [
{
"id" : "3",
"name" : "Level2",
"childs" : [
{
"id" : "4",
"name" : "Level3-1",
},
{
"id" : "5",
"name" : "Level3-2",
}
]
}
...
Every Element got his childs and every child-element got his own childs until the technical end.
Now I want to Query my MongoDB with something like:
db.categories.find({'childs':{$elemMatch:{$all:['Level19-23']}}})
This Query dont work by the way.
What is a good query to get my elements?
I dont know anything about the children or parents, I've got only the name of the element, and I need the element with all his children.
Anyone got an advise for me, the MongoDB Newbe? :)
Thanks in advance!