Writing this question I'm working with .Net's LiteDB but I think the question applies to entire NoSQL matters.
One of collection in my db contains documents that don't have fixed structure. I want to let user add his own values, of whatever name and value he wants.
So, for example, document at first would have following structure:
{
"_id": 1,
"creatorId": 10
}
But user would be able to specify new value and choose whether it will be int or boolean.
{
"_id": 1,
"creatorId": 10,
"customValue": false
}
Next time user open my app, he maybe will want to use values of the same kind as he used before, so I need to show him some kind of form with inputs named based on his previous activity. So if he previously added value named "customValue", I want to show him TextView named "customValue" next time he opens page with form.
Is there a way of retrieving structure of such document based on every record from collection? Or do I need to somehow track names of added values and save them in separate collection?
In LiteDB you can use BsonDocument class to read collection documents. BsonDocument is a generic way to implement document in BSON format (with all BSON data type available).
If you use:
var col = db.GetCollection("mycol");
var doc = col.FindById(1);
foreach(var key in doc.Keys)
{
var value = doc[key];
var valueDataType = value.Type; // return an enumerable with all data types
}
Related
I'm learning about CouchDB, and I don't get it: When I create a view, does the view copy the data that's inside the emit function or it only creates a new index based on it?
For example, let's suppose I have a database with documents like this one:
{
"name": "Bob",
"age": 30
}
My view would be something like:
function (doc) {
emit(doc.name, doc.age);
}
Will CouchDB create a copy of every document (the emitted fields) and the view index when the view is executed for the first time? Or will it create only an index?
With that map function, CouchDB will create a B-tree index based on doc.name with value doc.age and also it will store doc._id somewhere there. So it won't duplicate whole doc but will store those fields and perhaps something else in the index data.
My requirement is to get json pair from mqtt subscriber at different timings under single_id in cloudant, but I'm facing error while trying to insert new json pair in existing _id, it simply replace old one. I need at least 10 json pair under one _id. Injecting at different timings.
First, you should make sure about your architectural decision to update a particular document multiple times. In general, this is discouraged, though it depends on your application. Instead, you could consider a way to insert each new piece of information as a separate document and then use a map-reduce view to reflect the state of your application.
For example (I'm going to assume that you have multiple "devices", each with some kind of unique identifier, that need to add data to a cloudant DB)
PUT
{
"info_a":"data a",
"device_id":123
}
{
"info_b":"data b",
"device_id":123
}
{
"info_a":"message a"
"device_id":1234
}
Then you'll need a map function like
_design/device/_view/state
{
function (doc) {
emit(doc.device_id, 1);
}
Then you can GET the results of that view to see all of the "info_X" data that is associated with the particular device.
GET account.cloudant.com/databasename/_design/device/_view/state
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1},
{"id":"eaa710a5fa1ff4ba6156c997ddf6099b","key":1234,"value":1}
]}
Then you can use the query parameters to control the output, for example
GET account.cloudant.com/databasename/_design/device/_view/state?key=123&include_docs=true
{"total_rows":3,"offset":0,"rows":[
{"id":"28324b34907981ba972937f53113ac3f","key":123,"value":1,"doc":
{"_id":"28324b34907981ba972937f53113ac3f",
"_rev":"1-bac5dd92a502cb984ea4db65eb41feec",
"info_b":"data b",
"device_id":123}
},
{"id":"d50553d206d722b960fb176f11841974","key":123,"value":1,"doc":
{"_id":"d50553d206d722b960fb176f11841974",
"_rev":"1-a2a6fea8704dfc0a0d26c3a7500ccc10",
"info_a":"data a",
"device_id":123}}
]}
And now you have the complete state for device_id:123.
Timing
Another issue is the rate at which you're updating your documents.
Bottom line recommendation is that if you are only updating the document once per ~minute or less frequently, then it could be reasonable for your application to update a single document. That is, you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database. You must make sure that your are providing the most recent _rev of that document and you should also check for conflicts that could occur if the document is being updated by multiple devices.
If you are acquiring new data for a particular device at a high rate, you'll likely run into conflicts very frequently -- because cloudant is a distributed document store. In this case, you should follow something like the example I gave above.
Example flow for the second approach outlined by #gadamcox for use cases where document updates are not required very frequently:
[...] you'd add new key-value pairs to the same document with the same _id value. In order to do that, however, you'll need to GET the full doc, add the new key-value pair, and then PUT that document back to the database.
Your application first fetches the existing document by id: (https://docs.cloudant.com/document.html#read)
GET /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1"]
}
Then your application updates the document in memory
{
"_id": "100",
"_rev": "1-2902191555...",
"No": ["1","2"]
}
and saves it in the database by specifying the _id and _rev (https://docs.cloudant.com/document.html#update)
PUT /$DATABASE/100
{
"_id": "100",
"_rev": "1-2902191555...",
"No":["1","2"]
}
I asked this question a couple of days ago, but deleted it and I am adding more clarification here in what I'm looking at.
So what I have is a process where a user uploads a CSV and the CSV is then parsed by PapaParse and then sent server side and ultimately inserted into MongoDb.
My problem is that none of these uploads are linked to the specific user, so anyone will have access to every upload the way things look now.
What I tried to do is loop through the upload data, which looks like this;
var document = [{object}, {object}, {object}, {object}, {object}... ];
I used a for loop to loop through each of the objects and add an _id field this contains the user's id via var currentUser = this.userId;
Meteor.methods({
insert: function(document){
var currentUser = this.userId;
var newDocument = document;
for(var i = 0; i < newDocument.length; i++){
newDocument[i]._id = currentUser;
}
Bank.insert(newDocument);
}
Problem is that memory allocation is an issue for larger uploads and meteor simply crashes trying to loop through all the objects and individually adding the _id key to each object in each cell of the array.
When the document is inserted into MongoDB, it looks like this:
I know in my previous post, someone mentioned that MongoDB's insert method doesn't take an array as input, but somehow in my case, it does because the above screenshot is exactly how the document looks before being inserted into MongoDB. So basically, each object is a new document inside a MongoDB. I'm trying to find a way to bind the user's userID with each document in the databse.
Is there another way to associate the upload with the unique current user other than looping through the entire data set, which could be in the tens of thousands on some users?
Why not just do?
Meteor.methods({
insert: function(document){
var currentUser = Meteor.userId();
var newDocument = document;
Bank.insert({userId: currentUser, data: newDocument});
}
});
Now each document in your collection will have two keys: userId and data. The latter will be your array.
I'm trying to find a way to achieve the following pseudo function on the server.The fields likesRecieved likesShown and likesMatch in exist in a document within the Posts collection.
I require this function to perform for every document in post collection by default. This is because Id like the function to do this...
1) find value(s) that exist in likesRecieved and likedShown fields.
2) insert these value(s) in likesMatch field.
3) remove values found in operation 1 from likesRecieved and likesShown
This is what I am essentially trying to do on the server...
likesRecieved: idA, idB, idE, idF, idL
likesShown: idE, idC, idF
..perform a function to result in the following...
likesRecieved: idA, idB, idL
likesShown: idC,
likesMatch: idE, idF
This is my code to find the ids in both arrays for one document only. The likeMatch helper returns the userIds that may exist in both 'likesRecieved' and 'likesShown' fields within a selected document in Posts collection. The resulting value(s) are then inserting into the likesMatch field.
likeMatch: function() {
var selectedPostId = Session.get('postId'); // _id of document in Post collection
var arrayOfLikeRecieved = Posts.find({_id: selectedPostId}, {fields: {LikesRecieved: 1}}).fetch();
var sumArrayRecieved = _.chain(arrayOfLikeRecieved).pluck('LikesRecieved').flatten().value();
var arrayOfLikeShown = Posts.find({_id: selectedPostId}, {fields: {LikesShown: 1}}).fetch();
var sumArrayShown = _.chain(arrayOfLikeShown).pluck('LikesShown').flatten().value();
var duplicates = _.intersection(sumArrayRecieved, sumArrayShown);
Meteor.call('insertDuplicateIntoMatchField', duplicates);
},
MongoDB doesn't have hooks like some other databases do, so there's no way to automatically have a function called when a document is inserted.
You have a couple of options, though. One way would be to have a hook in your application that runs just before inserting the document to run your function. This could be achieved in meteor by using a Collection.deny function.
If you would prefer to have the function be executed in mongodb, then you'll have to call the function manually. The problem is just how to know when the document was inserted or updated. Luckily, meteor allows you to observe changes to a cursor. You could use that to make a call out to the database and run a stored procedure (function) whenever a document gets updated.
My question may be not very good formulated because I haven't worked with MongoDB yet, so I'd want to know one thing.
I have an object (record/document/anything else) in my database - in global scope.
And have a really huge array of other objects in this object.
So, what about speed of search in global scope vs search "inside" object? Is it possible to index all "inner" records?
Thanks beforehand.
So, like this
users: {
..
user_maria:
{
age: "18",
best_comments :
{
goodnight:"23rr",
sleeptired:"dsf3"
..
}
}
user_ben:
{
age: "18",
best_comments :
{
one:"23rr",
two:"dsf3"
..
}
}
So, how can I make it fast to find user_maria->best_comments->goodnight (index context of collections "best_comment") ?
First of all, your example schema is very questionable. If you want to embed comments (which is a big if), you'd want to store them in an array for appropriate indexing. Also, post your schema in JSON format so we don't have to parse the whole name/value thing :
db.users {
name:"maria",
age: 18,
best_comments: [
{
title: "goodnight",
comment: "23rr"
},
{
title: "sleeptired",
comment: "dsf3"
}
]
}
With that schema in mind you can put an index on name and best_comments.title for example like so :
db.users.ensureIndex({name:1, 'best_comments.title:1})
Then, when you want the query you mentioned, simply do
db.users.find({name:"maria", 'best_comments.title':"first"})
And the database will hit the index and will return this document very fast.
Now, all that said. Your schema is very questionable. You mention you want to query specific comments but that requires either comments being in a seperate collection or you filtering the comments array app-side. Additionally having huge, ever growing embedded arrays in documents can become a problem. Documents have a 16mb limit and if document increase in size all the time mongo will have to continuously move them on disk.
My advice :
Put comments in a seperate collection
Either do document per comment or make comment bucket documents (say,
100 comments per document)
Read up on Mongo/NoSQL schema design. You always query for root documents so if you end up needing a small part of a large embedded structure you need to reexamine your schema or you'll be pumping huge documents over the connection and require app-side filtering.
I'm not sure I understand your question but it sounds like you have one record with many attributes.
record = {'attr1':1, 'attr2':2, etc.}
You can create an index on any single attribute or any combination of attributes. Also, you can create any number of indices on a single collection (MongoDB collection == MySQL table), whether or not each record in the collection has the attributes being indexed on.
edit: I don't know what you mean by 'global scope' within MongoDB. To insert any data, you must define a database and collection to insert that data into.
Database 'Example':
Collection 'table1':
records: {a:1,b:1,c:1}
{a:1,b:2,d:1}
{a:1,c:1,d:1}
indices:
ensureIndex({a:ascending, d:ascending}) <- this will index on a, then by d; the fact that record 1 doesn't have an attribute 'd' doesn't matter, and this will increase query performance
edit 2:
Well first of all, in your table here, you are assigning multiple values to the attribute "name" and "value". MongoDB will ignore/overwrite the original instantiations of them, so only the final ones will be included in the collection.
I think you need to reconsider your schema here. You're trying to use it as a series of key value pairs, and it is not specifically suited for this (if you really want key value pairs, check out Redis).
Check out: http://www.jonathanhui.com/mongodb-query