MongoDB collection not listed in _schema - mongodb

I am using MongoDB 2.6 standard. I have successfully created multiple collections, inserted data, queried data, etc. Works fine inside of the Mongo shell, as well as from NodeJS apps using MongoSkin. So far, so good. Now I need to use a third-party tool (D2RQ) to access the data. It appears that D2RQ uses the _schema collection to obtain collection names, column names, data types, and so on. D2RQ works for three of the collections because the collections are in _schema in MongoDB. A fourth collection is not in _schema and seems to be invisible. However, the fourth collection is present in MongoDB. The collection has data. I can query the collection in the Mongo shell, and from NodeJS using Mongoskin. Any idea why the collection is not appearing in _schema? Is this a MongoDB bug?

This is not a MongoDB bug. The root cause of the problem is that D2RQ uses the UnityJDBC driver to access MongoDB. There is a parameter on the JDBC connection string indicating whether or not to rebuild _schema. D2RQ does not properly pass the parameter when making the JDBC connection to MongoDB, resulting in the _schema collection being out of date on all calls after the first call. The solution has two parts:
Part one was to write a tiny NodeJS application that does nothing but force the _schema rebuild on connection. That solved my immediate issue.
Part two was to then extend the tiny NodeJS application to be a full featured export process that generates an RDF file from MongoDB. This allowed me to remove both D2RQ and the UnityJDBC driver from the solution stack.
"The most reliable component in the architecture is the one that is not present"

Related

MongoDB is not scaling

I am building out some tools that require <1s response times and using Mongo (or SQL for that matter) will resolve that. I am pretty sure that I'll need to either use Redis or some in-memory data structure. Right now my Mongo aggregate takes almost 5 seconds to load 30 rows. All the collections are indexed but there are lookups within lookups which I think is where the issue lies (i.e. having a reference to the user and having multiple users, need to get their personal info). I also have pagination enabled and on Mongo 5.0 (which uses indexes for sub lookups) Few questions:
Should I just switch to a SQL database?
Should I add change streams to the collection I need to query and update a redis instance and read from there?
Should I just embed the information I'm extracting from the lookups into the parent document
Do 2 and 3?

Are there side effects to adding new content to a Mongo DB

I want to add a new object to an existing MongoDB document which I don't control and I don't want to break the vendors application. I've tried this in test code and it works fine, but I wanted some confirmation.
I'm using a RestAPI to drive a commercial product and under the hood the application is using MongoDB to persist. I can add new and arbitrary fields/objects to the JSon messages and they're persisted into Mongo as expected. Am I right that as long as my naming is different from existing/new vendor fields, then the Vendors application should just keep working, ignoring my new data?
Bonus points if there's an article covering this that I can reference.
MongoDB does not have a fixed schema and it treats all documents in a collection differently. With the new storage engine WiredTiger, even there is a document level transaction. So adding a new document to the existing collection should not matter most. However, if you are going to read that new document and its not indexed then reading time will be high

Copy data field from one mongo collection to another, on db server

I have two mongo collections. One we can call a template and second is instance. Every time new instance is created, rather large data field is copied from template to instance. Currently the field is retrieved from mongo db template collection in application and then sent back to db as a part of instance collection insert.
Would it be possible to somehow perform this copy on insert directly in mongo db, to avoid sending several megabytes over the network back and forth?
Kadira is reporting 3 seconds lag due to this. And documents are only going to get bigger.
I am using Meteor, but I gather that that should not influence the answer much.
I have done some searching and I can't really find an elegant solution for you. The two ways I can think of doing it are:
1.) Fork a process to run a mongo command to copy your template as your new instance via db.collection.copyTo().
http://eureka.ykyuen.info/2015/02/26/meteor-run-shell-command-at-server-side/
https://docs.mongodb.org/manual/reference/method/db.collection.copyTo/
Or
2.) Attempt to access the raw mongo collection rather than the minimongo collection meteor provides you with so you can use the db.collection.copyTo() functionality supplied by Mongo.
var rawCollection = Collection.rawCollection();
rawCollection.copyTo(newCollection);
Can meteor mongo driver handle $each and $position operators?
I haven't tried accessing the rawCollection to see if copyTo is available, and I also don't know if it will bring it into meteor before writing out the new collection. I'm just throwing this out here as an idea for you; hopefully someone else has a better one.

Can MongoDB be a consistent event store?

When storing events in an event store, the order in which the events are stored is very important especially when projecting the events later to restore an entities current state.
MongoDB seems to be a good choice for persisting the event store, given its speed and flexibel schema (and it is often recommended as such) but there is no such thing as a transaction in MongoDB meaning the correct event order can not be garanteed.
Given that fact, should you not use MongoDB if you are looking for a consistent event store but rather stick with a conventional RDMS, or is there a way around this problem?
I'm not familiar with the term "event store" as you are using it, but I can address some of the issues in your question. I believe it is probably reasonable to use MongoDB for what you want, with a little bit of care.
In MongoDB, each document has an _id field which is by default in ObjectId format, which consists of a server identifier, and then a timestamp and then a sequence counter. So you can sort on that field and you'll get your objects in their creation order, provided the ObjectIds are all created on the same machine.
Most MongoDB client drivers create the _id field locally before sending an insert command to the database. So if you have multiple clients connecting to the database, sorting by _id won't do what you want since it will sort first by server-hash, which is not what you want.
But if you can convince your MongoDB client driver to not include the _id in the insert command, then the server will generate the ObjectId for each document and they will have the properties you want. Doing this will depend on what language you're working in since each language has its own client driver. Read the driver docs carefully or dive into their source code -- they're all open source. Most drivers also include a way to send a raw command to the server. So if you construct an insert command by hand this will certainly allow you to do what you want.
This will break down if your system is so massive that a single database server can't handle all of your write traffic. The MongoDB solution to needing to write thousands of records per second is to set up a sharded database. In this case the ObjectIds will again be created by different machines and won't have the nice sorting property you want. If you're concerned about outgrowing a single server for writes, you should look to another technology that provides distributed sequence numbers.

SQL view in mongodb

I am currently evaluating mongodb for a project I have started but I can't find any information on what the equivalent of an SQL view in mongodb would be. What I need, that an SQL view provides, is to lump together data from different tables (collections) into a single collection.
I want nothing more than to clump some documents together and label them as a single document. Here's an example:
I have the following documents:
cc_address
us_address
billing_address
shipping_address
But in my application, I'd like to see all of my addresses and be able to manage them in a single document.
In other cases, I may just want a couple of fields from collections:
I have the following documents:
fb_contact
twitter_contact
google_contact
reddit_contact
each of these documents have fields that align, like firstname lastname and email, but they also have fields that don't align. I'd like to be able to compile them into a single document that only contains the fields that align.
This can be accomplished by Views in SQL correct? Can I accomplish this kind of functionality in MongoDb?
The question is quite old already. However, since mongodb v3.2 you can use $lookup in order to join data of different collections together as long as the collections are unsharded.
Since mongodb v3.4 you can also create read-only views.
There are no "joins" in MongoDB. As said by JonnyHK, you can either enormalize your data or you use embedded documents or you perform multiple queries
However, you could also use Map-Reduce.
or if you're prepared to use the development branch, you could test the new aggregation framework though maybe it's too much? This new framework will be in the soon-to-be-released 2.2, which is production-ready unlike 2.1.x.
Here's the SQL-Mongo chart also, which may be of some help in your learning.
Update: Based on your re-edit, you don't need Map-Reduce or the Aggregation Framework because you're just querying.
You're essentially doing joins, querying multiple documents and merging the results. The place to do this is within your application on the client-side.
MongoDB queries never span more than a single collection as there is no support for joins. So if you have related data you need available in the results of a query you must either add that related data to the collection you're querying (i.e. denormalize your data), or make a separate query for it from another collection.
I am currently evaluating mongodb for a project I have started but I
can't find any information on what the equivalent of an SQL view in
mongodb would be
In addition to this answer, mongodb now has on-demand materialized views. In a nutshell, this feature allows you to use aggregate and $merge (in 4.2) to create/update a quick view collection that you can query from faster. The strategy is used to update the quick view collection whenever the main collection has a record change. This has the side effect unlike SQL of increasing your data storage size. But the benefits can be huge depending on your querying needs.