Recently I've installed couchbase sync-gateway, and i'm having following issues from step one,
1) When doing an initial replication from couchbase to pouchdb through syncgateway, there are excess documents coming through.
Eg: Having only 3 objects on couchbase bucket but syncgateway returns more than 3 documents.
2) Inconsistent data when using filters
There's no proper document/log why this is happening. Does anyone know why this is? And how to resolve them
For issue
1. It is expected behavior from syncgateway. Syncgateway has its own meta information on each documents inserted through syncgateway and some _sync:* documents for its own identification/use. you can ignore these documents.
2. What do you means by insertion of data by filter?
Related
We have a active collection with 3000-5000 updates per second and average payload size being 500KB. We run changestream on the same but currently we only have one connector listening to all events in the collections. This works but we expect even more updates and are exploring ways to scale this approach horizontally. The current plan is to the hash the _id key and distirbute the events equally into three or more connectors.
The challenge is , since we use fulldocument=updateLookup we want to check if internally Mongo will now do a query to read the documents 3 times instead of the existing one time.
We tried testing this approach by using mongocat and setting log level to 2 but we are unable to see any Query logs on the collection while updating the documents even while having a active changestream with full document enabled. Any ideas on how we can test this out would really help
I am building out some tools that require <1s response times and using Mongo (or SQL for that matter) will resolve that. I am pretty sure that I'll need to either use Redis or some in-memory data structure. Right now my Mongo aggregate takes almost 5 seconds to load 30 rows. All the collections are indexed but there are lookups within lookups which I think is where the issue lies (i.e. having a reference to the user and having multiple users, need to get their personal info). I also have pagination enabled and on Mongo 5.0 (which uses indexes for sub lookups) Few questions:
Should I just switch to a SQL database?
Should I add change streams to the collection I need to query and update a redis instance and read from there?
Should I just embed the information I'm extracting from the lookups into the parent document
Do 2 and 3?
Issue
I have at least 10 text files(CSV), each reaches to 5GB in size. There is no issue when I import the first text file. But when I start importing the second text file it shows the Maximum Size Limit (16MB).
My primary purpose for using the database is for searching the customers from the database using customer_id index.
Given Below is the details of One CSV File.
Collection Name|Documents|Avg.Document Size|Total Document Size|Num.Indexes| Total Index Size|Properties
Customers|8,874,412|1.8 KB|15.7 GB|3|262.0 MB
To overcome this MongoDB community were recommending GridFS, but the problem with GridFS is that the data is stored in bytes and its not possible to query for a specific index in the textfile.
I don't know if its possible to query for a specific index in a textfile when using GridFS. If some one knows any help is appreciated.
Then the other solution I thought about was creating multiple instance of MonogDB running in different ports to solve the issue. Is this method feasible?
But lot of the tutorial on multiple instance shows how to cerate a replica set. There by storing the same data in the PRIMARY and the SECONDARY.
The SECONDARY instances don't allow to write and only allows to read data.
Is it possible to create multiple instance of MongoDB without creating replica set and with write and read operations on them? If Yes How? Can this method overcome the 16MB limit.
Second Solution I thought about was creating shards of the collections or simply sharding. Can this method overcome the 16MB limit. If yes any help regarding this.
Of the two solutions which is more efficient for searching for data (in terms of speed). As I mentioned earlier I just want to search of customers from this database.
The error message shows exactly where the problem is: entry #8437: line 13530, column 627
Have a look at the file and correct it in the file.
The error extraneous " in field ... is quite clear. In your CSV file you have an opening quote " but it is not closed, i.e. the rest of entire file is considered as one single field.
I'm currently experimenting with a test collection on a LAN-accessible MongoDB server and data in a Meteor (v1.6) application. View layer of choice is React and right now I'm using the createContainer to bind the subscriptions to props.
The data that gets put in the MongoDB storage is updated on a daily basis and consists of a big set of data from several SQL databases, netting up to about 60000 lines of JSON per day. The data has been ever-so-slightly reshaped to be turned into a usable format whilst remaining as RAW as I'd like it to be.
The working solution right now is fetching all this data and doing further manipulations client-side to prepare the data for visualization. The issue should seem obvious: each client is fetching a set of documents that grows every day and repeats a lot of work on earlier entries before being ready to display. I want to do this manipulation on the server, through MongoDB's Aggregation Framework.
My initial idea is to do the aggregations on the server and to create new Collections containing smaller, more specific datasets without compromising the RAWness of the original Collection. That would mean the "reduced" Collections can still be reactive, as I've been able to confirm through testing in a Remote Desktop, subscribing to an aggregated Collection which I can update through Robo3T.
I don't know if this would be ideal. As far as storage goes, there's plenty of room for the extra Collections. But I have no idea how to set up an automated aggregation script on said server. And regarding Meteor, I've tried using meteorhacks:aggregate and jcbernack:reactive-aggregate but couldn't figure out how to deal with either one of them. If anyone is dealing, or has dealt with, something similar; I'd love to hear ideas / suggestions.
I created my databases with mongodb, then I created a model in django and now I want order_by('?') order randomly, but the order does not change.
I am using django 1.4.1.
Thanks.
The MongoDB server (as at 2.2) does not have support for returning query results in random order.
One possible workaround, using a Random Attribute, is described in the MongoDB Cookbook.
Another less performant option would be to use a combination of count, skip, and limit with to find a random document.
You can vote or watch SERVER-533 in the MongoDB issue tracker, which is a feature request for getting random items from a collection. There is some further discussion on the Jira issue as well.