how can I reset broken _ids - mongodb

I have imported my database entries from a csv file into mongodb and in the process of doing so the imported entries's '_id's aren't in the typical randomised format. e.g instead of looking like
"_id": "R6AnePqKecNNe7dkr"
they look like
"_id": 28
For some reason this means that when I try to refer to the database entry with an _id of 28 it can't find it. I thought it might be an issue in that they _id is not a string because they're not in quotes but I've tried to convert them to strings and that doesn't make a difference. I've tried making a function that randomly generates a new _id but it can't find the old _id to put the new _id into.
Is there any way to do a bulk "generate new _ids for everything in this collection"? I'm having trouble finding anything equivalent

Related

Converting Parse objectId to Mongo ObjectId?

I'm trying to migrate data from Parse to a new project that uses Mongo as its database (without Parse/Parse Server). Since the schemas are different between the two projects, I'm manually writing a migration script to achieve this.
As I understand it, Parse appears to use 10-character-long IDs for their objects (combinations of digits, lower-case letters, and upper-case letters), while Mongo uses 24-character-long IDs (12 bytes represented as hex).
Right now, when migrating data for a document from the old project to the new one, I'm using a function that converts the Parse ID to a unique Mongo ObjectId (it converts each character to a 2-digit hex value, then pads the 20-character string with 4 zeroes).
Is this a good approach? I'm avoiding using Mongo's automatic ObjectId generation in case I ever need to re-migrate any of the old Parse documents and find the matching document in the new database. I know automatically generated ObjectIds in Mongo also embed some other information like creation dates, but I don't think this would be important and I can just use my custom ObjectId generator? However, I'm not sure about the implications for performance/if I'm just going about this migration the wrong way.
The approach i recommend is letting Mongo auto-generate the ids and then storing Parse's ids in a new field called parseID for future reference if needed.
For example:
PARSE DATA:
"_id": ObjectId(1234567890),
"title": "Mongo Migrate",
"description": "Migrating from Parse to Mongo"
MONGO DATA:
"_id": ObjectId(1ad83e4k2ab8e0daa8ebde7), //mongo generated
"parseId":ObjectId(1234567890),
"title": "Mongo Migrate",
"description": "Migrating from Parse to Mongo"
Then if you need to match a document between the two databases later, you can write a script that goes along the lines of Parse.find({"_id": Mongo.parseId}).....
MongoDB uses _id as primary key by default. _id has to be unique to avoid collision. The way you are generating unique ObjectId to _id is fine. As long as they are unique, you could even reduce the 20-character pad to save space.

Flow Router doesn't work with ObjectID. Any fix?

I'm trying to build routes in my Meteor app. Routing works perfectly fine but getting information from db with route path just doesn't work. I create my page specific routes with this:
FlowRouter.route('/level/:id'...
This route takes me to related template without a problem. Then I want to get some data from database that belong to that page. In my template helpers I get my page's id with this:
var id = FlowRouter.getParam('id');
This gets the ObjectID() but in string format. So I try to find that ObjectID() document in the collection with this:
Levels.findOne({_id: id});
But of course documents doesn't have ObjectIDs in string format (otherwise we wouldn't call it "object"id). Hence, it brings an undefined error. I don't want to deal with creating my own _ids so is there anything I can do about this?
PS: Mongo used to create _ids with plain text. Someting like I would get with _id._str now but all of a sudden, it generates ObjectID(). I don't know why, any ideas?
MongoDB used ObjectIds as _ids by default and Meteor explicitly sets GUID strings by default.
Perhaps you inserted using a meteor shell session in the past and now used a mongo shell/GUI or a meteor mongo prompt to do so, which resulted in ObjectIds being created.
If this happens in a development environment, you could generate the data again.
Otherwise, you could try to generate new _ids for your data using Meteor.uuid().
If you want to use ObjectId as the default for a certain collection, you can specify the idGeneration option to its constructor as 'MONGO'.
If you have the string content of an ObjectId and want to convert it, you can issue
let _id = new Mongo.ObjectID(my23HexCharString);

How do I make a mongo query for something that is not in a subdocument array of heterodox size?

I have a mongodb collection full of 65k+ documents, each one with a properties named site_histories. The value of it is an array that might be empty, or might not be. If it is not empty, it will have one or more objects similar to this:
"site_histories" : "[{\"site_id\":\"129373\",\"accepted\":\"1\",\"rejected\":\"0\",\"pending\":\"0\",\"user_id\":\"12743\"}]"
I need to make a query that will look for every instance in the collection of a document that does not have a given user_id.
I'm pretty new to Mongo, so I was trying to make a query that would find every instance that does have the given user_id, which I was then planning on adding a "$ne" to, but even that didn't work. This is the query I was using that didn't work:
db.test.find({site_histories: { $elemMatch: {user_id: '12743\' }}})
So can anyone tell me why this query didn't work? And can anyone help me format a query that will do what I need the final query to do?
If your site_histories really is an array, it should be as simple as doing:
db.test.find({"site_histories.user_id": "12743"})
That looks in all the elements of the array.
However, I'm a bit scared of all those backslashes. If site_histories is a string, that won't work. It would mean that the schema is poorly designed, you'd maybe try with $regex

Generation of _id vs. ObjectId autogeneration in MongoDB

I'm developing an application that create permalinks. I'm not sure how save the documents in MondoDB. Two strategies:
ObjectId autogeneration
MongoDB autogenerates the _id. I need to create an index on the permalink field because I get the information by the permalink. Also I can access to the creation time of the ObjectId, using the getTimestamp() method, so datetime fields seems to be redundant but if I delete this field I need two calls to MongoDB one to take the information and another to take the timestamp.
{
"_id": ObjectId("5210a64f846cb004b5000001"),
"permalink": "ca8W7mc0ZUx43bxTuSGN",
"data": "a lot of stuff",
"datetime": ISODate("2013-08-18T11:47:43.460+-100")
}
Generate _id
I generate the _id with the permalink.
{
"_id": "ca8W7mc0ZUx43bxTuSGN",
"data": "a lot of stuff",
"datetime": ISODate("2013-08-18T11:47:43.460+-100")
}
I not see any advantage to use ObjectIds. Am I missing something?
ObjectIds are there for situations where you don't have a unique key for every document in a collection. They're unique, so you don't have to worry about conflicts and they shard reasonably well in large deployments without too much worry (they have they're pros and cons, read more here).
The ObjectId also contains the timestamp of the client where the ObjectId was generated (unless the DB server is configured to generate all keys). With that, as you noticed, you can use the time stamp to perform some date operations. However, if you plan on using the Aggregation Framework, you'll find that you can't use an ObjectId in any date operations currently (issue). If you want to use the AF, you'll need a second field that contains the date, unfortunately doubly storing it with the ObjectId's internal value.
If you can be assured that the _id you're generating is unique, then there's not much reason to use an ObjectId in your data structure.

MongoDB - forcing stored value to uppercase and searching

in SQL world I could do something to the effect of:
SELECT name FROM table WHERE UPPER(name) = UPPER('Smith');
and this would match a search for "Smith", "SMITH", "SmiTH", etc... because it forces the query and the value to be the same case.
However, MongoDB doesn't seem to have this capability without using a RegEx, which won't use indexes and would be slow for a large amount of data.
Is there a way to convert a stored value to a particular case before doing a search against it in MongoDB?
I've come across the $toUpper aggregate, but I can't figure out how that would be used in this particular case.
If there's not way to convert stored values before searching, is it possible to have MongoDB convert a value when it's created in Mongo? So when I add a document to the collection it would force the "name" attribute to a particular case? Something like a callback in the Rails world.
It looks like there's the ability to create stored JS for MongoDB as well, similar to a Stored Procedure. Would that be a feasible solution as well?
Mostly looking for a push in the right direction; I can figure out the particular code once I know what I'm looking for, but so far I'm not even sure if my desired functionality is doable.
You have to normalize your data before storing them. There is no support for performing normalization as part of a query at runtime.
The simplest thing to do is probably to save both a case-normalized (i.e. all-uppercase) and display version of the field you want to search by. Suppose you are storing users and want to do a case-insensitive search on last name. You might store:
{
_id: ObjectId(...),
first_name: "Dan",
last_name: "Crosta",
last_name_upper: "CROSTA"
}
You can then create an index on last_name_upper, and query like:
> db.users.find({last_name_upper: "CROSTA"})