Are indexes deleted after deleting file in GridFS? - mongodb

I am using GridFS MongoDB via JAVA Driver 2.13.
I inserted a file:
File file = new File("/home/dev/abc.pdf");
GridFSInputFile inputFile = gfs.createFile(file);
I removed it using its _id which is p_id in this case:
DBObject query = new BasicDBObject("_id",p_id)
gfs.remove(query);
I came to know GridFS maintains compound Index on the primary key of metadata file and number of the chunk.
Are these indexes deleted after deleting the file in GridFS?

Index changes happen synchronously with data changes. If you deleted a file, index was updated at the same time.
Are these indexes deleted after deleting the file in GridFS?
You're, likely, just confused about terminology, so I'll clarify. "Deleting an index" means un-indexing a collection (removing index data for all documents and not adding it in the future). What happens here is "index update" where certain entries of the index are updated when you change/insert/remove corresponding data documents.

Related

Unique multi key hashed index in MongoDB

I have a collection with several billion documents and need to create a unique multi-key index for every attribute of my documents.
The problem is, I get an error if I try to do that because the generated keys would be too large.
pymongo.errors.OperationFailure: WiredTigerIndex::insert: key too large to index, failing
I found out MongoDB lets you create hashed indexes, which would resolve this problem, however they are not to be used for multi-key indexes.
How can i resolve this?
My first idea was to create another attribute for each of my document with an hash of every value of its attributes, then creating an index on that new field.
However this would mean to recalculate the hash every time I wish to add a new attribute, plus the excessive amount of time necessary to create both the hashes and the indexes.
This is a feature added in mongoDB since 2.6 to prevent the total size of an index entry to exceed 1024 bytes (also known as Index Key Length Limit).
In MongoDB 2.6, if you attempt to insert or update a document so that the value of an indexed field is longer than the Index Key Length Limit, the operation will fail and return an error to the client. In previous versions of MongoDB, these operations would successfully insert or modify a document but the index or indexes would not include references to the document.
For migration purposes and other temporary scenarios you can downgrade to 2.4 handling of this use case where this exception would not be triggered via setting this mongoDB server flag:
db.getSiblingDB('admin').runCommand( { setParameter: 1, failIndexKeyTooLong: false } )
This however is not recommended.
Also consider that creating indexes for every attribute of your documents may not be the optimal solution at all.
Have you examined how you query your documents and on which fields you key on? Have you used explain to view the query plan? It would be an exception to the rule if you tell us that you query on all fields all the time.
Here are the recommended MongoDB indexing strategies.
Excessive indexing has a price as well and should be avoided.

Get latest document inserted in MongoDb

How do I get the latest document inserted in (a standalone, no RS) MongoDb over existing collections?
And how do I get all documents inserted after this document?
It can be done only in replica set. Please follow the tutorial to convert standalone instance to replica set.
You can get a reference to the last inserted document from oplog:
db.oplog.rs.find({op:"i"}).sort({$natural: -1}).limit(1);
ns field contains name of the database and collection, and o._id contains the object's identifier.
To get references to documents that were inserted after that later you can use ts field of the document you retrieved in the previous query:
db.oplog.rs.find({op:"i", ts: {$gt: last.ts}});
This command will cause MongoDB to load everything to memory, if oplog.rs is very big, then it will cause memory usage high
db.oplog.rs.find({op:"i"}).sort({$natural: -1}).limit(1);

mongodb dataypes when importing

I want to clone a collection to a new collection, remove all the documents, and then import new documents from a csv file. When I do the copy using copyTo everything works fine. The datatypes are copied over from the source collection to the new collection. However, after I remove all the documents from the new collection and import from the csv, the datatypes are lost. The datatypes from my source csv are already setup to match what is in the source collection I copied from.
Is there a way to preserve the datatypes after removing all documents from a collection?
How can I copy the datatypes from my csv when importing? For example my date columns show as string.
A new collection doesn't have a fixed schema so documents added don't have to be similar unless you've created the collection using the validator option. You can also add validation to an existing collection. See Document Validation in the MongoDB manual.

Easiest way to replace existing mongo collection with new one in Meteor

I have a csv file that I've imported into a Meteor project and I've updated the csv file (added a couple of columns with data) and I'd like to re-import the csv file. If I just import it again, will it over-write the first one? Or would I have two collections with the same name? What's the best way to do this?
If you re-import the file again, it will do insert not update to the collection
So if your collection have a unique key index on a field (like _id because by default _id is indexed and unique) and that field is a column in the csv file. When you import again, mongodb will throw an error saying you have violated a unique unique constraint and stop, your old data is untouched.
If not, your collection don't have any other unique key index and _id is not a column in the csv file. Then if you re-import, your collection will have duplicate records with the old data and the new data that you just imported.
Either way, the result is not what you wanted.
You can't have 2 collections with the same name in the same database.
Easiest way to do: if your data is not important, you could just drop the collection and import again
Else you will have to update the document in mongodb (using mongo console or write a script)

Replace into (mysql) equivalent in mongodb

I want to do a batch insert in mongodb , but if the record exists already it should replace it with the new one.There is update command but its not possible to do it in batch.Any idea whether it is possible? I am using java api.
Thanks
Edit:
As my collection size is not very huge, i am renaming the collection with drop Target option set to true and creating a new collection with this data.As i cant risk deleting and creating a new collection this is better, but it will be awesome if there is replace into equivalent.
If you are having any primary key in your collection, then it will replace automatically.Make sure your documents have _id key.
Look at mongodb document:
Shorthand for insert/update is save - if _id value set, the record is updated if it exists or inserted if it does not; if the _id value is not set, then the record is inserted as a new one.
in http://mongodb.github.io/node-mongodb-native/markdown-docs/insert.html