Will Mongodb GridFs MyFiles.files get deleted automatically? - mongodb

I have some files uploaded to GridFs which will create Files.files and Files.chunks but now I see that Files.files data got deleted but Files.chunks data still available. I'm not able to access the files without Files.files(ObjectId). Do these files will get deleted after some time or some storage?

Related

MongoDB Collection Data Storing in File process

If I want to take the backup of MongoDB data files and transfer it to a different server how can we do that? In the data path, I can see a lot of files are there with the prefix collection, index, and ending with *.wt
Tried with all files but the service got stopped.
I'm trying to take the data from version 3.2 and want to restore the data in version 5
By using mongo import and export it's working. But the challenge is it can not be done in production data as the data size is 8TB+
For that looking for some solution if we can copy the data files only from the data path and send it to the version 5 data path.

Maintaining Generation Data when Backing Up Versioned Google Cloud Storage Buckets

My use cases:
I would like to store multiple versions of text files in Cloud Storage and save the createdAt timestamp at which each version was created.
I'd like to be able to retrieve a list of all versions and createdAt times without opening and reading the files.
I'd also like to create nightly backups of the bucket with all the versions intact, and for each file to keep a record of its original createdAt time.
My solutions:
I have a Cloud Storage bucket with versioning enabled. Every time a I save a file test, I get a new file test#generation_number.
I can list all versions and fetch an older version with test#generation_number
I can back up all versions of test in the bucket, using gsutil cp -A gs://my-original-bucket/test gs://my-backup-bucket.
The issue with point #3. The #generation_number of each version that is backed up changes to the time at which the version of each backup file was created, not the time of the original file creation. I understand this is working as intended, and the order of the versions is still intact.
But where to stash these original createdAt values? I tried to store them in metadata, and I found metadata seems not to be versioned, but rather global for the file object as a whole.
What is the best way to achieve my use case? Is there any way to do so directly with Google Cloud Storage? Or should I maintain a separate database with this information, instead?

What are fs.chunks and fs.files in lab cloud

I am saving data to mlab storage but I ran out of data, so I went into my account and realized that I had 490mb worth of data in fs.chunks. Can I delete fs.chunks and fs.files class. Or will something very bad happen. Im very confused by these two classes so more clarification would much be appreciated. Do I need fs.chunks and ds.files
These collections are created when an application is storing data in a database using GridFS.
In general, its much more efficient to store files in a file storage service such as AWS S3 and storing references to the files in the database, as opposed to storing the files directly in the database.

mongodb create database file automatically after certain period

Mongodb Database generate files automatically after certain period as follow
Doc.0
Doc.1
Doc.2
Doc.3
Doc.4
but Doc.ns file never regenerate like above file
I'm not sure exactly what, if anything, you are specifying as a problem. This is expected behavior. MongoDB allocates new data files as the data grows. The .ns file, which stores namespace information, does not grow like data files, and shouldn't need to.

CarrierWave save image to gridfs and upload in background s3

Is there any way to save image to mongo's gridfs and after asynchronous upload to S3 in background?
Maybe it is possible to chain uploaders?
The problem in next: Multiple servers used, thus - saved to hard drive image and running background process can be on different servers.
Also
1. it should remove from gridfs when uploaded to s3
2. it should auto remove from s3 when correspond entity destroyed.
Thanks.
What does your deployment architecture look like? I'm a little confused by when you say "multiple servers"- do you mean multiple mongod instances? Also, it's a bit confusing when you specify your requirements. According to requirement 1, if you upload to S3, then the gridfs file should be removed. However, according to your requirements, it cannot exist in both S3 and Gridfs, so requirement 2 seems to be a contradiction to the first, ie, it shouldn't exist in gridfs in the first place. Are you preserving some files on both Gridfs and S3?
If you are running in a replica set or sharded cluster, you could create a tailable cursor on your gridfs collection (you can also do this on a single node, although it's not recommended). When you see an insert operation (will look like 'op':'i') you could execute a script or do something in your application to grab the file from gridfs and push the appropriate file to s3. Similarly, when you see a delete operation ('op':'d') you could summarily delete the file from s3.
The beauty of a tailable cursor is that it allows for asynchronous operations- you can have another process monitoring the oplog on a different server and performing the appropriate actions.
I used temp variable to store to gridfs and made Worker (see this) to perform async upload from gridfs to s3.
Hope this would help somebody, thanks.