How to export MongoDB data from Google Cloud Compute Engine>Storage>Disk? - mongodb

I had a GKE cluster, with different webapps and a MongoDB database, that was deleted due to some problems with the billing. GKE just dissapeared but I still have the source code in another repo, so no problem to redeploy it.
If I go to "ComputeEngine>Storage>Disk" section, I can see the disk of my MongoDB but I cannot figure out how to export the data, in order to use it to populate a new Mongo database in another cloud. I have tried to create an image, but I'm not sure how to manage the image to extract the Mongo data.
I have not found any guide or tutorial about this, any help please?
Thanks too much

As #guillaume blaquiere mentioned :
Create a VM with a boot disk.
Add an additional disk with your MongoDB data.
Choose the Cloud Storage location to export your data to by clicking Browse.
Once you choose a Cloud Storage, choose a filename for the exported data. You can use the default filename, or you can choose your own filename.
After choosing a Cloud Storage, and entering a filename for the data, click Select.
From the Export image page, click Export. After choosing Export, the Cloud Console displays the Image export history, where you can view the data export process.
Go to the Storage page to access your exported data.
Checkout the link on Exporting an image for more information.

Related

Firestore: Copy/Import data from local emulator to Cloud DB

I need to copy Firestore DB data from my local Firebase emulator to the cloud instance. I can move data from the Cloud DB to the local DB fine, using the EXPORT functionality in the Firebase admin console. We have been working on the local database instance for 3-4 months and now we need to move it back to the Cloud. I have tried to move the local "--export-on-exit" files back to my storage bucket and then IMPORT from there to the Cloud DB, but it fails everytime.
I have seen one comment by Doug at https://stackoverflow.com/a/71819566/20390759 that this is not possible, but the best solution is to write a program to copy from local to cloud. I've started working on that, but I can't find a way to have both projects/databases open at the same time, both local and cloud. They are all the same project ID, app-key, etc.
Attempted IMPORT: I copied the files created by "--export-on-exit" in the emulator to my cloud storage bucket. Then, I selected IMPORT and chose the file I copied up to the bucket. I get this error message:
Google Cloud Storage file does not exist: /xxxxxx.appspot.com/2022-12-05 from local/2022-12-05 from local.overall_export_metadata
So, I renamed the file metadata file to match the directory name from the local system, and the IMPORT claims to initiate successfully, but then fails with no error message.
I've been using Firestore for several years, but we just started using the emulator this year. Overall I like it, but if I can't easily move data back to the Cloud, I can't use it. Thanks for any ideas/help.

MLflow Artifacts Storing artifacts(google cloud storage) but not displaying them in MLFlow UI

I am working on a docker environment(docker-compose) with a jupyter notebook docker image and a postgres docker image for running ML models and using google cloud storage to store the model artifacts. Storing the models on the cloud storage works fine but i can't get to show them within the MLFlow UI. I have seen similar problems but non of the solutions used google cloud storage as the storage location for artifacts. The error message says the following Unable to list artifacts stored under <gs-location> for the current run. Please contact your tracking server administrator to notify them of this error, which can happen when the tracking server lacks permission to list artifacts under the current run's root artifact directory.What could possibly be causing this problem?
I had the exactly the same issue. Keywords are docker-compose, google cloud storage, success in storing in GCS, but failure in listing artifacts in UI.
In my case, it turns out that in docker-compose file, if you assign the env vars by reading from a .env file (eg. GOOGLE_APPLICATION_CREDENTIALS), the server might start before the assignment. The quick solve is to assign the env var directly with key environment: instead of using key env_file:.
For sensitive data that you still need to put in .env file, you can add wait time for the server, and add depends on: in docker-compose file to make sure that the database container starts before the mlflow server if you are using database-backed store.
I faced a same issue when running mlflow from local. The issue got resolved after adding GOOGLE_APPLICATION_CREDENTIALS to the environment variables.
https://googleapis.dev/python/google-api-core/latest/auth.html

Use Cygnus to store historical data from Orion ContextBroker in a local Hadoop database

We are currently working in a project where we use Orion ContextBroker to store information from different sensors and Wirecloud to show them in a web page.
We want to store historical data from these sensors in order to show them in a graph. I have looked around the Fiware documentation and they recommend to store the data in a Cosmos instance of Fi-lab, through Cygnus.
The thing is that we would like to store that historical data in a local Hadoop based server we have in our company, not in Cosmos, because we are running this project in a local net where we don't have internet access, and also to have that information stored in our local server.
Is it possible to configure Cygnus to redirect the output data to my file system? If so, which files must be configured in order to achieve this?
Thank you
The answer is yes. Cygnus is meant to persist context data in whatever HDFS-based filesystem (as the one used by Cosmos), thus nothing special has to be done when configuring Cygnus.
If you download the lastest version (0.7.0 at the moment of writting this), you will need to configure:
A cygnus_instance_default.conf file from cygnus_instance.conf.template. This is the instance configuration. From 0.7.1 is possible to have multiple instance configurations that are run in a parallel way, and they all have to called cygnus_instance_<whatever>.conf.
A agent.conf file from agent.conf.template. This is the Flume specific configuration that you will find in the README.md.

Best way to stage file from cloud storage to windows machine

I am wanting to store a data file for Quickbooks in the cloud. I understand that the data file is more of a database-in-a-file, so I know that I don't want to simply have the data file itself in a cloud directory.
When I say 'cloud', I'm meaning something like Google Drive or box.com.
What I see working is that I want to write a script (bat file, or do they have something new and improved for Windows XP, like some .net nonsense or something?)
The script would:
1) Download the latest copy of the data file from cloud storage and put it in a directory on the local machine
2) Launch Quickbooks with that data file
3) When the user exits Quickbooks, copy the data file back up into the cloud storage.
4) Rejoice.
So, my question(s)... Is there something that already does this? Is there an easily scriptable interface to work with the cloud storage options? In my ideal world, I'd be able to say 'scp google-drive://blah/blah.dat localdir' and have it copy the file down, and do the opposite after running QB. I'm guessing I'm not going to get that.
Intuit already provides a product to do this. It is called Intuit Data Protect and it backs up your Quickbooks company file to the cloud for you.
http://appcenter.intuit.com/intuitdataprotect
regards,
Jarred

Restore full external ESENT backup

I've wrote the code that creates full backups of my ESENT database, using JetBeginExternalBackup API.
Following the MSDN guidelines, I backed up every file returned by JetGetAttachInfo and JetGetLogInfo.
I've made the backup, erased old database, and copied the backup data to the database folder.
The DB engine was unable to start, the JetInit error code is "JET_errMissingLogFile".
I've checked the backup, it only contains the database file, and "<inst>XXXXX.log" log files. It lacks the current log file (I'm using circular logging, BTW).
Is there any way to restore such backup?
I don't want to use JetExternalRestore API because it's too complex: I don't need to restore to another location, I don't understand why there're 3 input folders not 2, and I don't know the values to supply in genLow and genHigh arguments.
I do need external backups: the ESENT database is used by ASP.NET on a remote server, and I'm backing it up over the Internet.
Or, maybe there's a way to retrieve the name of the current log file, and I should just add it to the backup?
Thanks in advance!
P.S. I've got no permissions to span processes on my web server, so using eseutil.exe is not an option.
Unpack all backed up files to a single folder.
Take the name of your main database file. Replace extension to .pat. Create zero-length file with that name, e.g. database.pat.
After this simple step, call JetRestoreInstance API, it will restore the backup from that folder.