Transfer MongoDB dump on external hard drive to google cloud platform - mongodb

As a part of my thesis project, I have been given a MongoDB dump of size 240GB which is on my external hard drive. I'll have to use this data to run my python scripts for a short duration. However, since my dataset is huge and I cannot mongoimport on my local mongodb server (since I don't have enough internal memory), my professor gave me a $100 google cloud platform coupon so I can use the google cloud computing resources.
So far I have researched that I can do it this way:
Create a compute engine in GCP and install mongodb on remote engine. Transfer the MongoDB dump to remote instance and run the scripts to get the output.
This method works well but I'm looking for a method to create a remote database server in GCP so I that I can run my scripts locally, which is something like one of the following.
Creating a remote mongodb server on GCP so that I can establish a remote mongo connection to run my scripts locally.
Transferring the mongodb dump to google's datastore so then I can use the datastore API to remotely connect and run my scripts locally.
I have given a thought of using MongoDB atlas but because of the size of the data, I will be billed hugely and I cannot use my GCP coupon.
Any help or suggestions on how of either of the two methods can be implemented is appreciated.

There is 2 parts to your question
First, you can create a compute engine VM with MongoDB installed and load your backup on it. Then, open the right firewall rules for allowing the connexion from your local environment to the Google Compute Engine VM. The connexion will be performed with a simple login/password.
You can use a static IP on your VM. By the way, in case of reboot on the VM you will keep the same IP (and it will be easier for your local connexion).
Second, BE CAREFUL to datastore. It's a good product, serverless NoSQL database, document oriented, but it's absolutely not the MongoDB equivalent. You can't perform aggregate, you are limited in search capabilities,... It's designed for specific use case (I don't know yours, but don't think that is the MongoDB equivalent!).
Anyway, if you use Datastore, you will have to use a service account or to install Google Cloud SDK on your local environment to be authenticated and to be able to request Datastore API. No login/password in this case.

Related

How to take backup of Tableau Server Repository(PostgreSQL)

we are using 2018.3 version of Tableau Server. The server stats like user login, and other stats are getting logged into PostgreSQL DB. and the same being cleared regularly after 1 week.
Is there any API available in Tableau to connect the DB and take backup of data somewhere like HDFS or any place in Linux server.
Kindly let me know if there are any other way other than API as well.
Thanks.
You can enable access to the underlying PostgreSQL repository database with the tsm command. Here is a link to the documentation for your (older) version of Tableau
https://help.tableau.com/v2018.3/server/en-us/cli_data-access.htm#repository-access-enable
It would be good security practice to limit access to only the machines (whitelisted) that need it, create or use an existing read-only account to access the repository, and ideally to disable access when your admin programs are complete (i.e.. enable access, do your query, disable access)
This way you can have any SQL client code you wish query the repository, create a mirror, create reports, run auditing procedures - whatever you like.
Personally, before writing significant custom code, I’d first see if the info you want is already available another way, in one of the built in admin views, via the REST API, or using the public domain LogShark or TabMon systems or with the Addon (for more recent versions of Tableau) the Server Management Add-on, or possibly the new Data Catalog.
I know at least one server admin who somehow clones the whole Postgres repository database periodically so he can analyze stats offline. Not sure what approach he uses to clone. So you have several options.

How to replicate a postgresql database from local to web server

I am new in the form and also new in postgresql.
Normally I use MySQL for my project but I’ve decided to start migrating towards postgresql for some valid reasons which I found in this database.
Expanding on the problem:
I need to analyze data via some mathematical formulas but in order to do this I need to get the data from the software via the API.
The software, the API and Postgresql v. 11.4 which I installed on a desktop are running on windows. So far I’ve managed to take the data via the API and import it into Postgreql.
My problem is how to transfer this data from
the local Postgresql (on the PC ) to a web Postgresql (installed in a Web server ) which is running Linux.
For example if I take the data every five minutes from software via API and put it in local db postgresql, how can I transfer this data (automatically if possible) to the db in the web server running Linux? I rejected a data dump because importing the whole db every time is not viable.
What I would like is to import only the five-minute data which gradually adds to the previous data.
I also rejected the idea of making a master - slave architecture
because not knowing the total amount of data, on the web server I have almost 2 Tb of hard disk while on the local pc I have only one hard disk that serves only to take the data and then to send it to the web server for the analysis.
Could someone please help by giving some good advice regarding how to achieve this objective?
Thanks to all for any answers.

How can we access Bluemix hosted "Compose for MongoDB" service from "outside"?

Situation:
Have created today a new Compose for MongoDB Service instance in Bluemix
Need:
I have to access this MongoDB DIRECTLY with tools (eg. Mongo Managemant Studio Pro, mongo.exe, etc.) for bulkloading, testing, ad-hoc data fix, etc.
Problem:
I have not found any docs, samples nor a CLEAR statement that
a) gives me some confirmation that THIS is possible
b) gives me COMPLETE information (not just some technical fragments that might have worked year ago) how to do it.
Maybe I am looking to the wrong places or do not know the right people. However I am stuck on this, and before quitting Bluemix MongoDB maybe somebody has a copy/past solution or handson step by step manual.
Any help welcome. Thanks!
Connecting to MongoDB service in Bluemix from an application is possible. For this answer I have used the application "Robo3T" and here are the steps:
Access your MongoDB Service on you Bluemix account. Usually under
"Cloud Foundry Services"
Open section "Manage", from "Connection Settings" copy from "HTTPS" the connection address and port. In this example "sl-eu-lon-2-portal.5.dblayer.com" and "20651"
In Robo3T create a new connection with the connection address from previous step
In tab Authentication configure database name, username and password
. The credentials are found as in step 1
From "Connection Settings" copy the SSL Certificate into a text file and save locally.
In Robo3T Add the certificate to the connection in the "SSL" tab
Test the connection and save the settings
Answer
YES, Bluemix hosted Compose for MongoDB instances can be connected from the mongo Shell and some updated DB Managment tools.
However, you have to make sure, that in case you are running the newest DB versions, that your tools (shell and DB management GUIs) comply with the newest DB features such as encryption etc.
Origin of the Problem
My problem was due to older and therefore incompatible versions of the mongo shell and DB-managment tools running against the newest MongoDB versions with their specialities on encription and multiple servers to be handled in the URI.
At least two DB managment tools are not compatible with the newest DB version and will take their time to get fixed. The problem is, that both will not tell you about this. They just do not not connect. No logs on either side. Period.
So my advise here: look for tool providers who express dedicated compliance with the specific version of your DB.
Advise to the Bluemix Team
It might not take much time to provide some sample connection strings for the most common tools like the mongo shell, MongoBooster, etc. to take the hassle and guesswork out of interpreting the Environment variables and figuring out what is needed for specific connection strings and what is not.
For instance MongoDB Atlas hosting provides for every cluster readymade connection strings for many tools you can just copy/past and done!
Connecting to Atlas took me 5 Minutes. For Bluemix I have lost hours! Not because it is complex, but because the documentation and the generated Info is somehow incomplete and messy - at least for the ones who do not connection strings for their living!

Not able to migrate the data from Parse to local machine

as some of you might aware about the shutting down of parse service in about a year, i am following the migration process as per their tutorials. However, i am not able to migrate these data from parse to local database(i.e. mongodb).
I've started the mongodb instanse locally on 27017, and also created an admin user as part of migration based on these tutorials. Reference-1 & Reference-2.
But when i try to migrate the data from parse developer console, i get No Reachable Servers or Network Error & i don't understand why. I have doubt in the Connection string that i use for this but i am not sure, please find the following image.
I am new to mongodb so don't have much idea about this, your little help would be greatly appreciated.
Since the migration tool runs at parse.com, the tool needs to be able to access your MongoDB instance over the Internet.
Since you're using a local IP (192.168.1.101), parse.com cannot connect to your IP and the transfer will time out.
Either you need to make your MongoDB reachable from the Internet, or you can - as they do in their guide - use an existing MongoDB service.

mongodb - user connection string, secure password

I've been following a tutorial with express, node and mongo.
I have in a config file on the server side:
production:{
db:'mongodb://MYUSERNAME:MYPASSWORD#ds033307.mongolab.com:33307/dbname',
rootPath:rootPath,
port:process.env.PORT||80
}
so, i have my username and password in clear text in a server side javascript file. should i be worried about this? if yes, where else can I put it?
Thanks.
Edit: I went back and had a look at mongolab and heroku (where my site is hosted) docs.
Where I found: "The MongoLab add-on contributes one config variable to your Heroku environment: MONGOLAB_URI", and so I was able to put the MONGOLAB_URI env var into my config and move the password out of the source code.
With regards to the same datacenter, am I right to assume heroku would not be hosting my mongolab database in their datacenter, but would instead be calling out to a cloud service mongo database? Not much I can do then, is there, if I want to stick with mongolab and heroku?
I know this question is old but according to Heroku's docs they currently use 2 datacenters (https://devcenter.heroku.com/articles/regions#data-center-locations).
Their US server is 'amazon-web-services::us-east-1' and their EU alternative is 'amazon-web-services::eu-west-1'.
Both of these data centers are available when launching mongo instances on Mongolab so you can choose for both your app and your db to be on the same datacenter giving much improved security.
I think you should always be concerned about storing passwords in source code files. Generally you would be much better off keeping it in a configuration file that is managed separately. This gives you the flexibility to use the same code with a different configuration file to point to development or qa databases.
Of bigger concern perhaps - are you hosting your application in the same datacenter that MongoLab is hosting your database? If not, that user name and password, along with your data, will traverse the internet in the clear.
MongoLab does not currently support SSL (other than for their RestAPI) so even they recommend being in the same data center:
Do you support SSL?
Not yet but it is on our roadmap to be available in Summer 2014. In
the meantime, we highly recommend that you run your application and
database in the same datacenter. If you have a Dedicated plan, we also
highly recommend that you configure custom firewall rules for your
database(s).
Rest API:
Each MongoLab account comes with a REST API that can be used to access
the databases, collections and documents belonging to that account.
The API exposes most the operations you would find in the MongoDB
driver, but offers them as a RESTful interface over HTTPS.
I would definitely read MongoLab's security page fairly closely:
https://docs.mongodb.com/manual/security/