We would like to test connecting Cloud SQL (mySQL) to BigQuery using Cloud Data Fusion. What is the proper way to connect to CloudSQL as that does not appear to be "build in" at this point in time. What driver is recommended and are there any instructions available?
Here are instructions to use Cloud SQL MySQL in Data Fusion. Note that in the Wrangler section, currently, Cloud SQL instances with Private IP cannot be used. However, they can still be used when running Data Fusion pipelines
Using Cloud SQL (MySQL) in Wrangler (Public IP only)
Obtain the JDBC Driver JAR file by building it using the instructions at https://github.com/GoogleCloudPlatform/cloud-sql-jdbc-socket-factory
Go to Wrangler
If this is the first time you are configuring CloudSQL for MySQL, click on the Add Connection button from the Wrangler screen and choose Database.
Click “Google Cloud SQL for MySQL.”
Upload the previously built JAR as illustrated, and click the Next button.
Click the Finish button to complete the upload.
Once the driver has been uploaded you will see a green check mark indicating that your driver has been installed.
Click the Google Cloud SQL for MySQL to create a new connection. Once the connection modal opens click on the Advanced link if present.
Enter connection string as
jdbc:mysql://google/<database>?cloudSqlInstance=<instance-name>&socketFactory=com.google.cloud.sql.mysql.SocketFactory&useSSL=false
where represents the database you created in the prerequisites section, and refers to you instance connection name as displayed in the overview tab of of the instance details page, e.g:
Example:
jdbc:mysql://google/mysql?cloudSqlInstance=cloud-data-fusion-demos:us-west1:mysql&socketFactory=com.google.cloud.sql.mysql.SocketFactory&useSSL=false
Enter the username and the password you configured for this CloudSQL instance
Click Test Connection to verify that the connection can successfully be established with the database.
Click Add Connection to complete the task.
Once you’ve completed all the steps you will be able to click on the newly defined database connection and see the list of tables for that database.
Using Cloud SQL (MySQL) in Pipelines (Public and Private IP)
Perform steps 1-6 in the Wrangler section above
Open the pipeline Studio
From the plugin palette on the left, drop the Cloud SQL source plugin to the canvas, and open it by clicking Properties.
Specify the plugin name as cloudsql-mysql (Presumes that you have perform.
Specify the connection string as below:
jdbc:mysql://google/?cloudSqlInstance=&socketFactory=com.google.cloud.sql.mysql.SocketFactory&useSSL=false
where represents the database you created in the prerequisites section, and refers to you instance connection name as displayed in the overview tab of of the instance details page, e.g.:
jdbc:mysql://google/mysql?cloudSqlInstance=cloud-data-fusion-demos:us-west1:mysql&socketFactory=com.google.cloud.sql.mysql.SocketFactory&useSSL=false
Enter the query that you would like to import data from as the Import Query
Enter the username and password to use for the database. You can also use a secure macro for the password.
Click Get Schema to populate the schema of the plugin.
Configure the rest of the pipeline, and deploy.
Related
Reading through the documentation of Tableau Server I was not able to determine if the following works:
I have set-up Tableau Server 2020.4.0 along with the PostgreSQL
driver
I added a connection to an internal, i.e. non-public, PostgreSQL DB via Tableau Server
I can access the PostgreSQL via logging in to Tableau Server just fine
I am also able to connect to the Tableau Server through Tableau Desktop BUT I cannot connect to the PostgreSQL as it is not directly accessible from the client machine running Tableau Desktop.
Is there a way to access this non-public PostgreSQL database connected to Tableau Server from Tableau Desktop through Tableau Server?
If the server is accessible via SSH then you can set up a port forwarding tunnel.
ssh -L 127.0.0.1:5432:postgres.example.com:5432 tableau.example.com
Then in the datasource within Tableau Desktop change the host to 127.0.0.1 from postgres.example.com. If there are SSL errors you may want to add an entry to your /etc/hosts file and not change the hostname.
sudo echo '127.0.0.1 postgres.example.com' >> /etc/hosts
Answering my own question and following #matt_black's comment it is indeed possible to access and use published Datasources from Tableau Desktop which are not directly accessible.
For that you need to login to the Tableau-Server UI (not TSM via 8850), create a Workbook, click on "Datasource" (bottom left hand corner) add a single or multiple connections and then head back to any "Sheet" Tab (also bottom left hand corner).
At this point it is recommended to save the Workbook as "Template", i.e. "my_published_datasoure_template" – explanation follows.
After saving the Workbook you need to hover over the Datasource-Icon in the "Data" Tab and click on the appearing dropdown-arrow to publish the Datasource.
It needs to be mentioned, that once a Datasources has been published this way it asks you to update the workbook right afterwards which you must deny in order to be able to edit the Datasources of the workbook afterwards.
If you need to edit the Datasource at a later point be sure to delete the previously published Datasource then edit and re-publish it.
I have an AWS RDS (PostgreSQL) that is inside a private network - only accessible via a VPN and Bastian Host.
I am able to establish connection from PBI Desktop to "PostgreSQL-RDS Instance." By creating SSH tunneling from my Laptop (localhost) to Bastian Host using ODBC Driver. With this approach all the data is imported onto PBI desktop(import mode).
But our requirement is to establish connection through a direct query to refresh data real time and generate the Reports Dynamically which I am not able to.
I entered the database credentials into the Power BI desktop tool, and it not working correctly in the power bi desktop, getting a Timeout Error.
I must use direct query, I can't use import.
Any help is appreciated.
An exact error that you are getting would help get to the root cause of the issue. However, a few basic troubleshooting steps that I'd suggest are:
Ensure that you have a compatible version of the software installed on your machine such as the Npgsql-4.0.9. AT times the latest version of the software usually causes issues.
Ensure that you remove the semicolon at the end of the query.
Once you get the query running successfully on the desktop version, when you publish it to the web version, the visuals will not be able to connect to the database unless an on-premises data gateway is setup. To do so, more details on setting up a data gateway to automatically refresh the dataset for the power bi web version are here:
Refresh AWS RDS database from Power BI Web you are successfully able to query directly
I'm using Power BI version 2.84 to connect to Postgresql server. In PBI desktop everything works fine, I can connect to the server, import and refresh data smoothly.
However when I publish it to PBI server, I can't refresh it anymore due to 'encrypted connection'. I have checked all of my connection settings and make sure they are not encrypted at all but the problem is still there.
Please let me know if you have any solution for this.
Cheers
I assume you are using direct query?
If you want to use direct query you will need to set up On-Premises data gateway.
on premise gateways
And then you should add gateway cluster in PowerBI web version gateway cluster:
Data gateway
I think everything is quite straightforward here.
But do you need direct query? If you are ok with refreshing your data a few times a day, you could set up a ODBC connection (when importing data, choose ODBC option not postgresql).
You would need to set up ODBC drivers, (Control panel -> Administrative tools -> Data sources) And create a new one (you should download Postgresql ODBC driver if you have none)
Then you also need to create On-Premises data gateway and set up refresh intervals.
When I am trying to process dimensions after creating a data source view I get the error:
The project could not be deployed to the server because of the following connectivity problems : A connection cannot be made. Ensure that the server is running. To verify or update the name of the target server, right-click on the project in Solution Explorer, select Project Properties,click on the Deployment tab, and then enter the name of the server.
I have checked in task manager & SQLBrowser is running. Why am I getting this error?
I was able to get the SQL Server instances on my computer this way:
Start Menu
Microsoft SQL Server 2008 or your version
Server Installation Center
Admin login
Select "Tools" from left menu
Select "Installed SQL Server features discovery report"
You then get a nice HTML web page. You want to look for Database Services. You should see an instance name. Mine is called "SQLEXPRESS." So the combination of server name and instance would be MYCOMPUTER\SQLEXPRESS given that my computer name is MYCOMPUTER.
BTW, the default instance name is MSSQLSERVER.
Alternatively, you can get it from your registry. Just run regedit and look for this key: Computer\HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\Instance Names\SQL
Check that the SQL Server service itself (or MSSQLSERVER) is running. Also check the connection string.
If you want to deploy the project to a named instance of Analysis Services on the local computer, or to an instance on a remote server, change the Server property to the appropriate instance name, such as <InstanceName>.
https://learn.microsoft.com/en-us/analysis-services/multidimensional-tutorial/lesson-2-5-deploying-an-analysis-services-project?view=asallproducts-allversions
You need the server name and the instance name.
Open port 2382,2383 to server solve this issue for me.
My 2 server both use SQL Server 2008 R2
I have my local SQL server and also an Amazon machine running an instance of SQL-Server there.
I'm able to connect from my local machine to that Amazon SQL using the standard 10.10.10.10, 1433 connection from my local Management Studio.
What i need to do now is to run a query that says ..tells me what records I have locally that are not on the Amazon server right now.
Something like:
SELECT *
FROM [LOCAL].dbo.Table1
WHERE Field1 NOT IN
(SELECT Field1 FROM [AMAZON].Database1.dbo.Table1)
================================
Question:
I don't know how to write the "AMAZON" location on the Query window itself, since it's running on a different server.
Any help is truly appreciated !!!
You have to configure AMAZON Server as LINKED Server on your local machine. If you name it "AMAZON" - you query will work exactly as you wrote.
In SSMS, \Server Objects\Linked Servers. Right click, 'new linked server'. Name your server, and choose 'SQL server' radio button. Because I was authorized user on both machines with windows credentials, I selected 'Be made using the login's current security context' radio button under the security tab, and did not even have to fool with the local/remote user mappings.
In order to be able to run queries across multiples servers, a link (linked Server) must be established between the 2 Servers. To create a linked server,
Navigate to the Linked Server Sub-folder under the Server Object folders
Right Click on the Linked Server Folder
Click on New Linked Server
Supply the Connection Strings for the Server
Name your Linked Server.
You can now use the full object qualification (LinkedServer.Database.tableOwner.Table) to access the objects.
Good Luck !
You should open your registered server window and create a group for your servers. then you right click the group name and select new query (Or select several servers in that group). if you execute the query it will rung against the servers selected.