Cloudsql access from ai-platform job - google-cloud-sql

Google has nice ways to connect to cloudsql from other google services but I cannot see how to connect from ai-platform jobs. As part of our training job, we need to update our cloudsql db with metrics but the only I could get it to work is by whitelisting all IPs (don't want that!) in the cloudsql and connecting via the public IP. I don't see an option to add cloud-sql-proxy to the trainer instance. Since the IP of the trainer instance is dynamic, we cannot reliably add specific IP address to whitelist. Any other ways to handle this?

It looks like AI Platform supports VPC peering, so you should be able to connect to Cloud SQL using private IP.
Since Cloud SQL also uses VPC peering, you'll likely need to do the following to get the resources to connect:
Create a VPC to share (or use the "default" VPC)
Follow the steps here to setup VPC peering for AI Platform in your VPC.
Follow the steps here to setup a private IP for your instance in your VPC.
Since the resources are technically in different networks, you may need to export custom routes (Step #2) to allow the AI platform access to your Cloud SQL instance.
Alternatively to using private IP, you could keep using public IP w/ an IP allowlist coupled with Authorizing with SSL/TLS certificates. This still isn't as secure as using the proxy or private IP (as users are technically able to connect to your instance), but they'll be unable to interact with the database engine without the correct certificates.

Can you publish a PubSub message from within your training job and have it trigger a cloud function that connects to the database? AI Platform training seems to have IAM restrictions that I too am curious how to control.

Related

How to use private IP based backends with google cloud API gateway?

So I am trying to make Google cloud's API gateway serve requests from a private IP based backend. Currently, the backend is a Kubernetes based service. However, I couldn't find it explicitly being mentioned whether its possible or not.
Has anyone else encountered such an issue given that its a pretty common use case? It seems possible only when the API gateway infrastructure has a link to the VPC network(route table) or an explicit private connection.
After looking for a while I think that the best way to do what you are asking is to use Private service connect, this allows private consumption of services across VPC networks that belong to different groups, teams, projects, or organizations and also lets you connect to service producers using endpoints with internal IP addresses in your VPC network.
Here is a guide of how to use Private Service Connect to access Google APIs.
the Google API gateways exist only for serverless product and is intended to be use only against serverless backends(s). It is possible to configure it against public IP’s that are hosted on our Google backends because they leverage the same x-google-backend configuration key-value pairs in the openapi.yaml for API Gateway, but more niche features like authorization on behalf of backend services, or limiting access to backed services hosted on non-serverless platforms like GKE are currently not supported. a possible workaround could be to set up endpoints directly with your GKE cluster you, this documentation could help you first, second, third
Best regards.

Download from cloud storage bucket without internet

I have a requirement to download some files stored in a Google Cloud Storage bucket. The challenge is to download it without internet access. Is possible to interact with a Bucket without Internet access? Any suggestions?
Thanks,
Prasanth
No, it wouldn't be possible. You need internet connection to access resources hosted in the Cloud.
You would need to store the files locally or on a physical data storage device in order to access them without the connection.
The only possible option to not use "internet" is to use Dedicated Interconnect where basically you will have a cable from your on-premise to Google's network.
EDIT:
As I understand from the comment you edited, your actual goal is to connect to your GCS bucket from a private VM instance hosted on GCE.
For that you might want to use VPC Service Controls to define the security perimeter around your services and constrain data within a VPC. One of this product's advantages is that the VPC Service Controls provides an additional layer of security by denying access from unauthorized networks, even if the data is exposed by misconfigured Cloud IAM policies.
Here you can find the GCP documentation on configuring VPC Service Controls.

GKE private cluster and cloud sql proxy connection

I have 2 GKE cluster both private and public and using cloudproxy as sidecar container for gke app to access cloudsql instance.
public cluster setup for development/testing
Cloud SQL is enabled with both private and public IP.
GKE app is using cloudproxy with default option of ip types (public,private) as below
Cloud SQL doesn't have any authorized network.
In this case, my app is able to connect CloudSQL and works smoothly. As far as I understand, here connection to cloudsql should be happening with private becuase there is no authorised network configured.
private cluster setup for production
Cloud SQL is enabled with both private and public IP.
GKE app is using cloudproxy with default option of ip types (public,private)
cloudsql-proxy setting in deployment file
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/cloud_sql_proxy"]
args: ["-instances=$(REAL_DB_HOST)=tcp:$(REAL_DB_PORT)","-credential_file=/secrets/cloudsql/credentials.json"]
case 1
Cloud SQL doesn't have any authorized network.
Result: Application is not able to connect with Cloud SQL
case 2
Cloud SQL have private GKE NAT gateway as authorized network
Result: Application is not able to connect with Cloud SQL
May be removing cloudproxy from application will work (I am yet to test) but it discourages the usage of proxy during dev env as it will need changes in deployment file during production deployment.
I am not able to understand what is causing the connection failure with cloudproxy in gke private cluster. Should we not use cloudproxy in private cluster?
Update
The reason due to which cloud proxy not able to connect cloud sql was disabled Cloud SQL Admin API. I have updated my answer in answer section.
It looks like the question here is "Should we use the Cloud SQL proxy in a private cluster?" and that answer is "it depends". It's not required to connect, but it allows for more security because you can restrict unnecessary access to your Cloud SQL server.
The Cloud SQL proxy doesn't provide connectivity for you application - it only provides authentication. It has to be able to connect via the existing path, but then uses the Service Account's IAM roles to authenticate the connection. This also means that it doesn't have to come from a whitelisted network because it's been authenticated by a different means.
If you want to use the proxy to connect via Private IP (instead of defaulting to public), use the -ip_address_types=PRIVATE - this will tell the proxy to connect with the instance's Private IP instead. (Please note that if the proxy lacks a network path (eg, isn't on the VPC) that proxy will still be unable to connect.)
#kurtisvg has provided an informative answer to it.
However the real issue was SQL Admin API and enabling it fixed the issue. After looking into the logs I found below entry.
Error 403: Access Not Configured. Cloud SQL Admin API has not been used in project XXXXXX before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/sqladmin.googleapis.com/overview?
The issue for me was enabling Private cluster in GKE cluster :(
Because of private GKE cluster it wasn't having access to external IP addresses and fix was to create a NAT gateway with cloud router as per https://cloud.google.com/nat/docs/gke-example.
Hint if it's the issue is you won't be able to ping to google.com etc from the container after logging into it.

Accessing Google Cloud SQL from Google Compute Engine using private network

is it possible to access Google Cloud SQL from Google Compute Engine using the private network?
It appears that Google Cloud SQL sees the public network IP for the Google Compute Engine instance.
And, the web console doesn't allow entering the instance private address.
No it is not possible to access Google Cloud SQL instances via a private IP address.
This this page confirms this, it says Note: You must use the external (public) IP address of the GCE instance ...when configuring Authorized IP Addresses to your cloud sql instance from your GCE instance.
This is now available via private services access and VPC Network Peering.
The announcement:
https://cloud.google.com/blog/products/databases/introducing-private-networking-connection-for-cloud-sql
Details:
https://cloud.google.com/sql/docs/postgres/private-ip
You can't access cloud sql from a private IP address but you can whitelist NAT instance's Public IP in order to access cloudsql from private server.

Fort rabbit and Google Cloud SQL

I'd like to use fortrabbit with Google Cloud SQL. Google's Cloud SQL requires to whitelist any IPs that want to access the db, and it seems that fort rabbit doesn't guarantee the outbound IP? How can I access my Cloud SQL data from fortrabbit?
[edit] Cloud SQL does not support whitelisting all IP's like 0.0.0.0. Having said that, it's enough if you can provide a subnet that can catch all the IP's from which your connections can possibly originate from. If you provide a broad IP range for the authorized subnet, please make sure your database is protected with strong user-name and passwords to protect from unauthorized access.