Is `gcloud storage` billing us for transfers of public data? - gcloud

We're switching over our scripts from using gsutil to the reportedly faster gcloud storage. However we access a significant amount of public data, for example from gs://gcp-public-data--broad-references.
We do NOT want to pay to download this public data. However it appears that gcloud storage is automatically setting the X-Goog-User-Project header for public transfers while gsutil does not.
Is my understanding of the various documentation correct that glcoud storage is instructing GCS to bill us and not the public bucket for transfers?
Run gcloud version
On my machine this outputs Google Cloud SDK 407.0.0 and gsutil 5.15
Run gcloud init
Log in
Select a google project
Run gcloud config list
Verify the project you selected before has been configured
Run gsutil -d ls gs://gcp-public-data--broad-references
Verify that the request Headers: do NOT contain X-Goog-User-Project
Run gcloud --log-http storage ls gs://gcp-public-data--broad-references
Verify that under == headers start == your default project has been included as the X-Goog-User-Project
According to all the documentation I've been able to find one should not set that header by default.
Via https://cloud.google.com/storage/docs/requester-pays:
Important: Buckets that have Requester Pays disabled still accept requests that include a billing project, and charges are applied to the billing project supplied in the request. Consider any billing implications prior to including a billing project in all of your requests.
Via https://cloud.google.com/storage/docs/xml-api/reference-headers#xgooguserproject:
The project specified in the header is billed for charges associated with the request. This header is used, for example, when making requests to buckets that have Requester Pays enabled.
Bonus:
Run gsutil ls gs://gnomad-public-requester-pays
You should receive an error BadRequestException: 400 Bucket is a requester pays bucket but no user project provided.
Run gcloud storage ls gs://gnomad-public-requester-pays
The bucket contents should be listed
The latter above doesn't seem correct to me as I never intentionally told gcloud storage which project to bill for the request.

Heard back from a support member after this was reposted to the Google Cloud Community Forums.
ErnestoC said:
The default behavior of the Cloud CLI gcloud is to use the current project for all quota and billing operations. This is why you automatically see your project ID passed in X-Goog-User-Project. This behavior can be overridden though by adding the global --billing-project flag to any command.
If you set this flag to an empty string, no project is passed in the request. I tested this with gcloud storage and confirmed that requester pays buckets return the expected error message (“400: Bucket is a requester pays bucket but no user project provided.”). Non-requester pays buckets allow operations as well.

Related

Recovering access after initially provisioning wrong scopes for an instance

I recently created a VM, but mistakenly gave the default service account Storage: Read Only permissions instead of the intended Read Write under "Identity & API access", so GCS write operations from the VM are now failing.
I realized my mistake, so following the advice in this answer, I stopped the VM, changed the scope to Read Write and started the VM. However, when I SSH in, I'm still getting 403 errors when trying to create buckets.
$ gsutil mb gs://some-random-bucket
Creating gs://some-random-bucket/...
AccessDeniedException: 403 Insufficient OAuth2 scope to perform this operation.
Acceptable scopes: https://www.googleapis.com/auth/cloud-platform
How can I fix this? I'm using the default service account, and don't have the IAM permissions to be able to create new ones.
$ gcloud auth list
Credentialed Accounts
ACTIVE ACCOUNT
* (projectnum)-compute#developer.gserviceaccount.com
I will suggest you to try add the scope "cloud-platform" to the instance by running the gcloud command below
gcloud alpha compute instances set-scopes INSTANCE_NAME [--zone=ZONE]
[--scopes=[SCOPE,…] [--service-account=SERVICE_ACCOUNT
As a scopes put "https://www.googleapis.com/auth/cloud-platform" since it give Full access to all Google Cloud Platform resources.
Here is gcloud documentation
Try creating the Google Cloud Storage bucket with your user account.
Type gcloud auth login and access the link you are provided, once there, copy the code and paste it into the command line.
Then do gsutil mb gs://bucket-name.
The security model has 2 things at play, API Scopes and IAM permissions. Access is determined by the AND of them. So you need an acceptable scope and enough IAM privileges in order to do whatever action.
API Scopes are bound to the credentials. They are represented by a URL like, https://www.googleapis.com/auth/cloud-platform.
IAM permissions are bound to the identity. These are setup in the Cloud Console's IAM & admin > IAM section.
This means you can have 2 VMs with the default service account but both have different levels of access.
For simplicity you generally want to just set the IAM permissions and use the cloud-platform API auth scope.
To check if you have this setup go to the VM in cloud console and you'll see something like:
Cloud API access scopes
Allow full access to all Cloud APIs
When you SSH into the VM by default gcloud will be logged in as the service account on the VM. I'd discourage logging in as yourself otherwise you more or less break gcloud's configuration to read the default service account.
Once you have this setup you should be able to use gsutil properly.

Google Cloud Storage 500 Internal Server Error 'Google::Cloud::Storage::SignedUrlUnavailable'

Trying to get Google Cloud Storage working on my app. I successfully saved an image to a bucket, but when trying to retrieve the image, I receive this error:
GCS Storage (615.3ms) Generated URL for file at key: 9A95rZATRKNpGbMNDbu7RqJx ()
Completed 500 Internal Server Error in 618ms (ActiveRecord: 0.2ms)
Google::Cloud::Storage::SignedUrlUnavailable (Google::Cloud::Storage::SignedUrlUnavailable):
Any idea of what's going on? I can't find an explanation for this error in their documentation.
To provide some explanation here...
Google App Engine (as well as Google Compute Engine, Kubernetes Engine, and Cloud Run) provides "ambient" credentials associated with the VM or instance being run, but only in the form of OAuth tokens. For most API calls, this is sufficient and convenient.
However, there are a small number of exceptions, and Google Cloud Storage is one of them. Recent Storage clients (including the google-cloud-storage gem) may require a full service account key to support certain calls that involve signed URLs. This full key is not provided automatically by App Engine (or other hosting environments). You need to provide one yourself. So as a previous answer indicated, if you're using Cloud Storage, you may not be able to depend on the "ambient" credentials. Instead, you should create a service account, download a service account key, and make it available to your app (for example, via the ActiveStorage configs, or by setting the GOOGLE_APPLICATION_CREDENTIALS environment variable).
I was able to figure this out. I had been following Rail's guide on Active Storage with Google Storage Cloud, and was unclear on how to generate my credentials file.
google:
service: GCS
credentials: <%= Rails.root.join("path/to/keyfile.json") %>
project: ""
bucket: ""
Initially, I thought I didn't need a keyfile due to this sentence in Google's Cloud Storage authentication documentation:
If you're running your application on Google App Engine or Google
Compute Engine, the environment already provides a service account's
authentication information, so no further setup is required.
(I am using Google App Engine)
So I commented out the credentials line and started testing. Strangely, I was able to write to Google Cloud Storage without issue. However, when retrieving the image I would receive the 500 server error Google::Cloud::Storage::SignedUrlUnavailable.
I fixed this by generating my private key and adding it to my rails app.
Another possible solution as of google-cloud-storage gem version 1.27 in August 2020 is documented here. My Google::Auth.get_application_default as in the documentation returned an empty object, but using Google::Cloud::Storage::Credentials.default.client instead worked.
If you get Google::Apis::ClientError: badRequest: Request contains an invalid argument response when signing check that you have dash in the project name in the signing URL (i.e projects/-/serviceAccounts explicit project name in the path is deprecated and no longer valid) and that you have "issuer" string correct, as the full email address identifier of the service account not just the service account name.
If you get Google::Apis::ClientError: forbidden: The caller does not have permission verify the roles your Service Account have:
gcloud projects get-iam-policy <project-name>
--filter="bindings.members:<sa_name>"
--flatten="bindings[].members" --format='table(bindings.role)'
=> ROLE
roles/iam.serviceAccountTokenCreator
roles/storage.admin
serviceAccountTokenCreator is required to call the signBlob service, and you need storage.admin to have ownership of the thing you need to sign. I think these are project global rights, I couldn't get it to work with more fine grained permissions unfortunately (i.e one app is admin for a certain Storage bucket)

Service Account Authentication fails with gsutil for DCM CS bucket(Google-owned API Console Project)

I've done an extensive research but I can't find a solution.
How can I enable Service Account Authentication for a project that is linked with Google's private owned Bucket for Double Click Manager data? (more info on the current setup of this project here https://support.google.com/dcm/partner/answer/2941575?hl=en&ref_topic=6107456&rd=1).
Separate user authentication works with gsutil(navigating to browser->get token->paste back in your cmd->issue commands) but when it comes to configuring a service account I keep getting
AccessDeniedException: 403 Forbidden
What am I missing? Since the Google documentation says that this specific bucket can't be listed under Cloud Storage for that project, then the project and the service account should be linked to that bucket by default so I can't see the issue here.
During set-up you should have created a Google Group to control access to your bucket. You should add the service account email address to that group, and it will then be able to access the bucket.

gcloud installed on gce instance with service level accounts permission issues

I launched an instance with service level accounts enabled. For example it has storage-rw set. I verfied that the instance has those. Now whenever i run gsutil ls gs://my_bucket from within the instance I get the error: Failure: unauthorized_client.
gcloud auth list returns
Credentialed accounts:
- xxxx#developer.gserviceaccount.com (active)
I need to use gcloud sdk from an instance because i need more components other than the gcutil and gsutil.
So my question is how can I authorize gcloud to use the xxxx#developer.gserviceaccount.com account and thus the permissions only specified on the instance and not my personal user account which has full permissions to everything?
The gcloud CLI definitely handles Google Compute Engine service accounts. If you see it as "(active)" when you do $ gcloud auth list, that should be sufficient.
Two things can be going wrong here:
You are using the wrong gsutil.
When you install the Google Cloud SDK, it will create google-cloud-sdk/bin/gsutil, and THAT is the one you want to run. Do $ which gsutil to double check. If you're running google-cloud-sdk/platform/gsutil/gsutil, that's the wrong one, and it won't know about anything that gcloud can tell it.
The account doesn't have permissions to access the bucket you're trying to inspect. You'll have to ask the owner of the bucket to add it to the project that owns that bucket.
Source: Engineer for the Google Cloud SDK
See "Authenticating to Google Compute Engine" section in this doc: https://developers.google.com/compute/docs/gcutil/

The gsutil tool is not working to register a channel in object change notification

When executin the follow command:
gsutil notifyconfig watchbucket -i myapp-channel -t myapp-token https://myapp.appspot.com/gcsnotify gs://mybucket
I receive the follow answer, but I used the same command before in another buckets and it worked:
Watching bucket gs://mybucket/ with application URL https://myapp.appspot.com/gcsnotify...
Failure: <HttpError 401 when requesting https://www.googleapis.com/storage/v1beta2/b/mybucket/o/watch?alt=json returned "Unauthorized WebHook callback channel: https://myapp.appspot.com/gcsnotify">.
I used gsutil config to set permissions and tried with gsutil config -e also.
I already tried to set the permissions, made myself owner of the project, but is not working, any help?
I was getting the same error. You must configure gsutil to use a service account before you can watch a bucket.
An additional security requirement was recently added for Object Change Notification. You must add your endpoint domain as a trusted domain on your cloud project. To do that, the domain first has to be whitelisted with the Google Webmaster Tools.
See instructions here:
https://developers.google.com/storage/docs/object-change-notification#_Authorization
I also determined that I needed to:
Whitelist my appspot domain
Create a service account before I can watch a bucket.
At first I was using the google cloud shell and I figured it should just be authenticated. gsutil ls listed the objects in my bucket so I assumed I was authenticated. However that is not the case.
You need to instal gsutil or google cloud sdk, log in, get the .p12 file from the service account, and auth it as Wind Up Toy described. After that it will work.