How to use Service Accounts with gsutil, for downloading from CS - DCM Google private owned bucket - google-cloud-storage

A project, a Google Group have been set up for controlling data access following the DCM guide: https://support.google.com/dcm/partner/answer/3370481?hl=en-GB&ref_topic=6107456
The project does not contain the bucket I want to access(under Storage->Cloud Storage), since it's Google owned bucket, for which I only have read only access. I can see the bucket in my browser since I am allowed to with my Google account(since I am a member of the ACL).
I used the gsutil tool to configure the service account of the project that was linked with the private bucket using
gsutil config -e
but when I try to access that private bucket with
gsutil ls gs://<bucket_name>
I always get 403 errors, and I don't know why is that. Did anyone tried that before or any ideas are welcome.

Since the bucket is private and in project A, service accounts in your project (project B) will not have access. The service account for your project (project B) would need to be added to the ACL for that bucket.
Note that since you can access this bucket with read access as a user, you can run gsutil config to grant your user credentials to gsutil and use that to read the bucket.

Related

ML-Engine unable to access job_dir directory in bucket

I am attempting to submit a job for training in ML-Engine using gcloud but am running into an error with service account permissions that I can't figure out. The model code exists on a Compute Engine instance from which I am running gcloud ml-engine jobs submit as part of a bash script. I have created a service account (ai-platform-developer#....iam.gserviceaccount.com) for gcloud authentication on the VM instance and have created a bucket for the job and model data. The service account has been granted Storage Object Viewer and Storage Object Creator roles for the bucket and the VM and bucket all belong to the same project.
When I try to submit a job per this tutorial, the following are executed:
time_stamp=`date +"%Y%m%d_%H%M"`
job_name='ObjectDetection_'${time_stamp}
gsutil cp object_detection/samples/configs/faster_rcnn_resnet50.config
gs://[bucket-name]/training_configs/faster-rcnn-resnet50.${job_name}.config
gcloud ml-engine jobs submit training ${job_name} \
--project [project-name] \
--runtime-version 1.12 \
--job-dir=gs://[bucket-name]/jobs/${job_name} \
--packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
--module-name object_detection.model_main \
--region us-central1 \
--config object_detection/training-config.yml \
-- \
--model_dir=gs://[bucket-name]/output/${job_name}} \
--pipeline_config_path=gs://[bucket-name]/training_configs/faster-rcnn-resnet50.${job_name}.config
where [bucket-name] and [project-name] are placeholders for the bucket created above and the project it and the VM are contained in.
The config file is successfully uploaded to the bucket, I can confirm it exists in the cloud console. However, the job fails to submit with the following error:
ERROR: (gcloud.ml-engine.jobs.submit.training) User [ai-platform-developer#....iam.gserviceaccount.com] does not have permission to access project [project-name] (or it may not exist): Field: job_dir Error: You don't have the permission to access the provided directory 'gs://[bucket-name]/jobs/ObjectDetection_20190709_2001'
- '#type': type.googleapis.com/google.rpc.BadRequest
fieldViolations:
- description: You don't have the permission to access the provided directory 'gs://[bucket-name]/jobs/ObjectDetection_20190709_2001'
field: job_dir
If I look in the cloud console, the files specified by --packages exist in that location, and I've ensured the service account ai-platform-developer#....iam.gserviceaccount.com has been given Storage Object Viewer and Storage Object Creator roles for the bucket, which has bucket level permissions set. After ensuring the service account is activated and the default, I can also run
gsutil ls gs://[bucket-name]/jobs/ObjectDetection_20190709_2001
which successfully returns the contents of the folder without a permission error. In the project, there exists a managed service account service-[project-number]#cloud-ml.google.com.iam.gserviceaccount.com and I have also granted this account Storage Object Viewer and Storage Object Creator roles on the bucket.
To confirm this VM is able to submit a job, I am able to switch the gcloud user to my personal account and the script runs and submits a job without any error. However, since this exists in a shared VM, I would like to rely on service account authorization instead of my own user account.
I had a similar problem with exactly the same error.
I found that the easiest way to troubleshoot those errors is to go to "Logging" and search for "PERMISSION DENIED" text.
In my case service account was missing permission "storage.buckets.get". Then you would need to find a role that have this permission. You could do that from IAM->Roles. In that view you could filter roles by permission name. It turned out that only following roles have the needed permission:
Storage Admin
Storage Legacy Bucket Owner
Storage Legacy Bucket Reader
Storage Legacy Bucket Writer
I added "Storage Legacy Bucket Writer" role to the service account in the bucket and then was able to submit a job.
Have you tried to look in the Compute Engine scope?
Shutdown instance, Edit and change Cloud API access scopes to:
Allow full access to all Cloud APIs

Right way of using Google Storage on a GCE VM

I want to know the right/best way of having one machine copying data to Google Storage.
I need one machine to be able to write to a bucket, but not be able to create or delete other buckets.
While researching, I found out that you should create a account service so this account can log in to GC and then use the storage.
But the problem is, when the machine is from GCE, there are scopes. When setting up the scope "Default" it can Read from Google Storage, but can not write to it. Even after authenticated with a service account.
When the scope is Devstorage.read_write now the machine can create and remove buckets from that storage without login. I find that to risk.
Does anyone have any recommendations?
Thanks
The core problem here is that the "write" scope covers both write and delete, and that the GCE service account is likely a member of project-editors, which can create and delete buckets. It sounds like what you want to do is restrict a service account to only being able to affect a single bucket. You should be able to do this with these steps:
Create a service account in your project (and save the private key file).
In the permissions page for the project, make sure that service account is not a project editor for your project.
Using an account that does have full permissions to your project, create the bucket, then grant the service account write access to the bucket. Example gsutil commands to do this:
gsutil mb gs://yourbucket
gsutil acl ch -u your-service-account-name#gserviceaccount.com:W gs://yourbucket
Create a VM that does not have a GCE service account enabled.
Push the service account's private key file to that VM.
On the VM, gcloud auth activate-service-account --key-file=your-key-file.json
Now gsutil commands run on the VM should be able to write to (and delete) objects in that bucket, but not any other buckets in your project.

Service Account Authentication fails with gsutil for DCM CS bucket(Google-owned API Console Project)

I've done an extensive research but I can't find a solution.
How can I enable Service Account Authentication for a project that is linked with Google's private owned Bucket for Double Click Manager data? (more info on the current setup of this project here https://support.google.com/dcm/partner/answer/2941575?hl=en&ref_topic=6107456&rd=1).
Separate user authentication works with gsutil(navigating to browser->get token->paste back in your cmd->issue commands) but when it comes to configuring a service account I keep getting
AccessDeniedException: 403 Forbidden
What am I missing? Since the Google documentation says that this specific bucket can't be listed under Cloud Storage for that project, then the project and the service account should be linked to that bucket by default so I can't see the issue here.
During set-up you should have created a Google Group to control access to your bucket. You should add the service account email address to that group, and it will then be able to access the bucket.

gsutil copy returning "AccessDeniedException: 403 Insufficient Permission" from GCE

I am logged in to a GCE instance via SSH. From there I would like to access the Storage with the help of a Service Account:
GCE> gcloud auth list
Credentialed accounts:
- 1234567890-compute#developer.gserviceaccount.com (active)
I first made sure that this Service account is flagged "Can edit" in the permissions of the project I am working in. I also made sure to give him the Write ACL on the bucket I would like him to copy a file:
local> gsutil acl ch -u 1234567890-compute#developer.gserviceaccount.com:W gs://mybucket
But then the following command fails:
GCE> gsutil cp test.txt gs://mybucket/logs
(I also made sure that "logs" is created under "mybucket").
The error message I get is:
Copying file://test.txt [Content-Type=text/plain]...
AccessDeniedException: 403 Insufficient Permission 0 B
What am I missing?
One other thing to look for is to make sure you set up the appropriate scopes when creating the GCE VM. Even if a VM has a service account attached, it must be assigned devstorage scopes in order to access GCS.
For example, if you had created your VM with devstorage.read_only scope, trying to write to a bucket would fail, even if your service account has permission to write to the bucket. You would need devstorage.full_control or devstorage.read_write.
See the section on Preparing an instance to use service accounts for details.
Note: the default compute service account has very limited scopes (including having read-only to GCS). This is done because the default service account has Project Editor IAM permissions. If you use any user service account this is not typically a problem since user created service accounts get all scope access by default.
After adding necessary scopes to the VM, gsutil may still be using cached credentials which don't have the new scopes. Delete ~/.gsutil before trying the gsutil commands again. (Thanks to #mndrix for pointing this out in the comments.)
You have to log in with an account that has the permissions you need for that project:
gcloud auth login
gsutil config -b
Then surf to the URL it provides,
[ CLICK Allow ]
Then copy the verification code and paste to terminal.
Stop VM
goto --> VM instance details.
in "Cloud API access scopes" select "Allow full access to all Cloud APIs" then
Click "save".
restart VM and Delete ~/.gsutil .
I have written an answer to this question since I can not post comments:
This error can also occur if you're running the gsutil command with a sudo prefix in some cases.
After you have created the bucket, go to the permissions tab and add your email and set Storage Admin permission.
Access VM instance via SSH >> run command: gcloud auth login and follow the steps.
Ref: https://groups.google.com/d/msg/gce-discussion/0L6sLRjX8kg/kP47FklzBgAJ
So I tried a bunch of things trying to copy from GCS bucket to my VM.
Hope this post helps someone.
Via SSHed connection:
and following this script:
sudo gsutil cp gs://[BUCKET_NAME]/[OBJECT_NAME] [OBJECT_DESTINATION_IN_LOCAL]
Got this error:
AccessDeniedException: 403 Access Not Configured. Please go to the Google Cloud Platform Console (https://cloud.google.com/console#/project) for your project, select APIs and Auth and enable the Google Cloud Storage JSON API.
What fixed this was following "Activating the API" section mentioned in this link -
https://cloud.google.com/storage/docs/json_api/
Once I activated the API then I authenticated myself in SSHed window via
gcloud auth login
Following authentication procedure I was finally able to download from Google Storage Bucket to my VM.
PS
I did make sure to:
Make sure that gsutils are installed on my VM instance.
Go to my bucket, go to the permissions tab and add desired service accounts and set Storage Admin permission / role.
3.Make sure my VM had proper Cloud API access scopes:
From the docs:
https://cloud.google.com/compute/docs/access/create-enable-service-accounts-for-instances#changeserviceaccountandscopes
You need to first stop the instance -> go to edit page -> go to "Cloud API access scopes" and choose "storage full access or read/write or whatever you need it for"
Changing the service account and access scopes for an instance If you
want to run the VM as a different identity, or you determine that the
instance needs a different set of scopes to call the required APIs,
you can change the service account and the access scopes of an
existing instance. For example, you can change access scopes to grant
access to a new API, or change an instance so that it runs as a
service account that you created, instead of the Compute Engine
Default Service Account.
To change an instance's service account and access scopes, the
instance must be temporarily stopped. To stop your instance, read the
documentation for Stopping an instance. After changing the service
account or access scopes, remember to restart the instance. Use one of
the following methods to the change service account or access scopes
of the stopped instance.
Change the permissions of bucket.
Add a user for "All User" and give "Storage Admin" access.

how to grant read permission on google cloud storage to another service account

our team create some data on google cloud storage so other team can copy/download/read it from there, but when they tried, they always got 403 forbidden message. I tried to edit the permission on that bucket and added new permission as 'Project', 'viewers-(other team's project id)', and 'Reader', but still they got the same error when they ran this command:
gsutil cp -R gs://our-bucket gs://their-bucket
i also tried with their client id and email account, still the same.
I'm not sure one can define another group's collection of users with a give access right (readers, in this case), and apply it to an object in a different project.
An alternative to this would be to control bucket access via Google Groups: simply set up a group for readers, adding the users you wish to grant this right to. Then you can use said Group to control access to the bucket and/or contents. Further information, and use case scenario, here https://cloud.google.com/storage/docs/collaboration#group
try:
gsutil acl ch -u serviceaccount#google.com:R gs://your-bucket
This ch:changes the permission on 'your-bucket' for u:user serviceaccount#google.com to R:Reader.