Data Transfer between Google Storage different Service Accounts - google-cloud-storage

I have two Google Service Credentials and a bucket on each account .I have to transfer files from one bucket to another. How can I do this programmatic ally?
Can I achieve this with two Storage objects or using the Cloud storage Transfer service?

Yes, with Storage Transfer Service you can create a transfer job and send the data to a destination bucket (in another project), keep in mind that it is documented that:
To access the data source and the data sink, this service account must
have source permissions and sink permissions.
Meaning that you can't use two different service accounts, you will need to grant access to only one of the two service accounts you have.
If you want to transfer files from one bucket to another programmatically. First, you must grant permission to the service account associated with the Storage Transfer Service so it can access the data sink(destination bucket), please follow these steps.
Please note that if you are not creating the transfer job in the same project where the source bucket is located, then you must grant permissions to access it.
With Storage Transfer Service you can create a transfer job programmatically with Java and Python, examples include creating the transfer job and checking the transfer operation status. Full code example can be found for Java and Python.

Related

How to migrate GCS bucket from one project to another in different account

How to transfer GCS bucket from one account to another account without downloading data
Is Transfer Service for Cloud Data Chargable?
You don't transfert GCS bucket from an account to another one. The GCS bucket belong to a project.
You can grant new user on the project, on only on the bucket to allow them access. You can also create another bucket, in another project, with another name (project id and bucket name are global resources, 2 can't have the same name all around the world) and use Transfer service to duplicate the data. The service is free of charge if the data stay in the same region (if not, egress cost will be applied)

Access specific folder in GCS bucket according to user, using Workload Identity Federation

I have an external identity provider that supports OpenID Connect (OIDC) and want to access Google Cloud Storage(GCS) directly, using a short-lived access token. So I'm using workload identity federation in order to provide a credential from my external identity provider and get a federated token in exchange.
I have created the workload identity pool and provider and connected a service account to it, which has write access to a certain bucket in GCS.
How can I differentiate the access to specific folder in the bucket according to the token provided from my external identity provider? For example for userA to have access only to folderA in the bucket. Can I do this using one service account?
Any help would be highly appreciated.
The folders don't exist on Cloud Storage, it's a blob storage, all the object are stored at the bucket level. For human readability and representation, the / are the folder separator, by convention.
Therefore, because directory doesn't exist, you can't grant any permission on it. The finer granularity is the bucket.
In your use case, you can't grant a write access at folder level, but you can create 1 bucket per user and therefore grant the impersonated service account on the bucket.

gcs fuse on two seperate instances

So I have two separate accounts I'm sharing a gcloud bucket between. At first I had problems getting the credentials right but eventually I just added all the email-looking accounts under the IAM on the second account to the storage buckets permissions. I gave those accounts all the roles, as I want to be able to read and write from both accounts vm instances. At this point I can mount using gcsfuse, But I can't read or write to it? I can see the filesystem, but anytime I try to copy from or to I get a input/output error?

Use only a domain and disable https://storage.googleapis.com url access

I am newbie at cloud servers and I've opened a google cloud storage to host image files. I've verified my domain and configured it, to view images via my domain. The problem is, same file is both accessible via my domain example.com/images/tiny.png and also via storage.googleapis.com/example.com/images/tiny.png Is there any solution to disable access via storage.googleapis.com and use only my domain?
Google Cloud Platform Support Version:
NOTE: This is the reply from Google Cloud Platform Support when contacted via email...
I understand that you have set up a domain name for one of your Cloud Storage buckets and you want to make sure only URLs starting with your domain name have access to this bucket.
I am afraid that this is not possible because of how Cloud Storage permission works.
Making a Cloud Storage bucket publicly readable also gives each of its files a public link. And currently this public link can’t be disabled.
A workaround would be implement a proxy program and running it on a Compute Engine virtual machine. This VM will need a static external IP so that you can map your domain to it. The proxy program will be in charged of returning the requested file from a predefined Cloud Storage bucket while the bucket keeps to be inaccessible to the public.
You may find these documents helpful if you are interested in this workaround:
1. Quick start to set up a Linux VM (1).
2. Python API for accessing Cloud Storage files (2).
3. How to download service account keys to grant a program access to a set of services (3).
4. Pricing calculator for getting a picture on how much a VM may cost (4).
(1) https://cloud.google.com/compute/docs/quickstart-linux
(2) https://pypi.org/project/google-cloud-storage/
(3) https://cloud.google.com/iam/docs/creating-managing-service-account-keys
(4) https://cloud.google.com/products/calculator/
My Version:
It seems the solution to this question is really a simple, just FUSE Google Cloud Storage with VM Instance.
After FUSE private files from GCS can be accessed through VM's IP address. It made Google Cloud Storage Bucket act like a directory.
The detailed documentation about how to setup FUSE in Google Cloud is here.
There is but it requires you to do more work.
Your current solution works because you've made access to the GCS bucket (example.com), public and then you're DNS aliasing from your domain.
An alternative approach would be for you to limit access to the GCS bucket to one (possibly several) accounts and then run a web-server that uses one of the accounts to access your image files. You could then also either permit access to your web-server to anyone or also limit access to it.
More work for you (and possibly cost) but more control.

Right way of using Google Storage on a GCE VM

I want to know the right/best way of having one machine copying data to Google Storage.
I need one machine to be able to write to a bucket, but not be able to create or delete other buckets.
While researching, I found out that you should create a account service so this account can log in to GC and then use the storage.
But the problem is, when the machine is from GCE, there are scopes. When setting up the scope "Default" it can Read from Google Storage, but can not write to it. Even after authenticated with a service account.
When the scope is Devstorage.read_write now the machine can create and remove buckets from that storage without login. I find that to risk.
Does anyone have any recommendations?
Thanks
The core problem here is that the "write" scope covers both write and delete, and that the GCE service account is likely a member of project-editors, which can create and delete buckets. It sounds like what you want to do is restrict a service account to only being able to affect a single bucket. You should be able to do this with these steps:
Create a service account in your project (and save the private key file).
In the permissions page for the project, make sure that service account is not a project editor for your project.
Using an account that does have full permissions to your project, create the bucket, then grant the service account write access to the bucket. Example gsutil commands to do this:
gsutil mb gs://yourbucket
gsutil acl ch -u your-service-account-name#gserviceaccount.com:W gs://yourbucket
Create a VM that does not have a GCE service account enabled.
Push the service account's private key file to that VM.
On the VM, gcloud auth activate-service-account --key-file=your-key-file.json
Now gsutil commands run on the VM should be able to write to (and delete) objects in that bucket, but not any other buckets in your project.