Cloud Storage Transfer job fails with "Object: or" - google-cloud-storage

I'm having an issue transferring files from an S3 bucket to Google Cloud Storage.
The configuration of the job has an access key ID and secret key for a user which has AmazonS3FullAccess as policy. I verified access using aws-cli.
The path to the bucket is entered like root-bucket/folder/. There are no additional transfer options to the job.
The job runs for a while in the calculating state, and then fails with the error message
UNKNOWN: (showing 1 of 1 failures) >
Object: or
Is it possible to get a more verbose log of what failed with the job?

As Rybosome metions, only the bucket name can be written in the Amazon S3 bucket field. When wanting to include sub-folders, write it in the more section, under include Transfer files with these prefixes.

Related

How to copy blob file to SAS URL in a Synapse pipeline

I have a blob zip file in my storage account, I have a linked service and binary dataset to get the file as the source in a copy activity. There is an outside service I call in a web activity that returns a writable SAS URL to a different storage account in this format.
https://foo.blob.core.windows.net/dmf/43de9fb6-3b96-4f47-b730-eb8de040859dblah.zip?sv=2014-02-14&sr=b&sig=0mgvh25htg45b5u4ty5E%2Bf0ahMwFkHVy3iTC2nh%2FIKw%3D&st=2022-08-13T02%3A19%3A33Z&se=2022-08-13T02%3A54%3A33Z&sp=rw
I tried adding a SAS azure blob linked service, I added a parameter for the uri on the LS, then added a dataset bound to the LS and also added a parameter for the uri, I pass the SAS uri dynamically all the way down to the linked service. The copy fails each time with The remote server returned an error: (403). I have to be doing something wrong but not sure what it is. I'd appreciate any input, thanks.
I tried to reproduce the same in my environment and got same error:
To resolve the above 403 error, you need to enable it from all network option and also check whether the Storage blob data contributor was added or not. If not , Go to Azure Storage Account -> Access control (IAM) -> +Add role assignment as Storage blob data contributor.
Now, its working.

error browsing directory under ADLS Gen2 container for Azure Data Factory

I am creating a dataset in Azure Data Factory. This dataset will be a Parquet file within a directory under a certain container in an ADLS Gen2 account. The container name is 'raw', and the directory that I want to place the file into is source/system1/FullLoad. When I click on Browse next to File path, I am able to access the container, but I cannot access the directory. When I hit folder 'source', I get the error shown below.
How can I drill to the desired directory? As the error message indicates, I suspect that it's something to do with permissions to access the data (the Parquet file doesn't exist yet, as it will be used as a sink in a copy activity that hasn't been run yet), but I don't know how to resolve.
Thanks for confirming putting the resolution for others if anyone face this issue.
The user or managed identity you are using for your data factory should have storage data blob contributor access on the storage account. You can check it from azure portal, go to your storage account, navigate to the container and then directory, click on Access Control on the left panel and check role assignment. If it is missing add the role assignment of storage data blob contributor to your managed identity.

Unable to transfer GCS bucket from one account to another

I am trying to create a transfer job in Data Transfer, to copy all files in a bucket belonging to one account to an existing bucket belonging to another account.
I get access to both source and destination buckets, I get "green light" in the wizard, but when I try to run the transfer job I get the following error message:
To complete this transfer, you need the 'storage.buckets.setIamPolicy'
permission for the source bucket. Ask the bucket's administrator to
grant you the required permission and try again.
I have tried to apply various roles to the user runnning the transfer job, but I can't figure out how to overcome this problem.
Can anyone help me on this?
This permission storage.buckets.setIamPolicy can be granted with either roles/storage.legacyBucketOwner or roles/iam.securityAdmin role. It could be needed to keep the permissions applied to the source object.
Permissions for copying an object:
storage.objects.create (for the destination bucket)
storage.objects.delete (for the destination bucket)
storage.objects.get (for the source object)
storage.objects.getIamPolicy (for the source object)
storage.objects.setIamPolicy (for the destination bucket)
Please see:
Cloud IAM > Documentation > Understanding roles
Cloud Storage > Documentation > Reference > Cloud IAM roles

Data Transfer between Google Storage different Service Accounts

I have two Google Service Credentials and a bucket on each account .I have to transfer files from one bucket to another. How can I do this programmatic ally?
Can I achieve this with two Storage objects or using the Cloud storage Transfer service?
Yes, with Storage Transfer Service you can create a transfer job and send the data to a destination bucket (in another project), keep in mind that it is documented that:
To access the data source and the data sink, this service account must
have source permissions and sink permissions.
Meaning that you can't use two different service accounts, you will need to grant access to only one of the two service accounts you have.
If you want to transfer files from one bucket to another programmatically. First, you must grant permission to the service account associated with the Storage Transfer Service so it can access the data sink(destination bucket), please follow these steps.
Please note that if you are not creating the transfer job in the same project where the source bucket is located, then you must grant permissions to access it.
With Storage Transfer Service you can create a transfer job programmatically with Java and Python, examples include creating the transfer job and checking the transfer operation status. Full code example can be found for Java and Python.

Talend :tS3Put gives access denied error

I am trying to upload a list of files from a folder into an Amazon S3 folder.
I am able to manually upload files on the folder.But when I run the job which does the same thing ,the talend job gives an "Access Denied" error.
I have the required keys for the S3 bucket.
If you are getting the Access Denied error then it mean you do not have access to that bucke or check the access constraint again.
You can also manually copy the files to S3 by downloading the software called "CloudBerry Exlorer for Amazon S3".
Just download and provide the access key and see whether you have access to the bucket or not.