Automating gsutil commands - google-cloud-storage

I'm trying to automate some gsutils commands, but struggling to see where the authentication files are kept and how to re-use (if thats what happens).
I've gone through the gcloud init process in bash...
curl https://sdk.cloud.google.com | bash
gcloud init
All works well when I run
'gsutil ls'
Now I'm trying to automate the process, so this would work on a new server adding into a crontab on it (rather than creating a new config each time).
I saw a mention of setting env variable GOOGLE_APPLICATION_CREDENTIALS, so I copied my credentials from web login to a file and tried it, eg trying as a different user to test
export GOOGLE_APPLICATION_CREDENTIALS=/home/user/.gsutil/mycreds
and then gsutil ls, but fails.
So I assume I've got the whole credentials thing a bit wrong. I'm assuming there is a file somewhere that was originally created by gcloud which I could use, but I can't see it anywhere ?
I've looked at the answer here but doesn't seem up to date now, as per last comment.
Edit: I have followed Zacharys steps, gcloud auth activate-service-account --key-file=myfilelocation
However, with 'gsutil ls' I now get..
You are attempting to perform an operation that requires a project id, with none configured. Please re-run gsutil config and make sure to follow the instructions for finding and entering your default project id.
So my next question would be, where is it looking for the project id ? If I run gsutil config, it seems to create a new set of auth which then creates another error, so have removed that.

You should be able to do this without diving in too deep to the implementation of authentication for gsutil.
If you're using standalone gsutil (if you installed via this method), the instructions in the linked question are still valid (as Travis points out).
If you'd like to continue using the gsutil supplied via the Cloud SDK, you should use service accounts. Service accounts are the preferred method of authenticating on headless machines or in non-interactive contexts.
Your flow would look something like the following:
Create a service account via the Google Cloud Developers Console.
On the remote machine, install the Cloud SDK and gsutil. If you're not installing interactively, it's better to skip the curl ... | bash method. Instead, download this install archive, extract it, and run the install.sh script. This script has options (visible with --help); if you specify choices to all of these options, it won't prompt you.
Copy the service account to the remote machine. Run gcloud auth activate-service-account --key-file=/path/to/service-account.json.
Run gsutil. You should be appropriately authenticated.

You have to set default project and user in gsutil. Run the following command:
gcloud init
Choose 1. It shows you different users; select the user and then select the project.

I was trying to create a bucket with project id as name:
$ gsutil mb -l eu gs://PROJECT-ID
Creating gs://root****/...
Error: You are attempting to perform an operation that requires a project id, with none configured. Please re-run gsutil config and make sure to follow the instructions for finding and entering your default project id.
Steps that resolved for me:
gcloud auth login
gcloud config set project <PROJECT-ID>
gsutil mb -l eu gs://<PROJECT-ID>
Creating gs://root***/...
The error is gone out of the way and it works as expected.

Related

Tab completion for non-file-system paths or URIs (e.g. gs://...)

I use cloud storage (eg gcs, s3) in addition to the local file system for data analysis.
My question is: are there tools that enable tab completion (in a shell environment) for file paths or URIs that aren't on local, mounted file systems? Eg for a file URI like gs://path/to/file.txt, I'd like to have tab completion when typing part of the path.
Note: I don't want to use FUSE (or mount a file system or volume in some way). I'm wondering if there are bash or zsh extensions that enable this functionality for non-file-system URIs, presumably with API calls in the background or something.
gcloud interactive shell has auto-complete and auto prompting for any command that has a manual page, including the gcloud, bq, gsutil, and kubectl command-line tools.
To enable the gcloud interactive shell:
Install the gcloud beta components first by running the command gcloud components install beta.
Run gcloud beta interactive to enter gcloud interactive mode.
You can now use the tab to complete a file path or resource assignment
Check this link for the official documentation of gcloud auto-complete.

How can I avoid typing in the profile for aws each time?

I want to avoid typing --profile dev-platform every time I want to access the AWS cli commands.
How do I add a postfix or suffix like --profile dev-platform every time I want to run an AWS command?
I just want to type aws s3 ls with out manually putting in the profile each time like the above two instances.
aws *wild card* --profile dev-platform
I tried some things like
alias aws= aws ** | --profile dev-platform
but to no avail.
Assuming you have multiple profiles in your ~/.aws/config, and want to set dev-platform as the profile to use for your terminal session you can use:
export AWS_DEFAULT_PROFILE=dev-platform
This will set the default AWS profile to the selected profile for your terminal session.
I switch between multiple profile regularly and create aliases for my different profiles so I can use a short command like “use-dev” to switch to my dev profile, or use-prod, to switch to my production profile quickly and easily without having to type the full command.

Kubernetes error code 403: user anonymous cannot get path

I been trying to install kubernetes for the first time, after the initial setup, I was finally able to execute the kube-up.bash script:
kubernetes/cluster/kube-up.bash
Everything goes well and I even see the list of cluster services installed:
Then the book I am checking says:
"Go to: https//your_master_ip/ui/"
But when I try I can only see the following:
I assumed I did not performed a proper setup on during the auth process, so I did:
gcloud auth list
But my active account is there with my Google email, so I am not sure what I am doing wrong.
I am able to access the Google Cloud Platfrom and see the project I created for this, I can see the traffic also.
Also, the kubectl commands are not working on my system, it throws a bash error that It was not able to locate the order.
Can someone please assist me?
Regards

Using Google Cloud Storage with rsync

I am new to Google Cloud. We have historically used AWS for online backups -- essentially, our local servers ran rsync to an EC2 instance at AWS and it all worked fine. I'm now trying to migrate from AWS to Google and of course the setup is pretty different. With gsutil rsync it looked to me as though I wouldn't need to spin up a Compute Engine at all, I could just push stuff straight into gs://aws_mnt bucket
Having installed the SDK on our AWS instance I was able to push all our backups to the gs://aws_mnt bucket very easily using gsutil cp -n
But going forward I want to run a cron job on the local server which uses rsync rather than cp for obvious reasons.
I have two issues:
Despite reading the appropriate documentation (here) I am so stupid I can't figure out how to permanently authorise the local server so I don't have to do gcloud auth login and get a code from a browser each session, as for a cron job that's not really going to work.
When I try to use gsutil rsync from the local server to the gs://aws_mnt bucket that was pre-populated from AWS, I get an error:
gsutil rsync /mnt/archive/backups gs://aws_mnt/kahless
Building synchronization state...
Skipping cloud sub-directory placeholder object gs://aws_mnt/kahless/
Starting synchronization
There is some discussion of this error on github and I've produced detailed output from
gsutil -D -m rsync /mnt/archive/backups gs://aws_mnt/kahless
But since this is a brand-new install of the SDK I can't imagine the thread hasn't already been dealt with so I must be doing something wrong?
Rus
In response to your questions:
Once you have configured credentials using gcloud auth, the 'gcloud auth login' command will cause them to be selected until you login to a different credential... and that state will persist and not require you to go through the browser session again unless/until you revoke those credentials. Note: If you're thinking of running commands from an unattended script (e.g., via cron) please consider using service account credentials. For more details please see https://developers.google.com/cloud/sdk/gcloud/#gcloud.auth
That "skipping..." message is not an error - it's just informing you that gsutil is skipping trying to download the placeholder object, because such objects aren't needed in (and would interfere with) directories in the local file system. I'll update the message in the next version of gsutil to make this more clear. So, what you saw was that the second run of gsutil rsync found nothing to do after comparing the source and destination, and completed normally.

Google Cloud Storage - GSUtil Update fails because file used by another process

We use an ETL process to pull data from Google Cloud Storage, but annoyingly it hangs everytime Google releases udpates to GSUtil, because it sits at a prompt asking if you want to update the library. Fine if you are doing this manually, but not cool when it's being run in an automated SSIS package, as jobs don't finish for days and you keep wasting time with the same stupid cause.
I thought I was going to be cleaver, and add "python gsutil update -n" to the top of the bash script I'm automating the building/execution of in my SSIS Package in the hope to curb this problem, but when I run this command from the prompt in either Windows Server 2008r2 or Windows 7 I get the following:
C:\gsutil>python gsutil update -f -n
Copying gs://pub/gsutil.tar.gz...
OSError: The process cannot access the file because it is being used by another process.
Any help?
P.S. - Also, Google engineers... can you PLEASE remove these prompts? for all of us using these tools in automated processes? I have other things to work on, instead of constantly going back to things like this every few days/weeks.
What version of gsutil are you running?
Also, to be clear: Are you talking about the fact that gsutil checks for available software updates periodically, and if it finds them it then prompts you whether you want to update? Or are you talking about the fact that the gsutil update command asks if you want to perform the update?
If the former, gsutil shouldn't be performing this check/prompting if you are running gsutil from a script not connected to at TTY. If that's not working correctly we'd like to know.
And also, if that's the problem you're having, you can completely disable automated software update checks by setting software_update_check_period=0 in the [GSUtil] section of your .boto config file.