New buckets are not displayed in console manager - google-cloud-storage

Starting last night I've started to experience the following issue:
Existing buckets I created previously are displayed in the console manager and with gsutil, this is the functionality I expect, however...
creating a new bucket via console manager will be displayed, but as soon as the page refreshes or I navigate elsewhere and come back the bucket isn't listed anymore
-- using gsutil to list the project buckets gives me the older ones but not the new one
-- trying to create a new bucket with the same name (via console mngr or gsutil) gives me an error that the bucket already exists
-- using gsutil rm -R successfully removes the bucket, so it was in fact there
Is there an issue or 'feature' of gs clould storage I'm now aware of?

Related

How can I move an app from one space to another?

I have created a new space in the organisation and I have to move several apps from one space to the newly created one. How can I do this? (I don't want to rename the old space since only a subset of its apps need to be moved.
I'm afraid there is no such function to move apps from one space to another. There are two possibilities to achieve it, though:
Compile & Push
If you have access to the source-code of the app and use a common CI/CD setup, simply adjust your scripts, have them point to the new space and trigger a complete CI/CD cycle.
Download & Push
If you don't have access to the source-code, the binary-repository where you store your binaries or have to ensure it is 100% the same droplet in the new space, you can download that droplet from Cloud Foundry and push it to the new space:
cf app source-app --guid to get your source app guid
cf curl /v2/apps/:source-guid/droplet/download --output /tmp/droplet.tgz to download the source app's droplet to your local machine (note that if your Cloud Controller is configured to use a remote blobstore this command will return a redirect to the actual location of the droplet)
cf target -s destination-space to target the space you want to push the copied droplet to
cf push --droplet /tmp/droplet.tgz destination-app-name
Credits: Github
This will work for buildpack-based apps but not for Docker-based apps, as you can't simply download docker-binaries.
Pitfall: If you want to use the same route again, don't forget to delete (not just unbind) it from your old space.

Automating gsutil commands

I'm trying to automate some gsutils commands, but struggling to see where the authentication files are kept and how to re-use (if thats what happens).
I've gone through the gcloud init process in bash...
curl https://sdk.cloud.google.com | bash
gcloud init
All works well when I run
'gsutil ls'
Now I'm trying to automate the process, so this would work on a new server adding into a crontab on it (rather than creating a new config each time).
I saw a mention of setting env variable GOOGLE_APPLICATION_CREDENTIALS, so I copied my credentials from web login to a file and tried it, eg trying as a different user to test
export GOOGLE_APPLICATION_CREDENTIALS=/home/user/.gsutil/mycreds
and then gsutil ls, but fails.
So I assume I've got the whole credentials thing a bit wrong. I'm assuming there is a file somewhere that was originally created by gcloud which I could use, but I can't see it anywhere ?
I've looked at the answer here but doesn't seem up to date now, as per last comment.
Edit: I have followed Zacharys steps, gcloud auth activate-service-account --key-file=myfilelocation
However, with 'gsutil ls' I now get..
You are attempting to perform an operation that requires a project id, with none configured. Please re-run gsutil config and make sure to follow the instructions for finding and entering your default project id.
So my next question would be, where is it looking for the project id ? If I run gsutil config, it seems to create a new set of auth which then creates another error, so have removed that.
You should be able to do this without diving in too deep to the implementation of authentication for gsutil.
If you're using standalone gsutil (if you installed via this method), the instructions in the linked question are still valid (as Travis points out).
If you'd like to continue using the gsutil supplied via the Cloud SDK, you should use service accounts. Service accounts are the preferred method of authenticating on headless machines or in non-interactive contexts.
Your flow would look something like the following:
Create a service account via the Google Cloud Developers Console.
On the remote machine, install the Cloud SDK and gsutil. If you're not installing interactively, it's better to skip the curl ... | bash method. Instead, download this install archive, extract it, and run the install.sh script. This script has options (visible with --help); if you specify choices to all of these options, it won't prompt you.
Copy the service account to the remote machine. Run gcloud auth activate-service-account --key-file=/path/to/service-account.json.
Run gsutil. You should be appropriately authenticated.
You have to set default project and user in gsutil. Run the following command:
gcloud init
Choose 1. It shows you different users; select the user and then select the project.
I was trying to create a bucket with project id as name:
$ gsutil mb -l eu gs://PROJECT-ID
Creating gs://root****/...
Error: You are attempting to perform an operation that requires a project id, with none configured. Please re-run gsutil config and make sure to follow the instructions for finding and entering your default project id.
Steps that resolved for me:
gcloud auth login
gcloud config set project <PROJECT-ID>
gsutil mb -l eu gs://<PROJECT-ID>
Creating gs://root***/...
The error is gone out of the way and it works as expected.

Using Google Cloud Storage with rsync

I am new to Google Cloud. We have historically used AWS for online backups -- essentially, our local servers ran rsync to an EC2 instance at AWS and it all worked fine. I'm now trying to migrate from AWS to Google and of course the setup is pretty different. With gsutil rsync it looked to me as though I wouldn't need to spin up a Compute Engine at all, I could just push stuff straight into gs://aws_mnt bucket
Having installed the SDK on our AWS instance I was able to push all our backups to the gs://aws_mnt bucket very easily using gsutil cp -n
But going forward I want to run a cron job on the local server which uses rsync rather than cp for obvious reasons.
I have two issues:
Despite reading the appropriate documentation (here) I am so stupid I can't figure out how to permanently authorise the local server so I don't have to do gcloud auth login and get a code from a browser each session, as for a cron job that's not really going to work.
When I try to use gsutil rsync from the local server to the gs://aws_mnt bucket that was pre-populated from AWS, I get an error:
gsutil rsync /mnt/archive/backups gs://aws_mnt/kahless
Building synchronization state...
Skipping cloud sub-directory placeholder object gs://aws_mnt/kahless/
Starting synchronization
There is some discussion of this error on github and I've produced detailed output from
gsutil -D -m rsync /mnt/archive/backups gs://aws_mnt/kahless
But since this is a brand-new install of the SDK I can't imagine the thread hasn't already been dealt with so I must be doing something wrong?
Rus
In response to your questions:
Once you have configured credentials using gcloud auth, the 'gcloud auth login' command will cause them to be selected until you login to a different credential... and that state will persist and not require you to go through the browser session again unless/until you revoke those credentials. Note: If you're thinking of running commands from an unattended script (e.g., via cron) please consider using service account credentials. For more details please see https://developers.google.com/cloud/sdk/gcloud/#gcloud.auth
That "skipping..." message is not an error - it's just informing you that gsutil is skipping trying to download the placeholder object, because such objects aren't needed in (and would interfere with) directories in the local file system. I'll update the message in the next version of gsutil to make this more clear. So, what you saw was that the second run of gsutil rsync found nothing to do after comparing the source and destination, and completed normally.

Google Cloud Storage - GSUtil Update fails because file used by another process

We use an ETL process to pull data from Google Cloud Storage, but annoyingly it hangs everytime Google releases udpates to GSUtil, because it sits at a prompt asking if you want to update the library. Fine if you are doing this manually, but not cool when it's being run in an automated SSIS package, as jobs don't finish for days and you keep wasting time with the same stupid cause.
I thought I was going to be cleaver, and add "python gsutil update -n" to the top of the bash script I'm automating the building/execution of in my SSIS Package in the hope to curb this problem, but when I run this command from the prompt in either Windows Server 2008r2 or Windows 7 I get the following:
C:\gsutil>python gsutil update -f -n
Copying gs://pub/gsutil.tar.gz...
OSError: The process cannot access the file because it is being used by another process.
Any help?
P.S. - Also, Google engineers... can you PLEASE remove these prompts? for all of us using these tools in automated processes? I have other things to work on, instead of constantly going back to things like this every few days/weeks.
What version of gsutil are you running?
Also, to be clear: Are you talking about the fact that gsutil checks for available software updates periodically, and if it finds them it then prompts you whether you want to update? Or are you talking about the fact that the gsutil update command asks if you want to perform the update?
If the former, gsutil shouldn't be performing this check/prompting if you are running gsutil from a script not connected to at TTY. If that's not working correctly we'd like to know.
And also, if that's the problem you're having, you can completely disable automated software update checks by setting software_update_check_period=0 in the [GSUtil] section of your .boto config file.

On resume gsutil seems to re-upload files

I'm trying to upload data to Google Cloud Storage from a disk with ~3000 files totalling 1TB. I'm using gsutil cp -R <disk-top-directory> <bucket>. My understanding is that, if gsutil is resumed/restarted, it uses checksums to determine when a file has already been uploaded and skips over it.
It doesn't appear to be doing this: it appears to be resuming the upload from the top and replacing the files all over again. When I run successive gsutil ls -Rl <bucket/disk-top-directory> ten minutes apart and compare them with diff, I see what appears to be the same files with the same sizes but a changed (newer) date. (i.e. consistent with the same file being re-uploaded.)
For example:
< 404104811 2014-04-08T14:13:44Z gs://my-bucket/disk-top-directory/dir1/dir2/dir3/dir4/dir5/file-20.tsv.bz2
---
> 404104811 2014-04-08T14:43:48Z gs://my-bucket/disk-top-directory/dir1/dir2/dir3/dir4/dir5/file-20.tsv.bz2
The machine I'm using to read the disk and transfer files is running Ubuntu 13.10. I installed gsutil using the pip instructions for Debian and Ubuntu.
Am I misunderstanding how gsutil's resumable transfers is supposed to work? If not, any diagnosis and fix to get the correct resume behavior? Thanks in advance!
You need to use the -n (No-clobber) switch to prevent the re-uploading of objects that already exist at the destination.
gsutil cp -Rn <disk-top-directory> <bucket>
From the help (gsutil help cp)
-n No-clobber. When specified, existing files or objects at the
destination will not be overwritten. Any items that are skipped
by this option will be reported as being skipped. This option
will perform an additional HEAD request to check if an item
exists before attempting to upload the data. This will save
retransmitting data, but the additional HTTP requests may make
small object transfers slower and more expensive.
Also according to this, when transfering files over 2MB, gsutil automatically uses a resumable transfer mode.
If you're open to working with the (still beta) gsutil v4, that version of gsutil has an rsync command. You can get this by running:
gsutil update gs://prerelease/gsutil_4.0beta2pre_minus_m_sugg.tar.gz
Please be sure to read the release notes before switching to this major new release, especially if you're using gsutil v3 in scripts.