Is there a way to tag or version Cloud Storage buckets? - google-cloud-storage

I have a shell script which refreshes my emulators data to the latest data from prod.
Part of the script is removing the existing bucket and then re exporting it to avoid the Path already exists error.
I know that I can manually add version buckets like /firestore_data/v1 but that would require me to find out what the last version is from the console and then update the shell script each time I need to refresh the emulators data.
Ideally I would like to be able to run gsutil -m cp -r gs://my-app.appspot.com/firestore_data#latest
Is there any way to version storage buckets, or to leave tags that can be used when adding and copying down?

Related

Google Cloud Firestore: How to copy Firestore collection to Cloud Storage

Writing a code is the only option to copy Firestore collections to Cloud Storage or is there some kind of a magic feature I can use?
I know this new feature announcement of importing Firestore collection into BigQuery in the Firestore talk during the Next conference. Is there something similar for Cloud Storage?
https://cloud.google.com/firestore/docs/manage-data/export-import. Not so sure whether this is a new feature but I am going to try this out.
Yes, finally, Firebase enabled this feature.
Create Cloud Storage Bucket
install gcloud if not already: in terminal run curl
https://sdk.cloud.google.com | bash
after prompt Modify profile to update your $PATH and enable bash completion? (Y/n) type y + enter
next, run source .bash_profile
afterwards, run: gcloud beta firestore export gs://[BUCKET-NAME].
and in case, you want to save the folder locally, simply run gsutil cp -r gs://[BUCKET-NAME] /path/to/folder

Google cloud storage does not let me remove bucket with huge data

I have around 200 gb of data on a google cloud coldline bucket. When i try to remove it, it keeps preparing forever.
Any way to remove the bucket ?
Try the gsutil tool if you have been trying with the Console and it did not work. To do so, you can just open Google Cloud Shell (most left button in the top right corner of the Console) and type a command like:
gsutil -m rm -r gs://[BUCKET_NAME]
It may take a while, but with the -r flag you will be deleting first the contents of the bucket recursively, and later delete the bucket itself. The -m flag performs parallel removes, to speed up the process.

How to load fish configuration from a remote repository?

I have a zillion machines in different places (home network, cloud, ...) and I use fish on each of them. The problem is that I have to synchronize their configuration every time I change something in there.
Is there a way to load the configuration from a remote repository? (= a place where it would be stored, not necessarily git but ideally I would manage them in GitHub). In such a case I would just have a one liner everywhere.
I do not care too much about startup time, loading the config each time would be acceptable
I cannot push the configuration to the machines (via Ansible for instance) - not of them are reachable from everywhere directly - but all of them can reach Internet
There are two parts to your question. Part one is not specific to fish. For systems I use on a regular basis I use Dropbox. I put my ~/.config/fish directory in a Dropbox directory and symlink to it. For machines I use infrequently, such as VMs I use for investigating problems unique to a distro, I use rsync to copy from my main desktop machine. For example,
rsync --verbose --archive --delete -L --exclude 'fishd.*' krader#macpro:.config .
Note the exclusion of the fishd.* pattern. That's part two of your question and is unique to fish. Files in your ~/.config/fish directory named with that pattern are the universal variable storage and are currently unique for each machine. We want to change that -- see https://github.com/fish-shell/fish-shell/issues/1912. The problem is that file contains the color theme variables. So to copy your color theme requires exporting those vars on one machine:
set -U | grep fish_color_
Then doing set -U on the new machine for each line of output from the preceding command. Obviously if you have other universal variables you want synced you should just do set -U and import all of them.
Disclaimer: I wouldn't choose this solution myself. Using a cloud storage client as Kurtis Rader suggested or a periodic cron job to pull changes from a git repository (+ symlinks) seems a lot easier and fail-proof.
On those systems where you can't or don't want to sync with your cloud storage, you can download the configuration file specifically, using curl for example. Some precious I/O time can be saved by utilizing HTTP cache control mechanisms. With or without cache control, you will still need to create a connection to a remote server each time (or each X times or each Y time passed) and that wastes quite some time already.
Following is a suggestion for such a fish script, to get you started:
#!/usr/bin/fish
set -l TMP_CONFIG /tmp/shared_config.fish
curl -s -o $TMP_CONFIG -D $TMP_CONFIG.headers \
-H "If-None-Match: \"$SHARED_CONFIG_ETAG\"" \
https://raw.githubusercontent.com/woj/dotfiles/master/fish/config.fish
if test -s $TMP_CONFIG
mv $TMP_CONFIG ~/.config/fish/conf.d/shared_config.fish
set -U SHARED_CONFIG_ETAG (sed -En 's/ETag: "(\w+)"/\1/p' $TMP_CONFIG.headers)
end
Notes:
Warning: Not tested nearly enough
Assumes fish v2.3 or higher.
sed behavior varies from platform to platform.
Replace woj/dotfiles/master/fish/config.fish with the repository, branch and path that apply to your case.
You can run this from a cron job, but if you insist to update the configuration file on every init, change the script to place the configuration in a path that's not already automatically loaded by fish, e.g.:
mv $TMP_CONFIG ~/.config/fish/shared_config.fish
and in your config.fish run this whole script file, followed by a
source ~/.config/fish/shared_config.fish

How to run a mongo script from Heroku scheduler?

I have implemented a javascript script for my mongo database. This script is called getMetrics.js and I am able to execute it by running: mongo getMetrics.js from my computer.
Now I want to automatically execute that script one time per day. To do so, I have created a Heroku app and I added to it the scheduler add-on (https://devcenter.heroku.com/articles/scheduler).
My main problem is that in order to be run, my task will execute the command "mongo getMetrics.js" and it will failed because I don't have mongo command installed in my Heroku app.
How can I run this script from Heroku?
Thanks a lot for your help.
I did the below in a similar case:
Download mongodb for linux https://www.mongodb.com/download-center#community
The bin folder contains the mongo binary
Make this binary available in your Heroku instance (e.g. If you have your Heroku configured with your git repo, then checkin this binary along side your script
[Make sure the folder you are keeping this binary is in the path, safe path will be inside /bin]

make server backup, and keep owner with rsync

I recently configured a little server for test some services, now, before an upgrade or install new software, I want to make an exact copy of my files, with owners, groups and permissions, also the symlinks.
I tried with rsync to keep the owner and group but in the machine who receives the copy I lost them.
rsync -azp -H /directorySource/ myUser#192.168.0.30:/home/myUser/myBackupDirectory
My intention is to do it with the / folder, to keep all my configurations just in case, I have 3 services who have it's own users and maybe makes modifications in folders outside it's home.
In the destination folder appear with my destination user, whether I do the copy from the server as if I do it from the destination, it doesn't keep the users and groups!, I create the same user, tried with sudo, even a friend tried with 777 folder :)
cp theoretically serves the same but doesn't work over ssh, anyway I tried to do it in the server but have many errors. As I remembered the command tar also keep the permissions and owners but have errors because the server it's working and it isn't so fast the process to restore. I remember too the magic dd command, but I made a big partition. Rsync looked the best option to do it, and to keep synchronized the backup. I saw rsync in the new version work well with owners but I have the package upgraded.
Anybody have some idea how I do this, or how is the normal process to keep my own server well backuped, to restore just making the partition again?
The services are taiga, a project manager platform, a git repository, a code reviewer, and so on, all are working well with nginx over Ubuntu Server. I haven't looked other backup methods because I thought rsync with a cron job do the work.
Your command would be fine, but you need to run as root user on the remote end (only root has permission to set file owners):
rsync -az -H /directorySource/ root#192.168.0.30:/home/myUser/myBackupDirectory
You also need to ensure that you use rsync's -o option to preserve owners, and -g to preserve groups, but as these are implied by -a your command is OK. I removed -p because that's also implied by -a.
You'll also need root access, on the local end, to do the reverse transfer (if you want to restore your files).
If that doesn't work for you (no root access), then you might consider doing this using tar. A proper archive is probably the correct tool for the job, and will contain all the correct user data. Again, root access will be needed to write that back to the file-system.