gsutil rsync not preserving uid/gid ownership - google-cloud-storage

when using gsutil -m rsync -p -d -r
the ownership became root
Any idea how to run gsutil rsync just like rsync -a?
thanks
Peter

gsutil rsync doesn't currently support preserving POSIX file attributes in the cloud.
It's not guaranteed that the uid/gid on the system that uploaded a file is even valid on the system that downloaded the file. So (at least for now), you'll need to manage your file permissions manually.

Related

Option to exclude files in pg_basebackup command Postgres

When cloning a standby, how can I prevent pg_basebackup from copying postgresql.conf and pg_hba.conf from the master to /var/lib/pgsql/9.9/data directory?
Currently I am using this command
[root#xyz..]# pg_basebackup -h {master ipAddr} -D /var/lib/pgsql/9.6/data -U postgres -v -P
according to docs:
The backup will include all files in the data directory and
tablespaces, including the configuration files and any additional
files placed in the directory by third parties. But only regular files
and directories are copied. Symbolic links (other than those used for
tablespaces) and special device files are skipped.
So there is no such option. If you still want to force it, move config files away from data directory (and optionally ln them to data_dir)
This answer is for Postgres 14. pg_basebackup takes backup of the entire data directory. https://www.postgresql.org/docs/14/app-pgbasebackup.html states that the backup utility will skip all directory/file that are symbolic links. So, that could be a workaround to get only desired content into the tar ball.
I had faced similar situations where I wanted to exclude the content of multiple directories like pg_replslot,pg_dynshmem, pg_notify etc. I made the tar ball the usual way: pg_basebackup -D /backup/ -F t -P -v. After the tar ball was made, and before restoring it to another server, I updated the tar manually by excluding content of all the required directories.

How do i copy/move all files and subfolders from the current directory to a Google Cloud Storage bucket with gsutil

I'm using gsutil and I need to copy a large number of files/subdirectories from a directory on a windows server to a Google Cloud Storage Bucket.
I have checked the documentation but somehow I can't seem to get the syntax right - I'm trying something along these lines:
c:\test>gsutil -m cp -r . gs://mytestbucket
But I keep getting the message:
CommandException: No URLs matched: .
What am I doing wrong here?
Regards
Morten Hjorth Nielsen
Try gsutil -m cp -r * gs://mytestbucket
Or gsutil -m cp -r *.* gs://mytestbucket
Or if your local directory is called test go one dir up and type: gsutil -m cp -r test gs://mytestbucket
Not sure which syntax you need on Windows, but probably the first.

How to download multiple files in Google Cloud Storage

Scenario: there are multiple folders and many files stored in storage bucket that is accessible by project team members. Instead of downloading individual files one at a time (which is very slow and time consuming), is there a way to download entire folders? Or at least multiple files at once? Is this possible without having to use one of the command consoles? Some of the team members are not tech savvy and need to access these files as simple as possible. Thank you for any help!
I would suggest downloading the files with gsutil. However if you have a large number of files to transfer you might want to use the gsutil -m option, to perform a parallel (multi-threaded/multi-processing) copy:
gsutil -m cp -R gs://your-bucket .
The time reduction for downloading the files can be quite significant. See this Cloud Storage documentation for complete information on the GCS cp command.
If you want to copy into a particular directory, note that the directory must exist first, as gsutils won't create it automatically. (e.g: mkdir my-bucket-local-copy && gsutil -m cp -r gs://your-bucket my-bucket-local-copy)
I recommend they use gsutil. GCS's API deals with only one object at a time. However, its command-line utility, gsutil, is more than happy to download a bunch of objects in parallel, though. Downloading an entire GCS "folder" with gsutil is pretty simple:
$> gsutil cp -r gs://my-bucket/remoteDirectory localDirectory
To download files to local machine need to:
install gsutil to local machine
run Google Cloud SDK Shell
run the command like this (example, for Windows-platform):
gsutil -m cp -r gs://source_folder_path "%userprofile%/Downloads"
gsutil rsync -d -r gs://bucketName .
works for me

gsutil rsync with gzip compression

I'm hosting publicly available static resources in a google storage bucket, and I want to use the gsutil rsync command to sync our local version to the bucket, saving bandwidth and time. Part of our build process is to pre-gzip these resources, but gsutil rsync has no way to set the Content-Encoding header. This means we must run gsutil rsync, then immediately run gsutil setmeta to set headers on all the of gzipped file types. This leaves the bucket in a BAD state until that header is set. Another option is to use gsutil cp, passing the -z option, but this requires us to re-upload the entire directory structure every time, and this includes a LOT of image files and other non-gzipped resources that wastes time and bandwidth.
Is there an atomic way to accomplish the rsync and set proper Content-Encoding headers?
Assuming you're starting with gzipped source files in source-dir you can do:
gsutil -h content-encoding:gzip rsync -r source-dir gs://your-bucket
Note: If you do this and then run rsync in the reverse direction it will decompress and copy all the objects back down:
gsutil rsync -r gs://your-bucket source-dir
which may not be what you want to happen. Basically, the safest way to use rsync is to simply synchronize objects as-is between source and destination, and not try to set content encodings on the objects.
I'm not completely answering the question but I came here as I was wondering the same thing trying to achieve the following:
how to deploy efficiently a static website to google cloud storage
I was able to find an optimized way for deploying my static web site from a local folder to a gs bucket
Split my local folder into 2 folders with the same hierarchy, one containing the content to be gzip (html,css,js...), the other the other files
Gzip each file in my gzip folder (in place)
Call gsutil rsync in for each folder to the same gs destination
Of course, it is only a one way synchronization and deleted local files are not deleted remotely
For the gzip folder the command is
gsutil -m -h Content-Encoding:gzip rsync -c -r src/gzip gs://dst
forcing the content encoding to be gzippped
For the other folder the command is
gsutil -m rsync -c -r src/none gs://dst
the -m option is used for parallel optimization. The -c option is needed to force using checksum validation (Why is gsutil rsync re-downloading all our files?) as I was touching each local file in my build process. the -r option is used for recursivity.
I even wrote a script for it (in dart): http://tekhoow.blogspot.fr/2016/10/deploying-static-website-efficiently-on.html

Rsync files in local directories and chmod issues

When I do rsync this is my command:
rsync -a source dest
I am using dest as my web root /var/www/
so some folder which are set to chmod 777 were no longer with 777 permission.
does rsync change folder permission as well?
What is best way to sync two local folders in same server.? Will rsync delete any changes done in destination and use the source files?
The manual page for rsync says this:
-a, --archive archive mode; equals -rlptgoD (no -H,-A,-X)
Among those options is -p, about which it says:
-p, --perms preserve permissions
So, yes, rsync is making the permissions on dest match those on source in this case. If that is not desired, then read the manual page and decide what options are more appropriate to your need than rsync -a, and use those instead. In the simplest case, add the --no-perms flag after -a to disable the permission preservation.