How to skip existing files in gsutil rsync - google-cloud-storage

I want to copy files between a directory on my local computer disk and my Google Cloud Storage bucket with the below conditions:
1) Copy all new files and folders.
2) Skip all existing files and folders irrespective of whether they have been modified or not.
I have tried to implement this using the Google ACL policy, but it doesn't seem to be working.
I am using Google Cloud Storage admin service account to copy my files to the bucket.

As #A.Queue commented, the solution to skip existing files would be the use of the gsutil cp command with the -n option. This option means no-clobber, so that all files and directories already present in the Cloud Storage bucket will not be overwritten, and only new files and directories will be added to the bucket.
If you run the following command:
gsutil cp -n -r . gs://[YOUR_BUCKET]
You will copy all files and directories (including the whole directory tree with all files and subdirectories underneath) that are not present in the Cloud Storage bucket, while all of those which are already present will be skipped.
You can find more information related to this command in this link.

Related

Export firestore data by overwriting existing data gcloud firestore

I am trying to overwrite existing export data in gcloud using:
gcloud firestore export gs://<PROJECT>/dir --collection-ids='tokens'
But I get this error:
(gcloud.firestore.export) INVALID_ARGUMENT: Path already exists: /fcm-test-firebase.appspot.com/dir/dir.overall_export_metadata
Is there anyway to either delete the path or export with replace?
You can easily determine the list of available flags for any gcloud.
Here are variants of the command and you can see that there's no overwrite option:
gcloud firestore export
gcloud alpha firestore export
gcloud beta firestore export
Because the export is too a Google Cloud Storage (GCS) bucket, you can simply delete the path before attempting the export.
BE VERY CAREFUL with this command as it recursively deletes objects
gsutil rm -r gs://<PROJECT>/dir
If you would like Google to consider adding an overwrite feature, consider filing a feature request on it's public issue tracker.
I suspect that the command doesn't exist for various reasons:
GCS storage is cheap
Many backup copies is ∞>> no backup copies
It's easy to delete copies using gsutil

Azure data factory - SftpPermissionDeniedException

Using a copy data activity I want to upload files to an SFTP service, but receive the following error message:
I can upload files via a simple linux sftp client to the target folder with the same user, and also able to create files and folders within the target folder(but not in its parent folder, which is the root folder).
"Upload with temp file" option is set to false.
Any idea?
To confirm which user your build runs as you can run the whoami command as a part of your build process.
Solution:
Store things inside of a folder that the user running the build has permissions to.
Change the ownership of the directory with the chown command before trying to write to it.
Refer - https://support.circleci.com/hc/en-us/articles/360003649774-Permission-Denied-When-Creating-Directory-or-Writing-a-File

Moving postgres data folder on Ubuntu

I have a web application querying a Postgresql database (successfully) and I'm looking to move the data folder from location /var/lib/postgres/9.3/main to a customisable location.
Right now I'm prevented from even copying the folder due to permission errors, but I can't assign myself the permissions because that breaks the postgres server.
(I broke the server by running sudo chown <username> -R /var/lib/postgres/9.3/main - which worked as a command but stopped the postgres server from working)
I would simply create a new folder and change the location there, but I'll lose the current instance of my database if that was done.
How can I move the current folder to a new location, so that I can point to it in the .conf file? I need to explicitly move the folder, I can't create a new DB.
You can just copy or move the directory, including all subdirs and files
cp -rp or mv should be enough for this.
Postgres must not be running while you are messing with the files
The base of the data-drectory (PG_DATA) must be owned by postgres and have file mode 0700 . (when not: pg will refuse to start)
[the rest of the files must at least be readable/writeble by postgres]
the new location must also be known to the startup process (in /etc/init.d/ and (possibly) in the postgres.conf file within the data directory. (for the log file location)

Executed PHP Script Cannot Access GCS Mounted Drive on GCE

I was able to mount my Google Cloud Storage using the command line below:
gcsfuse -o allow_other -file-mode=660 -dir-mode=770 --uid=<uid> --gid=<gid> testbucket /path/to/domain/folder
The group includes the user apache. Apache is able to write to the mounted drive like so:
sudo -u apache echo 'Some Test Text' > /path/to/domain/folder/hello.txt
hello.txt appears in the bucket as expected. However when I execute the below php script I get an error:
<?php file_put_contents('/path/to/domain/folder/hello.txt', 'Some Test Text');
PHP Error: failed to open stream: Permission denied
echo exec('whoami'); Returns apache
I assumed this is a common use for mounting with gcsfuse or something similar to this but, I seem to be the only one on the internet with this issue. I do not know if its an issue with the way I mounted it or the service security of httpd.
I came across a similar issue.
Use the flag --implicit-dirs while mounting the Google Storage bucket using gcsfuse. More on this here.
Mounting the bucket as a folder makes the OS to treat it like a regular folder which may contain files and folders. But Google Cloud Storage bucket doesn't have directory structures. For example, when you are creating a file named hello.txt in a folder named files inside a Google Storage bucket, you are not actually creating a folder and putting the file in it. The object is created in the bucket with the name as files/hello.txt. More on this here and here.
To make the OS treat the GCS bucket like a hierarchical structure, you have to specify the --implicit-dirs flag to the gcsfuse.
Note:
I wouldn't recommend using gcsfuse in production systems as it is a beta quality software.

How to keep timestamp when gsutil cp

I am new to google cloud storage nearline and test it. I intend to use google cloud storage nearline for backup.
I wonder how to keep files timestamp when I do 'gsutil cp' between local and nearline.
gsutil cp localfile gs://mybucket
Then, uploaded file timestamp is set uploaded time. I want to keep original file timestamp.
Sorry, you cannot specify the creation time of an object in GCS. The creation time is always the moment that the object is created in GCS.
You can, however, set extra user metadata on objects that you upload. If you'd like, you can record the original creation time of an object there:
$> gsutil cp -h "x-goog-meta-local-creation-time:Some Creation Time" localfile gs://mybucket
When I attempted to perform copy with the following command, the timestamp (Linux "mtime") of local files gets automatically preserved as "goog-reserved-file-mtime" in metadata on Google Cloud Storage.
gsutil cp -r -P $LOCAL_DIR/* gs://$TARGET_BUCKET &