How to perform logging with gsutil rsync - google-cloud-storage

What's the proper way to log any errors or warnings when performing a quiet rsync?
This is what I currently run from my crontab:
gsutil -m -q rsync -r -C /mount1/share/folder gs://my-bucket-1/folder/ > /mount2/share/folder/gsutil.log
Since the log file is always completely empty and I'm uploading terabytes of data I'm starting to think that maybe even errors and warnings are being supressed.

After having realized that this is related to how you pipe stdout and/or stderr to files in general, the answer really lies within this existing thread: How to redirect both stdout and stderr to a file
So a simple solution to log as much as possible into one single log file could be something like:
gsutil -d rsync [src] [dst] &> [logfile]
...where -d enables debug output. I found this to be the only way to show files which were affected by an error such as CommandException: 3 files/objects could not be copied. Please note that -d exposes authentication credentials.

Related

Accidentally backed Docker Postgres up with -v flag. How to restore?

I had a backup script for a small server running Postgresql in a Docker Container.
The backup line was the following:
docker exec -t crm-database-1 pg_dump --dbname=postgresql://postgres:**************#localhost:5432/********** -F c -b -Z 9 -E UTF8 -v > $path/db.bak
Today the unthinkable happened and the Postgres data directory got corrupted. No chance of recovering.
I think I screwed myself because of the -v flag.
And of course I never tried to recover my backups before.
Now that im trying to restore one of the 20 Dumps I have but I only get errors like "unexpected EOF" or "file is not an archive"
Is there any chance on recovering or did I screw up and the data is gone?
I tried every possible combination of piping the backup to pg_restore or pg_ctl.
Also tried to edit the backup file by removing all "pg_dump: reading table..." with regex.
None did work.
EDIT:
After a long search i found out after "cat-ing" out something from docker to host, the file that gets created on the host, has a few bytes more.
With HxD I found, there are extra 0D (LineBreak) characters.
After removing them I was able to restore the schema but when trying to import data, I get the error message "out of memory"
Im going to dig a little bit more into that.

How to get response from SCPG3 command

I'm using scpg3 command to copy file from local server to a remote server. My command is as below:
scpg3 <filename> user#remotehost:/tmp
My question is: how to get the result from this command. I want to move it to backup folder after copy successfully. Thanks
There is a verbose directly. You can use that:
scpg3 -v, --verbose
in your case,
scpg3 -v <filename> user#remotehost:/tmp
Uses verbose mode which is equal to -D 2. -D only applies on Unix. On Windows, instead of this command line tool, use the Connection Broker debugging options -D, -l.
D is the Debug level.
Hope it helps.

gsutil rsync with gzip compression

I'm hosting publicly available static resources in a google storage bucket, and I want to use the gsutil rsync command to sync our local version to the bucket, saving bandwidth and time. Part of our build process is to pre-gzip these resources, but gsutil rsync has no way to set the Content-Encoding header. This means we must run gsutil rsync, then immediately run gsutil setmeta to set headers on all the of gzipped file types. This leaves the bucket in a BAD state until that header is set. Another option is to use gsutil cp, passing the -z option, but this requires us to re-upload the entire directory structure every time, and this includes a LOT of image files and other non-gzipped resources that wastes time and bandwidth.
Is there an atomic way to accomplish the rsync and set proper Content-Encoding headers?
Assuming you're starting with gzipped source files in source-dir you can do:
gsutil -h content-encoding:gzip rsync -r source-dir gs://your-bucket
Note: If you do this and then run rsync in the reverse direction it will decompress and copy all the objects back down:
gsutil rsync -r gs://your-bucket source-dir
which may not be what you want to happen. Basically, the safest way to use rsync is to simply synchronize objects as-is between source and destination, and not try to set content encodings on the objects.
I'm not completely answering the question but I came here as I was wondering the same thing trying to achieve the following:
how to deploy efficiently a static website to google cloud storage
I was able to find an optimized way for deploying my static web site from a local folder to a gs bucket
Split my local folder into 2 folders with the same hierarchy, one containing the content to be gzip (html,css,js...), the other the other files
Gzip each file in my gzip folder (in place)
Call gsutil rsync in for each folder to the same gs destination
Of course, it is only a one way synchronization and deleted local files are not deleted remotely
For the gzip folder the command is
gsutil -m -h Content-Encoding:gzip rsync -c -r src/gzip gs://dst
forcing the content encoding to be gzippped
For the other folder the command is
gsutil -m rsync -c -r src/none gs://dst
the -m option is used for parallel optimization. The -c option is needed to force using checksum validation (Why is gsutil rsync re-downloading all our files?) as I was touching each local file in my build process. the -r option is used for recursivity.
I even wrote a script for it (in dart): http://tekhoow.blogspot.fr/2016/10/deploying-static-website-efficiently-on.html

How to send data to command line after calling .sh file?

I want to install Anaconda through EasyBuild. EasyBuild is a software to manage software installation on clusters. Anaconda can be installed with sh Anaconda.sh.
However, after running I have to accept the License agreement and give the installation location on the command line by entering <Enter>, yes <Enter>, path/where/to/install/ <Enter>.
Because this has to be installed automatically I want to do the accepting of terms and giving the install location in one line. I tried to do it like this:
sh Anaconda.sh < <(echo) >/dev/null < <(echo yes) >/dev/null \
< <(echo /apps/software/Anaconda/1.8.0-Linux-x86_64/) > test.txt
From the test.txt I can read that the first echo works as <Enter>, but I can't figure out how to accept the License agreement, as it sees it now as not sending yes:
Do you approve the license terms? [yes|no]
[no] >>> The license agreement wasn't approved, aborting installation.
How can I send the yes correctly to the script input?
Edit: Sorry, I missed the part about having to enter more then one thing. You can take a look at writing expect scripts. thegeekstuff.com/2010/10/expect-examples. You may need to install it however.
You could try piping with the following command: yes yes | sh Anaconda.sh. Read the man pages for more information man yes.
Expect is a great way to go and probably the most error proof way. If you know all the questions I think you could do this by just writing a file with the answers in the correct order, one per line and piping it in.
That install script is huge so as long as you can verify you know all the questions you could give this a try.
In my simple tests it works.
I have a test script that looks like this:
#!/bin/sh
echo -n "Do you accept "
read ANS
echo $ANS
echo -n "Install path: "
read ANS
echo $ANS
and an answers file that looks like this:
Y
/usr
Running it like so works... perhaps it will work for your monster install file as well.
cat answers | ./test.sh
Do you accept Y
Install path: /usr
If that doesn't work then the script is likely flushing and you will have to use expect or pexpect.
Good luck!
Actually, I downloaded and looked at the anaconda install script. Looks like it takes command line arguments.
/bin/bash Anaconda-2.2.0-Linux-x86_64.sh -h
usage: Anaconda-2.2.0-Linux-x86_64.sh [options]
Installs Anaconda 2.2.0
-b run install in batch mode (without manual intervention),
it is expected the license terms are agreed upon
-f no error if install prefix already exists
-h print this help message and exit
-p PREFIX install prefix, defaults to /home/cody.stevens/anaconda
Use the -b and -p options...
so use it like so:
/bin/bash Anaconda-2.2.0-Linux-x86_64.sh -b -p /usr
Also of note.. that script explicitly says not to run with '.' or 'sh' but 'bash' so they must have some dependency on a feature of bash.
--
Cody

Can't resume "wget --mirror" with --no-clobber (-c -F -B unhelpful)

I started a wget mirror with "wget --mirror [sitename]", and it was
working fine, but accidentally interrupted the process.
I now want to resume the mirror with the following caveats:
If wget has already downloaded a file, I don't want it downloaded
it again. I don't even want wget to check the timestamp: I know the
version I have is "recent enough".
I do want wget to read the files it's already downloaded and
follow links inside those files.
I can use "-nc" for the first point above, but I can't seem to coerce
wget to read through files it's already downloaded.
Things I've tried:
The obvious "wget -c -m" doesn't work, because it wants
to compare timestamps, which requires making at least a HEAD request
to the remote server.
"wget -nc -m" doesn't work, since -m implies -N, and -nc is
incompatible with -N.
"wget -F -nc -r -l inf" is the best I could come up with, but it
still fails. I was hoping "-F" would coerce wget into reading local,
already-downloaded files as HTML, and thus follow links, but this
doesn't appear to happen.
I tried a few other options (like "-c" and "-B [sitename]"), but
nothing works.
How do I get wget to resume this mirror?
Apparently this works:
Solved: Wget error “Can’t timestamp and not clobber old files at the
same time.” Posted on February 4, 2012 While trying to resume a
site-mirror operation I was running through Wget, I ran into the error
“Can’t timestamp and not clobber old files at the same time”. It turns
out that running Wget with the -N and -nc flags set at the same time
can’t happen, so if you want to resume a recursive download with
noclobber you have to disable -N. The -m attribute (for mirroring)
intrinsically sets the -N attribute, so you’ll have to switch from -m
to -r in order to use noclobber as well.
From: http://www.marathon-studios.com/blog/solved-wget-error-cant-timestamp-and-not-clobber-old-files-at-the-same-time/
-m, according to the wget manual is equivalent to this longer series of settings: -r -N -l inf --no-remove-listing. Just use those settings instead of -m, and without -N (timestamping).
Now I'm not sure if there is a way to get wget to download urls from existing html files. There probably is a solution, I know it can take html files as inputs and scrape all the links in them. Perhaps you could use a bash command to concatenate all the html files together into one big file.
I solved this problem by just deleting all the html files, because I didn't mind only redownloading them. But this might not work for everyone's use case.