AWS S3, Deleting files from local directory after upload - powershell

I have backup files in different directories in one drive. Files in those directories can be quite big up to 800GB or so. So I have a batch file with a set of scripts which upload/syncs files to S3.
See example below:
aws s3 sync R:\DB_Backups3\System s3://usa-daily/System/ --exclude "*" --include "*/*/Diff/*"
The upload time can vary but so far so good.
My question is, how do I edit the script or create a new one which checks in the s3 bucket that the files have been uploaded and ONLY if they have been uploaded then deleted them from the local drive, if not leave them on the drive?
(Ideally it would check each file)
I'm not familiar with aws s3, or aws cli command that can do that? Please let me know if I made myself clear or if you need more details.
Any help will be very appreciated.

Best would be to use mv with --recursive parameter for multiple files
When passed with the parameter --recursive, the following mv command recursively moves all files under a specified directory to a specified bucket and prefix while excluding some files by using an --exclude parameter. In this example, the directory myDir has the files test1.txt and test2.jpg:
aws s3 mv myDir s3://mybucket/ --recursive --exclude "*.jpg"
Output:
move: myDir/test1.txt to s3://mybucket2/test1.txt
Hope this helps.

As the answer by #ketan shows, Amazon aws client cannot do batch move.
You can use WinSCP put -delete command instead:
winscp.com /log=S3.log /ini=nul /command ^
"open s3://S3KEY:S3SECRET#s3.amazonaws.com/" ^
"put -delete C:\local\path\* /bucket/" ^
"exit"
You need to URL-encode special characters in the credentials. WinSCP GUI can generate an S3 script template, like the one above, for you.
Alternatively, since WinSCP 5.19, you can use -username and -password switches, which do not need any encoding:
"open s3://s3.amazonaws.com/ -username=S3KEY -password=S3SECRET" ^
(I'm the author of WinSCP)

Related

how to eliminate command exception issue in gsutil

I have this command
gsutil rsync -r -x '".*.jpg$"' File Share\data\Home Drive gs://sdefs01/Home Drive
this is to exclude any .jpg file to be copied to my google bucket.
however, it returns an error:
commandexceptions: the rsync command accept at most 2 arguments.
the command example that I refer to is from google cloud support page.
please help.
You need to put the source directory path inside double quotes as it contains spaces.

Compress folder to .WAR file using PowerShell

I have a folder on my D drive (D://MyFolder), which I want to compress into a .WAR file (D://MyFolder.war).
I am trying to automate a deployment process using PowerShell, so I am looking for a PowerShell (or MS command line) command to do this.
I've tried to google and scourge StackOverflow, but haven't been able to find anything yet. This is my first 'PowerShell Adventure', so I'm not entirely sure if/how I can do this?
Many thanks for your help.
What about simple (if you do not have JAVA_HOME which you can check with env | sls JAVA_HOME):
cd D:\MyFolder
& "<path_to_your_java>\bin\java.exe" -cvf my_folder.war *
java options:
-c create new archive
-v generate verbose output on standard output
-f specify archive file name

Keep original documents' dates with PSFTP

I have downloaded some files with PSFTP from a SQL Server. The problem is that PSFTP changes the dates of creation/update and last modified of the files when downloading them in a local folder. For me it is important to keep the original dates. Is there any command to set/change it? Thanks
This is the script of the batch file
psftp.exe user#host -i xxx.ppk -b abc.scr
This is the scriptof the SCR file
cd /path remote folder
lcd path local folder
mget *.csv
exit
I'm not familiar with PSFTP and after looking at the docs I don't see any option to do this. However, you can use the -p flag of pscp to preserve dates and times.
See docs here.
(note it's a lower-case p, the other case is for specifying the port)

Folders not showing up in Bucket storage

So my problem is that a have a few files not showing up in gcsfuse when mounted. I see them in the online console and if I 'ls' with gsutils.
Also, if If I manually create the folder in the bucket, i then can see the files inside it, but I need to create it first. Any suggestions?
gs://mybucket/
dir1/
ok.txt
dir2
lafu.txt
If I mount mybucket with gcsfuse and do 'ls' it only returns dir1/ok.txt.
Then I'll create the folder dir2 inside dir1 at the root of the mounting point, and suddenly 'lafu.txt' shows up.
By default, gcsfuse won't show a directory "implicitly" defined by a file with a slash in its name. For example if your bucket contains an object named dir/foo.txt, you won't be able to find it unless there is also an object nameddir/.
You can work around this by setting the --implicit-dirs flag, but there are good reasons why this is not the default. See the documentation for more information.
Google Cloud Storage doesn't have folders. The various interfaces use different tricks to pretend that folders exist, but ultimately there's just an object whose name contains a bunch of slashes. For example, "pictures/january/0001.jpg" is the full name of a single object.
If you need to be sure that a "folder" exists, put an object inside it.
#Brandon Yarbrough suggests creating needed directory entries in the GCS bucket. This avoids the performance penalty described by #jacobsa.
Here is a bash script for doing so:
# 1. Mount $BUCKET_NAME at $MOUNT_PT
# 2. Run this script
MOUNT_PT=${1:-HOME/mnt}
BUCKET_NAME=$2
DEL_OUTFILE=${3:-y} # Set to y or n
echo "Reading objects in $BUCKET_NAME"
OUTFILE=dir_names.txt
gsutil ls -r gs://$BUCKET_NAME/** | while read BUCKET_OBJ
do
dirname "$BUCKET_OBJ"
done | sort -u > $OUTFILE
echo "Processing directories found"
cat $OUTFILE | while read DIR_NAME
do
LOCAL_DIR=`echo "$DIR_NAME" | sed "s=gs://$BUCKET_NAME/==" | sed "s=gs://$BUCKET_NAME=="`
#echo $LOCAL_DIR
TARG_DIR="$MOUNT_PT/$LOCAL_DIR"
if ! [ -d "$TARG_DIR" ]
then
echo "Creating $TARG_DIR"
mkdir -p "$TARG_DIR"
fi
done
if [ $DEL_OUTFILE = "y" ]
then
rm $OUTFILE
fi
echo "Process complete"
I wrote this script, and have shared it at https://github.com/mherzog01/util/blob/main/sh/mk_bucket_dirs.sh.
This script assumes that you have mounted a GCS bucket locally on a Linux (or similar) system. The script first specifies the GCS bucket and location where the bucket is mounted. It then identifies all "directories" in the GCS bucket which are not visible locally, and creates them.
This (for me) fixed the issue with folders (and associated objects) not showing up in the mounted folder structure.

GCS - multiple credentials in a single boto file

New to GCS (just got started with it today). Looks very promising.
Is there anyway to use multiple S3 accounts (or GCS) in a single boto file? I only see the option to assign keys to one S3 and one GCS account in a single file. I'd like to use multiple credentials.
We're like to copy from S3 to S3, or GCS to GCS, with each of those buckets using different keys.
You should be able to setup multiple profiles within your .boto file.
You could add something like:
[profile prod]
gs_access_key_id=....
gs_secret_access_key=....
[profile dev]
gs_access_key_id=....
gs_secret_access_key=....
And then from your code you can add a profile_name= parameter to the connection call:
con = boto.gs.connection(profile_name="dev")
You can definitely use multiple boto files, just make sure that the credentials in each of them are valid. Every time you need to switch between them, run the following command with the right path.
$ BOTO_CONFIG=/path/to_boto gsutil cp SOME_FILE gs://bucket
Example :
BOTO_CONFIG=/etc/boto.cfg gsutil -m cp text.txt gs://bucket
Additionally, you can have aliases for your different profiles. Just create an alias for each command and you are set !