I have this command
gsutil rsync -r -x '".*.jpg$"' File Share\data\Home Drive gs://sdefs01/Home Drive
this is to exclude any .jpg file to be copied to my google bucket.
however, it returns an error:
commandexceptions: the rsync command accept at most 2 arguments.
the command example that I refer to is from google cloud support page.
please help.
You need to put the source directory path inside double quotes as it contains spaces.
Related
I got
$ gsutil ls gs://ml_models_c/ref7/test/model/2/
gs://ml_models_c/ref7/test/model/2/ <= why this?
gs://ml_models_c/ref7/test/model/2/saved_model.pb
gs://ml_models_c/ref7/test/model/2/variables/
$ gsutil ls gs://seldon-models/tfserving/mnist-model/1/
gs://seldon-models/tfserving/mnist-model/1/saved_model.pb
gs://seldon-models/tfserving/mnist-model/1/variables/
Why there is gs://ml_models_c/ref7/test/model/2/ in the first command output?
Why the second command does not return itself?
It seems that I can rm it.
Thanks
At the API level, Cloud Storage doesn't have the concept of folders, everything is stored as long file names that might have slashes in them.
In this case, you likely have an object named gs://ml_models_c/ref7/test/model/2/, but no object named gs://seldon-models/tfserving/mnist-model/1/
If you don't need the gs://ml_models_c/ref7/test/model/2/ object, you can delete it and it will no longer show in the results for gsutil ls
The following command: certutil.exe -L -d “C:\Users\Home\AppData\Roaming\Mozilla\Firefox\Profiles\1bku2z91.default-1633392324717\”
returns this error message: certutil.exe: function failed: SEC_ERROR_LEGACY_DATABASE: The certificate/key database is in an old, unsupported format.
I tried with and without quotes, changed backslash to forward slash. I found comments that the destination folder has to include secmod.db, which my folder does not but I think this applied to cert8.db, not cert9.db. I am passing the right folder as per about:support lookup. My Firefox version is 66.0.3
you need to add "sql:" before the location of the folder to specify that is a sqlite db that you are trying to read so it would be:
certutil.exe -L -d sql:“C:\Users\Home\AppData\Roaming\Mozilla\Firefox\Profiles\1bku2z91.default-1633392324717\”
I have backup files in different directories in one drive. Files in those directories can be quite big up to 800GB or so. So I have a batch file with a set of scripts which upload/syncs files to S3.
See example below:
aws s3 sync R:\DB_Backups3\System s3://usa-daily/System/ --exclude "*" --include "*/*/Diff/*"
The upload time can vary but so far so good.
My question is, how do I edit the script or create a new one which checks in the s3 bucket that the files have been uploaded and ONLY if they have been uploaded then deleted them from the local drive, if not leave them on the drive?
(Ideally it would check each file)
I'm not familiar with aws s3, or aws cli command that can do that? Please let me know if I made myself clear or if you need more details.
Any help will be very appreciated.
Best would be to use mv with --recursive parameter for multiple files
When passed with the parameter --recursive, the following mv command recursively moves all files under a specified directory to a specified bucket and prefix while excluding some files by using an --exclude parameter. In this example, the directory myDir has the files test1.txt and test2.jpg:
aws s3 mv myDir s3://mybucket/ --recursive --exclude "*.jpg"
Output:
move: myDir/test1.txt to s3://mybucket2/test1.txt
Hope this helps.
As the answer by #ketan shows, Amazon aws client cannot do batch move.
You can use WinSCP put -delete command instead:
winscp.com /log=S3.log /ini=nul /command ^
"open s3://S3KEY:S3SECRET#s3.amazonaws.com/" ^
"put -delete C:\local\path\* /bucket/" ^
"exit"
You need to URL-encode special characters in the credentials. WinSCP GUI can generate an S3 script template, like the one above, for you.
Alternatively, since WinSCP 5.19, you can use -username and -password switches, which do not need any encoding:
"open s3://s3.amazonaws.com/ -username=S3KEY -password=S3SECRET" ^
(I'm the author of WinSCP)
So I run the following:
gsutil -m cp -R file.png gs://bucket/file.png
And I get the following error message:
Copying file://file.png [Content-Type=application/pdf]...
Uploading file.png: 42.59 KiB/42.59 KiB
AccessDeniedException: 401 Login Required
CommandException: 1 files/objects could not be transferred.
I'm not sure what the problem is since I ran config and I can see all my buckets. Does anyone know what I need to do?
Note: I do not have gcloud, I just installed gsutil and ran the config.
Login to Google Cloud is needed for accessing any Cloud service. You need to use below command which will guide you through login steps like typing verification code you generate by opening browser link given in console.
gcloud auth login
I was getting a similar response, and was able to solve this problem by looking at the read permissions on the .boto file. In my case, I was using a service account and the .boto file that was created by
gsutil config -e
only had read permissions set for user. Since it was being read by a service running as a different user, it wasn't able to read the file and yielding a 401 Login Required error. I fixed it by adding read permissions for the service's group.
In the least sophisticated case, you could fix it by giving any user read permission with
chmod a+r .boto
A more detailed explanation for troubleshooting
To get more information, run the same command with a -D flag, like:
gsutil -m -D cp ....
In the debug output, look at:
Command being run: /path/to/gsutil
config_file_list: /path/to/boto/config
Create your login credentials using the executable at /path/to/gsutil, not gcloud auth or any other gsutil executable on the machine, using:
/path/to/gsutil config
For a service account, use:
/path/to/gsutil config -e
These should create a .boto config file in your home directory, $HOME/.boto. If you are running the gsutil command this file should be referenced in the config_file_list variable in the debug output. If not, see below to change it.
Running gsutil under a service account or as another user
If you are running as another user, and need to reference a newly-created config file, set the environment variable BOTO_CONFIG (don't forget to export it):
BOTO_CONFIG=/path/to/$HOME/.boto
export BOTO_CONFIG
By setting this variable, when you execute gsutil, it will reference the config file you have placed in BOTO_CONFIG. You can confirm that you are referencing the correct config file by looking at the config_file_list variable in the gsutil -D command's output.
make sure the referenced .boto file is readable by the user who is executing the gsutil command
Running the /path/to/gsutil with the BOTO_CONFIG variable set allowed me to execute gsutil as another user, referencing an arbitrary BOTO_CONFIG file that was set up with a service-account's credentials.
To set up the service account:
https://console.cloud.google.com/permissions/serviceaccounts
The key file from the service account set-up process needs to be downloaded, and the path to it is requested during the gsutil config -e step.
This may be an issue with how gsutil/boto handles the OS path separators on Windows, as referenced here. This should eventually get merged into the sdk tools, but until then the following should work:
Go to
google-cloud-sdk\platform\gsutil\third_party\boto\boto\pyami\config.py
and replace the line:
for path in os.environ['BOTO_PATH'].split(':'):
with:
for path in os.environ['BOTO_PATH'].split(os.path.pathsep):
Next, go to
google-cloud-sdk\bin\bootstrapping\gsutil.py
replace the lines that use ':'
if boto_config:
boto_path = ':'.join([boto_config, gsutil_path])
elif boto_path:
# this is ':' for windows as well, hardcoded into the boto source.
boto_path = ':'.join([boto_path, gsutil_path])
else:
path_parts = ['/etc/boto.cfg',
os.path.expanduser(os.path.join('~', '.boto')),
gsutil_path]
boto_path = ':'.join(path_parts)
with
if boto_config:
boto_path = os.path.pathsep.join([boto_config, gsutil_path])
elif boto_path:
# this is ':' for windows as well, hardcoded into the boto source.
boto_path = os.path.pathsep.join([boto_path, gsutil_path])
else:
path_parts = ['/etc/boto.cfg',
os.path.expanduser(os.path.join('~', '.boto')),
gsutil_path]
boto_path = os.path.pathsep.join(path_parts)
Restart cmd and now the error should go away.
Is there any way to remove file extensions when copying files with gsutil?
From local 0001:
0001/a/1.jpg
0001/b/2.png
To bucket 0002:
gs://0002/a/1
gs://0002/b/2
(I can remove the extensions locally but I will be losing the Content-Type when copying to GS)
gsutil doesn't have any mechanism for rewriting the file name in this way. You could write a shell loop that iterates over the files and removes the extensions in the file names being copied.
To preserve the Content-Type here are a couple of suggestions:
Set it explicitly on the command line, e.g.,
gsutil -h Content-Type:image/jpeg cp 0001/a/1.jpg gs://0001/a/1
Use the use_magicfile configuration (in the .boto config file), to cause the Content-Type to be detected by the "file" command. This only works if you're running on Unix or MacOS. In this case you'd still use the shell script to remove the filename extensions, but you wouldn't have to specify the -h Content-Type arg:
gsutil cp 0001/a/1.jpg gs://0001/a/1
Mike