I want to compress / zip multiple files in google cloud storage bucket into single zip file without downloading them.
Is there any gsutil cli method which takes multiple path input and cp zip / compressed of all those input files.
Thank you in advance.
Nope, there's no functionality in GCS that supports this. And if the API doesn't support it, no tools or client libraries can, as they're simply making API calls under the hood.
Here it is, though not natively but you can host on ur machine or Google cloud for better
https://www.npmjs.com/package/zip-bucket
Related
Our application reads data from several HDFS data folders, folders get updated weekly/daily/monthly so based on the updated period we need to find the latest path and then read the data.
We would like to do this using programmatic way using scala, so is there libraries available?
We could only see but just wondering any better libraries available?
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/package-summary.html
The linked library is the recommended way to use HDFS API programmatically without going through hadoop fs CLI scripts. Any other library you may find would be built using the same package.
I have files in a virtual machine hosted and created in Google Cloud and I want to be able to access them in google colab to run selenium.
Should I send the files to Google storage? It seems that I would then be able to ve found a tutorial there, it shows me how to access Google Cloud Storage files in Colab Notebooks.
You can use many methods to upload the files from your Compute Engine machine, for example uploading the files to Google Drive, or Google Cloud Storage. This document explains it well. Note that your CE machine wouldn't be stricty "local" to you but it would act the same way in that context.
With Cloud Storage, you can use gsutil to upload the files, with this command:
gsutil cp $file_to_upload gs://$bucket_name
I am using Text to Speech in Watson Studio. The output file is '.wav'. Does anyone have got any idea where is the file stored? I want to download it from the IBM cloud to my pc. How should I do this? I have searched entire cloud storage, but couldn't find the speech file.
When running the TTS API from within Watson Studio on Cloud notebooks, the files you write go to the underlying python runtime container's filesystem, which is not persistent.
So, you would have to explicitly copy that file to Cloud Object Storage.
An easy way to do that in WSC is to use the project_lib API (see https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/project-lib-python.html), which will let you create a Data Asset in your project.
You could also use the COS Client API https://ibm.github.io/ibm-cos-sdk-python/reference/services/s3.html#S3.Client.upload_file to copy that file to an arbitrary bucket that you have access to.
Regards,
Philippe Gregoire - IBM Ecosystem Advocacy Group - Data&AI Technical enablement
Is there a way to directly load / edit / save files to a given bucket in Google Cloud Storage without having to download the file, edit it, and then upload it again?
We have a GCS bucket with about 20 config files that we edit for various reasons. We would really just like to load the bucket into VS Code and then browse between updating the files and saving edits.
I have tried the vscode-bucket-explorer extension for VS Code but this just seems to provide viewing capability with no editing/saving capability. Unless I am missing something?
Is there a way to mount a bucket as a drive on a Mac? With read/write ability?
Is there a way to directly load / edit / save files to a given bucket
in Google Cloud Storage without having to download the file edit it and then upload it again
No, blobs objects in Google Cloud Storage can not be edited in place.
As with buckets, existing objects cannot be directly renamed. Instead,
you can copy an object, give the copied version the desired name, and
delete the original version of the object. See Renaming an object for
a step-by-step guide, including instructions for tools like gsutil and
the Google Cloud Console, which handle the renaming process
automatically.
Is there a way to mount a bucket as a drive on a Mac? With read/write
ability?
You can use Cloud Storage FUSE where the mounted bucket will behave similarly to a persistent disk.
Cloud Storage FUSE is an open source FUSE adapter that allows you to
mount Cloud Storage buckets as file systems on Linux or macOS systems.
It also provides a way for applications to upload and download Cloud
Storage objects using standard file system semantics. Cloud Storage
FUSE can be run anywhere with connectivity to Cloud Storage, including
Google Compute Engine VMs or on-premises systems
I download a lot of csv files via ftp from different sources on a daily basis. I then upload these files into Google Cloud Storage.
Are there any programs/api/tools to automate this?
Looking for a best way, if possible, to load these files directly into Google Cloud Storage without having to locally download them. Something that I can deploy on Google Compute, so I don't need to run a local programs like Filezilla/CrossFTP. The program/tool will keep checking the remote location on a regular basis and load new files into Google Cloud Storage; ensuring a checksum match.
I apologize in advance if this is too vague/generic question.
Sorry, no. Automatically importing objects from a remote FTP server is not currently a feature of GCS.