Using Filepicker.io, uploading files into a folder in an S3 Bucket - filepicker.io

Can I upload files into a specific folder in an S3 bucket rather than just uploading into the base folder of the bucket?

Yes, use the "path" parameter on the filepicker.store call.
filepicker.store(fpfile, {location:'S3', path:'myfolder/file.png'},
function(stored_fpfile){
console.log(stored_fpfile);
});
Documentation at https://developers.filepicker.io/docs/web/#store

Related

Google Storage Python ACL Update not Working

I have uploaded one image file to my google storage bucket.
#Block 1
#Storing the local file inside the bucket
blob_response = bucket.blob(cloud_path)
blob_response.upload_from_filename(local_path, content_type='image/png')
File gets uploaded fine. I verify the file in bucket.
After uploading the file, in the same method, I am trying to update the acl for file to be publicly accessible as:
#Block 2
blob_file = storage.Blob(bucket=bucket20, name=path_in_bucket)
acl = blob_file.acl
acl.all().grant_read()
acl.save()
This does not make the file public.
Strange thing is that,after I run the above upload method, if I just call the #Block 2 code. separately in jupyter notebook; It is working fine and file become publicly available.
I have tried:
Checked existence of blob file in bucket after upload code.
Introducing 5 seconds delay after upload.
Any help is appreciated.
If you are changing the file uploaded from upload_from_filename() to public, you can reuse the blob from your upload. Also, add a reloading of acl prior to changing the permission. This was all done in 1 block in Jupyter Notebook using GCP AI Platform.
# Block 1
bucket_name = "your-bucket"
destination_blob_name = "test.txt"
source_file_name = "/home/jupyter/test.txt"
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(destination_blob_name)
blob.upload_from_filename(source_file_name)
print(blob) #prints the bucket, file uploded
blob.acl.reload() # reload the ACL of the blob
acl = blob.acl
acl.all().grant_read()
acl.save()
for entry in acl:
print("{}: {}".format(entry["role"], entry["entity"]))
Output:

AzCopy - how to specify metadata when copying a file to a blob storage

I'm trying to upload a file to an Azure Blob storage using AzCopy, but I want to include metadata.
According to the documentation, "AzCopy copy" has a metadata parameter where I have to provide key/value pairs as a string.
How has this string to be formatted? I can't get it to work and don't find any examples...
AzCopy.exe copy .\testfile2.txt "https://storageaccount.blob.core.windows.net/upload/testfile4.txt?sastoken" --metadata ?what_here?
Thanks!
Documentation:
https://learn.microsoft.com/en-us/azure/storage/common/storage-ref-azcopy-copy#options
The string should be in this format: --metadata "name=ivan".
If you want to add multi metadata, use this format: --metadata "name=ivan;city=tokyo"
This is the command I'm using, and the version of azcopy is 10.3.4 :
azcopy copy "file_path" "https://xxx.blob.core.windows.net/test1/aaa1.txt?sasToken" --metadata "name=ivan"
The test result:

GCloud Functions: load error: File ./dist/index.js that is expected to define function doesn't exist

When trying to deploy to google cloud functions I get the response
load error: File ./dist/index.js that is expected to define function doesn't exist
Why is that?
It didn't upload my dist folder, because that was gitignored. In .gcloudignore I could remove the addition of gitignore by deleting #!include:.gitignore from the file.

Deleting all blobs inside a path prefix using google cloud storage API

I am using google cloud storage python API. I came across a situation where I need to delete a folder that might have hundred of files using API. Is there an efficient way to do it without making recursive and multiple delete call?
One solution that I have is to list all blob objects in the bucket with given path prefix and delete them one by one.
The other solution is to use gsutil:
$ gsutil rm -R gs://bucket/path
Try something like this:
bucket = storage.Client().bucket(bucket_name)
blobs = bucket.list_blobs()
while True:
blob = blobs.next()
if not blob: break
if blob.name.startswith('/path'): blob.delete()
And if you want to delete the contents of a bucket instead of a folder within a bucket you can do it in a single method call as such:
bucket = storage.Client().bucket(bucket_name)
bucket.delete_blobs(bucket.list_blobs())
from google.cloud import storage
def deleteStorageFolder(bucketName, folder):
"""
This function deletes from GCP Storage
:param bucketName: The bucket name in which the file is to be placed
:param folder: Folder name to be deleted
:return: returns nothing
"""
cloudStorageClient = storage.Client()
bucket = cloudStorageClient.bucket(bucketName)
try:
bucket.delete_blobs(blobs=bucket.list_blobs(prefix=folder))
except Exception as e:
print str(e.message)
In this case folder = "path"

mount S3 to databricks

I'm trying understand how mount works. I have a S3 bucket named myB, and a folder in it called test. I did a mount using
var AwsBucketName = "myB"
val MountName = "myB"
My question is that: does it create a link between S3 myB and databricks, and would databricks access all the files include the files under test folder? (or if I do a mount using var AwsBucketName = "myB/test"does it only link databricks to that foldertestbut not anyother files that outside of that folder?)
If so, how do I say list files in test folder, read that file or or count() a csv file in scala? I did a display(dbutils.fs.ls("/mnt/myB")) and it only shows the test folder but not files in it. Quite new here. Many thanks for your help!
From the Databricks documentation:
// Replace with your values
val AccessKey = "YOUR_ACCESS_KEY"
// Encode the Secret Key as that can contain "/"
val SecretKey = "YOUR_SECRET_KEY".replace("/", "%2F")
val AwsBucketName = "MY_BUCKET"
val MountName = "MOUNT_NAME"
dbutils.fs.mount(s"s3a://$AccessKey:$SecretKey#$AwsBucketName", s"/mnt/$MountName")
display(dbutils.fs.ls(s"/mnt/$MountName"))
If you are unable to see files in your mounted directory it is possible that you have created a directory under /mnt that is not a link to the s3 bucket. If that is the case try deleting the directory (dbfs.fs.rm) and remounting using the above code sample. Note that you will need your AWS credentials (AccessKey and SecretKey above). If you don't know them you will need to ask your AWS account admin for them.
It only lists the folders and files directly under bucket.
In S3
<bucket-name>/<Files & Folders>
In Databricks
/mnt/<MOUNT-NAME>/<Bucket-Data-List>
Just like below (Output for dbutils.fs.ls(s"/mnt/$MountName"))
dbfs:/mnt/<MOUNT-NAME>/Folder/
dbfs:/mnt/<MOUNT-NAME>/file1.csv
dbfs:/mnt/<MOUNT-NAME>/file2.csv