How do you use storage service in Bluemix? - ibm-cloud

I'm trying to insert some storage data onto Bluemix, I searched many wiki pages but I couldn't come to conclude how to proceed. So can any one tell me how to store images, files in storage of Bluemix through any language code ( Java, Node.js)?

You have several options at your disposal for storing files in your app. None of them include doing it in the app container file system as the file space is ephemeral and will be recreated from the droplet each time a new instance of your app is created.
You can use services like MongoLab, Cloudant, Object Storage, and Redis to store all kinda of blob data.

Assuming that you're using Bluemix to write a Cloud Foundry application, another option is sshfs. At your app's startup time, you can use sshfs to create a connection to a remote server that is mounted as a local directory. For example, you could create a ./data directory that points to a remote SSH server and provides a persistent storage location for your app.
Here is a blog post explaining how this strategy works and a source repo showing it used to host a Wordpress blog in a Cloud Foundry app.

Note that as others have suggested, there are a number of services for storing object data. Go to the Bluemix Catalog [1] and select "Data Management" in the left hand margin. Each of those services should have sufficient documentation to get you started, including many sample applications and tutorials. Just click on a service tile, and then click on the "View Docs" button to find the relevant documentation.
[1] https://console.ng.bluemix.net/?ace_base=true/#/store/cloudOEPaneId=store

Check out https://www.ng.bluemix.net/docs/#services/ObjectStorageV2/index.html#gettingstarted. The storage service in Bluemix is OpenStack Swift running in Softlayer. Check out this page (http://docs.openstack.org/developer/swift/) for docs on Swift.
Here is a page that lists some clients for Swift.
https://wiki.openstack.org/wiki/SDKs

As I search There was a service that name was Object Storage service and also was created by IBM. But, at the momenti I couldn't see it in the Bluemix Catalog. I guess , They gave it back and will publish new service in the future.

Be aware that pobject store in bluemix is now S3 compatible. So for instance you can use Boto or boto3 ( for python guys ) It will work 100% API comaptible.
see some example here : https://ibm-public-cos.github.io/crs-docs/crs-python.html
this script helps you to list recursively all objects in all buckets :
import boto3
endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'
s3 = boto3.resource('s3', endpoint_url=endpoint)
for bucket in s3.buckets.all():
print(bucket.name)
for obj in bucket.objects.all():
print(" - %s") % obj.key
If you want to specify your credentials this would be :
import boto3
endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'
s3 = boto3.resource('s3', endpoint_url=endpoint, aws_access_key_id=YouRACCessKeyGeneratedOnYouBlueMixDAShBoard, aws_secret_access_key=TheSecretKeyThatCOmesWithYourAccessKey, use_ssl=True)
for bucket in s3.buckets.all():
print(bucket.name)
for obj in bucket.objects.all():
print(" - %s") % obj.key
If you want to create a "hello.txt" file in a new bucket. :
import boto3
endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'
s3 = boto3.resource('s3', endpoint_url=endpoint, aws_access_key_id=YouRACCessKeyGeneratedOnYouBlueMixDAShBoard, aws_secret_access_key=TheSecretKeyThatCOmesWithYourAccessKey, use_ssl=True)
my_bucket=s3.create_bucket('my-new-bucket')
s3.Object(my_bucket, 'hello.txt').put(Body=b"I'm a test file")
If you want to upload a file in a new bucket :
import boto3
endpoint = 'https://s3-api.us-geo.objectstorage.softlayer.net'
s3 = boto3.resource('s3', endpoint_url=endpoint, aws_access_key_id=YouRACCessKeyGeneratedOnYouBlueMixDAShBoard, aws_secret_access_key=TheSecretKeyThatCOmesWithYourAccessKey, use_ssl=True)
my_bucket=s3.create_bucket('my-new-bucket')
timestampstr = str (timestamp)
s3.Bucket(my_bucket).upload_file(<location of yourfile>,<your file name>, ExtraArgs={ "ACL": "public-read", "Metadata": {"METADATA1": "resultat" ,"METADATA2": "1000","gid": "blabala000", "timestamp": timestampstr },},)

Related

Producing a CSV of Cloud Bucket files

What's the best way to create a CSV file listing images in a Google Cloud bucket to be imported into AutoML Vision?
If you want to listen the files that are saved on a bucket you can use a Google cloud function to listen the new files and create the csv file in another bucket
For example you can use this python code as starting point, this code log the details of a new uploaded file
def hello_gcs_generic(data, context):
"""Background Cloud Function to be triggered by Cloud Storage.
This generic function logs relevant data when a file is changed.
Args:
data (dict): The Cloud Functions event payload.
context (google.cloud.functions.Context): Metadata of triggering event.
Returns:
None; the output is written to Stackdriver Logging
"""
print('Event ID: {}'.format(context.event_id))
print('Event type: {}'.format(context.event_type))
print('Bucket: {}'.format(data['bucket']))
print('File: {}'.format(data['name']))
print('Metageneration: {}'.format(data['metageneration']))
print('Created: {}'.format(data['timeCreated']))
print('Updated: {}'.format(data['updated']))
Basically the function is listening the storage events "google.storage.object.finalize
" (this happen when a file is uploaded)
To deploy this function on the cloud you can use this command
gcloud functions deploy hello_gcs_generic --runtime python37 --trigger-resource [your bucket name] --trigger-event google.storage.object.finalize
or you can use the GCP console (Web UI) to deploy this function.
selecting "cloud storage" on the trigger field
select "Finalize/create" on the event type
specifiying your bucket
Even you can directly process the files using Auto ML within a cloud function as is mentioned in this example.

Load model from Google Cloud Storage without downloading

Is there a way to serve model from Google Cloud Storage without actually downloading a copy of model? like streaming the data directly?
I'm trying to load a fasttext model that is hosted on Google Cloud Storage. everytime i run the program, it needs to get and download a copy of that model in the bucket.
language_model_filename = 'lid.176.bin' // filename in GCS
language_model_local = 'lid.176.bin' // local file name when downloaded
bucket = storage_client.get_bucket(CLOUD_STORAGE_BUCKET)
blob = bucket.blob(language_model_filename)
blob.download_to_filename(language_model_local)
language_model = FastText.load_model(language_model_local)
You can use Streaming Tranfers for that purpose. As explained in the documentation, you can use the third party boto client library plugin for Cloud Storage.
A streaming download example would look like this:
import sys
downloaded_file = 'saved_data_file'
MY_BUCKET = 'my_app_bucket'
object_name = 'data_file'
src_uri = boto.storage_uri(MY_BUCKET + '/' + object_name, 'gs')
src_uri.get_key().get_file(sys.stdout)

How to share information across notebooks in a DSX project

Is it possible to share information (such as credentials) across multiple notebooks in a DSX project, e.g. with environment variables?
For example a Cloud Foundry application in Bluemix has a control setting where environment variables can be defined, is there a similar concept for a DSX project (I couldn't see anything in the various project level settings).
Separate notebooks have separate runtimes in the background and at the moment it is not possible to share credentials among notebooks by defining environment variables. But there are helper methods for most obvious credential requirements in a project. This is called the "Insert to code" method.
For example: if you have an object store associated with your project.
Select the "Data" tab in the top bar.
Add some file to the object store by browsing or simple drag-n-drop.
Insert credentials of that object store container in your notebook by selecting the "Insert credentials" option, right besides your file in the right hand side panel.
You can then directly insert those credential (Step 3) in any other notebook in that project.
Besides "Insert to code" there are other helper functions like "Insert SparkR dataframe", "Pandas dataframe" etc. to speed up the analytics process of data scientists. Hope that was a bit helpful.
FYI - I've added a feature request on uservoice to allow Bluemix services to be bound to a project and then the credentials be accessed in the same way a Bluemix application accessess credentials. Please vote if you think this would be useful.
Currently, one pattern I use quite a lot is to create a notebook in my project that is used to save credentials to a file on DSX:
! echo '{ "username": "xxxx", "password": "xxxx", ... }' > cloudant_creds.json
That file is now available to all of your notebooks on the project. NOTE: the file is saved on the spark service file system. If you use the same spark service in other dsx projects, they will also be able to access the file.
The credentials for cloudant normally include other fields such as host, I haven't shown these fields here so I can Keep the example simple. I have indicated there are more fields with the .... I normally copy this json from the bluemix service credentials field.
In your other notebooks, you would read the credentials something like this:
with open('cloudant_creds.json') as data_file:
sourceDB = json.load(data_file)
You can then refer the credentials like this:
dfReader = sqlContext.read.format("com.cloudant.spark")
dfReader.option("cloudant.host", sourceDB.host)
if sourceDB.username:
dfReader.option("cloudant.username", sourceDB.username)
if sourceDB.password:
dfReader.option("cloudant.password", sourceDB.password)
df = dfReader.load(sourceDB.database).cache()

Node-RED on Bluemix: Where are the flows?

Normally, Node-RED flows are stored somewhere in the filesystem, in a file named flows_XXX.json.
When running Node-RED on Bluemix where are they stored?
This could be important if your node instance doesn't start anymore.
A Node-RED instance on Bluemix when created from the Node-RED boilerplate always comes with a Cloudant database service connected.
Open the Cloudant dashboard
Open the database nodered
Open the document <app_name>/flow (Use the edit icon to open it)
You can now copy all the flows from this Node-RED instance.
Simply remove this part from the beginning:
{
"_id": "HUe-IoT-RED/flow",
"_rev": "6-3813d11089aa3e3adb9e704d4251bcdd",
"flow":
and the tailing }
Everything between the [ ] are the flows. They can be imported into another Node-RED instance.
More info on Node-RED website and Node-RED GitHub repo
For the boilerplate install all data including flows is persisted to the bound cloudant database.
Details can be found in the node-red-bluemix repo - https://github.com/node-red/node-red-bluemix
As described by Harald in a previous answer once you create an instance of nodered boilerplate it is bound to a cloudant nosql instance for data, instead of the classic json file: this because a file on the filesystem would be reset as soon as your application restarts, while a db service persists.
So if you wish to retrieve your application flows once it isn't able to start anymore, you have to access the cloudant nosql dashboard and extract the data locally.
Generally when the node-red instance doesn't start anymore (if something is changed, etc.), you can 're-push' the starter - code on your old bugged application. So, the app is 'resetted' as the first time, but you don't lose the flows, cause they are stored in a Cloudant DB.

Publicily Shared files on Google Cloud Storage Bucket

I am using Google App Engine PHP SDK.
Google cloud storage allows users to check a "publicly shared?" field in the storage manager that allows you to share a URL to the data directly.
I'm using google app engine and sending data to the storage, but I would like to have it publicly shared by default.
this is code which I am using to upload files
require_once 'google/appengine/api/cloud_storage/CloudStorageTools.php';
use google\appengine\api\cloud_storage\CloudStorageTools;
$options = [ 'gs_bucket_name' => 'my_bucket' ];
$upload_url = CloudStorageTools::createUploadUrl('/test.php', $options);
$gs_name = $_FILES['sample']['tmp_name'];
move_uploaded_file($gs_name, 'gs://test_sample/');
How can I do this? Their docs does not seem to mention anything about this, except manually doing it.
You can define the permissions that your new uploaded files will have by default with the command:
gsutil defacl ch -u allUsers:R gs://<bucket>
And after you upload your files with your code they should be publicly shared.
Visit the following links for more information about the command:
https://cloud.google.com/storage/docs/gsutil/commands/defacl
https://cloud.google.com/storage/docs/gsutil/commands/acl
Hope it helps.