I am looking for a solution similar to Amazon S3 or Azure Blob Storage that can be hosted internally instead of remotely. I don't necessarily need to scale out, but I'd like to create a central location where my growing stable of apps can take advantage of file storage. I would also like to formalize file access. Does anybody know of anything like the two services I mentioned above?
I could write this myself, but if something exists then I'd rather now reinvent the wheel, unless that weel has corners :)
The only real alternative to services like S3 and Azure blobs I've seen is Swift, though if you don't plan to scale out this may be overkill for your specific scenario.
The OpenStack Object Store project, known as Swift, offers cloud storage software so that you can store and retrieve lots of data in virtual containers. It's based on the Cloud Files offering from Rackspace.
The OpenStack Object Storage API is implemented as a set of ReSTful (Representational State Transfer) web services. All authentication and container/object operations can be performed with standard HTTP calls
http://docs.openstack.org/developer/swift/
Related
I have a php app running on several instances of Google Compute Engine (GCE). The app allows users to upload images of various sizes, resizes the images and then stores the resized images (and their thumbnails) in the storage disk and their meta data in the database.
What I've been trying to find is a method for storing the images onto Google Cloud Storage (GCS) through the php app running on GCE instances. A similar question was asked here but no clear answer was given there. Any hints or guidance on the best way for achieving this is highly appreciated.
You have several options, all with pros and cons.
Your first decision is how users upload data to your service. You might choose to have customers upload their initial data to Google Cloud Storage, where your app would then fetch it and transform it, or you could choose to have them upload it directly to your service. Let's assume you choose the second option, and you want users to stream data directly to your service.
Your service then transforms the data into a different size. Great. You now have a new file. If this was video, you might care about streaming the data to Google Cloud Storage as you encode it, but for images, let's assume you want to process the whole thing locally and then store it in GCS afterwards.
Now we have to get a file into GCS. It's a PHP app, and so as you have identified, your main three options are:
Invoke the GCS JSON API through the Google API PHP client.
Invoke either the GCS XML or JSON API via custom code.
Use gsutil.
Using gsutil will be the easiest solution here. On GCE, it automatically picks up appropriate credentials for your service account, and it's got several useful performance optimizations and tuning that a raw use of the API might not do without extra work (for example, multithreaded uploads). Plus it's already installed on your GCE instances.
The upside of the PHP API is that it's in-process and offers more fine-grained, programmatic control. As your logic gets more complicated, you may eventually prefer this approach. Getting it to perform as well as gsutil may take some extra work, though.
This choice is comparable to copying files via SCP with the "scp" command line application or by using the libssh2 library.
tl;dr; Using gsutil is a good idea unless you have a need to handle interactions with GCS more directly.
Ok first of all I love Azure and table storage.
We're starting a new greenfield project which will be hosted as a SaaS model in the cloud. Azure Table storage is ideal for what we need but one thing stopping us from taking this route is the possibility of someone having to have the application deployed to their local web server rather than a cloud deployment.
This is something i'd rather avoid personally but unfortunately some people insist the their local setup is more secure than any data centre out there.
What i'd really like to know is if someone has created a local implementation of Azure Table Storage. I know microsoft have the emulator which in theory could be used (it stores the data in SQL which may be slow)
Anyone used the emulator for an internal deployment?
I'm happy to look at creating a wrapper for Azure Table Storage using their rest apis but didn't want to do something that's already been done.
Alternately can anyone recommend an alternate? I know there's RavenDB and MongoDB which also look good too but i've not had an exposure to how well they handle under load or when to scale them out.
The emulator is designed to simplify testing - it is definitely not intended to be used as part of a production deployment.
Is it possible to embrace both a cloud only (Azure Web role and Storage) and a hybrid design whereby your application can be hosted within your web server yet still use Azure Storage?
Jason
I'm Cococa programmer, but right now I encountered situation when I can't go any further without smarter people:)
I always used small databases in my applciations. I programmed PHP backend on my own server and it worked good.
Right now I have to switch for something much bigger and I decided to try with Google App Engine, because it is relatively cheap and has great scalability.
I'm so confused with documentation and I really don't know where to start.
My new app will store data (images, videos) as well as database (mysql) in google cloud.
I concluded that for app like that I should use:
Google Cloud Storage for images / viedos etc.
Google Cloud SQL for CRUD operations for users (inserting and fetching personal data)
I would prefer to use JSON api. Then I don't have to write any Java, Python or GO code, right? Only REST requests for Google Cloud SQL...
My question is : Am I thinking correctly? Should I use these two services?
Google App Engine has a feature called "Cloud Endpoints" (Java | Python)
that automatically generates a JSON API similar to the APIs that Google provides for its own services (and also generates client libraries in JavaScript, Obj-C, and Java to invoke those APIs), saving you the trouble of writing the REST API yourself and manually serializing/deserializing the request and, instead, focusing on just the business logic that performs the storage and retrieval operations. So, what I would suggest is that you write the code that reads/writes data into the datastore (and cloud storage), but then use Cloud Endpoints to automatically generate your JSON API and client libraries, rather than manually writing that code.
Your plan seems fine so far. Google Cloud Storage is a great choice for storing a large number of images and movies, and Google Cloud SQL is a great choice for handling smaller, more relational data.
If you're using PHP from app engine, there's built-in support for Google Cloud Storage. See https://developers.google.com/appengine/docs/php/googlestorage/
If you're using PHP from your app that lives somewhere else, you could write to the Google Cloud Storage JSON or XML APIs directly, but there's also a PHP library for the Google APIs that might be easier for you to use: https://code.google.com/p/google-api-php-client/
I'm trying to create a back-end in which I can have many users communicate with each other amongst an iPhone app I'm creating. I've tried working with Core Data, Google App Engine, Google Cloud Storage, and Amazon Web Services (RDS & Elastic Beanstalk). Unfortunately, after weeks of trying to get any of this working, none of it will!
I've been trying to get in touch with someone who would know how startups (when they were little) like Instagram, Path, and Pinterest have managed to do this. But everyone out there seems to despise this stuff as much as I'm growing to...
I would love for someone to simply map out EXACTLY how I need to create a back-end database that I can save and query data to and from that many users can see. That means that just SQLite, Core Data, or Parse by itself isn't going to work here!
A tutorial of some kind would be incredible.
First off, technologies like CoreData and sqlite are typically local device storage. Local device storage is not going to get you shared cloud storage.
Parse.com is a fast way for devices to access cloud storage and get going fast. Especially useful for games and other mobile apps to access cloud data via an app id and app key. It's simple storage to avoid creating your own backend if it fills all your needs and requirements.
When you get to a multi-tenant cloud backend where you roll your own services and multiple devices accessing your cloud application you need to look into exposing your web API. Exposing RESTful API over http is great for devices and web clients. Exposing the data as JSON is especially conventient for the web and easily consumed by devices.
Those web service end points in the cloud access some sort of backend storage which is optimized for concurrent access by mutliple clients. This is typically a SQL backend like MySQL, SQLServer etc... or a NoSQL solution like mongodb, couchDB, etc...
Some front end web api technologies to look into:
ASP.net web api
Ruby on Rails
Node.js
etc...
Some back end storage technologies to look into:
SQL: MySQL, SQLServer/Azure SQL, Oracle
NoSQL: MongoDb, CouchDb, Amazon S3 simple storage, etc...
If the data is used by many many multi-tenant clients, the backends can scaled up (larger and larger) or get sharded. Sharding is where the data for multiple users is split into many databases or datastores with some sort of lookup algorithm for requests to find where that users data is stored. The front end web api servers abstract the backend storage.
Finally, you'll end up needing some sort of caching/fast lookup technology (if you're successful :):
Redis: fast in memory storage over sockets
memcached: facebook uses - simple key value in memory caching across many front end servers.
Your question is an open ended up broad question so start by googling many of these terms and technologies.
Each of these links will have resources and tutorials. Get a cloud VM, play with each and decide which fits your needs best. There is no one size fits all solution.
I am looking for an API that performs functionality roughly analogous to Rackspace Cloud Files / OpenStack Swift, Microsoft Azure Blob Storage, or Amazon S3 that can be run on a Windows Server.
I am not speaking of all the add-ons including replication, etc, but an API that enables a similar RESTful API for the storage/serving (including Anonymous). Some examples of functionality I like, and would be missing if I rolled my own right now, are:
Rackspace's Large Files support.
Amazon S3's Root Document support.
Microsoft Azure BLOB storage Page Blobs and Authentication.
Options like MongoDB's GridFS are getting close, but wouldn't quite cut it. RavenDB's "Attachments" functionality is pretty close, I understand it only supports up to 2Gb via the ESENT storage engine
Just to clarify, I'm not exactly sure what form this would take. I'm not looking for a pre-built product (which I don't see exists), but perhaps a stub of a project, an open source project planning to provide this functionality, people who might have developed their own similar solution in C#, etc.
We have RavenFS that handles that scenario, I think.
It is a commercial offering, though.