How to store and organize uploaded images on webserver? - server

I am writing a server that allows user to upload images. It appears that most people tend to store those files on the filesystem directly.
My question would be if that really is the way how to do it. I'm not familiar with the capacities of a server but what I'm curious about is e.g. how to make sure that the server does not run out of (hard drive) memory?
I would also like to know how one would organize those files for many different users. Is it enough to just store it like war/images/<user-database-id>/<uuid-for-image>.(jpeg|png) by just using the user ID inside the database or are there a lot more things to consider when it comes to storing images?

I think your best bet would be to use a cloud storage system such as Amazon S3, Google Cloud Storage, Rackspace, or MS Azure.
Using a path like the one you suggested ought to be possible but you could also omit the user-database-id if that database already gives you a list of objects owned by that user.

Related

How can I have a software access files on the cloud

So I have a small company with plenty of documents and I want to set up an archiving system. I have several employees with different levels of permissions to access the files on the server. This will serve as an archive system plus a management system, as employees can read and write files (depending on the permission) for a certain project, then the admin can prevent access to certain directory (i.e. project).
So after some research I think the best idea is to have a cloud-based NAS in which a user can have locally by giving the correct username and password. Then a software will access these files (which are now local) and can display some data (e.g. project progress, minutes of meetings), or the user can access the files directly.
Does any of this make sense? I mean is that what NAS can actually do and can it be done on the cloud? and can users access the file system (with restrictions) given username and password (much like if it were a network). Is there a better alternative for my purposes?
To the best of my knowledge, I can, instead, create a software that accesses the cloud directly, but how can I get the users to write files and be stored on the cloud? won't that be more complicated to implement? Can I use an RDMS for it? I've used it before but never for files.
If I understand your use case correctly, all you really want is to have access to different files for different roles within your company, is this correct?
To the best of my knowledge, I believe that Google provide corporate accounts which are quite affordable which should have access control schemes suiting what you need (after all, storing files on scaling storage, with various access controls in an offsite location and with redundancy is partly what the cloud is for).
If not, or if this solution isn't appealing to you and you would prefer to use your NAS, the best way to do this would be to use Google's Backup and Sync application (you can download this by clicking the cog icon on Drive and selecting it). If you install and run this on an admin computer that is always on (and always connected (mounted) with your NAS), you can set a root folder on the NAS as your Drive sync folder. Any files added to this folder will be uploaded to Drive, and any added to Drive will be automatically downloaded. After this you can configure the access control on the NAS using various user accounts and roles, and have each employee mount the store using their own credentials, revealing only the files they have access to.

Which kind of Google Cloud Platform mobile backend client is appropriate?

THE PROBLEM
I'm writing a mobile app which will allow a user to log in, save some preferences that must be stored in a database, and display congressional bills to the user.
I've only written simple RESTful services with PHP and MySQL in the past. I'd like to take advantage of newer technologies, and am a little lost on general direction.
The bill data (formatted as JSON) can be gathered by running the scrapers found here. Using docker, I managed to set a working directory and download the files on my local machine.
I've designed a MySQL database for holding the relevant bill and user data.
I started to mess around in Google Cloud Platform, and read the doc that describes different models. I'm thinking of a few different ideas, but aren't familiar with GCP or what I can actually accomplish.
QUESTIONS
1) What are App Engine, Compute Engine, and Container Engine each for? I get the gist that Container Engine holds different instances of stuff you load up with docker, and that Compute Engine sets up a VM, but I don't really understand the relationships. How should I think of them?
2) When I run those scrapers from the shell, where are the files being stored, and how can I check on them? On my computer, I set a working directory, but how do directories work in GCP? Is it just a directory in the currently selected VM, or is this what Buckets are for?
IDEAS
1) Since my bill data already comes as JSON, should I skip the entire process of building a database for the bills and insert them into Firebase somehow? Is this even possible? If so, am I stuck using Firebase's NoSQL, or can I still set up a relational database?
2) I could schedule the scrapers to run periodically, detect new files, and run a script to parse the JSON and insert new bill data into my a database (PostgrSQL?/MySQL?). Then I would write an API.
3) Download the JSON files to a bucket, and write an API that reads from them. Not sure how the performance would compare to using a DB.
I'm open to other suggestions as well.
For your use case (stateless web application), App Engine is probably your best choice. The Google documentation has severalcomparisons of your computing options
You can use App Engine with PHP and cloud-hosted MySQL if you want, which could be a good way to get your toes wet without going in over your head.

Sugestions about file storage in Amazon AWS

I'm developing a Asp.Net MVC project that will be hosted in Amazon AWS, but I have some questions about storage of the client's files. The documentation from Amazon is not clear to me and I'm looking for some directions and experiences here.
1 - each client have a few files with low space disk requirements, low update frequency but very high access frequency (like brand image and even sensitive files like certificates). Is appropriate to storage this files in app_data folder in web server?
2 - the most critical to me are sensitive documents (from hundreds to dozen of thousands per client, most like xml signed files). This files has a medium read access frequency but a very high demand for creation. One solution I found is MongoDB, wich give me some freedom to manage the storage policy and allow me a external backup easy, but I'm not sure about that. Other options are to use the Amazon Storage and handle all this files and GBs in there with a lot of folders or maybe use a regular database and save the files as xml or bin.
My concerns are about the amount of data, the security and the reliable in case of disaster as most of this documents has legal value.
You could, but storing them locally, violates the shared nothing architecture and would limit your scaling options. Amazon S3 is a good option here. You can set some files public and serve them direct from s3 (or with cloudfront) and keep other private and provide access via signed urls.
Again, you can put the files on s3 and make them private. You will still probably store references to the files in your database. Generally its not a great idea to store large blob files in a database since they are often not well optimized to access them.

uploading images to php app on GCE and storing them onto GCS

I have a php app running on several instances of Google Compute Engine (GCE). The app allows users to upload images of various sizes, resizes the images and then stores the resized images (and their thumbnails) in the storage disk and their meta data in the database.
What I've been trying to find is a method for storing the images onto Google Cloud Storage (GCS) through the php app running on GCE instances. A similar question was asked here but no clear answer was given there. Any hints or guidance on the best way for achieving this is highly appreciated.
You have several options, all with pros and cons.
Your first decision is how users upload data to your service. You might choose to have customers upload their initial data to Google Cloud Storage, where your app would then fetch it and transform it, or you could choose to have them upload it directly to your service. Let's assume you choose the second option, and you want users to stream data directly to your service.
Your service then transforms the data into a different size. Great. You now have a new file. If this was video, you might care about streaming the data to Google Cloud Storage as you encode it, but for images, let's assume you want to process the whole thing locally and then store it in GCS afterwards.
Now we have to get a file into GCS. It's a PHP app, and so as you have identified, your main three options are:
Invoke the GCS JSON API through the Google API PHP client.
Invoke either the GCS XML or JSON API via custom code.
Use gsutil.
Using gsutil will be the easiest solution here. On GCE, it automatically picks up appropriate credentials for your service account, and it's got several useful performance optimizations and tuning that a raw use of the API might not do without extra work (for example, multithreaded uploads). Plus it's already installed on your GCE instances.
The upside of the PHP API is that it's in-process and offers more fine-grained, programmatic control. As your logic gets more complicated, you may eventually prefer this approach. Getting it to perform as well as gsutil may take some extra work, though.
This choice is comparable to copying files via SCP with the "scp" command line application or by using the libssh2 library.
tl;dr; Using gsutil is a good idea unless you have a need to handle interactions with GCS more directly.

Packaged App: syncFileSystem / fileSystem API - For *large* files

I am looking to develop a Chrome Packaged App that will (at a very simple level) provide a dynamic form filling UI - but allow users to attach large attachments to the forms (could be upwards of 10 files of 10MB each). I would like to have the ability to save and share the form data and the attachment via Google Drive. The forms will be completed collaboratively by multiple team members who also need to all see the attachments. Imagine a form front-end/metadata that sits on top of a shared Google Drive folder...
I have read the documentation, and learnt that the syncFileSystem API is not intended for use for general and/or large files to be stored in Google Drive, but rather for small configuration data.
I then looked at the fileSytem API - hoping that I could include the Sandboxed folder for the app in the folders that the Google Drive Client App (so that the files get synced automatically) - but it doesn't look like the sandbox is meant to be accessed externally.
My current thinking is to recreate a windows explorer type UI in the packaged app (can use drag and drop) - then store the files in the sandbox using the fileSystem API. I can reuse the code from the Google Drive sample packaged app to implement cloud syncing. Good idea?
Two questions stem from this:
How persistent is the fileSystem API. The documentation mentions that the user can purge all stored files - is this done through 'clearing all browser history' ? In which case they could very easily accidentally wipe many hundreds of MB of useful information that I am storing in the packaged app.
I have read that you can use a 3rd party authentication services (which I want to do). If I use a non-Google account to authenticate my users, how would the Google Drive authentication work ? Would I be able to use a different Google account to perform the cloud storage (i.e. unrelated to the actual end user, who may or may not have a Google account already - which may already be signed in)
It seems like waiting for this https://code.google.com/p/chromium/issues/detail?id=148486 (getting read access to non-sandbox directories) would be the easiest way forward.
I don't think clearing browser history deletes temporary sandbox filesystem files, they're supposed to be sort of automatically garbage collected when space is required. It would make sense if that were another checkbox in the "Clear browsing data" section of chrome's options. Perhaps that would make the answer to your first question more clear :-)
The second point, I am not sure how to do this, but it looks like you have already figured out something? At least that's what this page https://groups.google.com/a/chromium.org/forum/#!topic/chromium-apps/hOYu75Cv0AE seems to indicate