We have a new ThirdParty File Based Integration coming for one of our projects. Its a different company that puts a file and we would need to grab the file and continue processing. We need this file needs to be transfered securely. So First Option that comes is SFTP however we are thinking of doing a research on SFTP vs Google Cloud Storage. Can Google Cloud Storage be used instead of SFTP ? what are the pros and cons of the same. Thanks!
These are two different sorts of things. Google Cloud Storage is a service that stores files. SFTP is a protocol for transferring files between two computers.
If your only goal is to transfer a file from computer A to computer B, and both can speak to each other via SFTP, then that's a perfectly good solution.
That said, services like GCS are commonly used as a drop box for large files as part of a distributed workflow. For instance, one service might record video and upload that video to GCS, and then another service might later transcode that video or take some other action on it. That's also perfectly reasonable.
So, I guess the answer is that it depends on what you want to do.
Related
THE PROBLEM
I'm writing a mobile app which will allow a user to log in, save some preferences that must be stored in a database, and display congressional bills to the user.
I've only written simple RESTful services with PHP and MySQL in the past. I'd like to take advantage of newer technologies, and am a little lost on general direction.
The bill data (formatted as JSON) can be gathered by running the scrapers found here. Using docker, I managed to set a working directory and download the files on my local machine.
I've designed a MySQL database for holding the relevant bill and user data.
I started to mess around in Google Cloud Platform, and read the doc that describes different models. I'm thinking of a few different ideas, but aren't familiar with GCP or what I can actually accomplish.
QUESTIONS
1) What are App Engine, Compute Engine, and Container Engine each for? I get the gist that Container Engine holds different instances of stuff you load up with docker, and that Compute Engine sets up a VM, but I don't really understand the relationships. How should I think of them?
2) When I run those scrapers from the shell, where are the files being stored, and how can I check on them? On my computer, I set a working directory, but how do directories work in GCP? Is it just a directory in the currently selected VM, or is this what Buckets are for?
IDEAS
1) Since my bill data already comes as JSON, should I skip the entire process of building a database for the bills and insert them into Firebase somehow? Is this even possible? If so, am I stuck using Firebase's NoSQL, or can I still set up a relational database?
2) I could schedule the scrapers to run periodically, detect new files, and run a script to parse the JSON and insert new bill data into my a database (PostgrSQL?/MySQL?). Then I would write an API.
3) Download the JSON files to a bucket, and write an API that reads from them. Not sure how the performance would compare to using a DB.
I'm open to other suggestions as well.
For your use case (stateless web application), App Engine is probably your best choice. The Google documentation has severalcomparisons of your computing options
You can use App Engine with PHP and cloud-hosted MySQL if you want, which could be a good way to get your toes wet without going in over your head.
I am writing a server that allows user to upload images. It appears that most people tend to store those files on the filesystem directly.
My question would be if that really is the way how to do it. I'm not familiar with the capacities of a server but what I'm curious about is e.g. how to make sure that the server does not run out of (hard drive) memory?
I would also like to know how one would organize those files for many different users. Is it enough to just store it like war/images/<user-database-id>/<uuid-for-image>.(jpeg|png) by just using the user ID inside the database or are there a lot more things to consider when it comes to storing images?
I think your best bet would be to use a cloud storage system such as Amazon S3, Google Cloud Storage, Rackspace, or MS Azure.
Using a path like the one you suggested ought to be possible but you could also omit the user-database-id if that database already gives you a list of objects owned by that user.
I'm planning to make web application which allows users to upload music/audio files and host them etc, i'm wondering what the best method would be to go about this, i have used cloudinary in previous projects for image hosting but nothing for audio.
What do companies like Soundcloud use if not there own service which i am assuming is the case.
What would you recommend? It will be vital when it comes to building a scalable and reliable service so I don't want to go into this project uneducated.
ps. I will be using meteor and mongodb to build the application.
I'd recommend getting started with edgee:slingshot in your app. It's much lighter on your Meteor server since uploads and downloads go straight to the storage system. There you have several choices including S3, Google Cloud Storage, and Rackspace Cloud. You could also use CollectionFS but slingshot seems architecturally better suited to this class of problem.
I have a php app running on several instances of Google Compute Engine (GCE). The app allows users to upload images of various sizes, resizes the images and then stores the resized images (and their thumbnails) in the storage disk and their meta data in the database.
What I've been trying to find is a method for storing the images onto Google Cloud Storage (GCS) through the php app running on GCE instances. A similar question was asked here but no clear answer was given there. Any hints or guidance on the best way for achieving this is highly appreciated.
You have several options, all with pros and cons.
Your first decision is how users upload data to your service. You might choose to have customers upload their initial data to Google Cloud Storage, where your app would then fetch it and transform it, or you could choose to have them upload it directly to your service. Let's assume you choose the second option, and you want users to stream data directly to your service.
Your service then transforms the data into a different size. Great. You now have a new file. If this was video, you might care about streaming the data to Google Cloud Storage as you encode it, but for images, let's assume you want to process the whole thing locally and then store it in GCS afterwards.
Now we have to get a file into GCS. It's a PHP app, and so as you have identified, your main three options are:
Invoke the GCS JSON API through the Google API PHP client.
Invoke either the GCS XML or JSON API via custom code.
Use gsutil.
Using gsutil will be the easiest solution here. On GCE, it automatically picks up appropriate credentials for your service account, and it's got several useful performance optimizations and tuning that a raw use of the API might not do without extra work (for example, multithreaded uploads). Plus it's already installed on your GCE instances.
The upside of the PHP API is that it's in-process and offers more fine-grained, programmatic control. As your logic gets more complicated, you may eventually prefer this approach. Getting it to perform as well as gsutil may take some extra work, though.
This choice is comparable to copying files via SCP with the "scp" command line application or by using the libssh2 library.
tl;dr; Using gsutil is a good idea unless you have a need to handle interactions with GCS more directly.
I am looking to develop a Chrome Packaged App that will (at a very simple level) provide a dynamic form filling UI - but allow users to attach large attachments to the forms (could be upwards of 10 files of 10MB each). I would like to have the ability to save and share the form data and the attachment via Google Drive. The forms will be completed collaboratively by multiple team members who also need to all see the attachments. Imagine a form front-end/metadata that sits on top of a shared Google Drive folder...
I have read the documentation, and learnt that the syncFileSystem API is not intended for use for general and/or large files to be stored in Google Drive, but rather for small configuration data.
I then looked at the fileSytem API - hoping that I could include the Sandboxed folder for the app in the folders that the Google Drive Client App (so that the files get synced automatically) - but it doesn't look like the sandbox is meant to be accessed externally.
My current thinking is to recreate a windows explorer type UI in the packaged app (can use drag and drop) - then store the files in the sandbox using the fileSystem API. I can reuse the code from the Google Drive sample packaged app to implement cloud syncing. Good idea?
Two questions stem from this:
How persistent is the fileSystem API. The documentation mentions that the user can purge all stored files - is this done through 'clearing all browser history' ? In which case they could very easily accidentally wipe many hundreds of MB of useful information that I am storing in the packaged app.
I have read that you can use a 3rd party authentication services (which I want to do). If I use a non-Google account to authenticate my users, how would the Google Drive authentication work ? Would I be able to use a different Google account to perform the cloud storage (i.e. unrelated to the actual end user, who may or may not have a Google account already - which may already be signed in)
It seems like waiting for this https://code.google.com/p/chromium/issues/detail?id=148486 (getting read access to non-sandbox directories) would be the easiest way forward.
I don't think clearing browser history deletes temporary sandbox filesystem files, they're supposed to be sort of automatically garbage collected when space is required. It would make sense if that were another checkbox in the "Clear browsing data" section of chrome's options. Perhaps that would make the answer to your first question more clear :-)
The second point, I am not sure how to do this, but it looks like you have already figured out something? At least that's what this page https://groups.google.com/a/chromium.org/forum/#!topic/chromium-apps/hOYu75Cv0AE seems to indicate