If a file is placed into Google Cloud storage and made public, but a URL to the file does not exist on another webpage, does Google index it in its search results anyway? Anyone know?
Google's search index is independent of its cloud storage. Making a file public in cloud storage does not automatically index it in Google's search results.
However, asking this question leads me to believe that you're probably wondering if it's okay to make a file public and be reasonably sure that the file is not accessible to anyone. If the files are of any sensitive nature to you or your users, this is not the right solution.
If you're using GCS on a website and want to serve files securely, you might want to try the Signed URLs option. If it's just about being indexed, you could add a robots.txt file to the root of your bucket that excludes the file from being indexed.
If you have a public file in Google's cloud storage, the URL to that file would have to be indexed, otherwise the servers would have no way of finding it.
The same thing happens with Google Drive. A public URL is still indexed, and is still available within the server's lookup tables, even if a hard link doesn't exist anywhere on the web.
Related
Can anyone give me an overview about google drive API, please?
What are its abilities?
What are its drawbacks?
How is access obtained?
I searched about it but,don't understand any things.
Using Google drive API like any API, you can send request to Authenticate, do some thing or get data.
In case of google drive you can upload, download, rename ,share files more and more.
to find all possibilities have a look on this link (look at the left pane):
https://developers.google.com/drive/api/v3/about-sdk
It helps you save, read and sync files stored in your Google Drive account directly from an app. For example: if you want to save a file from an application and don't want to store it on your mobile, you can store it in Google Drive. Likewise if you want to upload a file to an app or just see a file from Drive in you app.
Here are the steps to enable the Drive API for your project: https://developers.google.com/drive/api/v3/enable-sdk
Google Drive API is a REST API gives you a group of APIs along with client libraries, language-specific examples, and documentation to help you develop apps that integrate with Drive.
The core functionality of Drive apps is to download and upload files in Google Drive. Think of the drive api as simply a file store. The information you have avalaibe is only the information about the files themselves. Name, size, type ... drive api also contains the information about sharing files and who last accessed the file. The Drive api also has a limited ablity to conert files from one type to another. The google drive api does NOT give you the ability to edit the continence of the files. The google sheets api does however give you the ability to edit a google sheet but that is a different api.
Access
The information available on Google drive api is private user data. That means that in order to access that data you must have the permission of the user who owns it. Gaining this permission is most often done though Oauth2 where by the application in question request the users consent to access the data.
Libraries
If you are considering developing an application to use the Google Drive api I recommend that you look for a client library in your chosen language. The client libraries are designed to help you develop your application quickly. There is normally a lot of documentation found for the different libraries. You can find a number of quickstarts within the documentation.
I am writing a server that allows user to upload images. It appears that most people tend to store those files on the filesystem directly.
My question would be if that really is the way how to do it. I'm not familiar with the capacities of a server but what I'm curious about is e.g. how to make sure that the server does not run out of (hard drive) memory?
I would also like to know how one would organize those files for many different users. Is it enough to just store it like war/images/<user-database-id>/<uuid-for-image>.(jpeg|png) by just using the user ID inside the database or are there a lot more things to consider when it comes to storing images?
I think your best bet would be to use a cloud storage system such as Amazon S3, Google Cloud Storage, Rackspace, or MS Azure.
Using a path like the one you suggested ought to be possible but you could also omit the user-database-id if that database already gives you a list of objects owned by that user.
I'm developing a Asp.Net MVC project that will be hosted in Amazon AWS, but I have some questions about storage of the client's files. The documentation from Amazon is not clear to me and I'm looking for some directions and experiences here.
1 - each client have a few files with low space disk requirements, low update frequency but very high access frequency (like brand image and even sensitive files like certificates). Is appropriate to storage this files in app_data folder in web server?
2 - the most critical to me are sensitive documents (from hundreds to dozen of thousands per client, most like xml signed files). This files has a medium read access frequency but a very high demand for creation. One solution I found is MongoDB, wich give me some freedom to manage the storage policy and allow me a external backup easy, but I'm not sure about that. Other options are to use the Amazon Storage and handle all this files and GBs in there with a lot of folders or maybe use a regular database and save the files as xml or bin.
My concerns are about the amount of data, the security and the reliable in case of disaster as most of this documents has legal value.
You could, but storing them locally, violates the shared nothing architecture and would limit your scaling options. Amazon S3 is a good option here. You can set some files public and serve them direct from s3 (or with cloudfront) and keep other private and provide access via signed urls.
Again, you can put the files on s3 and make them private. You will still probably store references to the files in your database. Generally its not a great idea to store large blob files in a database since they are often not well optimized to access them.
I have a php app running on several instances of Google Compute Engine (GCE). The app allows users to upload images of various sizes, resizes the images and then stores the resized images (and their thumbnails) in the storage disk and their meta data in the database.
What I've been trying to find is a method for storing the images onto Google Cloud Storage (GCS) through the php app running on GCE instances. A similar question was asked here but no clear answer was given there. Any hints or guidance on the best way for achieving this is highly appreciated.
You have several options, all with pros and cons.
Your first decision is how users upload data to your service. You might choose to have customers upload their initial data to Google Cloud Storage, where your app would then fetch it and transform it, or you could choose to have them upload it directly to your service. Let's assume you choose the second option, and you want users to stream data directly to your service.
Your service then transforms the data into a different size. Great. You now have a new file. If this was video, you might care about streaming the data to Google Cloud Storage as you encode it, but for images, let's assume you want to process the whole thing locally and then store it in GCS afterwards.
Now we have to get a file into GCS. It's a PHP app, and so as you have identified, your main three options are:
Invoke the GCS JSON API through the Google API PHP client.
Invoke either the GCS XML or JSON API via custom code.
Use gsutil.
Using gsutil will be the easiest solution here. On GCE, it automatically picks up appropriate credentials for your service account, and it's got several useful performance optimizations and tuning that a raw use of the API might not do without extra work (for example, multithreaded uploads). Plus it's already installed on your GCE instances.
The upside of the PHP API is that it's in-process and offers more fine-grained, programmatic control. As your logic gets more complicated, you may eventually prefer this approach. Getting it to perform as well as gsutil may take some extra work, though.
This choice is comparable to copying files via SCP with the "scp" command line application or by using the libssh2 library.
tl;dr; Using gsutil is a good idea unless you have a need to handle interactions with GCS more directly.
I am looking to develop a Chrome Packaged App that will (at a very simple level) provide a dynamic form filling UI - but allow users to attach large attachments to the forms (could be upwards of 10 files of 10MB each). I would like to have the ability to save and share the form data and the attachment via Google Drive. The forms will be completed collaboratively by multiple team members who also need to all see the attachments. Imagine a form front-end/metadata that sits on top of a shared Google Drive folder...
I have read the documentation, and learnt that the syncFileSystem API is not intended for use for general and/or large files to be stored in Google Drive, but rather for small configuration data.
I then looked at the fileSytem API - hoping that I could include the Sandboxed folder for the app in the folders that the Google Drive Client App (so that the files get synced automatically) - but it doesn't look like the sandbox is meant to be accessed externally.
My current thinking is to recreate a windows explorer type UI in the packaged app (can use drag and drop) - then store the files in the sandbox using the fileSystem API. I can reuse the code from the Google Drive sample packaged app to implement cloud syncing. Good idea?
Two questions stem from this:
How persistent is the fileSystem API. The documentation mentions that the user can purge all stored files - is this done through 'clearing all browser history' ? In which case they could very easily accidentally wipe many hundreds of MB of useful information that I am storing in the packaged app.
I have read that you can use a 3rd party authentication services (which I want to do). If I use a non-Google account to authenticate my users, how would the Google Drive authentication work ? Would I be able to use a different Google account to perform the cloud storage (i.e. unrelated to the actual end user, who may or may not have a Google account already - which may already be signed in)
It seems like waiting for this https://code.google.com/p/chromium/issues/detail?id=148486 (getting read access to non-sandbox directories) would be the easiest way forward.
I don't think clearing browser history deletes temporary sandbox filesystem files, they're supposed to be sort of automatically garbage collected when space is required. It would make sense if that were another checkbox in the "Clear browsing data" section of chrome's options. Perhaps that would make the answer to your first question more clear :-)
The second point, I am not sure how to do this, but it looks like you have already figured out something? At least that's what this page https://groups.google.com/a/chromium.org/forum/#!topic/chromium-apps/hOYu75Cv0AE seems to indicate