I want to upload 120 files, each around 1.2GB so about 150GB in total, from an HTTPS website onto my Google Cloud Storage.
I really, really don't want to have to download them all locally, and then upload them individually.
Is there any way around this? Surely I can just give Google Cloud Storage a URL to pull from? I don't control the HTTPS server.
It seems to be possible to upload from S3 to Google Cloud Storage, but S3 seems to suffer from the same problem.
If your website allows public access you can use the GCS Transfer Service to do it: https://cloud.google.com/storage/transfer/
Related
I need to sync (not mirror) files between a local disk and a cloud bucket. I can think of something like the Google Drive app, that works also in offline mode (and when the local PC goes online it automatically syncs data). This is useful for the app I'm going to develop, because of offline usage.
I dig a lot into the documentation but I didn't find any useful resource.
I can use gcloud rsync in combination with a Cloud Function to listen to cloud bucket events.
And a custom, local, trigger for events on the local hard disk (let's assume I'm developing a Node.JS local app).
But then I've to handle edge situations like: offline, concurrent operations, very long transfers, permissions, etc.
I don't want to re-invent the wheel and I think this is a common pattern, like the previously mentioned GDrive app.
Also, Firestore Native Mode does implement something really close to, although it's related to documents and not files.
Does Google Cloud Platform and/or Firebase allow the synchronization of local folder with cloud bucket with ease?
What do you think about my approach?
As you mentioned, these functions are implemented in Google Drive / One, and for these products, this is the main intention "to be a cloud drive" (basically all the time it is in sync with your local devices).
On the other hand, Google Cloud Storage is a service with a different approach, this is an object-based storage and was designed as part of the Cloud architecture to interact with cloud services (always online services), at this time Google does not offer a similar software client (as Google Drive does) for syncing local and cloud folders.
I found this third party software (not supported by Google) that allows syncing between local folders and cloud storage
Also I reviewed the pricing for Google One and cloud storage and Google One is significantly cheaper, for example.
2 TB / month G ONE: $ 10 USD
2TB / month G storage: $ 40 USD
Based on this, you should also add the price of additional services, for example.
pubsub service
cloud function service
outgoing network traffic
Your approach sounds good (it takes a lot of effort but it's okay) but unfortunately you are trying to use a service in an off-design scenario.
If you want to save code in the cloud, you can use Google Cloud Repositories which basically works like Github, but has the advantage of being easily integrated with CI/CD services like Google Cloud Build
My Java backend server has to upload files to the Google Cloud Storage (GCS).
Right now I just run
public void store(MultipartFile multipartFile) throws IOException {
Storage storage = StorageOptions.getDefaultInstance().getService();
storage.create(
BlobInfo.newBuilder(
BUCKET_NAME,
Objects.requireNonNull(multipartFile.getOriginalFilename()))
.build(),
multipartFile.getBytes()
);
}
Having set GOOGLE_APPLICATION_CREDENTIALS=$PROJECT_DIR$/project-1234-abcdefg.json in my environment.
However, this makes things complicated for my deployment setup. I don't know how I would go about making this file available to my service.
Is there another way to get access to GCS for my service account?
Background
I am deploying my server to Heroku as a compiled jar file and I don't know how to make the credentials available to my server during deployment.
You need a Google Account to access to GCS, either personal or technical. Technical is a service account.
However, you have another solution, but not really easy to implement. I wrote an article for securing serverless product with Cloud Endpoint with and API Key. Here your serverless solution can be Cloud Storage. But that implies that you call GCS with REST API and not with the java library, not very fun. That also implies additional cost for the hosting and the processing time of Cloud Endpoint.
Note: you can improve the authorization from API Key to Firebase auth or something else if you prefer. Check the Cloud Endpoint authentication capabilities
Note2: Google is working on another authentication mechanism but I don't know at which stage are the developments, and if it's plan for 2020. In any case, your constraint is known and addressed by Google
I am newbie at cloud servers and I've opened a google cloud storage to host image files. I've verified my domain and configured it, to view images via my domain. The problem is, same file is both accessible via my domain example.com/images/tiny.png and also via storage.googleapis.com/example.com/images/tiny.png Is there any solution to disable access via storage.googleapis.com and use only my domain?
Google Cloud Platform Support Version:
NOTE: This is the reply from Google Cloud Platform Support when contacted via email...
I understand that you have set up a domain name for one of your Cloud Storage buckets and you want to make sure only URLs starting with your domain name have access to this bucket.
I am afraid that this is not possible because of how Cloud Storage permission works.
Making a Cloud Storage bucket publicly readable also gives each of its files a public link. And currently this public link can’t be disabled.
A workaround would be implement a proxy program and running it on a Compute Engine virtual machine. This VM will need a static external IP so that you can map your domain to it. The proxy program will be in charged of returning the requested file from a predefined Cloud Storage bucket while the bucket keeps to be inaccessible to the public.
You may find these documents helpful if you are interested in this workaround:
1. Quick start to set up a Linux VM (1).
2. Python API for accessing Cloud Storage files (2).
3. How to download service account keys to grant a program access to a set of services (3).
4. Pricing calculator for getting a picture on how much a VM may cost (4).
(1) https://cloud.google.com/compute/docs/quickstart-linux
(2) https://pypi.org/project/google-cloud-storage/
(3) https://cloud.google.com/iam/docs/creating-managing-service-account-keys
(4) https://cloud.google.com/products/calculator/
My Version:
It seems the solution to this question is really a simple, just FUSE Google Cloud Storage with VM Instance.
After FUSE private files from GCS can be accessed through VM's IP address. It made Google Cloud Storage Bucket act like a directory.
The detailed documentation about how to setup FUSE in Google Cloud is here.
There is but it requires you to do more work.
Your current solution works because you've made access to the GCS bucket (example.com), public and then you're DNS aliasing from your domain.
An alternative approach would be for you to limit access to the GCS bucket to one (possibly several) accounts and then run a web-server that uses one of the accounts to access your image files. You could then also either permit access to your web-server to anyone or also limit access to it.
More work for you (and possibly cost) but more control.
I have migrated files from Parse.com to my hosted parse server using "https://github.com/parse-server-modules/parse-files-utils" tool by applying "Option-2".
Now My problem is when I click on the image in my hosted parse server dashboard, it will show me message "File not found." and my url is like,
http://ip of my server:1337/parse/files/OE9gP1wrd2OT9avp3RBmt8zysmM25wRTMtDOxsfe/tfss-6ca44378-72fb-4ddf-aef2-11af0485b11b-profile-pic
If I upload new image from mobile aap, its working fine.
I have installed mongodb and migrated parse.com data to newly created database in mongodb.
I am not using any FileAdapter in my new created parse server.
Thanks in advance, kindly please look into this issue and help me that how can I display migrated images in our hosted parse server.
What is the http error you are seeing (404?) Are you sure the error is "file not found"? Maybe your server folder permission is set not set to public, so you can't publicly access the files (should be a 403 error).
You usually store the image files to a container (storage) that can be accessed via APIs like Amazon (AWS) or Microsoft Azure. It's usually more efficient to keep you local server file storage small and have fast access speeds to your images.
You can find out how to setup an Amazon S3 bucket or Google Cloud Storage here.
You can find out how to setup an Azure Storage here and connect it to your parse-server using this adapter.
I'm not sure about AWS, but Google and Azure gives you free credits if you sign up, and (at least for Azure) the storage aren't too expensive, so the free credits can last you a while...
Two fold question:
1) Does google have a solution to allow third party clients to FTP into a Cloud Storage Bucket and get objects?
2) If this is not possible, what is the best way to sync items in a cloud storage bucket with an FTP Server?