I am trying to build my app and use google drive to store data. I would like to have complete history of file changes. Is that possible with google drive through revisions?
UPD: If it matters, I am plan to store compact JSON data. Very unlikely to be more than 1MB.
Based on the documentation, it is possible to track all the file changes; but in order to keep all the revisions, you need use the keepRevisionForever parameter. See Manage Revisions.
Google Drive automatically purges (or "prunes") revisions in order to optimize disk usage. To prevent this from happening, you can set the boolean flag keepRevisionForever to true to mark revisions that you don't want Drive to purge.
You can also check the Revisions resource which has list, get, delete and update functions.
Related
I'm storing backups in Cloud Storage. A desirable property of such a backup is to ensure the device being backed up cannot erase the backups, to protect against ransomware or similar threats. At the same time, it is desirable to allow the backup client to delete so old files can be pruned. (Because the backups are encrypted, it isn't possible to use lifecycle management to do this.)
The solution that immediately comes to mind is to enable object versioning and use lifecycle rules to retain object versions (deleted files) for a certain amount of time. However, I cannot see a way to allow the backup client to delete the current version, but not historical versions. I thought it might be possible to do this with an IAM condition, but the conditional logic doesn't seem flexible enough to parse out the object version. Is there another way I've missed?
The only other solution that comes to mind is to create a second bucket, inaccessible to the backup client, and use a Cloud Function to replicate the first bucket. The downside of that approach is the duplicate storage cost.
To answer this:
However, I cannot see a way to allow the backup client to delete the current version, but not historical versions
When you delete a live object, object versioning will retain a noncurrent version of it. When deleting the noncurrent object version, you will have to specify the object name along with its generation number.
Just to add, you may want to consider using a transfer job to replicate your data on a separate bucket.
Either way, both approach (object versioning or replicating buckets) will incur additional storage costs.
I use gcloud node v0.24 for interacting with Google Cloud Storage. I've encountered an issue when immediate list after upload doesn't return all the files that were uploaded.
So the question is
does Bucket#getFiles always list files right after Bucket#upload?
or
is there any delay between upload's callback and when file becomes available (e.g. can be listed, downloaded)?
Note: below answer is no longer up to date -- GCS object listing is strongly consistent.
Google Cloud Storage provides strong global consistency for all read-after-write, read-after-update, and read-after-delete operations, including both data and metadata. As soon as you get a success response from an upload message, you may immediately read the object.
However, object and bucket listing is only eventually consistent. Objects will show up in a list call after you upload them, but not necessarily immediately.
In other words, if you know the name of an object that you have just uploaded, you can immediately download it, but you cannot necessarily discover that object by listing the objects in a bucket immediately.
For more, see https://cloud.google.com/storage/docs/consistency.
I would like to synchronize uploads from our own server to our clients' dropboxes to which we have full access. syncing changes on dropbox is easy because i can use the delta call, but I need a more efficient way to identify and upload changes made locally to dropbox.
The sync api would be amazing for this but I'm not trying to make a mobile app so the languages with the api are not easily accessible (AFAIK). Is there an equivalent to the sync api for python running on a linux server?
Possible solution:
So far, I was thinking of using anydbm to store string,string dictionaries that would hold folder names as the key and the hash generated from the metadata call from the server. then I could query the dropbox and every time I run into a folder, I will check the folder compared with the metadata on the anydbm.
if there is a difference, compare the file dates/sizes in the folder and if there are any subfolders, recurse the function into them,
if it the same, skip the folder.
This should save a substantial amount of time compared to the current verification of each and every file, but if there are better solutions, please do let me know.
I have a "Projects" folder which contains dozens of Visual Studio projects. I want to create a backup for them. First I thought I should copy them all to my SkyDrive or DropBox folders and let them be synced to the cloud whenever there is a change.
The other strategy would be using a source control but I don't want the backup to take place whenever a change is made and it should be optimized. By that I mean, only the changed files and only the changed parts should be uploaded to the server to save my bandwidth. I don't have a very good connection (512 Kbps).
Also my codes are very valuable for me so security is very important to me.
Is there a way to achieve the automatic backup to the cloud (ideally free) and take advantage of the source control options (such as revisions, etc.)?
I'm sure a lot of people have solutions for this and a lot of people have the same problem so please let the question be answered instead of just clicking "close"!
Use GitHub or BitBucket. You have all the benefits of version control and a cloud storage for your repositories.
You can commit changes as often as you like, and only need traffic when you push or pull changes to or from the server. The version control systems are smart enough to sent only the modified files.
You could even have a team working on a local network, without the need of a cloud solution and only push to the cloud server periodically just for backup. To do that, you can create a script that pulls from your local repository and pushes to the server. That script can be run in a scheduler.
Apart from the service used to backup your files, I think you should use version control anyway. As a programmer I don't think you can live without.
This might be of interest to you.
The idea is that you create just the Source Control repository in Dropbox, and check out an actual copy onto your machine.
You could then only commit (which would trigger the sync) the files you've modified, and that was also reserve all of your history for those projects.
In our project we are following agile practices ( Sprint ). So every day nightly build will be done. We are able to ensure the correctness of build till day before formal build. But unfortunately most of the time people are doing some major check-in at final day.
We wanted to lock some of the highly sensitive elements which would cause more trouble.
We do not want to lock the integration stream itself. We just wanted to lock some files and folders automatically. Is there any way to do it using Cleartool , (or cleartool commands in powershell)
I would not recommend locking the vob or the files:
both options would lock everything (ie any modification in any branch) for all (or most) users.
you need (from the cleartool lock man page) to be the type owner, VOB owner or root to be able to lock the files or a vob: if one of those sensitive files isn't created by you, the lock will fail (and the vob itself has likely been created by an admin)
the maintenance is too cumbersome for files (you need to maintain the list of files you want to lock)
Locking the stream or at least the branch is still your best option.
It is one simple atomic operation target to lock the right environment.
Combined with the -nusers option, you can still authorized some users to do what they need (checkout/checkins)
The OP comments:
Actually I want to prevent all the users from delivering those sensitive files.
If I lock the stream for particular user it will not serve the purpose. It will stop them delivering other files too
The -nuser option lock for all users except a few.
The idea behind the integration stream is that is is not the user who make the deliver, but the stream integration owner who, at his/her own time, makes the deliver. If that stream is locked for everyone but the integrator, he/she can control the deliver
However, that puts the control of those sensitive file on the integrator (again, locking just those file would be a bad idea, and would make sure that any deliver fails, because of those locks)
If you still want them to deliver while being able to control that the build only use a certain version of those files, then I would rather recommend:
not locking the stream
putting a baseline before final day
tweaking your build script in order for it to:
use whatever version found on final day
except for those "sensitive files" where the script would fetch their baselined version (and not the LATEST version found on final day, because said LATEST version might have been changed by some final deliver).
See for instance "Clearcase command to export an element" or
"In ClearCase, how can I view old version of a file in a static view, from the command line?".