saving "mocked" mongodb collections in version control - mongodb

I am developing on two machines (switching PC and laptop during the week) and try to find the best way to synchronize a limited amount of dummy data in the mongodbs running on each machine.
My first quick & dirty idea was to track the relevant files, containing the collection data inside of a db folder (inside my project folder) by Git.
I saw for example files named e.g. collection-0-1858325258041703863.wt which seem to contain what i am after.
However i am not sure about the rest e.g. index-1-1858325258041703863.wt files, sizestorer.wt, WiredTiger.wt etc..
Which are the files i have to track and which are the ones i can ignore?
I am aware that i could create a mock server with only json files and that saving binaries in VC is bad, but still i am curious if this way is possible (Mainly because working with MongoCompass is way easier than editing json files)

Related

Simple yet fast deployment tool or solution for FTP-based server

I have a simple task to which some simple solution should exist yet I cannot come across one.
I have a huge file tree on computer A (development). I have the same (multiple) such file trees on a computer B (let's call it production). Computer B runs FTP and PHP, nothing much else.
I need to move the changed files from the tree on A to the tree on B but as efficiently as possible. I.e. if just one file changes, it will just transfer that one file. It would be enough to "compare" the local and remote trees using last modification dates, nothing else needed.
I tried to use the good old Ant for it but that really does not work as the FTP task is really bad one there (does not preserve modification dates on PUT and so on). What other options are there if I do not want to write the code for such a task myself? I'd expect there is some tool out there that would make a remote dir listing, download it to local computer, select only those changed files and transfer them to the destination. Do you know how I could do it? Some sort of FTP or PHP-based distributed robocopy?
EDIT: I should have added that I mean doing it on a Windows 10 computer syncing to some FTP/PHP server using command-line automated script, not GUI.
Actually I solved the issue using winscp. I managed to integrate it into ant calling it through the task and using the winscp's synchronize command. For my current folder size it is fast enough, let's see later. The FTP command in ant was not useful since it does not preserve the modification dates.

symlink or an alternative solution on GCS

I am currently using Google cloud storage to store some files. From the PHP side, I am able to use these files just fine. Now I want to extend this functionality to store 4 good versions of these files so that I can change the file path through symlink(or any other alternative way is that's not an option) on PHP side in case the latest set of files get corrupted. I am wondering how to go about this.
I appreciate any suggestions that you might have.
Cloud Storage offers a versioning system as a feature that you need to enable. Versioning allows for you to save a file with the same title and the system archives the previous version and displays the new one. In this case, if there was a corruption, you would have to go into the Cloud Shell and retrieve the previous copy.
If you do not wish to go that route, I can suggest save 4 copies with distinct names(ie: fileName[number]). This way, you would take the newest file, retrieve a substring containing the number, and creating your new file based off the substring.
In both methods, you are able to roll-back to a previous version.
Cloud Storage does not allow for symlinks.

Simple and easy to use tool for managing different versions of files

I want to manage different sets of file versions locally on a machine without using complex version control tools like TFS/Git/SVN...etc. here is my use case:
I have a Windows virtual machine that contains many xml, xslt, xsl, txt...etc. files, the virtual machine gets updated with every release of my product.
Often I need to analyze errors in this virtual machine, so I change many files and run the product and start analyzing, let us call these file changes FileChangeSet1.
based on the results above I need to change other files and maybe some of the files in FileChangeSet1 and do another test.
again based on the results, I need to change more files, eventually I end up with FileChangeSet1, FileChangeSet2...FileChangeSet(n)
I want to:
be able to switch between these file change sets easily and quickly, e.g. have a GUI that shows my my tree of FileChangeSets then click one of them and all files of that change are used.
create file change sets from other file change sets e.g. copy FileChangeSet1 in FileChangeSet2 and change only one file in set 2
I don't want to configure and install a complex version/source control system like TFS/Git/SVN where I have to create a database of all my files first.
Making snapshots of the virtual machine is not an option because it is extremely slow.
I think you would not have much advantage with version control tools even because they are made to version text files. For binary files, I think you would end up like managing several diffent copies of the binary files anyway (at least for older tools such as CVS and SVN).
If you are running in linux, you may want to use cmp/diff tools. Take a look on incremental diff and diff tools such as patchutils.
Consider also to create a checksum of huge files to avoid comparing them for nothing.
ps. also take a look on this - http://jojodiff.sourceforge.net/ - haven't tried but it seems simple to use and promising.
Mercurial is the right tool for me. With it I can solve my business case easily as follows:
Install mercurial on Windows, it integrates in the Windows file explorer.
Create a local version control mercurial database by right clicking my root folder.
Now I can open all my files under my root folder in different text editors e.g. notepad++ and modify these files.
When I want to save/remember a specific status I simply commit the files to mercurial by right clicking the root folder, I can provide a commit note.
Later I can change my files in a different way and test how my system reacts to them, again I can commit these files locally.
Over time I have a history of change sets in Mercurial, I can go back to any change set, branch it, merge it...etc.
I have a huge and complex system that contains thousands of files, my root folder is actually the C:\ drive, I can easily and quickly make out of c: a version control database using mercurial.
All with a simple and intuitive GUI, no command line learning needed.

Hybrid version control & sync system?

Is anyone aware of a hybrid version control and synchronising system?
I'm currently a happy mercurial user, but my projects usually contain a mixture of files.
Most of these (code, documentation, ...) I want to be version-controlled. This is why I use mercurial.
However, on the rare occasion I have files that I would like to synchronise between my working copies, but not version control.
For example, I version control the code I write to do image processing. This code can produce a whole bunch of output images which I'd like to have synchronised so I don't have to remember to shuffle them around my various computers, but there's no point having these version controlled.
To clarify - I am aware of extension to mercurial such as bfiles and bigfiles, which are handy for my image example, but I was just wondering if anyone out there knows of alternative ways to handle this. I just want the one system that I can tell "version control all files except those ones, which should be synced but have no history".
cheers!
EDIT: I could do something like adding a hg marksync <filename> that added <filename> to a list of files to be synced, and then adding a hook to hg push/hg pull that would (say) run rsync (or whichever sync tool) in the background, but I wondered if there was a less hacky solution (I think bfiles/bigfiles do something along these lines anyway).
Version Control System (any) doesn't care about synchronization of
not versioned data
besides default pathes
If you want sync any files - use specially designed for this task tools: f.e. rsync
This code can produce a whole bunch of output images which I'd like to have synchronised
Is this DATA or part of your CODE?
If data: Keep out of your versioning system, just don't go there. If it is part of your code (like layout images) check it in. Those are the only ways which are the generally accepted.
A nice solution for the data would be syncing OR generating them. So you might add a step after deployment to a server: GenerateImages().
edit: In addition to the comment made by the thread starter:
If the images are data and you need to process them on a different system don't think about the version control for your code. It is unrelated. The steps which would make sense to me, in order of processing:
Start with updating your image code, check it in versioning. Then deploy (yes this is deployment) the updated code to the cruncher computer. Now code is done.
Then you have tasks which the number cruncher should handle. Like processing the images. So start that processing from either the cruncher itself (probably some queue happens there) or from a central dispatcher.
Then you have the results locally at the cruncher. Now something has to happen with that data, so that's also part of your software. Decide whether you want the cruncher to send them to some central storage, your workstation or another location. Let the software handle that. This is the most hard part as I read through your question. Many solutions are possible from just FTP/network transfers to specific storage solutions. Willing to help but need more info about the real issues, amounts, sizes etc. on these parts.
If the new updated version of the image processor makes the old generated images obsolete implement that also in your code, by for example attaching an attribute to the files generated, a seperate folder or another indication. That way you could request the cruncher after update to re-generate any obsolete files.

What to do with XML Files generated by my Application

I am making an application that persists several different user settings. The way I have done it is just to serialize my collections (with the settings in them) to XML files.
As they are changed I update the saved file so that when the user runs again, the settings are saved.
As I get going with this style of persistence, I am finding that I have a lot of XML files.
Is this normal? Is it ok to litter my installed directory with configuration xmls files?
Is there a way to hide these files? Maybe a trick to save them as a resource under one file name?
This is not a really urgent issue. It does not really bug me to have the XML files there, but I thought I would ask.
I am using C# and VS 2008.
Can you not at least put them in their own folder? "/App_Data" for example? Beyond that... If you are getting a lot of files, what are the chances of being able to switch to a database? (SQLLite or something along those lines)