Should I keep my site media in my website's repository? - version-control

I have a simple blog application written in Python, using Django. I use Git to version-control this website. The main content of the site is a blog. The blog entries are stored in a SQLite database (which is not version-controlled, but is backed up regularly); some entries contain images and other media (like PDFs).
I currently store this "blog media" in the repository, alongside other media (such as external JavaScript code, and images used for layout purposes -- all nicely organized, of course). It occurred to me, however, that this isn't really a good strategy, for a few reasons:
Whenever I post a new blog entry that contains an image or a link to a PDF, I have to add the image to the repo and then copy a new version to the production server -- which seems like a lot of work just to add an image. It'd be easier just to upload the image to the server (and make a local backup, of course).
Since this media is content rather than code, it doesn't seem necessary to store it alongside the code (and related style media) itself.
The repo contains a lot of binary files, which increase the overall size of the repo; and more importantly,
I never really edit these images, so why keep them under version-control?
So I'm considering removing these files from the repo, and just copying them to a directory on the server outside of the directory containing the Python code, templates, style sheets, etc., for the website.
However, I wondered: Is there a "best practice" for dealing with content images and other media in a website's repo, as opposed to images, etc., that are actually used as part of the site's layout and functionality?
Edit
To elaborate, I see a difference between keeping the code for the website in the repo, and also keeping the content of the site in the repo -- I feel that perhaps the content should be stored separately from the code that actually provides the functionality of the site (especially since the content may change more frequently, and I don't see a need to create new commits for "stuff" that isn't necessary for the functioning of the site itself).

Keep them in version control. If they never change, you don't pay a penalty for it. If they do change, well, then it turns out you needed the version control after all.

Initially, I would say don't put them in the repo because they'll never change but then consider the situation of moving your website to a different server, or hosting provider. You'd need an easy way to deploy it, and unless it's not under version control, that's a lot of copy/paste that could go wrong. At least it's all in once place if/when something happens.
This isn'y really an answer as much as it's something to consider.

Version them. Why not? I version the PSD's and everything. But if that makes you wince, I can understand. You should version the javascript and stylesheets though, that stuff is code (of sorts).
Now, if by content, you mean "the image I uploaded for a blog post" or "a pdf file I'm using in a comment", then I'd say no--dont version it. That kind of content is accounted for in the database or somewhere else. But the logo image, the sprites, and the stuff that makes up the look and feel of the site should absolutely be versioned.
I'll give you one more touchy-feely reason if you aren't convinced. Some day you'll wish you could go into your history and see what your site looked like 5 years ago. If you versioned your look & feel stuff, you'll be able to do it.

You are completely correct on two points.
You are using Version Control for your code.
You are backing up your live content database.
You have come to the correct conclusion that the "content images" are just that and have no business in your code's Version Control.
Backup your content images along with your database. You do not want to blur the lines between the two unless you want your "code" to be just your own blog site.
What if you wanted to start a completely different blog. Or your friends all wanted one.You wouldn't be giving them a copy of your database with all your content. Nor would it be any use for them to have a copy with all your content images.

Move version control systems don't work well with binary files, that being said, if they're not changing, it makes no (little) difference.
You just have to decide which is easier, backing it up on the repository and the multistep process to add an image/pdf/whatever, or maintaining a separate set of actions for them (including backup). Personally I'd keep them on the version-control. If you're not changing them it's not harming anything. Why worry about something that isn't causing harm?

I think you need to ask yourself why you are using version control and why are you making back-ups Probably because you want to safeguard yourself against loss or damage of your files and in the event of something terrible happens you can fall back on your backups.
If you use version control and a separate backup system you get into the problem of distribution because the latest version of your site lives in different places. What if something does go wrong, then how much effort is it going to take you to restore things? To me, having a distributed system with version control and backup's seems like a lot of manual work that's not easy script-able. Even more, when something does go wrong you're probably already stressed out anyway. Making the restoration process harder will probably not help you much.
The way I see it, putting your static files in version control doesn't do any harm. You have to put them some where anyway be in a version control repository or a normal file system. since your static files never change they're not taking up more space over time, so what's the problem? I recommend you just place all of it under version control and make it easy on yourself. Personally I would make a backup of my database with regular intervals and commit this backup to version control as well. This way you have everything in one place and in the case of disaster you can easily do a new checkout/export to restore your site.
I've build this website. It has over a gig of PDF files and everything is stored under version control. If the server dies, all I have to do is a clean export and re-import the database and the site it up and running again.

If you are working on a web project, I would recommend creating a virtual directory for your media. For example, we setup a virtual directory in our local working copy IIS for /images/ /assets/ etc. which points to the development/staging server that the customer has access to.
This increases the speed of the source control (especially using something clunky like Visual Source Safe), and if the customer changes something during testing, this is automatically reflected in our local working copy.

Related

Version control personally and simply?

Requirement
make history for web text/code source files.
login-worker is only me, i.e personal usage.
automatically save history for each updated files(no require at once but at least once per week)
It must be a simple way to start and work.
I have 3 work places so need to do async files.
(not must but hopefully for future working environment) Any other non-engineer can also understand the location of history file and can see it easily.
Current way:
I made history folder the day, download files in there for edit, copy files when I edit/creat new one.
Advantage of the current way:
Very quick and simple, no need to do additional task to make history
Disadvantage of the current way:
Messy. Whenever day I work, I create a new history folder to keep downloaded files, so that it is messy in Finder(or windows explore).
Also, I don't have a way to Doing Async files for sure with in other places.
I tested to use GIT before, I had Thought GIT automatically save files I edit and save with a editor, but that was not the case. Also GIT is too complicated to use/start. If you recommend GIT, you need to show me ways to deal with the problem I had, for instance, simple GIT GUI with limited options without merging/project/branch etc because of personal usage for maintaining just one website.
Do you know any way to do version control personally and simply?
Thanks.
Suppose you entered <form ...> in your HTML—without the closing tag—and saved the file; do you really think the commit created by our imaginary VCS picked up that file's update event would have any sense?
What I mean, is that as with writing programs¹,
the history of source code changes are there for humans to read,
and for that matter, a good history graph should really read like a prose:
each commit should be atomic in the sense it comprises one (small) but
internally integral feature or fixes a bug, and had to be properly annotated
so that the intent of the change captured by that commit is clear.
What you want instead is just some dumb stream of changes purely for backup purposes.
Well, if you're fully aware of the repercussions (the most glaring one is that the generated history is completely useless for doing development on
the project and can only be used for rollbacks in case of "oopsies"),
there are two ways to go:
Some IDEs (namely, Eclipse) save a backup copy of each file they manage
on each save—thus providing your with such a rollback functionality w/o
using any VCS.
Script around any VCS you like: say, on Linux,
you start something like inotifywait telling it to watch your
project's root directory, recurvively, for write events on files,
read whatever the tool prints to its stdout when these events happen,
and for each event, call to your VCS of choice to record a new commit
with these changes.
¹ «Programs must be written for people to read, and only incidentally for machines to execute.» — Abelson & Sussman, "Structure and Interpretation of Computer Programs", preface to the first edition.
I strongly suggest you to have a deeper look at git.
It may looks difficult at the beginning, but you should spend some time learning it, that's all. All the problems above could be easily solved if you spend some time to learn the basics. There is also a nice "tutorial" on github on how to use git, no need to install anything: https://try.github.io/levels/1/challenges/1.

ModX: how to update database without overriding content

I am working on a ModX website (mainly templates but also system settings, user management, etc) while a development website is already online with the customer starting to input the content.
I haven't found a practical solution to push my work online (layouts are stored in the database) without overriding the content input by the customer (in the database as well).
My present workflow consists of first replacing my local modx_site_content table with the one extracted from the online database, then push this hybrid database online. Not practicle, plus I am not sure user changes are confined to modx_site_content only.
There must be a better workflow though! How do you handle this?
I don't think it gets any easier than selecting the tables you need and exporting only those into the live environment. Assuming you only work on templating, the template, snippet & chunk tables are all you need to export.
We usually take a copy, develop, and merge only once when the new features are supposed to go live this minimizes this trouble. Also the client can continue normally until d-day.
If youre doing a lot of snippet work you could always just include an actual php file instead and work with your editor directly towards those files, connect them to git and what not.
If you project is not really big, you can store your chunks/resources, etc. in a separate files (there is and option called "Static Resource"), and then, manage your changes with git. Otherwise, you need to store user data in a separate table and deploy the whole database with Fabric, for example.

Website may have up to 10 files with same name/purpose, no version control

I am new at working for a large company with various people working on the same files. Sadly we don’t have version control and I often find myself cross eyed. For lack of better terminology, we have a dev site, quality-assurance site, and the live site. We have most files in two languages. Since the network connected drives have an average transfer rate of 15kb/sec we often copy the files locally before working on them. Also contractors send us new versions of files, but we may have made changes on our side and everything gets screwed up.
Basically I’m working with 6-10 files with the same name and same purpose. Does anyone have any tips on how I can keep them straight? I use Beyond Compare 2 to see the differences but if there’s a program that compares all files time stamps to see which is most current may help.
Thoughts:
1) Get version control system (Git), otherwise you will continue to have more and more pain.
2) Create a includes/lib folder and reduce that 6-10 files down (to 1).
I'll suggest, take a lead and put your code in version control and push your team to move to new repository. It'll make everybody's life easier and most important reduce chances of any merge error.
Assuming you cannot convince the powers that be to actually use source code control, why not try using Mercurial purely locally. Hopefully you can insulate yourself from some of the noise. You could even make fake users for the contractors and commit & push those changes as though they were actually doing it.
It shouldn't be too hard to get a bureaucrat to see how nice a good gatekeeper like Mercurial or Git would be. Its kind of like helpful red tape!

How should I start with tracking file changes/versions?

I've been working with a lot of my files on the go recently, and in the process often times accumulated several copies of files in different stages of completion/revision. I'm working on any number of projects at a given time, so it's not always easy to remember or figure out quickly which version I should continue working on.
What type of options would you recommend that allow me to track changes locally and if possible with files I work on while at a remote location? I've never worked with file versioning or tracking systems, so not sure what direction I should be looking in. I work mostly with HTML, CSS, and PHP.
Any help is awesomely appreciated! Thanks.
PS. Don't know if I should have this in a separate question but what options are available for the same type of thing, change tracking/logging for files on server? Preferably something that not only vaguely notes a file has been changed, but that tracks specific changes that have occurred in files.
It's seems to me that github is prefect choice for your requirement. You can create repository for maintain the history, it's easy to use and it is free
https://github.com/

What is the best solution for maintaining backup and revision control on live websites?

What is the best solution for maintaining backup and revision control on live websites?
As part of my job I work with several live websites. We need an efficient means of maintaining backups of the live folders over time. Additionally, updating these sites can be a pain, especially if a change happens to break in the live environment for whatever reason.
What would be ideal would be hassle-free source control. I implemented SVN for a while which was great as a semi-solution for backup as well as revision control (easy reversion of temporary or breaking changes) etc.
Unfortunately SVN places .SVN hidden directories everywhere which cause problems, especially when other developers make folder structure changes or copy/move website directories. I've heard the argument that this is a matter of education etc. but the approach taken by SVN is simply not a practical solution for us.
I am thinking that maybe an incremental backup solution may be better.
Other possibilities include:
SVK, which is command-line only which becomes a problem. Besides, I am unsure on how appropriate this would be.
Mercurial, perhaps with some triggers to hide the distributed component which is not required in this case and would be unnecessarily complicated for other developers.
I experimented briefly with Mercurial but couldn't find a nice way to have the repository seperate and kept constantly in-sync with the live folder working copy. Maybe as a source control solution (making repository and live folder the same place) combined with another backup solution this could be the way to go.
One downside of Mercurial is that it doesn't place empty folders under source control which is problematic for websites which often have empty folders as placeholder locations for file uploads etc.
Rsync, which I haven't really investigated.
I'd really appreciate your advice on the best way to maintain backups of live websites, ideally with an easy means of retrieving past versions quickly.
Answer replies:
#Kibbee:
It's not so much about education as no familiarity with anything but VSS and a lack of time/effort to learn anything else.
The xcopy/7-zip approach sounds reasonable I guess but it would quickly take up a lot of room right?
As far as source control, I think I'd like the source control to just say that "this is the state of the folder now, I'll deal with that and if I can't match stuff up that's your fault, I'll just start new histories" rather than fail hard.
#Steve M:
Yeah that's a nicer way of doing it but would require a significant cultural change. Having said that I very much like this approach.
#mk:
Nice, I didn't think about using Rsync to deploy. Does this only upload the differences? Overwriting the entire live directory everytime we make a change would be problematic due to site downtime.
I am still curious to see if there are any more traditional options
You can still use SVN, but instead of doing a checkout on your live environment, do an export, that way no .svn directories will be created. The downside, of course, is that no code changes on your live environment can take place. This is a good thing.
As a general rule, code changes on production systems should never be allowed. The change should be made and tested in a development/test/UAT environment, then once confirmed as OK, you can tag that code in SVN with something like RELEASE-x-x-x. Then, on the live system, export the code with that tag.
We use option 3. Rsync. I wrote a bash script to do this along with some extra checking, but here are the basics of what it does.
Make a tag for pushing to live.
Run svn export on that tag.
rsync to live.
So far it has been working out. We don't have to worry about user conflicts or have a separate user for running svn up on the production machine.
Any source control solution you pick is going to have problems if people are moving, deleting, or adding files and not telling the source control system about it. I'm not aware of any source control item that could solve this problem.
In the case where you just can't educate the people working on the project[1], then you may just have to go with daily snapshots. Something as simple as batch file using xcopy to a network drive, and possibly 7-zip on the command line to compress it so it doesn't take up too much space would probably be the simplest solution.
[1] I would highly disbelieve this, probably just more a case of people being too stubborn and not willing to learn, or do "extra work". Nevermind how much time source control could save them when they have to go back to previous versions, or 2 people have edited the same file.
rsync will only upload the differences. I haven't personally used it, but Mark Pilgrim wrote a long time ago about how it even handles binary diffs brilliantly.
svn+rsync sounds like a fantastic solution. I'll have to try that in the future.