I am looking into improving the backup process a group of animators use. Currently they back up their work into external hard drives or DVDs manually, taking full copies of everything. The data consists of thousands of high resolution images, project files of various video editing software and sound files. Basically everything is binary data and nothing should ever be merged on checkin.
Should I investigate version control systems that I would use as a software developer (Subversion, GIT etc.), or is there a class of version control systems intended for non-SW data that would suit these needs better?
You could also check out AlienBrain. Its a project asset management system designed for artists.
If your scope is just "backup" then I'd say stick to backup solutions.
But if you are thinking about the whole lifecycle of the animator's work, then the type of use typically falls into the "Digital Asset Management" category for the very reasons you mention: huge data volumes; binary formats.
Since version control (SCM) software is usually designed for text files that can be diff'd and merged, they tend not to do so well with binary formats in high volume. While your average web graphics are not going to be an issue for (software) version control tools, you mention video, which puts you in another league.
The bad news (maybe - depends on your business) is that DAM is dominated by the big end of town. #Atmospherian has mentioned AlienBrain which is a good representative of niche offering for artists. At the other end of the spectrum you have more general purpose offerings like Oracle's UCM (formerly Stellent). Make sure you check the price tags though.
There must be open source or lower cost alternatives available - but I don't know them, sorry.
What does seem to be very common are custom inhouse solutions. Unlike managing code, where changes to the files themselves have their own significance, managing digital assets tends to focus on the metadata (the image/video is just an associated blob). And since since many shops have their own particular production workflow, it makes the territory ripe for some skunkworks programming (if that's your bent - go for it!).
So while I'm not recommending any particular products, I suggest if you think "digital asset management" rather than "version control" when scouting for solutions you will probably find answers more suited to your needs.
Your question is a little unclear - you seem to have conflated version control and backup.
If what you want is version control, then take a look at the list on wikipedia: Comparison of revision control software. That shows most of the widely known version control systems, and their basic features. You're looking for something where you can set it up to force user's to checkout before they edit. Be aware that commercial solutions range in price from moderately expensive up to 'You want HOW much?'
If what you want is backup software, then I'd start at List of backup software in wikipedia. There's a lot more choices in the backup software arena, and there are a lot of price points.
Either way, figure in the creation of a admin position (either as part of someone's job or a new person altogether, if you're big enough). I've worked with backup and version control systems that didn't have an admin and it's a problem. Either no one takes care of problems, or everyone gets their fingers in there and really screws things up. Either way, making it part of someone's job (officially) is the best way to limit damage.
I think Clearcase would work for you.The reason being everything is VOB(VersionedObject) no matter what it is ! Check once
From your description, it sounds like you would do pretty well with some basic backup software such as Retrospect. Using daily backups of workstations, only changed data would be backed up and it would be easy to roll back to an earlier version of a file if needed.
What you don't get from such a setup is the ability to check out / check in files and get warnings about conflicts.
Vidyatel has an editing software that can compere video content and find the difference between the video versions leaning on the video only.
The result is in - EDL/TC.
It might help.
You should take a look at boar. It is exactly what you want, "version control and backup for photos, videos and other binary files". It is version control designed for large binary files.
Related
I am working on a website of 3,000+ pages that is updated on a daily basis. It's already built on an open source CMS. However, we cannot simply continue to apply hot fixes on a regular basis. We need to replace the entire system and I anticipate the need to replace the entire system on a 1-2 year basis. We don't have the staff to work on a replacement system while the other is being worked on, as it results in duplicate effort. We also cannot have a "code freeze" while we work on the new site.
So, this amounts to changing the tire while driving. Or fixing the wings while flying. Or all sorts of analogies.
This brings me to a concept called "continuous migration." I read this article here: https://www.acquia.com/blog/dont-wait-migrate-drupal-continuous-migration
The writer's suggestion is to use a CDN like Fastly. The idea is that a CDN allows you to switch between a legacy system and a new system on a URL basis. This idea, in theory, sounds like a great idea that would work. This article claims that you can do this with Varnish but Fastly makes the job easier. I don't work much with Varnish, so I can't really verify its claims.
I also don't know if this is a good idea or if there are better alternatives. I looked at Fastly's pricing scheme, and I simply cannot translate what it means to a specific price point. I don't understand these cryptic cloud-service pricing plans, they don't make sense to me. I don't know what kind of bandwidth the website uses. Another agency manages the website's servers.
Can someone help me understand whether or not using an online CDN would be better over using something like Varnish? Is there free or cheaper solutions? Can someone tell me what this amounts to, approximately, on a monthly or annual basis? Any other, better ways to roll out a new website on a phased basis for a large website?
Thanks!
I think I do not have the exact answers to your question but may be my answer helps a little bit.
I don't think that the CDN gives you an advantage. It is that you have more than one system.
Changes to the code
In professional environments I'm used to have three different CMS installations. The fist is the development system, usually on my PC. That system is used to develop the extensions, fix bugs and so on supported by unit-tests. The code is committed to a revision control system (like SVN, CVS or Git). A continuous integration system checks the commits to the RCS. When feature is implemented (or some bugs are fixed) a named tag will be created. Then this tagged version is installed on a test-system where developers, customers and users can test the implementation. After a successful test exactly this tagged version will be installed on the production system.
A first sight this looks time consuming. But it isn't because most of the steps can be automated. And the biggest advantage is that the customer can test the change on a test system. And it is very unlikely that an error occurs only on your production system. (A precondition is that your systems are build on a similar/equal environment. )
Changes to the content
If your code changes the way your content is processed it is an advantage when your
CMS has strong workflow support. Than you can easily add a step to your workflow
which desides if the content is old and has to be migrated for the current document.
This way you have a continuous migration of the content.
HTH
Varnish is a cache rather than a CDN. It intercepts page requests and delivers a cached version if one exists.
A CDN will serve up contents (images, JS, other resources etc) from an off-server location, typically in the cloud.
The cloud-based solutions pricing is often very cryptic as it's quite complicated technology.
I would be careful with continuous migration. I've done both methods in the past (continuous and full migrations) and I have to say, continuous is a pain. It means double the admin time for everything, and assumes your requirements are the same at all points in time.
Unfortunately, I would say you're better with a proper rebuilt on a 1-2 year basis than a continuous migration, but obviously you know best about that.
I would suggest you maybe also consider a hybrid approach? Build yourself an export tool to keep all of your content in a transferrable state like CSV/XML/JSON so you can just import into a new system when ready. This means you can incorporate new build requests when you need them in a new system (what's the point in a new system if it does exactly the same as the old one) and you get to keep all your content. Plus you don't need to build and maintain two CMS' all the time.
Now(6:13pm Jun 1, 2012): I resign myself to learning git and github so that I can do version control. I won't need to mail copies of the (compressed) code to myself, but I still don't understand the mechanism after a day of looking at this stuff.
I get the SHA1 concept for uniquely identifying a file, and using the first 2 characters fo the hash as a directory name. But I'm still confused on the updates, pointers, merge business.
Previously: I have multiple versions of programs, so I can regress to an earlier one to solve a problem.
I used to like to compress the one I was using, and send it to myself via email, but today when I did that the compressed version was too small (49 kb instead of 6 mb). So I guess I am referencing the "workspace" (the extension on the app is ".xcworkspace").
I probably shouldn't waste too much time on this problem, since it is merely a backup, but on the other hand, having the full size is an indication that the whole app is self contained, instead of pointers elsewhere that may be inadvertently changed or destroyed.
Is there any way to "undo" my current version to have all the correct data, or is it really tough?
From personal experience I agree with other commentators that Git is the way to go, or even Mercurial. The learning curve bends down after a while especially if the needs are modest.
As to the need for a "Poor Man's Version Control", sometimes you do need one. For example, you work at a employer that does not allow downloading and use of non-corporate software and the centralized VCS is not allowed to be used for Ad Hoc, Experimental, or skunk work.
Related post: poor mans source control zip project files on build
I'm not sure how to get back any changes without knowing more about your set up, but I can recommend that you look into a slightly new setup: your email-an-archive-to-yourself system sounds like a poor man's Revision Control System, except worse than poor, because there are plenty of great RCS tools available for free.
I recommend you spend an hour or so and read about git. If you learn a few commands you can have a complete change history of your project, and jump back to any point in time you like. (And then change history, creating alternate timelines, become your own grandparent, and cause all sorts of problems/adventures.) Most of the time version control is used in the context of a development team, but it provides a lot of benefit even for a lone wolf.
I need to come up with a CM process for PLC code.
Currently, the system is developed using RSLogix 5000. The build product is a monolithic file that can be loaded onto a PLC for execution and edited directly in the development environment. With multiple developers, this has become a problem. They're stepping on each others changes.
As an analogy, it's as if, when doing Java development, the only wway to edit and save the source would be to load up a *.jar file into your IDE, make the change, and then save it back to the jar file. This is less than ideal.
How can I coordinate changes between multiple developers working with PLC's?
If we are talking about one big binary files, then a VCS (centralized or decentralized) is not the best tool for the job.
An external referencial (a shared disk for instance) where a batch will copy and label the current PCL state is better.
See "Tracking Software History"
To avert discontinuities in the historical record of revisions, old versions of programs must be stored.
“We take it a step further, though. Using our MDT AutoSave, we actually go out and interrogate the equipment. Overnight or at whatever frequency is specified, the software reads the programs in the PLCs and then compares that information to the last known program. The version-control software will copy the new program and store it and [then] compare it to the last one.
Launching version control is fairly simple. Required is software installation and then hardware configuration. “You would need a server and a couple of weeks of engineering and you’re good to go,” Perysyn says. However, his company uses a “shrink-wrap approach” that involves installing the software and then customization by users filling in the blanks.
That being said, when you have multiple changes from multiple developers, you need an integration environment where a first delivery can be done and validated, before pushing it to the actual server.
See also this post.
I use Unity Pro, so this may not apply for other brands.
Unity can export an "archive" file which is XML which describes the PLC program and IO setup in its entirety. After commissioning changes, I create an export and check it in to my local Git repo. This gets me an annotated history of changes, but no visual comparison. I can always use UnityDiff for comparison.
Check out http://www.mdtsoft.com/ also
You need specialized versioning system for PLCs like VersionDog.
From the manufacturer:
"Special support with Smart Compares for SIMATIC S5, SIMATIC S7,
SIMATIC PCS 7, WinCC, WinCC flexible, InTouch, CoDeSys, TwinCAT,
Phoenix PC WORX, RSLogix, Schneider Modsoft, Schneider Concept,
Schneider Unity, SINUMERIK 840D, Bosch IndraWorks and more. Also robot
programs from ABB and Kuka and office related data formats like
Microsoft Word, Microsoft Excel and Adobe PDF are perfectly supported
by versiondog.
Update: Here is a screenshot showing ladder version compare. I guess that's what most PLC folks are interested in. We also use it to schedule e-mail report if PLC offline and online application versions are a match, as an alarm that something has been changed in PLC but not put into version control server.
About RSLogix5000 specifically, I have seen developers use an emulated PLC and make their changes online. The final product once developed is then put together with all the comments (as they are not contained in the PLC) and then commissioned. There are issues with changes that cannot be done online, such as AOIs. There are tools in place to stop two people editing the same logic online at once and to take ownership of sections. Backups can be done in the form of uploads, but there isn't any way to track changes.
It is a messy problem, messier still for when you are maintaining a system as you want an .ACD that you can go online with, as unless you are somehow doing a diff with the RSLogix compare tool you just see unreadable machine code like "+|Éû³´¬ÙÆW×晵‚>Ù,"
The most common revision control I have seen (sadly) is just saving the the latest file, then taking a copy and adding the current date to the file name, like the recommended control.com post described.
RSLogix5000 has always prohibited multiple users from opening and editing on the same .ACD simultaneously. However, if multiple users have identical .ACD files, open them, and all make connections to the same target controller, they each can edit on the controller simultaneously, but only if they are working on different routines. Other's edits appear automatically, if they were to look at another programmers routine.
Note that working online like this is usually done with the PLC running, even sometimes with the target system (some kind of machine) operating. This kind of arrangement for the purpose of completing work faster, or in some cases because the system is huge. No one develops like this, as it is really a debug tool and impractical for significant changes.
If one programmer finishes, and another is not done, the unfinished work of the other will be saved to the first programmer's .ACD when they save. Whoever saves last will have everyone's work.
Like others have mentioned in this thread, using file date is fairly reasonable. Some companies use a version control variable that is usually displayed on a connected HMI. Other companies use a separate document that documents who and what changes. Sometimes version notes are placed in a lengthy rung comment in the main routine.
My company uses a separate change log, and dated archive copies are maintained. Multiple programmers are only used in the most extreme cases. Someone is always designated to maintain the offline file integrity, usually the person who will be working the longest, or the project manager.
It is important to note that rung comments are not carried from one user to another before RSLogix5000 v21 because previous versions didn't store comments on the controller.
All this said, you might be trying to manage offline development. I haven't seen any sophisticated methods for this. Usually programmers write the needed routines separately, and a project manager will assemble them into a single project. The cleanest approach I've seen is where a project manager will create an architecture with global functionality, and assign routine work to others, giving them a copy of the .ACD to work with. They return the .ACD with changes, and the project manager copies and pastes their routines into the "master" project.
This is a very good question and it really depends on what you want it to do.
If you are only using Rockwell equipment it might be helpfull to look at their solution, I think it's called FactoryTalk AssetCentre.
Currently I am looking into using Bazaar from Canonical.
One thing that VonC pointed out is that a piece of software that can interogate the PLC is a deffinate plus, not a must in my oppinion but it sure as hell helps.
Am I reading your question properly and you have multiple developers working on the same PLC code at the same time? It's a scary thought but I know it sometimes needs to happen, Siemens PLC's are a bit easier to program with multiple developers but I would assign one person to consolidate and test all the changes before committing to the PLC. Any CVS system will let you create branches for every developer but how you would get them to consolidate their changes is the million dolar question.
Bart.
A simple thing to do would be to do a text diff on the .l5k files so you can easily see whether a developer has been messing with part of the file that is outside of their scope.
I saw this question just now from a link at stack exchange: Are There Realistic/Useful Solutions for Source Control for Ladder Logic Programs. Rather than have a link only answer, I'll dupe my answer here:
There is actually a canned solution - from GE-IP of all places. Check out Proficy Change Management. This product does version control from a PLC control systems point of view, rather than a pure version control of files point of view - it works as a layer sitting on top of a VCS (the scary part is that originally this VCS was Visual SourceSafe) and handles rights management, reporting and checkout/checkin.
While the product is from GE-IP, it is designed to support a variety of PLC and HMI systems out of the box.
Full disclosure, I used for work for a company selling and installing PCM (but that was 7 years ago). So if you ask me what it was like back then I'm likely to tell you where it all went wrong!
In my company we just started a trial with Copia.io
Check it out. Our first tests look very promising!
It brings, branching, merging, ladder diff etc... for multiple PLC platforms (Rockwell, Siemens, Codesys)..
PS. I work for a company that builds machines, we were looking for version-dog alike solutions with a bit more power in collaboration and diffing capabilities. I used tools like Mercurial, Git, Tortoise in past companies (not for PLC though).
I know that a VCS is absolutely critical for a developer to increase productivity and protect the code, no doubts about it. But what about a designer, using say, Photoshop (though it's not specific to any tools, just to make my point clearer).
VCSs uses delta compression to store different versions of files. This works very well for code, but for images, that's a problem. Raster image files are binary formats, though vector image files are text (SVG comes to my mind) and pose to problem. The problem comes with .psd files (and any other image "source" file) - those can get pretty big and since I'm not familiar with the format, I'll consider them as binary files. How would a VCS work in this condition?
The repository could be pretty darned big if the VCS server isn't able to diff the files efficiently (or worse, not at all) and over time this can become a really big pain when someone needs to check out the repository (or clone it if using a DVCS).
Have any of you used a VCS for this purpose? How well does it work? I'm mostly interested in Mercurial, though this is a general situation that applies to any VCS.
Designers usually use specialized tools like AlienBrain, Adobe VersionCue or similar, which are essentically Version Control Systems that understand Images and other Media Assets, which allows stuff like diffing two images.
Designers IMHO should definitely use VCS systems, at least as a means for Versioning and backup - their stuff is just as important as Specs, Documentation, Code, Deploy Scripts and everything else that makes a project.
I do not know if there are bridges between "Asset Management Systems" like the mentioned ones and Developer VCS' systems though.
Version Control Systems are useful for ANYONE that is doing work that they might need an older version of at some later date. That said, I have set up all my creative friends with Subversion (in the past) and now I recommend Git. Even those that are doing Video editing with hundreds of gigs of video. They can archive off the projects when they get final payment. Drive space is CHEAP, cheaper than ever before, size isn't an issue in any modern VCS. Being able to revert back to a previous working state or experiment with something without losing data and manually managing multiple "temp" directories is invaluable if you bill by the hour.
Yes
Don't worry about the size, if you run out of space, just buy a larger hard drive.
Losing information will be far costlier.
In addition to a VCS (any will do, as you won't be needing delta storage), do regular backups.
When doing checkout you shouldn't be standing on the root of the system, but rather on a specific branch to your project, that way it won't be slower than any simple copy operation of that folder.
Definitely recommend using Version Control for any type of file you care about, or can't afford to loose. Disk space is cheap, and as has already been pointed out it'd be far worse to loose a bunch of important files than to spend a few extra bucks on a new HDD. I recommend Subversion since it has file locking, an important feature when working with binary files and version control to prevent ugly or impossible merge conflicts.
I believe so. Especially if you wish to track changes over time or need to rollback to previous versions. Centralized source control may be the way to go if you're worried about the size.
I'm currently working at a small web development company, we mostly do campaign sites and other promotional stuff. For our first year we've been using a "server" for sharing project files, a plain windows machine with a network share. But this isn't exactly future proof.
SVN is great for code (it's what we use now), but I want to have the comfort of versioning (or atleast some form of syncing) for all or most of our files.
What I essentially want is something that does what subversion does for code, but for our documents/psd/pdf files.
I realize subversion handles binary files too, but I feel it might be a bit overkill for our purposes.
It doesn't necessarily need all the bells and whistles of a full version control system, but something that that removes the need for incremental naming (Notes_1.23.doc) and lessens the chance of overwriting something by mistake.
It also needs to be multiplatform, handle large files (100 mb+) and be usable by somewhat non technical people.
SVN is great for binaries, too. If you're afraid you can't compare revisions, I can tell you that it is possible for Word docs, using Tortoise.
But I do not know, what you mean with "expanding the versioning". SVN is no document management system.
Edit:
but I feel it might be a bit overkill for our purposes
If you are already using SVN and it fulfils your purposes, why bother with a second system?
If you have a windows 2003 server, you can have a look at Sharepoint Services 3.0 (http://technet.microsoft.com/en-us/windowsserver/sharepoint/bb684453.aspx).
It can do version control for documents, and has a nice integration with Office, starting with Office xp, but office 2003 and 2007 are better. Office and PDF files can be indexed (via Adobe IFilter), and searched. You can also add IFilters to search metadata in your documents.
Regarding large files, by default the max filesize is 50MB, but it can be configured.
We've just moved over to Perforce and have been really happy with it. It's a commercial product, but it's so powerful and easy to use that it's worth the price per seat IMHO.
A decent folder structure and naming scheme?
VCS don't really handle images and such very well - would it be possible to have the code in a VCS (SVN/Git/Mercurial etc), along-side a sensible folder structure for the binary-assets (source photos, Photoshop PSD files, Illustrator files and so on)?
It wouldn't handle syncing, but a central file-server would achieve the same thing.
It would require some enforcing and kitten-herding to get people to name things properly, but I think having a version folder for each asset (like someproject/asset/header_logo/v01/header_logo_v01.psd) will basically be like a VCS, but easier to move between different revisions (no vcs checkout blah -r 234 when a client decides they prefered v02 more than v03)
Your question is interesting because your specifying that it be suitable for a small office. At the enterprise level, I would recommend something along the line of EMC Documentum's eRoom, but obviously thats going to be way more than you need, and more than you want cost-wise as well. I'm not sure of the licensing details on this but I've heard that if your office has MS Office, you have access to Sharepoint, which might work well for you. I'm also sure there are a lot of SAAS implementations of this kind of stuff, so you may want to look at that, keeping in mind that the servers will not be hosted by you, so if the material is extremely sensitive, thats obviously not the proper route.
You might want to consider using a Mac as your server and using Time Machine to backup your shared folders. Doing this gives you automatic backups and allows you to share through Samba so everyone can have a network drive on their computer. A Mac server is probably overkill. A Mac Mini would do for a small office or a repurposed desktop machine.
You might also consider Amazon's S3 service to do offline backups. Since it's a pay-as-you-go service this can scale with use, and if you feel you want to move to something else you can always download your data and take it somewhere else.
Windows Vista features local file versioning in its file system, which can be useful, but is limited in terms of teamwork. However, if somebody overwrites somebody else's file, a new version is stored as it should be.
Also consider KnowledgeTree. Have a look at it, some demos/screenshots are available at
http://www.knowledgetree.com/
It has a free open source Community Edition - so it's cost effective. We haven't tried it, but we chose this one over other systems for a small business looking for document versioning solution.