How do others keep a clipart repository up to date? - deployment

We have a clipart library with our application, its recently grown in size (based on marketings feedback) to be 2GB of data.
The application installer is now 1GB and really a bit unmanageable for some of the countries we distribute to.
What solutions would you use install and keep a library like that up to date?

One possibility may be to let customers select subsets (some/all/none) of the library.

Related

How to implement continuous migration for large website?

I am working on a website of 3,000+ pages that is updated on a daily basis. It's already built on an open source CMS. However, we cannot simply continue to apply hot fixes on a regular basis. We need to replace the entire system and I anticipate the need to replace the entire system on a 1-2 year basis. We don't have the staff to work on a replacement system while the other is being worked on, as it results in duplicate effort. We also cannot have a "code freeze" while we work on the new site.
So, this amounts to changing the tire while driving. Or fixing the wings while flying. Or all sorts of analogies.
This brings me to a concept called "continuous migration." I read this article here: https://www.acquia.com/blog/dont-wait-migrate-drupal-continuous-migration
The writer's suggestion is to use a CDN like Fastly. The idea is that a CDN allows you to switch between a legacy system and a new system on a URL basis. This idea, in theory, sounds like a great idea that would work. This article claims that you can do this with Varnish but Fastly makes the job easier. I don't work much with Varnish, so I can't really verify its claims.
I also don't know if this is a good idea or if there are better alternatives. I looked at Fastly's pricing scheme, and I simply cannot translate what it means to a specific price point. I don't understand these cryptic cloud-service pricing plans, they don't make sense to me. I don't know what kind of bandwidth the website uses. Another agency manages the website's servers.
Can someone help me understand whether or not using an online CDN would be better over using something like Varnish? Is there free or cheaper solutions? Can someone tell me what this amounts to, approximately, on a monthly or annual basis? Any other, better ways to roll out a new website on a phased basis for a large website?
Thanks!
I think I do not have the exact answers to your question but may be my answer helps a little bit.
I don't think that the CDN gives you an advantage. It is that you have more than one system.
Changes to the code
In professional environments I'm used to have three different CMS installations. The fist is the development system, usually on my PC. That system is used to develop the extensions, fix bugs and so on supported by unit-tests. The code is committed to a revision control system (like SVN, CVS or Git). A continuous integration system checks the commits to the RCS. When feature is implemented (or some bugs are fixed) a named tag will be created. Then this tagged version is installed on a test-system where developers, customers and users can test the implementation. After a successful test exactly this tagged version will be installed on the production system.
A first sight this looks time consuming. But it isn't because most of the steps can be automated. And the biggest advantage is that the customer can test the change on a test system. And it is very unlikely that an error occurs only on your production system. (A precondition is that your systems are build on a similar/equal environment. )
Changes to the content
If your code changes the way your content is processed it is an advantage when your
CMS has strong workflow support. Than you can easily add a step to your workflow
which desides if the content is old and has to be migrated for the current document.
This way you have a continuous migration of the content.
HTH
Varnish is a cache rather than a CDN. It intercepts page requests and delivers a cached version if one exists.
A CDN will serve up contents (images, JS, other resources etc) from an off-server location, typically in the cloud.
The cloud-based solutions pricing is often very cryptic as it's quite complicated technology.
I would be careful with continuous migration. I've done both methods in the past (continuous and full migrations) and I have to say, continuous is a pain. It means double the admin time for everything, and assumes your requirements are the same at all points in time.
Unfortunately, I would say you're better with a proper rebuilt on a 1-2 year basis than a continuous migration, but obviously you know best about that.
I would suggest you maybe also consider a hybrid approach? Build yourself an export tool to keep all of your content in a transferrable state like CSV/XML/JSON so you can just import into a new system when ready. This means you can incorporate new build requests when you need them in a new system (what's the point in a new system if it does exactly the same as the old one) and you get to keep all your content. Plus you don't need to build and maintain two CMS' all the time.

Event-based analytics package that won't break the bank with high volume

I'm using an application that is very interactive and is now at the point of requiring a real analytics solution. We generate roughly 2.5-3 million events per month (and growing), and would like to build reports to analyze cohorts of users, funneling, etc. The reports are standard enough that it would seem feasible to use an existing service.
However, given the volume of data I am worried that the costs of using a hosted analytics solution like MixPanel will become very expensive very quickly. I've also looked into building a traditional star-schema data warehouse with offline background processes (I know very little about data warehousing).
This is a Ruby application with a PostgreSQL backend.
What are my options, both build and buy, to answer such questions?
Why not building your own?
Check this open source project as an exemple:
http://www.warefeed.com
It is very basic and you will have to built datamart feature you will need in your case

Version control for video editing work

I am looking into improving the backup process a group of animators use. Currently they back up their work into external hard drives or DVDs manually, taking full copies of everything. The data consists of thousands of high resolution images, project files of various video editing software and sound files. Basically everything is binary data and nothing should ever be merged on checkin.
Should I investigate version control systems that I would use as a software developer (Subversion, GIT etc.), or is there a class of version control systems intended for non-SW data that would suit these needs better?
You could also check out AlienBrain. Its a project asset management system designed for artists.
If your scope is just "backup" then I'd say stick to backup solutions.
But if you are thinking about the whole lifecycle of the animator's work, then the type of use typically falls into the "Digital Asset Management" category for the very reasons you mention: huge data volumes; binary formats.
Since version control (SCM) software is usually designed for text files that can be diff'd and merged, they tend not to do so well with binary formats in high volume. While your average web graphics are not going to be an issue for (software) version control tools, you mention video, which puts you in another league.
The bad news (maybe - depends on your business) is that DAM is dominated by the big end of town. #Atmospherian has mentioned AlienBrain which is a good representative of niche offering for artists. At the other end of the spectrum you have more general purpose offerings like Oracle's UCM (formerly Stellent). Make sure you check the price tags though.
There must be open source or lower cost alternatives available - but I don't know them, sorry.
What does seem to be very common are custom inhouse solutions. Unlike managing code, where changes to the files themselves have their own significance, managing digital assets tends to focus on the metadata (the image/video is just an associated blob). And since since many shops have their own particular production workflow, it makes the territory ripe for some skunkworks programming (if that's your bent - go for it!).
So while I'm not recommending any particular products, I suggest if you think "digital asset management" rather than "version control" when scouting for solutions you will probably find answers more suited to your needs.
Your question is a little unclear - you seem to have conflated version control and backup.
If what you want is version control, then take a look at the list on wikipedia: Comparison of revision control software. That shows most of the widely known version control systems, and their basic features. You're looking for something where you can set it up to force user's to checkout before they edit. Be aware that commercial solutions range in price from moderately expensive up to 'You want HOW much?'
If what you want is backup software, then I'd start at List of backup software in wikipedia. There's a lot more choices in the backup software arena, and there are a lot of price points.
Either way, figure in the creation of a admin position (either as part of someone's job or a new person altogether, if you're big enough). I've worked with backup and version control systems that didn't have an admin and it's a problem. Either no one takes care of problems, or everyone gets their fingers in there and really screws things up. Either way, making it part of someone's job (officially) is the best way to limit damage.
I think Clearcase would work for you.The reason being everything is VOB(VersionedObject) no matter what it is ! Check once
From your description, it sounds like you would do pretty well with some basic backup software such as Retrospect. Using daily backups of workstations, only changed data would be backed up and it would be easy to roll back to an earlier version of a file if needed.
What you don't get from such a setup is the ability to check out / check in files and get warnings about conflicts.
Vidyatel has an editing software that can compere video content and find the difference between the video versions leaning on the video only.
The result is in - EDL/TC.
It might help.
You should take a look at boar. It is exactly what you want, "version control and backup for photos, videos and other binary files". It is version control designed for large binary files.

Is automatic upgrades a realistic feature to expect from enterprise Web applications?

Most of the work I do is with what could be considered enterprise Web applications. These projects have large budgets, longer timelines (from 3-12 months), and heavy customizations. Because as developers we have been touting the idea of the Web as the next desktop OS, customers are coming to expect the software running on this "new OS" to react the same as on the desktop. That includes easy to manage automatic upgrades. In other words, "An update is available. Do you want to upgrade?" Is this even a realistic expectation? Can anyone speak from experience on trying to implement this feature?
At my company we have enterprise installations ranging into the thousands of seats. If we implemented an auto-upgrade, our customers would mutiny!
Large installations have peculiar issues that don't apply to small ones. For example, with 2000 users (not all of whom are, let us say, the most sophisticated of tool users), tool-training is a big deal: training time, internal demos, internal process documents, etc.. They cannot unleash a new feature or UI change without a chance to understand how it fits in their process and therefore what their internal best practices are and how to communicate that to their users.
Also when applications fail, it's the internal IT team who are responsible. Therefore, they want time to install a new version in a test area, beat it up, and deploy on a Saturday only when they're good and ready.
I can see the value in making minor patches more easy to install, particularly when the patch is just for a bug-fix and not for anything that would require retraining, and if the admins still get final say over when it's installed. But even then, I don't believe anyone has ever asked for this! Whether because they don't want it or they are trained to not expect it, it doesn't seem worth it.
Well, it really depends on your business model but for a lot of applications the SaaS model can end up biting you. It's great for a lot of things but for some larger applications the users are not investing as significant amount up front and could possibly move to something else before you've made any money.
See
http://news.zdnet.com/2424-9595_22-218408.html
and here
http://www.25hoursaday.com/weblog/2008/07/21/SoftwareAsAServiceWhenYourBusinessModelBecomesAParadox.aspx
for more information
One of the primary reasons to implement an application as a web application is that you get automatic upgrades for free. Why would users be getting prompted for upgrades on a web app?
For Windows applications, the "update is available, do you want to upgrade?" functionality is provided by Microsoft using ClickOnce, which I have used in an enterprise environment successfully -- there are a few gotchas but for the most part it is a good way to manage automatic deployment and upgrade of Windows apps.
For mobile apps, you can also implement auto-upgrades, although it is a little trickier.
In any case, to answer your question in a broad sense, I don't know if it is expected that all enterprise apps should make upgrading easy, but it certainly is worth the money from an IT support standpoint to architect them to allow for easy upgrading.
If you're providing a hosted solution, I wouldn't bother. Let the upgrade happen silently (perhaps with a notice that you did it). If you're selling an application that's hosted on their servers, let the upgrade decision be made by a single owner, not every user of the app.

What is the best way to handle files for a small office?

I'm currently working at a small web development company, we mostly do campaign sites and other promotional stuff. For our first year we've been using a "server" for sharing project files, a plain windows machine with a network share. But this isn't exactly future proof.
SVN is great for code (it's what we use now), but I want to have the comfort of versioning (or atleast some form of syncing) for all or most of our files.
What I essentially want is something that does what subversion does for code, but for our documents/psd/pdf files.
I realize subversion handles binary files too, but I feel it might be a bit overkill for our purposes.
It doesn't necessarily need all the bells and whistles of a full version control system, but something that that removes the need for incremental naming (Notes_1.23.doc) and lessens the chance of overwriting something by mistake.
It also needs to be multiplatform, handle large files (100 mb+) and be usable by somewhat non technical people.
SVN is great for binaries, too. If you're afraid you can't compare revisions, I can tell you that it is possible for Word docs, using Tortoise.
But I do not know, what you mean with "expanding the versioning". SVN is no document management system.
Edit:
but I feel it might be a bit overkill for our purposes
If you are already using SVN and it fulfils your purposes, why bother with a second system?
If you have a windows 2003 server, you can have a look at Sharepoint Services 3.0 (http://technet.microsoft.com/en-us/windowsserver/sharepoint/bb684453.aspx).
It can do version control for documents, and has a nice integration with Office, starting with Office xp, but office 2003 and 2007 are better. Office and PDF files can be indexed (via Adobe IFilter), and searched. You can also add IFilters to search metadata in your documents.
Regarding large files, by default the max filesize is 50MB, but it can be configured.
We've just moved over to Perforce and have been really happy with it. It's a commercial product, but it's so powerful and easy to use that it's worth the price per seat IMHO.
A decent folder structure and naming scheme?
VCS don't really handle images and such very well - would it be possible to have the code in a VCS (SVN/Git/Mercurial etc), along-side a sensible folder structure for the binary-assets (source photos, Photoshop PSD files, Illustrator files and so on)?
It wouldn't handle syncing, but a central file-server would achieve the same thing.
It would require some enforcing and kitten-herding to get people to name things properly, but I think having a version folder for each asset (like someproject/asset/header_logo/v01/header_logo_v01.psd) will basically be like a VCS, but easier to move between different revisions (no vcs checkout blah -r 234 when a client decides they prefered v02 more than v03)
Your question is interesting because your specifying that it be suitable for a small office. At the enterprise level, I would recommend something along the line of EMC Documentum's eRoom, but obviously thats going to be way more than you need, and more than you want cost-wise as well. I'm not sure of the licensing details on this but I've heard that if your office has MS Office, you have access to Sharepoint, which might work well for you. I'm also sure there are a lot of SAAS implementations of this kind of stuff, so you may want to look at that, keeping in mind that the servers will not be hosted by you, so if the material is extremely sensitive, thats obviously not the proper route.
You might want to consider using a Mac as your server and using Time Machine to backup your shared folders. Doing this gives you automatic backups and allows you to share through Samba so everyone can have a network drive on their computer. A Mac server is probably overkill. A Mac Mini would do for a small office or a repurposed desktop machine.
You might also consider Amazon's S3 service to do offline backups. Since it's a pay-as-you-go service this can scale with use, and if you feel you want to move to something else you can always download your data and take it somewhere else.
Windows Vista features local file versioning in its file system, which can be useful, but is limited in terms of teamwork. However, if somebody overwrites somebody else's file, a new version is stored as it should be.
Also consider KnowledgeTree. Have a look at it, some demos/screenshots are available at
http://www.knowledgetree.com/
It has a free open source Community Edition - so it's cost effective. We haven't tried it, but we chose this one over other systems for a small business looking for document versioning solution.