What time should I build to production? - deployment

My users use the site pretty equally 24/7. Is there a meme for build timing?
International audience, single cluster of servers on eastern time, but gets hit well into the morning, by international clients.
1 db, several web servers, so if no db, simple, whenever.
But when the site has to come down, when would you, as a programmer be least mad to see SO be down for say 15 minutes.

If there's truly no good time from the users' perspective, then I'd suggest doing it when your team has the most time to recover from any build-related disaster.

Here's what I have done and its worked well for me:
Get a site traffic analysis tool
which will graph hourly user load
Select low-point in graph for doing
updates

If you're small, then yeah, find when your lowest usage period is, and do it then (for us personally, usually around 1AM-3AM PST is the lowest dip...but it never drops to 0 of course). Once you start growing to having a larger userbase, if you want people to take you seriously you'll need to design your application such that you can upgrade without downtime. This is not simple, and it often involves having multiple servers.
I've spent ages trying to get our application to this point, the best I've come up with so far is for a couple hours run both the old version and new version at the same time. Users logged in at the time of the switchover stay on the old version, until they log out. Next time they come in they go to the new version. Any users coming on after the switchover get sent straight to the new version. It's still not foolproof, but it's pretty good.

What kind of an application is it? Most sites that I use tend to update around 2AM or 3AM.

Use a second site, and hotswap as needed.

The issue with hot-swapping, is database would still be shared, and breaking changes would bring stand in down as well.

I guess you have to ask your clients.
In any case, there's the wee hours of the morning. If you're talking about a locally available website, I do not think users will mind if they get an "under maintenance" notice at 2 am in their time zone.

Depends on your location: 4AM East Coast/1AM West Coast is typlically the lightest time.

Pick a few times that you'd like to do it and offer them as choices to the decider-types. Whatever you do, put up a "down for routine maintenance" page while you deploy.

Check the time of least usage
Clone/copy/update latest production code to another directory
If there exists any database migrations to be done, perform any that are required, and non conflicting with the old code base
At time of least usage, move symlink to point to latest code

First use an analysis tool to try and determine your typically "light" traffic times. Depending on the site and your location in the world in comparison to most of your users, it could be 4am, it could be 1pm, who knows. Then, once you have a good timeframe nailed down, make sure to have your deployment process as automated as possible, so that it happens quickly to minimize the downtime of your site.

Related

IBM BlueMix Free Plan Pricing

I created a simple python flask application with 256MB to play around with.
They state that they give up to 2GB free of ram... so I was way below it
After 30 min I see that I'm billed $0.02..
I tried to post it in their "stackoverflow" variant but I got 500 error :).. so my last resort is to ask here, maybe someone can clarify.
The fear will be to put an app there, forget about it, then at the end of the month you wake up with a $999 bill.
I don't really understand your concern, but I'm still going to try to help.
If you are afraid that you will be charged, this is not going to happen while you have the 30-days trial account. You will only have to pay if you decide to keep the account after the 30-days are over.
If you are afraid that you are being over-charged, you can always keep track of what you spent through the "View Full Usage Details", where there are all the pricing details (per region, per app, etc). You can also predict how much an app will cost beforehand using the calculator on the platform (https://new-console.au-syd.bluemix.net/?direct=classic/#/pricing/cloudOEPaneId=pricing&paneId=pricingSheet)
Lastly, if you are afraid that you will spend more than you want, there is a spending notification policy (which applies only to paid accounts, of course). I believe there is also an option to stop applications if they exceed a limit, but personally I haven't used these feature yet.
I hope this helps :)

Memcached - Pros and Cons

We have a website at swalif.com which is like a news website based on forums. We are currently using a mysql database and things are getting slow. We decided to go with the Sphinx Search Server to speedup things and its been going quiet well.
Recently we heard of something called 'memcached' and just going through it we think we should look into it before moving to a search server completely.
My question is what are the pros and cons of using 'memcached' as it is a fairly new topic to us.
Thanking you
I just got my site set up with memcached a couple months back and it's amazing. The pros are rather obvious. It can be used to cache information that is, perhaps difficult to gather. The best example is an expensive mysql query. Check your slow query log, that would be a good starting point for things to parts to target. I had this one main page that took 2.5 seconds to echo from the server (horrible, I know). I had thought about changing the way it was written and it would have been very complicated. I put in memcached on the "difficult" parts of that page and now it's down to 0.001 seconds to parse. It's just amazing.
There is one main con that I've run into. If you update your content, you have to delete all associated keys related to that new content so that your front end will refetch the data and cache the new data. If not, you get stale content. I have tens of thousands of entries in my memcached and it's difficult to delete all the appropriate ones. If you don't, you'll get old content. One solution is to just set your key expiration duration to something short (24 hours). If you do that, you know that your site will reflect the newest content, at worst, 24 hours after a change. So if you can live with that, this problem is rather moot.
Bottom line, it's one of the best tools I've ever seen. It took me less than a day to install it on the lions share of my rather big site and the impact was tremendous.

Time Period vs Functionality Trial Period

What is better for window form applications or what do you all perfer?
An application with a Trial Period by a Date (like free to use for 30 days) or do you like to limit functionality of the application for the Free Version?
I think that it depends on the application type.
I think time limited is best if:
No data saved in the application which will not be available if I don't by it.
It is an every-day kind of application, not something that is just used once or twice.
I think function limited is best if:
The application is used just once or in a short period of time.
The application can be limited in such a way that it can prove its power, but not provide useful results.
I've marked this as community wiki, feel free to add things to the lists.
I like # of uses better than a date limit. I often try something once, forget about it for a while, then want to try it again only to be locked out. Having a set number of uses unlimited by time is much better. Limited functionality can be OK depending on the app, but full functionality limited by usage is better IMO.
"The customer is always right"; provide both and let him/her chose.
We do both.
Fully functional 14-day trial period
Reverts to restricted DEMO functionality after 14-days
Maybe some users forget to use or test your app after installation. After a couple of weeks user starts your app and trial is expired -> User is angry. So think about possibility of trial extension :-)

How is Accurev Performance?

How is performance in the current version (4.7) of Accurev?
time to checkout per 100mb, per gb?
time to commit per # of files or mb?
responsiveness of gui when 100+ streams?
I just had a demo of Accurev, and the streams look like a lightweight way to model workflow around code/projects. I've heard people praising Accurev for the streams back end and complaining about performance. Accurev appears to have worked on the performance, but I'd like to get some real world data to make sure it isn't a case of demos-well-runs-less-well.
Does anyone have Accurev performance anecdotes or (even better) data from testing?
I don't have any numbers but I can tell you where we have noticed performance issues.
Our builds typically use 30-40K files from source control. In my workspace currently there are over 66K files including build intermediate and output files, over 15GB in size. To keep AccuRev working responsively we aggressively use the ignore elements so AccuRev ignores any intermediate files such as *.obj. In addition we use the time stamp optimization. In general running an update is quick, but the project sizes are typically 5-10 people so normally only a couple of dozen files come down if you update daily. Even if someone made changes that touched lots of files speed is not an issue. On the other hand a full populate of all 30K+ files is slow. I don't have a time since I seldom do this and on the rare occasion I do, I run the populate when I'm going to lunch or a meeting. I expect it could be as much as 10 minutes. In general source files come down very quickly, but we have some large binary files, 10-20MB, that take a couple of seconds each.
If the exclude rules and ignore elements are not correctly configured, AccuRev can take a couple of minutes to run an update for workspaces of this size. When I hear of other developers complaining about the speed I know something is miss-configured and we get it straightened out.
A year or so ago one of the project updated boost with 25K+ files and also added FireFox to the repository (forget the size but made boost look small.) They also added ICU, wrote a lot of software and modified countless files. In all I recall there were approx 250K+ files sitting in a stream. I unfortunately decided that all their good code should be promoted to the root so all projects could share. This turned out to be a little beyond what AccuRev could handle well. It was a multi hour process getting all the changes promoted. As I recall once FireFox was promoted the rest went smoothly - perhaps a single transaction with over 100K files was the issue?
I recently updated boost and so had to keep and promote 25K+ files. It took a minute or two but not unreasonable considering the number of files and the size of the binaries.
As for the number of streams, we have over 800 streams and workspaces. Performance here is not an issue. In general I find the large number of streams hard to navigate so I run a filtered view of just my workspaces and the just streams I'm interested in. However when I need to look at the unfiltered list to find something performance is fine.
As a final note, AccuRev support is terrific - we call them the voice in the sky. Every now and again we shoot ourselves in the foot using AccuRev and wind up clueless on how to fix things. Almost always we did something dumb and then tried something dumber to fix it. Eventually we place a support request and next thing we know they are walking us through the steps to righteousness either on the phone or a goto meeting. I've even contacted them for trivial things that I just don't have time to figure out as I'm having a hectic day and they kindly walk me through it rather than telling me to RTFM.
Edit 2014: We can now get acceptable X-Windows performance by using the commercial version of RealVNC.
Original comment:This answer applies to any version of Accurev, not just 4.7. Firstly, GUI performance might be OK if you can use the web client. If you can't use the web client and if you want GUI performance then you'd better be using Windows, or have all your developers in one place, i.e. where the Accurev server is located. Try to run the GUI on X-Windows over a WAN ? Forget it : our experience has been dozens of seconds or minutes for basic point and click operations. This is over a fairly good WAN about 800 miles distant, with an almost optimal ping time. This is not a failing of Accurev, but of X-Windows, and you'll likely have similar problems with other X applications over a WAN. So avoid basic X if you possibly can. Currently we cannot, and our WAN users are forcibly relegated to command-line only. The basic problem is that Accurev is is centralized and you can't increase the speed of light. I believe you can get around WAN latency by running Accurev Replication Servers, but that still does not properly address the problem if you have remote developers at single-person offices over VPN. It is ironic that the replication servers somewhat turn this centralized VCS into a form of DVCS. If you don't have replication servers then a horrible but somewhat workable work-around is to use a delta-synchronization tool such as rsync to sync your source tree between your local machine where you can run the GUI (i.e. GUI running directly on your Windows or Linux laptop), and the machine where you're actually working (e.g. UNIX machine 1,000 miles away). Another option is to use something like VNC which works better over a WAN than X, connecting to a virtual desktop at the Accurev server's location, and use X from there. At my workplace more than one team has resorted to using Mercurial on the side and promoting to Accurev only when it's strictly necessary. As Stephen Nutt points out above, other necessary work is to use time-stamp optimization and ignores. We also have our Accurev admins (yes, it requires you employ people to baby sit it) complain when we need to include large numbers of files, despite the fact they form a core part of our product and MUST be included and version controlled. Draw your own conclusions.

How fast can you get a fixed bug into production?

I'm working with 2 very different applications.
App #1 is a web app where I have direct access to the FTP, so fixing bugs is pretty easy. Cat A bugs are usually fixed within the next day. No problems here.
App #2 is an oil business document control app, where we have to go through two acceptancy test phases - end users test and system test. Any bugs discovered after this phase will retain until the next version, usually 2-3 months. Every new release package is a huge cost. It's really hard to explain to the end users that they have to live with some of the bugs until the next version.
How do you relate to critical bugs that can't be fixed immediately?
The faster I fix bugs the more bugs I find I need to fix.
In my personal opinion in your described situation is a very deep structural problem and it should have been dealt with before the project has started. Every programmer should know at least one person to directly push changes if needed and the procedure for this must be clear. Honestly what about security or database problems with potential data loss? I mean of course, if you can't fix it directly inform the staff and tell them to "please don't do this", but honestly the best way is to get this problem out of the world asap. I had a similar case in a terminal application where a program simply quit working after a button was pressed twice. The fix was trivial, but no one was allowed to fix it and it literally cost hours for all the people depending on this thing to run. Demand a shortcut for important changes!
The speed which management allows you to fix a bug is directly related to the cost management will endure until the bug is fixed.
I'm a 1-man team. Nothing stands between me and my bugs :)
It really depends on a combination of the organisation size, system size, importance of the system & impact of the bug eg:
One Man Shop or Low Impact System (quickest - App#1 above)
Time to fix bug = time to find bug + time to code fix + time to deploy to production
Large Organisation or Important System (longest - App#2 above)
Time to fix bug = time to find bug + time to document & prioritise bug + time to estimate cost + time to approve work on fix + time to design fix + time to document fix + time to code fix + time to document test plan + time to test fix + time to regression test + time to performance/load test + time to schedule & approve deployment + time to deploy fix
Edit: How many Microsoft employees does it take to change a lightbulb? is an interesting read on the topic.
1: See http://blogs.msdn.com/ericlippert/archive/2003/10/28/53298.aspx
The answer would be a ratio of how much access one has to the production environment to the quantity of lives or money at stake.
Workarounds.
I've had previous experience where a user deemed a functionality dead due to a bug, notified us, waited til the bug was fixed, then told us that during the downtime on that section that they've been entering information into their old excel version of the application (Oracle APEX migration from Excel) and then nicely asked us the turn around time on us dynamically inserting the data from their excel application again. The turn around for that was longer than the downtime for the original bug.