How is the deployment process handled in scrum methodology? - deployment

We are developing a complex system using scrum methodology with 1 week sprints and a team of 6 developers.
We continuously update the source code on every developer machine when the changes are tested and integrated on the development branches, and the developers daily integrate the changes to a test common server.
But the production system is critical enough for any issue or downtime to cause much $ lost, and the deployment process is slow, hard and delicate. Even if the system changes are tested and even deployed to a test server, sometimes problems arise when we try to publish the whole week progress as a lot. Thus, we have chose to have a previous deployment process which happens after all the week development is completed and deployed to the test server. We run full feature tests on the whole week changes on the test server, then publish the week work lot to a preproduction server, then sometimes everything goes fine but sometimes some new problems arise on the deployment process or the published changes, then we plan the highly delicate production process and execute it on the next night we can, avoiding any downtime for the customer work.
Now, we are having discussions with the customer since he defends this is not scrum because he isn't gettint the sprint result on the scrum day, but three days later. But obviously we can't start the pre-release and release process until the sprint completes totally - so, next day - and then the system complexity and criticality forces us to secure the deployment process to the top, and the customer production usage requires also some special operation scheduling.
Are we working against the scrum guidelines? Where is the deployment process on the scrum methodology? Is scrum appropiate for this project?

the deployment process is slow, hard and delicate.
When a deployment process is hard, it tends to mean organisations deploy less frequently. If they deploy less often then releases become bigger, more difficult and more critical. This tends to mean that there is even more reluctance to release.
This negative cycle works against Agile as it means organisations struggle to respond to change.
The best thing you can do is try and break out of this cycle by improving the release process. This may be difficult and consume time and resources, but the benefits are significant.
If you can automate your releases then you tend to reduce the risk. With the risk lowered then releasing more frequently becomes possible. Frequent releases means that the size of releases is reduced and you can quickly fix bugs if necessary.
Frequent releases also make the customer happier as they get more opportunities to provide feedback. The more feedback they give, the sooner the product will be what they want.
Perhaps a good place to start would be to automate the releases you currently make to the common test server. Once you have been doing this for a while you should have the confidence to use the same process on production.

Barnaby has the ideal answer. In the meantime, one possibility is to have a repeating story each sprint to release the approved stories from the previous sprint. Per the Scrum Guide, a team only delivers "a potentially releasable Increment of (a) 'Done' product at the end of each Sprint." The key word is "potentially." In addition to the problem you face, I have been in companies that only released once a quarter because that is what the customers wanted. If your customer wants a release every sprint, great, but nothing in the guide requires that to happen in the same sprint the stories are accepted in.
To clarify based on the comment, let's assume a team is using a traditional Dev-Test-Stage-Production architecture. The customer would review the changes in Stage (during the Demonstration Ceremony if not before). Accepted stories go into the release, which moves to Prod as part of the recurring "Release" story in the next sprint.

Related

How to use replication in combination with version control system?

The situation is as follow :
Our company works two main production sites, communicating via WAN. We develop a software internally which uses about 100Gb of disk space on our servers (application data deployed to our customers with a lot of images). In order in improve performance, our network administrators choosed DFS replication (every 6 hours). This means that our users (people from within the company) do not have to wait (sometimes 2-3 hours) to download the needed files, because they are available locally (over LAN).
The problem is that the algorithm used by DFS replication is "Last Writer Wins". So, in case of simultaneous changes (during development/maintenance), the file with the latest date will win. I would like to avoid such data loss.
I am project manager for the overall develop process. What I want to do, is to introduce people to version control systems to tackle the simultaneous modifications problem. I plan to use Mercurial for several reasons, mainly because it is distributed, simple to explain, usable for personal use, free, and (most importantly) has great merging capabilities. However the benefits of the version control system when used locally (LAN) is lost because of the replication process (WAN) which doesn't know how to merge.
Some possible solutions are to :
use only version control over the WAN (hoping that compression will be enough to speed things up)
use only DFS, and track changes manually (error-prone)
find a work-around with both methods
The team is small (about 10 persons). Your help and experience is appreciated.
If it were me, I'd have a "central" repository at each location, with the developers from each site working on a different branch. One of those should probably be chosen as the "main" branch (ideally the one that will be making the most changes), although in practice it won't really matter much.
Each team's repo should be synchronized regularly (e.g., daily, on your 6 hour schedule, or even more often) with the repo from the other location, to reflect changes made in that branch. Then they would be merged to the site's branch (ideally this would be done automatically as part of the same update, but the exact details of how that merge will happen may vary, depending on your VCS of choice and your branching model).
Remember: "sync early, sync often"

iPhone app versions: release micro-updates often, or major updates less frequently?

I released a new iPhone app 5 days ago. Already it has received high ratings, and many downloads, so I think it can be quite successful. (It's currently ranked in the top 10 paid music apps.)
What do you think is the best release strategy:
Release many micro-updates, often. (Just 1 or 2 new features per update, as they are completed.)
or
Release major updates less frequently. (Perhaps one new version every 1 or 2 months.)
The app is currently priced at $0.99 USD. Originally I planned to raise the price after the first major update. But if the app continues to sell well, I may leave the price alone.
Just curious to know how others have handled their app release cycles. Thanks!
Apple suggests (here):
High frequency updates - crashes or data loss
Updates to your application
that address crashing and data loss
should be submitted as frequently as
necessary. Fixing as many related bugs
as possible in each update is highly
recommended.
Medium frequency updates - minor
enhancements and usability
improvements Consider a release
schedule between two to four weeks
that groups together updates which do
not affect the core functionality of
your application, such as user
interface improvements, spelling
corrections, and minor functionality
enhancements.
Low frequency updates - new features
Applications with new features should
be submitted on a periodic, monthly
basis. A high frequency of new feature
updates suggests poor development
planning and can be confusing to your
customers.customers.
They say that submitting updates really often may impact the time it takes for your updates to get approved, (because each update has to be checked manually by the app store reviewers).
One reason for not updating too frequently is that frequent updates can be annoying to users. Each time, they have to key in their password and wait for it to download the update.

MS CRM recursive workflow and performance

I’m about to write a workflow in CRM that calls itself every day. This is a recursive workflow.
It will run on half a million entities each day and deactive the record if it was not been upodated in the past 3 days.
I’m worried about performance has anyone else done this.
I haven't personally implemented anything like this, but that's 500,000 records that are floating around in the DB that the async service has to keep track of, which is going to tax your hardware. In addition, CRM keeps track of recursive workflow instances. I don't have the exact specs in front of me, but if a workflow calls itself a set number of times within a certain timeframe, CRM will kill the workflow.
Could you just write a console app that asks the Crm Service for records that haven't been updated in three days, and then deactivate them? Run it as a scheduled task once a day, and then your CRM system doesn't have the burden of keeping track of all those running workflow instances.
EDIT: Ah, I see now you might have been thinking of one workflow that runs on all the records as opposed to workflows running on each record. benjynito's advice makes sense if you go this route, although I still think a scheduled task would be more appropriate than using workflow.
You'll want to make sure your workflow is running in non-peak hours. Assuming you have an on-premise installation you should be able to get away with that. If you're using a hosted instance, you might be worried about one organization running the workflow while another organization is using the system. Use the timeout and maybe a custom workflow activity, if necessary, to force the start time to a certain period.
I'm assuming you'll be as efficient as possible in figuring out which records to deactivate. (i.e. Query Expression would only bring back the records you'll be deactivating).
The built-in infinite loop-protection offered by CRM shouldn't kill your workflow instances. It stops after a call depth of 8, but it resets to 1 if no calls are made for an hour. So the fact that you're doing this once a day should make you OK on the recursive workflow front.

How fast can you get a fixed bug into production?

I'm working with 2 very different applications.
App #1 is a web app where I have direct access to the FTP, so fixing bugs is pretty easy. Cat A bugs are usually fixed within the next day. No problems here.
App #2 is an oil business document control app, where we have to go through two acceptancy test phases - end users test and system test. Any bugs discovered after this phase will retain until the next version, usually 2-3 months. Every new release package is a huge cost. It's really hard to explain to the end users that they have to live with some of the bugs until the next version.
How do you relate to critical bugs that can't be fixed immediately?
The faster I fix bugs the more bugs I find I need to fix.
In my personal opinion in your described situation is a very deep structural problem and it should have been dealt with before the project has started. Every programmer should know at least one person to directly push changes if needed and the procedure for this must be clear. Honestly what about security or database problems with potential data loss? I mean of course, if you can't fix it directly inform the staff and tell them to "please don't do this", but honestly the best way is to get this problem out of the world asap. I had a similar case in a terminal application where a program simply quit working after a button was pressed twice. The fix was trivial, but no one was allowed to fix it and it literally cost hours for all the people depending on this thing to run. Demand a shortcut for important changes!
The speed which management allows you to fix a bug is directly related to the cost management will endure until the bug is fixed.
I'm a 1-man team. Nothing stands between me and my bugs :)
It really depends on a combination of the organisation size, system size, importance of the system & impact of the bug eg:
One Man Shop or Low Impact System (quickest - App#1 above)
Time to fix bug = time to find bug + time to code fix + time to deploy to production
Large Organisation or Important System (longest - App#2 above)
Time to fix bug = time to find bug + time to document & prioritise bug + time to estimate cost + time to approve work on fix + time to design fix + time to document fix + time to code fix + time to document test plan + time to test fix + time to regression test + time to performance/load test + time to schedule & approve deployment + time to deploy fix
Edit: How many Microsoft employees does it take to change a lightbulb? is an interesting read on the topic.
1: See http://blogs.msdn.com/ericlippert/archive/2003/10/28/53298.aspx
The answer would be a ratio of how much access one has to the production environment to the quantity of lives or money at stake.
Workarounds.
I've had previous experience where a user deemed a functionality dead due to a bug, notified us, waited til the bug was fixed, then told us that during the downtime on that section that they've been entering information into their old excel version of the application (Oracle APEX migration from Excel) and then nicely asked us the turn around time on us dynamically inserting the data from their excel application again. The turn around for that was longer than the downtime for the original bug.

What time should I build to production?

My users use the site pretty equally 24/7. Is there a meme for build timing?
International audience, single cluster of servers on eastern time, but gets hit well into the morning, by international clients.
1 db, several web servers, so if no db, simple, whenever.
But when the site has to come down, when would you, as a programmer be least mad to see SO be down for say 15 minutes.
If there's truly no good time from the users' perspective, then I'd suggest doing it when your team has the most time to recover from any build-related disaster.
Here's what I have done and its worked well for me:
Get a site traffic analysis tool
which will graph hourly user load
Select low-point in graph for doing
updates
If you're small, then yeah, find when your lowest usage period is, and do it then (for us personally, usually around 1AM-3AM PST is the lowest dip...but it never drops to 0 of course). Once you start growing to having a larger userbase, if you want people to take you seriously you'll need to design your application such that you can upgrade without downtime. This is not simple, and it often involves having multiple servers.
I've spent ages trying to get our application to this point, the best I've come up with so far is for a couple hours run both the old version and new version at the same time. Users logged in at the time of the switchover stay on the old version, until they log out. Next time they come in they go to the new version. Any users coming on after the switchover get sent straight to the new version. It's still not foolproof, but it's pretty good.
What kind of an application is it? Most sites that I use tend to update around 2AM or 3AM.
Use a second site, and hotswap as needed.
The issue with hot-swapping, is database would still be shared, and breaking changes would bring stand in down as well.
I guess you have to ask your clients.
In any case, there's the wee hours of the morning. If you're talking about a locally available website, I do not think users will mind if they get an "under maintenance" notice at 2 am in their time zone.
Depends on your location: 4AM East Coast/1AM West Coast is typlically the lightest time.
Pick a few times that you'd like to do it and offer them as choices to the decider-types. Whatever you do, put up a "down for routine maintenance" page while you deploy.
Check the time of least usage
Clone/copy/update latest production code to another directory
If there exists any database migrations to be done, perform any that are required, and non conflicting with the old code base
At time of least usage, move symlink to point to latest code
First use an analysis tool to try and determine your typically "light" traffic times. Depending on the site and your location in the world in comparison to most of your users, it could be 4am, it could be 1pm, who knows. Then, once you have a good timeframe nailed down, make sure to have your deployment process as automated as possible, so that it happens quickly to minimize the downtime of your site.