Essential techniques for pinpointing missing requirements? - specifications

An initial draft of requirements specification has been completed and now it is time to take stock of requirements, review the specification. Part of this process is to make sure that there are no sizeable gaps in the specification. Needless to say that the gaps lead to highly inaccurate estimates, inevitable scope creep later in the project and ultimately to a death march.
What are the good, efficient techniques for pinpointing missing and implicit requirements?
This question is about practical techiniques, not general advice, principles or guidelines.
Missing requirements is anything crucial for completeness of the product or service but not thought of or forgotten about,
Implicit requirements are something that users or customers naturally assume is going to be a standard part of the software without having to be explicitly asked for.
I am happy to re-visit accepted answer, as long as someone submits better, more comprehensive solution.

Continued, frequent, frank, and two-way communication with the customer strikes me as the main 'technique' as far as I'm concerned.

It depends.
It depends on whether you're being paid to deliver what you said you'd deliver or to deliver high quality software to the client.
If the former, simply eliminate ambiguity from the specifications and then build what you agreed to. Try to stay away from anything not measurable (like "fast", "cool", "snappy", etc...).
If the latter, what Galwegian said + time or simply cut everything not absolutely drop-dead critical and build that as quickly as you can. Production has a remarkable way of illuminating what you missed in Analysis.

evaluate the lifecycle of the elements of the model with respect to a generic/overall model such as
acquisition --> stewardship --> disposal
do you know where every entity comes from and how you're going to get it into your system?
do you know where every entity, once acquired, will reside, and for how long?
do you know what to do with each entity when it is no longer needed?
for a more fine-grained analysis of the lifecycle of the entities in the spec, make a CRUDE matrix for the major entities in the requirements; this is a matrix with the operations/applications as the rows and the entities as the columns. In each cell, put a C if the application Creates the entity, R for Reads, U for Updates, D for Deletes, or E for "Edits"; 'E' encompasses C,R,U, and D (most 'master table maintenance' apps will be Es). Then check each column for C,R,U, and D (or E); if one is missing (except E), figure out if it is needed. The rows and columns of the matrix can be rearranged (manually or using affinity analysis) to form cohesive groups of entities and applications which generally correspond to subsystems; this may assist with physical system distribution later.
It is also useful to add a "User" entity column to the CRUDE matrix and specify for each application (or feature or functional area or whatever you want to call the processing/behavioral aspects of the requirements) whether it takes Input from the user, produces Output for the user, or Interacts with the user (I use I, O, and N for this, and always make the User the first column). This helps identify where user-interfaces for data-entry and reports will be required.
the goal is to check the completeness of the specification; the techniques above are useful to check to see if the life-cycle of the entities are 'closed' with respect to the entities and applications identified

Here's how you find the missing requirements.
Break the requirements down into tiny little increments. Really small. Something that can be built in two weeks or less. You'll find a lot of gaps.
Prioritize those into what would be best to have first, what's next down to what doesn't really matter very much. You'll find that some of the gap-fillers didn't matter. You'll also find that some of the original "requirements" are merely desirable.
Debate the differences of opinion as to what's most important to the end users and why. Two users will have three opinions. You'll find that some users have no clue, and none of their "requirements" are required. You'll find that some people have no spine, and things they aren't brave enough to say out loud are "required".
Get a consensus on the top two or three only. Don't argue out every nuance. It isn't possible to envision software. It isn't possible for anyone to envision what software will be like and how they will use it. Most people's "requirements" are descriptions of how the struggle to work around the inadequate business processes they're stuck with today.
Build the highest-priority, most important part first. Give it to users.
GOTO 1 and repeat the process.
"Wait," you say, "What about the overall budget?" What about it? You can never know the overall budget. Do the following.
Look at each increment defined in step 1. Provide a price-per-increment. In priority order. That way someone can pick as much or as little as they want. There's no large, scary "Big Budgetary Estimate With A Lot Of Zeroes". It's all negotiable.

I have been using a modeling methodology called Behavior Engineering (bE) that uses the original specification text to create the resulting model when you have the model it is easier to identify missing or incomplete sections of the requirements.
I have used the methodolgy on about six projects so far ranging from less than a houndred requirements to over 1300 requirements. If you want to know more I would suggest going to www.behaviorengineering.org there some really good papers regarding the methodology.
The company I work for has created a tool to perform the modeling. The work rate to actually create the model is about 5 requirements for a novice and an expert about 13 requirements an hour. The cool thing about the methodolgy is you don't need to know really anything about the domain the specification is written for. Using just the user text such as nouns and verbs the modeller will find gaps in the model in a very short period of time.
I hope this helps
Michael Larsen

How about building a prototype?

While reading tons of literature about software requirements, I found these two interesting books:
Problem Frames: Analysing & Structuring Software Development Problems by Michael Jackson (not a singer! :-).
Practical Software Requirements: A Manual of Content and Style by Bendjamen Kovitz.
These two authors really stand out from the crowd because, in my humble opinion, they are making a really good attempt to turn development of requirements into a very systematic process - more like engineering than art or black magic. In particular, Michael Jackson's definition of what requirements really are - I think it is the cleanest and most precise that I've ever seen.
I wouldn't do a good service to these authors trying to describe their aproach in a short posting here. So I am not going to do that. But I will try to explain, why their approach seems to be extremely relevant to your question: it allows you to boil down most (not all, but most!) of you requirements development work to processing a bunch of check-lists* telling you what requirements you have to define to cover all important aspects of the entire customer's problem. In other words, this approach is supposed to minimize the risk of missing important requirements (including those that often remain implicit).
I know it may sound like magic, but it isn't. It still takes a substantial mental effort to come to those "magic" check-lists: you have to articulate the customer's problem first, then analyze it thoroughly, and finally dissect it into so-called "problem frames" (which come with those magic check-lists only when they closely match a few typical problem frames defined by authors). Like I said, this approach does not promise to make everything simple. But it definitely promises to make requirements development process as systematic as possible.
If requirements development in your current project is already quite far from the very beginning, it may not be feasible to try to apply the Problem Frames Approach at this point (although it greatly depends on how your current requirements are organized). Still, I highly recommend to read those two books - they contain a lot of wisdom that you may still be able to apply to the current project.
My last important notes about these books:
As far as I understand, Mr. Jackson is the original author of the idea of "problem frames". His book is quite academic and theoretical, but it is very, very readable and even entertaining.
Mr. Kovitz' book tries to demonstrate how Mr. Jackson ideas can be applied in real practice. It also contains tons of useful information on writing and organizing the actual requirements and requirements documents.
You can probably start from the Kovitz' book (and refer to Mr. Jackson's book only if you really need to dig deeper on the theoretical side). But I am sure that, at the end of the day, you should read both books, and you won't regret that. :-)
HTH...

I agree with Galwegian. The technique described is far more efficient than the "wait for customer to yell at us" approach.

Related

How bad is SLOC (source lines of code) as a metric? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
We are documenting our software development process. For technical people, this is pretty easy: iterative development with internal milestones every four weeks, external every 3 months.
However, the purpose of this exercise is to expose things for our project management in terms that they can understand. Specifically, these non-technical managers need metrics that they can understand.
I understand our options for metrics well and have proposed a whole set (requirements met and actual costs vs. budgeted costs are two of my favorites). However, we do have some old hands involved and they tend to hang onto metrics like SLOC.
I understand the temptation of SLOC: it seems easy for non-software people to understand and it seems like the closest analog of a physical thing (it's just like counting punched cards back in the old days!).
So here's the question: how can I explain the dangers of SLOC to a non-technical person?
Here's some concrete motivation: we work on a fairly mature deployed system that has years of history behind it. As we add features, SLOC tends to stay approximately level or even decrease (refactoring removes old / dead code, new features are really just adjustments of existing, etc). To a non-programmer manager, a non-increasing SLOC in a development project is perplexing at best....
Clarifying in response to a recent answer below: remember, I'm arguing that SLOC is a bad metric for the purposes of measuring project progress. I'm not arguing that it is a number that's not worth collecting. It requires extensive context to do anything useful with it and most program managers don't have that context.
Someone said :
"Using SLOC to measure software progress is like using kg for measuring progress on aircraft manufacturing"
It is totally inappropriate as it encourages bad practices like :
Copy-Paste-Syndrome
discourage refactoring to make things easier
Stuffing with meaningless comments
...
The only use is that it can help you to estimate how much paper to put in the printer when you do a printout of the complete source tree.
The issue with SLOC is that it's an easy metric to game. Being productive does not equate to producing more code. So the way I've explained it to people baring what Skilldrick said is this:
The more lines of code there are the more complicated something gets.
The more complicated something gets, the harder it is to understand it.
Before I add a new feature or fix a bug I need to understand it.
Understanding takes time.
Time costs money.
Smaller code -> easier to understand -> cheaper to add new features
Bean counters can understand that.
Show them the difference between:
for(int i = 0; i < 10; i++) {
print i;
}
and
print 0;
print 1;
print 2;
...
print 9
And ask them whether 10 SLOC or 3 SLOC is better.
In response to the comments:
It doesn't take long to explain how a for loop works.
After you show them this, say "we now need to print numbers up to 100 - here's how you make that change." and show how much longer it takes to change the non-DRY code.

			
				
I disagree on SLOC being a bad metric. It may be moot to go into a years-old question with eleven answers, but I'll still add another.
Most arguments call it a bad metric because it is not suited to directly measure productivity. That is a strange argument; it assumes the metric to be used in an insane way. With this reasoning, one could call the Kelvin a bad unit because it is unsuited to measure distance.
Code length is a viable measure of ballast.
The amount of non-comment code lines correlates with:
undetected errors
maintenance costs
training time for new contributors
migration costs
new feature costs
and many more similar kinds of costs, like the cost of optimization.
Of course SLOC count isn't a precise measure of any of these. Code can be anywhere between very nice and very ugly to manage. But it can be assumed that code length is rarely free, and thus, longer code is often harder to manage.
If I were managing a team of programmers, I would very much want to keep track of the ballast it creates or removes.
Explain that SLOC is an excellent measurement of the lines of code in the application, nothing else. The number of lines in a book, or the length of a film doesn't determine how good it is. You can improve a film and shorten it, you can improve an application and reduce the lines of code.
Pretty bad (-:
A much better idea would to cover the test cases, rather than code.
The idea is this: a developer should commit a test case that fails, then commit the fix in next build, and the test case should pass ... just measure how many test cases the developer added.
As a bonus collect coverage stats (branch coverage is better than line coverage here).
You don't judge how good(how many features,how it performs..) a plane is based on its weight(sloc).
When you want your plane to fly higher, longer and perform better, you don't add weight to it. You replace parts of it with lighter/better materials. You strip off parts you don't need as to not add unnecessary weight.
I believe SLOC is a great metric. It tells you how large your system is. That is good for judging complexity and resources. And it helps you prepare the next developer for working on a codebase.
But SLOC count should be analyzed only AFTER other appropriate code quality metrics have been applied. So...
Do NOT write 2 lines of code when 1 will do, unless the 2-line
version makes the code 2 times easier to maintain.
Do NOT fluff code with unnecessary comments just to fluff SLOC count.
Do NOT pay people by SLOC count.
I have been managing software projects for 30 years. I use SLOC count all the time, to help understand mature systems. I have never found it useful to even glance at SLOC count until a project is near version 1.0 release.
Basically, during the development process, I worry about quality, performance, usability, and conformance to specifications. Get those right, and the project will probably be a success. When the dust settles, look at SLOC count. You might be surprised that you got SO much out of 5,000 lines of code. And you might be surprised that you got SO little! (But SLOC count does not affect quality, performance, usability, and conformance to specification.)
And always code like the person who will be working on your code next is a violent psychopath who knows where you live.
Cheers,
Uncle Chip
even modern code metrics tools criticize SLOC conting, i like the point made in the ProjectCodeMeter FAQ:
What's wrong with counting Lines Of Code (SLOC / LLOC)?
Why SLOC is bad as an individual metric of productivity
Think of code as a block of clay/stone. You need to carve, say 10 statues. It's not how many statues you carve that counts. It's how well you've carved it that counts. Similarly it's not how many lines you've written but how well they are functioning. In case of code LOC can backfire as a metric this way.
Productivity also changes when writing a complex piece of code. It takes a second to write a print statement but a lot of time to write a complex piece of logic. Not all fingers are equal.
How SLOC can be used to your benefit
I think SLOC for defect % is a good metric. Yes the difficulty level comes into play but this is a good parameter that the managers can throw around while doing business. Try to think from their perspective too. They don't hate you or your work, but they need to tell customers that you're the best and for that they need something tangible. Give them what you can :)
SLOC can be changed dramatically by putting extra empty lines ("for readability") or by putting or removal of comments. So relying on SLOC only can lead to confusion.
Why don't they understand that the SLOC hasn't changed, but the software does more than it did yesterday because you've added new features, or fix bugs?
Now explain it to them like this. Measuring how much work was done in your code by comparing the lines of code is the same as measuring how many features are in your cell phone comparing it by size. Cell phones have decreased in size over 20 years time while adding more features because of technological improvements and techniques. Good code follows this same principal as we can express the same logic in fewer and fewer lines of code, making it faster to run, easier to maintain, and simpler to understand as we improve our understanding of the problem and introduce new techniques for development.
I would get them to focus on the business value returned through feature development, maintenance, and bug fixes. If whoever is happy with the software says they can see improvement don't sweat the SLOC.
Go read this:
https://stackoverflow.com/questions/3800707/what-is-negative-code

Best Dijkstra papers to explain this quote?

I was enjoying "The Humble Programmer" earlier today and ran across this choice quote:
Therefore, for the time being and perhaps forever, the rules of the second kind present themselves as elements of discipline required from the programmer. Some of the rules I have in mind are so clear that they can be taught and that there never needs to be an argument as to whether a given program violates them or not. Examples are the requirements that no loop should be written down without providing a proof for termination nor without stating the relation whose invariance will not be destroyed by the execution of the repeatable statement.
I'm looking for which of Dijkstra's 1300+ writings best describe in further detail rules such as he was describing above.
Page 5 through 18: http://userweb.cs.utexas.edu/users/EWD/ewd02xx/EWD249.PDF
Mid. page 3 through end: http://userweb.cs.utexas.edu/users/EWD/ewd04xx/EWD473.PDF
End page 5 through end: http://userweb.cs.utexas.edu/users/EWD/ewd06xx/EWD641.PDF
All: http://userweb.cs.utexas.edu/users/EWD/transcriptions/EWD02xx/EWD261.html (Dutch, translation=below)
Note: Dijkstra numbers his pages starting at 0. Given page numbers are starting at 1, the PDF page number, and not the written page numbers.
My translation of EWD261 in English:
How to program mathematically
A (well-defined) programme is structured just like a (well-defined) mathematical theory. The programmers' work is not different from that of a creative mathematician.
There are small, but important, differences, though:
There are not much basic concepts of programming and they are not difficult to comprehend (though misleadingly simple); this is why it's an ideal for development practice. (Besides this, there is the fact that a demand for correctness, the programme should really work!)
With most mathematical education one learns about existing theorems, viz. equipping a student with a specific (detailed) set of concepts; a programmer, however, has to develop the needed concept himself. Programming requires the abstractions which leads to a type of creativity, while the same in mathematics is limited to applying existing theorems.
Because programmes are big and nevertheless have to work will programmers learn how to develop carefully and consciously. This is exactly what one should teach! To teach extensive knowledge is, for me, not justified.

Looking for examples where knowledge of discrete mathematics is helpful [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Inspired after watching Michael Feather's SCNA talk "Self-Education and the Craftsman", I am interested to hear about practical examples in software development where discrete mathematics have proved helpful.
Discrete math has touched every aspect of software development, as software development is based on computer science at its core.
http://en.wikipedia.org/wiki/Discrete_math
Read that link. You will see that there are numerous practical applications, although this wikipedia entry speaks mainly in theoretical terms.
Techniques I learned in my discrete math course from university helped me quite a bit with the Professor Layton games.
That counts as helpful... right?
There are a lot of real-life examples where map coloring algorithms are helpful, besides just for coloring maps. The question on my final exam had to do with traffic light programming on a six-way intersection.
As San Jacinto indicates, the fundamentals of programming are very much bound up in discrete mathematics. Moreover, 'discrete mathematics' is a very broad term. These things perhaps make it harder to pick out particular examples. I can come up with a handful, but there are many, many others.
Compiler implementation is a good source of examples: obviously there's automata / formal language theory in there; register allocation can be expressed in terms of graph colouring; the classic data flow analyses used in optimizing compilers can be expressed in terms of functions on lattice-like algebraic structures.
A simple example the use of directed graphs is in a build system that takes the dependencies involved in individual tasks by performing a topological sort. I suspect that if you tried to solve this problem without having the concept of a directed graph then you'd probably end up trying to track the dependencies all the way through the build with fiddly book-keeping code (and then finding that your handling of cyclic dependencies was less than elegant).
Clearly most programmers don't write their own optimizing compilers or build systems, so I'll pick an example from my own experience. There is a company that provides road data for satnav systems. They wanted automatic integrity checks on their data, one of which was that the network should all be connected up, i.e. it should be possible to get to anywhere from any starting point. Checking the data by trying to find routes between all pairs of positions would be impractical. However, it is possible to derive a directed graph from the road network data (in such a way as it encodes stuff like turning restrictions, etc) such that the problem is reduced to finding the strongly connected components of the graph - a standard graph-theoretic concept which is solved by an efficient algorithm.
I've been taking a course on software testing, and 3 of the lectures were dedicated to reviewing discrete mathematics, in relation to testing. Thinking about test plans in those terms seems to really help make testing more effective.
Understanding of set theory in particular is especially important for database development.
I'm sure there are numerous other applications, but those are two that come to mind here.
Just example of one of many many...
In build systems it's popular to use topological sorting of jobs to do.
By build system I mean any system where we have to manage jobs with dependency relation.
It can be compiling program, generating document, building building, organizing conference - so there is application in task management tools, collaboration tools etc.
I believe testing itself properly procedes from modus tollens, a concept of propositional logic (and hence discrete math), modus tollens being:
P=>Q. !Q, therefore !P.
If you plug in "If the feature is working properly, the test will pass" for P=>Q, and then take !Q as given ("the test did not pass"), then, if all these statements are factually correct, you have a valid, sound basis for returning the feature for a fix. By contrast, many, maybe most testers operate by the principle:
"If the program is working properly, the test will pass. The test passed, therefore the program is working properly."
This can be written as: P=>Q. Q, therefore P.
But this is the fallacy of "affirming the consequent" and does not show what the tester believes it shows. That is, they mistakenly believe that the feature has been "validated" and can be shipped. When Q is given, P may in fact either be true or it may be untrue for P=>Q, and this can be shown with a truth table.
Modus tollens is core to Karl Popper's notion of science as falsification, and testing should proceed in much the same way. We're attempting to falsify the claim that the feature always works under every explicit and implicit circumstance, rather than attempting to verify that it works in the narrow sense that it can work in some proscribed way.

How do I adapt my recommendation engine to cold starts?

I am curious what are the methods / approaches to overcome the "cold start" problem where when a new user or an item enters the system, due to lack of info about this new entity, making recommendation is a problem.
I can think of doing some prediction based recommendation (like gender, nationality and so on).
You can cold start a recommendation system.
There are two type of recommendation systems; collaborative filtering and content-based. Content based systems use meta data about the things you are recommending. The question is then what meta data is important? The second approach is collaborative filtering which doesn't care about the meta data, it just uses what people did or said about an item to make a recommendation. With collaborative filtering you don't have to worry about what terms in the meta data are important. In fact you don't need any meta data to make the recommendation. The problem with collaborative filtering is that you need data. Before you have enough data you can use content-based recommendations. You can provide recommendations that are based on both methods, and at the beginning have 100% content-based, then as you get more data start to mix in collaborative filtering based.
That is the method I have used in the past.
Another common technique is to treat the content-based portion as a simple search problem. You just put in meta data as the text or body of your document then index your documents. You can do this with Lucene & Solr without writing any code.
If you want to know how basic collaborative filtering works, check out Chapter 2 of "Programming Collective Intelligence" by Toby Segaran
Maybe there are times you just shouldn't make a recommendation? "Insufficient data" should qualify as one of those times.
I just don't see how prediction recommendations based on "gender, nationality and so on" will amount to more than stereotyping.
IIRC, places such as Amazon built up their databases for a while before rolling out recommendations. It's not the kind of thing you want to get wrong; there are lots of stories out there about inappropriate recommendations based on insufficient data.
Working on this problem myself, but this paper from microsoft on Boltzmann machines looks worthwhile: http://research.microsoft.com/pubs/81783/gunawardana09__unified_approac_build_hybrid_recom_system.pdf
This has been asked several times before (naturally, I cannot find those questions now :/, but the general conclusion was it's better to avoid such recommendations. In various parts of the worls same names belong to different sexes, and so on ...
Recommendations based on "similar users liked..." clearly must wait. You can give out coupons or other incentives to survey respondents if you are absolutely committed to doing predictions based on user similarity.
There are two other ways to cold-start a recommendation engine.
Build a model yourself.
Get your suppliers to fill in key information to a skeleton model. (Also may require $ incentives.)
Lots of potential pitfalls in all of these, which are too common sense to mention.
As you might expect, there is no free lunch here. But think about it this way: recommendation engines are not a business plan. They merely enhance the business plan.
There are three things needed to address the Cold-Start Problem:
The data must have been profiled such that you have many different features (with product data the term used for 'feature' is often 'classification facets'). If you don't properly profile data as it comes in the door, your recommendation engine will stay 'cold' as it has nothing with which to classify recommendations.
MOST IMPORTANT: You need a user-feedback loop with which users can review the recommendations the personalization engine's suggestions. For example, Yes/No button for 'Was This Suggestion Helpful?' should queue a review of participants in one training dataset (i.e. the 'Recommend' training dataset) to another training dataset (i.e. DO NOT Recommend training dataset).
The model used for (Recommend/DO NOT Recommend) suggestions should never be considered to be a one-size-fits-all recommendation. In addition to classifying the product or service to suggest to a customer, how the firm classifies each specific customer matters too. If functioning properly, one should expect that customers with different features will get different suggestions for (Recommend/DO NOT Recommend) in a given situation. That would the 'personalization' part of personalization engines.

What makes a good spec? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
One of the items in the Joel Test is that a project/company should have a specification.
I'm wondering what makes a spec good. Some companies will write volumes of useless specification that no one ever reads, others will not write anything down because "no one will read any of it anyway". So, what do you put into your spec? What is the good balance between the two extremes? Is there something particularly important that really, really (!) should always be recorded in a specification?
The best spec is one that:
Exists
Describes WHAT, not HOW (no solutions)
Can be interpreted in as few ways as possible
Is widely-distributed
Is agreed-upon as being THE spec by all parties involved
Is concise
Is consistent
Is updated regularly as requirements change
Describes as much of the problem as is possible and practical
Is testable
What to put in a spec
You need to look at the audience of the spec and work out what they need to know. Is it just a document between you and a business sponsor? In this case it can probably be fairly lightweight. If it's a functional spec for a 100+ man-year J2EE project it will probably need a bit more detail.
The audience
The key question is: who is going to read the spec - A spec will have several potential sets of stakeholders:
The business owner who is signing
off the system.
The developer who is building the
system (which may or may not be you)
QA people who have to write test plans for it.
Maintenance staff wanting to
understand the system
Developers or analysts on other projects who
may want to integrate other systems into it.
Requirements of typical key stakeholders:
The business owner needs to have a clear idea of what the system workflows and business rules are so they can have a fighting chance of understanding what they have agreed to. If they are the only major audience of the spec, concentrate on the user interface, screen-screen workflow and business and data validation rules.
Developers need a data model, data validation rules, some or all of the user interface design and enough description of the expected system behaviour so they know what to build. If you are writing for developers concentrate on the user interface, mapping to data model and rules in the user interface. This should be more detailed than if you are doing the development yourself because you are acting as an intermediary in a communication between two third parties.
If you are specifying an interface between two systems, this has to be very precise.
QA staff need enough information to work out how to test and validate the logic, validation and expected user interface behaviour of the application. A spec intended for developers and QA staff needs to be fairly unambiguous.
Maintenance staff need much the same information as developers plus a system roadmap document describing the architecture.
Integrators need a data model and clear definitions of any interfaces.
Key components of a spec:
I'm assuming that one is writing specs for business apps, so the content below is geared to this. Specs for other types of systems will have different emphasis. In my experience the key elements of a functional spec are:
User Interface: screen mockups and a description of the interaction behaviour of the system and workflow between screens.
Data Model: Definition of the data items and mapping to the user interface. User interface mappings are normally done in the bits of the spec describing the user interface.
Data Validation and Business Rules: What checks for correctness need to be be made on the data and what computations are being made, along with definitions. Examples can be quite useful here.
Definitions of interfaces: If you have interfaces exposed that other systems can use, you need to specify those pretty tightly. The simpler internet RFC's give quite good examples of protocol designs and are quire a good start for examples of interface documents. Clearly defining interfaces isn't easy but almost certainly save you grief down the track.
Glue: this is where use cases, workflow diagrams and other requirements related artifacts help. Generally an exhaustive listing of these is pointless, but there will be key areas within the system where this type of documentation helps to put items in context. My experience is that selective inclusion of use cases and other requirements level descriptions does a lot to add clarity and meaning to a spec but writing up a user story for every single interaction with the system is a waste of time.
Joel (of 'on software' fame) wrote a good series of articles on this called Painless Functional Specification which I've referred people to on quite a few occasions. It's quite a good set of articles and well worth a read. In my opinion, your objective is to clearly explain what the system is supposed to do in a way that minimises ambiguity. It's quite useful to think of the spec as a reference document - what might the various stakeholders want to be able to easily look up.
Having written a glib set of bullet points about specs, the clear communication part is harder than it looks. Specs are actually non-trivial technical documents and are quite a test of one's technical writing and editorial skills. You are actually in the business of writing document that describes what someone is supposed to build. Doing good specs is a bit of an art.
The pay-off for doing specs is that no-one else wants to do them. As you've written what is probably the only document of any importance for the system, you get to call the shots. Anyone else with an agenda has to either lobby you to change the spec or somehow impose a competing spec on the project. This is a good example of the pen being mightier than the sword.
EDIT: It has been my experience that debate about the distinction between 'how' and 'what' tends to be pretty self-serving. On any non-trivial project the data model and user interface will have multiple stakeholders, not all of whom are the system's developers. Working in data warehousing will give one a taste for the chaos that ensues when an application data model is allowed to become a free-for all, and PFS should give one a feel for the potential set of stakeholders a spec has to cater to.
The fact that someone owns a data model or user interface design doesn't mean that these are just decided by fiat - there can be a discourse and negotiation process. However, as a project gets larger the value of ownership and consistency in these gets greater. It's been my observation in the past that the best way to appreciate the value of a good analyst is to see the damage done by a bad one.
In my experience a spec will have more chance of being read if it has the following:
Use diagrams where possible - pictures are worth 1000 words
Have a title page that clearly indicates what the spec is describing
Have a style that is used throughout the document. Make all headers the same font, size and style. Make the font the same all the way through, use the same bullet styles etc
DONT WAFFLE - Be clear concise and to the point, and don't add extra cruft to pad out your document. If a point can't be explained in a few lines of text, then maybe you need to break it down further
I have seen in companies where the person writing the spec doesn't understand the system. It's almost a way of learning the system by writing the spec. This usually ends in tears...
As someone who develops bespoke software for clients, the best spec is the one which the customer has signed.
It doesn't matter how refined your spec is - if the customer hasn't explicitly agreed to it in writing, they'll change it and expect you to roll with their changes seamlessly, wrecking your beautiful architecture...
Good specs should contain requirements that are measurable and verifiable. When looking at each requirement, you should be able to easily answer the question, "How can I prove I have fulfilled this requirement?".
Read Joel's series of "Painless Functional Specifications" followups to the Joel Test article. They also appear in the "Joel on Software" book.
Depends on how big the project is and (like all architecture decisions) what the constraints are. A good start is
a short description, a "one pager"
a context diagram -- where are the
boundaries, what interacts with the
system?
use cases/user stories
a GUI prototype or paper prototype,
if applicable
a description of the needed
nonfunctional requirements
(performance etc.)
Best of all is to have an acceptance test, ie, a testable statement of things that can be checked, along with an agreement that when those things are done, the project is complete.
It also helps if you start by stating the goal the user has or what the global idea of a certain function is; rather than filling in the exact implementation. This always feels to me like narrowing down the open mindedness or using less creative (more usable) solutions. So you should keep "all options open".
Example
Your writing a software to measure "X".
Instead of stating:
There has to be a start button and a save button.
Use:
The user has to be able to start a measurement and save it.
Why?
Because in the first situation you already determined what the solution has to be, while the second situation gives you flexibility on how to implement something. Now this may seem trivial, but I have the feeling "programmers" tend to think more in solutions rather than in problems (or situations). When you add more functionality this becomes more obvious, because then it might have been better to use a wizard or automate the process, but you already narrowed the idea's down to using buttons.
For functional requirements—or, more specifically, behavioral requirements—I like to use Cucumber and Gherkin.
Here’s an example of a simple and short specification for a new feature in a simple mapping application. The feature allows small businesses to sign up to the mapping platform and add their places of business on a Google Maps-like service.
Feature: Allow new businesses to appear on the map
Scenario Outline: Businesses should provide required data
Given a <business> at <location>
When <business> signs up to the map platform
Then it <should?> be added to the platform
And its name <should?> appear on the map at <location>
Examples: Business name and location should be required
| business | location | should? |
| UNNAMED BUSINESS | NOWHERE | shouldn't |
Examples: Allow only businesses with correct names
| business | location | should? |
| Back to Black | 8114 2nd Street, Stockton | should |
| UNNAMED BUSINESS | 8114 2nd Street, Stockton | shouldn't |
Examples: Allow businesses with two or more establishments
| business | location | should? |
| Deep Lemon | 6750 Street South, Reno | should |
| Deep Lemon | 289 Laurel Drive, Reno | should |
Examples: Allow only suitable locations
| business | location | should? |
| Anchor | 77 Chapel Road, Chicago | should |
| Anchor | Chicago River, Chicago | shouldn't |
| Anchor | NOWHERE | shouldn't |
This specification looks deceivably simple, but is in fact quite powerful.
Good specifications are clear, unambiguous and concrete. They don't need to be deciphered in order to write working code. That’s exactly what Gherkin specs are. They’re best served short and simple. Instead of writing a long ass specification document, you let the specification suite evolve along with your product by writing new specs in every iteration.
Gherkin is a business-readable language for writing specification documents based on the Given-When-Then template. The template can be automated into acceptance tests. Automating the specification ensures it stays up to date because the captured conversation is directly tied to testing code. This way, tests can be used as documentation, because Gherkin features have to change every time the code changes.
When each business rule is given an automated test, Gherkin specifications become so-called executable specifications—specifications that can be run as computer programs. The program tests whether the acceptance criteria were implemented correctly. So at the end of the day, we get a yes-or-no answer to the question of whether our product is actually doing what we expect it to do—which in itself is very valuable, as it contributes to making software of better quality.
The direct connection between Gherkin specifications and testing code often reduces the damage of waste by creating and cultivating a system of living documentation. Thanks to frequent validation of tests, as in continuous integration systems, you can know that Given-When-Thens are still up to date—and when you trust your tests, you can use the corresponding Gherkin specifications as documentation for the entire system.
In fact, there’s an entire methodology called Specification by Example that uses tools like Gherkin. Specification by Example's practices reduce possibility for misunderstandings and rework by giving you a framework for talking with business stakeholders by forcing you to use concrete, discrete, unambiguous examples in your specification documents.
If you want to read more about Cucumber, Gherkin, BDD, and Specification by Example, I wrote a book on the subject. “Writing Great Specifications” explores the art of writing great scenarios and will help you make executable specifications a core part of your development process.
If you are interested in buying “Writing Great Specifications,” you can save 39% with the promo code 39nicieja2 :)
I think writing "Use cases" should save you bunch of pages
+1 #KiwiBastard and I would add write bullet-like and make each bullet testable.
A blueprint that describes all of the critical information necessary for the implementation, but doesn't waste any effort on describing all of the trivial or obvious information that is also necessary.
It should just be enough information to insure that the implementation is "as expected", without providing too much additional noise that isn't necessary.
In practice, most people get this wrong, as they focus on the easy stuff (which is the least necessary) and shy away from the hard stuff (which is what you really really want to lock down). I've seen way too many 2 inch documents that completely and utterly miss the point, and very few 3 page ones that hit it dead on.
Specs don't have to be long, they just have to contain the right stuff!
(hint: if the programmer didn't look at that page while coding, it probably wasn't required)
Paul.