Specification: Use cases for CRUD - specifications

I am writing a Product requirements specification. In this document I must describe the ways that the user can interact with the system in a very high level. Several of these operations are "Create-Read-Update-Delete" on some objects.
The question is, when writing use cases for these operations, what is the right way to do so? Can I write only one Use Case called "Manage Object x" and then have these operations as included Use Cases? Or do I have to create one use case per operation, per object? The problem I see with the last approach is that I would be writing quite a few pages that I feel do not really contribute to the understanding of the problem.
What is the best practice?

The original concept for use cases was that they, like actors, and class definitions, and -- frankly everything -- enjoy inheritance, as well as <<uses>> and <<extends>> relationships.
A Use Case superclass ("CRUD") makes sense. A lot of use cases are trivial extensions to "CRUD" with an entity type plugged into the use case.
A few use cases will be interesting extensions to "CRUD" with variant processing scenarios for -- maybe -- a fancy search as part of Retrieve, or a multi-step process for Create or Update, or a complex confirmation for Delete.
Feel free to use inheritance to simplify and normalize your use cases. If you use a UML tool, you'll notice that Use Cases have an "inheritance" arrow available to them.

The answer really depends on how complex the interactions are and how many variations are possible from object to object. There are two real reasons why I suggest that you develop specific use cases for each CRUD
(a) If you really are only doing a high-level summary of the interaction then the overhead is very small
(b) I've found it useful to specify a set of generic Use Cases for modifying 'Resources' and then extending / overriding particular steps for particular objects. Obviously the common behaviour is captured in the generic 'Resource' use cases.
As your understanding of the domain develops (i.e. as business users dump more requirements on you), you are more likely to add to the CRUD rather than remove it.

It makes sense to distinguish between workflow cases and resource/object lifecycles.
They interact but they are not the same; it makes sense to specify them both.
Use case scenarios or more extended workflow specifications typically describe how a case may proceed through the system's workflow. This will typically include interaction with various different resources. These interactions can often be characterized as C,R,U or D.
Resource lifecycles provide the process model of what may happen to a particular (type of) resource (object). They are often trivial "flower" models that say: any of C,R,U,D may happen to this resource in any order, so they are not very interesting by themselves.
The link between the two is that steps from the workflow and from the lifecycles coincide.

I feel representation - as long as it makes sense and is readable - does not matter. Conforming to the UML spec in all details is especially irrelevant.
What does matter, that you spec clearly states the operations and operation types the implementaton requires.
C: What form of insert operations exists. Can you insert rows not fully populated? Can you insert rows without an ID? Can you retrieve the ID last inserted? Can you cancel an insert selectively? What happens on duplicate keys or constraints failure? Is there a REPLACE INTO equivalent?
R: By what fields can you select? Can you do arbitrary grouping, orders? Can you create aggregate fields, aliases? How can you retrieve embedded (has many etc.) data? How do you specify depth of recursion, limits?
U, D: see R + C

Related

Shall REST API Design pay Respect to Inheritance?

This is more of a question regarding design rather than implementation. I have been working on a personal application with Python Flask backend and Vue.js SPA frontend for quite some time now and am now a bit puzzled on how to design the actual REST API.
My backend heavily depends on inheritance, e.g. there's the broad notion of an Account and then there is the ManagedAccount, exhibiting all properties of the former but also additional ones. Then there are Transactions, PredictedTransactions, RecurringTransactions etc., I guess you get the gist.
I read up on REST API design beforehand, e.g. on the Overflow blog and before I was working with IDs, I had the following design:
/accounts/ [GET] - returns all Accounts including ManagedAccounts with Account-specific properties
/accounts/managed/ [GET] - returns all ManagedAccounts with ManagedAccount-specific properties
/accounts/ [POST] - creates a new Account
/accounts/managed/ [POST] - creates a new ManagedAccount
Now, this worked quite well until I introduced ID-dependent functions, e.g.
/accounts/<:id> [PATCH] - modifies the received data in Account with given ID, can also be used to alter Account-specific properties in ManagedAccounts
/accounts/managed/<:id> [PATCH] - modifies the received data in ManagedAccount with given ID
/accounts/<:id> [DELETE] - deletes an Account, can also be used for ManagedAccount
For all IDs I am using RFC4122-compliant UUIDs in version 4, so there definitely will never be a collision within the route. Still, this behavior triggers an assertion error with Flask, which can be circumvented, but got me to rather double-check whether this design approach has any pitfalls I have not yet considered.
Two alternatives came to my mind:
Segregating the parent classes under a separate path, e.g. using /accounts/all/ for Account. This requires knowing the inheritance depth beforehand (e.g. would need breaking changes if I were to introduce a class inheriting from ManagedAccount).
Not incorporating inheritance in the routes, e.g. using /managed_accounts/ or /accounts_managed/. This would be the easiest method implementation-wise, but would - in my opinion - fail to give an accurate representation of the data model.
From the beginning on I want to give this API consistent guarantees as defined in the backend data model, e.g. for the Transaction I can currently assure that /transactions/executed/ and /transactions/predicted/ return mutually exclusive sets that, when joined, equal that returned by /transactions/.
Right now I am able to effectively utilize this on front-end side, e.g. poll data from Account in general and get specifics for ManagedAccount when required. Is the design I am currently pursuing a 'good' REST API design or am I clinging to concepts that are not applicable in this domain? If so, which design approach would make more sense and especially, why and in what situations would it be beneficial to the current one?
REST doesn't care what spellings you use for your URI, so long as your identifiers are consistent with the production rules defined in RFC 3986.
/d501c2c2-5729-469c-8eed-3643e1b06504
That UUID, of itself, is a perfectly satisfactory identifier for REST purposes.
This is more of a question regarding design
Yes, exactly -- the machines don't care what spellings you use for your resource identifiers, which gives you some freedom to address other concerns. The usual choice here is to make them easier for humans (easier for people to implement, easier for people to document, easier for operators to recognize in a log... lots of possibilities).
"No ambiguities in our automated routing" is a perfectly reasonable design constraint.
Path segments are used for the hierarchical parts of an identifier. There's no particular reason, however, that your identifier hierarchy needs to be aligned with "inheritance" hierarchy.
One long standing design principle on the web is that Cool URI Don't Change; so before introducing inheritance cues in your identifiers, be sure to consider how stable your inheritance design is.

CQRS Commands with common operations - duplicated code

Lets assume that we have several commands sharing common logic. For example i have Document that have several states. We have mutating operation that is possible on some states, but some of the logic is different depending on its state. Making one Command using If statements for more than 3 states is confusing. Its better for each operation make separate command, but what to do with common logic ?
We have to fetch data from DB, validate, generate some side documents, write to audit table and other stuff. So it looks like it should be common place and making meaningless Helper class is the worst option. I assume that this operations can / don't require transaction.
I have read http://scrapbook.qujck.com/holistic-abstractions-take-2/ and CQRS code duplication in commands . I am looking for other options.
#Redgood, If I am not mistaken, some of the things you describe belongs to the Domain.
Make sure that your business/domain "logic" is not spilling outside the domain. I do use ICommand interfaces to mark my commands and I do have some logic in there but only for data type validations or other types of integrity checks.
Keep it at that. From a Command perspective all you care about is that the data contained in the command is Good. That's it. So, make sure that all your methods in your Command are just to enforce that integrity.

Is it RESTful do DELETE collections?

Some say it's "often not desirable" for a REST server to allow the DELETEion of the entire collection of entities.
DELETE http://www.example.com/customers
Is this a real rule for achieving RESTful nirvana?
And what about sub-collections, defined by query parameters?
DELETE http://www.example.com/customers?gender=m
The answer to this depends more on the requirements and risks of your application than on the inherent RESTfulness of either construct.
It's "not often desirable" to delete an entire collection if you imagine the collection as something with enduring importance like a customer list. It doesn't break with some essential REST wisdom.
If the collection contains information that a user should be able to delete, and potentially a lot of such information, DELETE of the entire collection can be the nicest REST-ish way to go, rather than run a lot of individual DELETEs.
Deleting based on criteria (e.g. the query parameter) is so essential to some applications that if the REST police declared it Officially UnRESTful I would continue to do it without shame.
(They actually say "not often desirable," which one might interpret slightly differently than "often not desirable.")
Yes, it's RESTful. If you have a valid use case, it's fine to do it. Your second scenario (deleting with a query) is frequently useful, and can be an easy way to reduce the number of HTTP requests the client has to make.
Edit: as #peeskillet says, do consider if you actually want to delete something, versus change some flag on the record (e.g. "active").

Meteor method vs. deny/allow rules

In Meteor, when should I prefer a method over a deny rule?
It seems to me that allow/deny rules should be favoured, as their goal is more explicit, and one knows where to look for them.
However, in the Discover Meteor book, preventing duplicate insertions (“duplicate” being defined as adding a document whose url property is already defined in some other document of the same collection) is said to have to be defined through a method (and left as an exercise to the reader, chapter 8.3).
I think I am able to implement this check in a way that I find much clearer:
Posts.deny({
update: function(userId, post, fieldNames, modifier) {
return Posts.findOne({ url: modifier.$set.url, _id: { $ne: post._id } });
}
});
(N.B. if you know the example, yes, I voluntarily left out the “only a subset of the attributes is modified” check from the question to be more specific.)
I understand that there are other update operators than $set in Mongo, but they look typed and I don't feel like leaving a security hole open.
So: are there any flaws in my deny rule? Independently, should I favour a method? What would I gain from it? What would I lose?
Normally I try to avoid subjective answers, but this is a really important debate. First I'd recommend reading Meteor Methods vs Client-Side Operations from the Discover Meteor blog. Note that at Edthena we exclusively use methods for reasons which should become evident.
Methods
pro
Methods can correctly enforce schema and validation rules of arbitrary complexity without the need of an outside library. Side note - check is an excellent tool for validating the structure of your inputs.
Each method is a single source of truth in your application. If you create a 'posts.insert' method, you can easily ensure it is the only way in your app to insert posts.
con
Methods require an imperative style, and they tend to be verbose in relation to the number of validations required for an operation.
Client-side Operations
pro
allow/deny has a simple declarative style.
con
Validating schema and permissions on an update operation is infinitely hard. If you need to enforce a schema you'll need to use an outside library like collection2. This reason alone should give you pause.
Modifications can be spread all over your application. Therefore, it may be tricky to identify why a particular database operation happened.
Summary
In my opinion, allow/deny is more aesthetically pleasing, however it's fundamental weakness is in enforcing permissions (particularly on updates). I would recommend client-side operations in cases where:
Your codebase is relatively small - so it's easy to grep for all instances where a particular modifier occurs.
You don't have many developers - so you don't need to all agree that there is one and only one way to insert into X collection.
You have simple permission rules - e.g. only the owner of a document can modify any aspect of it.
In my opinion, using client-side operations is a reasonable choice when building an MVP, but I'd switch to methods for all other situations.
update 2/22/15
Sashko Stubailo created a proposal to replace allow/deny with insert/update/remove methods.
update 6/1/16
The meteor guide takes the position that allow/deny should always be avoided.

How to start working with a large decision table

Today I've been presented with a fun challenge and I want your input on how you would deal with this situation.
So the problem is the following (I've converted it to demo data as the real problem wouldn't make much sense without knowing the company dictionary by heart).
We have a decision table that has a minimum of 16 conditions. Because it is an impossible feat to manage all of them (2^16 possibilities) we've decided to only list the exceptions. Like this:
As an example I've only added 10 conditions but in reality there are (for now) 16. The basic idea is that we have one baseline (the default) which is valid for everyone and all the exceptions to this default.
Example:
You have a foreigner who is also a pirate.
If you go through all the exceptions one by one, and condition by condition you remove the exceptions that have at least one condition that fails. In the end you'll end up with the following two exceptions that are valid for our case. The match is on the IsPirate and the IsForeigner condition. But as you can see there are 2 results here, well 3 actually if you count the default.
Our solution
Now what we came up with on how to solve this is that in the GUI where you are adding these exceptions, there should run an algorithm which checks for such cases and force you to define the exception more specifically. This is only still a theory and hasn't been tested out but we think it could work this way.
My Question
I'm looking for alternative solutions that make the rules manageable and prevent the problem I've shown in the example.
Your problem seem to be resolution of conflicting rules. When multiple rules match your input, (your foreigner and pirate) and they end up recommending different things (your cangetjob and cangetevicted), you need a strategy for resolution of this conflict.
What you mentioned is one way of resolution -- which is to remove the conflict in the first place. However, this may not always be possible, and not always desirable because when a user adds a new rule that conflicts with a set of old rules (which he/she did not write), the user may not know how to revise it to remove the conflict.
Another possible resolution method is prioritization. Mark a priority on each rule (based on things like the user's own authority etc.), sort the matching rules according to priority, and apply in ascending sequence of priority. This usually works and is much simpler to manage (e.g. everybody knows that the top boss's rules are final!)
Prioritization may also be used to mark a certain rule as "global override". In your example, you may want to make "IsPirate" as an override rule -- which means that it overrides settings for normal people. In other words, once you're a pirate, you're treated differently. This make it very easy to design a system in which you have a bunch of normal business rules governing 90% of the cases, then a set of "exceptions" that are treated differently, automatically overriding certain things. In this case, you should also consider making "?" available in the output columns as well.
One other possible resolution method is to include attributes in each of your conditions. For example, certain conditions must have no "zeros" in order to pass (? doesn't matter). Some conditions must have at least one "one" in order to pass. In other words, mark each condition as either "AND", "OR", or "XOR". Some popular file-system security uses this model. For example, CanGetJob may be AND (you want to be stringent on rights-to-work). CanBeEvicted may be OR -- you may want to evict even a foreigner if he is also a pirate.
An enhancement on the AND/OR method is to provide a threshold that the total result must exceed before passing that condition. For example, putting CanGetJob at a threshold of 2 then it must get at least two 1's in order to return 1. This is sometimes useful on conditions that are not clearly black-and-white.
You can mix resolution methods: e.g. first prioritize, then use AND/OR to resolve rules with similar priorities.
The possibilities are limitless and really depends on what your actual needs are.
To me this problem reminds business rules engine where there is no known algorithm to define outputs from inputs (e.g. using boolean logic) but the user (typically some sort of administrator) has to define all or some the logic itself.
This might sound a bit of an overkill but OTOH this provides virtually limit-less extension capabilities: you don't have to code any new business logic, just define a new rule set.
As I understand your problem, you are looking for a nice way to visualise the editing for these rules. But this all depends on your programming language and the tool you select for this. Java, for example, has JBoss Drools. Quoting their page:
Drools Guvnor provides a (logically
centralized) repository to store you
business knowledge, and a web-based
environment that allows business users
to view and (within certain
constraints) possibly update the
business logic directly.
You could possibly use this generic tool or write your own.
Everything depends on what your actual rules will look like. Rules like 'IF has an even number of these properties THEN' would be painful to represent in this format, whereas rules like 'IF pirate and not geek THEN' are easy.
You can 'avoid the ambiguity' by stating that you'll always be taking the first actual match, in other words your rules have a priority. You'd then want to flag rules which have no effect because they are 'shadowed' by rules higher up. They're not hard to find, so it's something your program should do.
Your interface could also indicate groups of rules where rules within the group can be in any order without changing the outcomes. This will add clarity to what the rules are really saying.
If some of your outputs are relatively independent of the others, you will also get a more compact and much clearer table by allowing question marks in the output. In that design the scan for first matching rule is done once for each output. Consider for example if 'HasChildren' is the only factor relevant to 'Can Be Evicted'. With question marks in the outputs (= no effect) you could be halving the number of exception rules.
My background for this is circuit logic design, not business logic. What you're designing is similar to, but not the same as, a PLA. As long as your actual rules are close to sum of products then it can work well. If your rules aren't, for example the 'even number of these properties' rule, then the grid like presentation will break down in a combinatorial explosion of cases. Your best hope if your rules are arbitrary is to get a clearer more compact presentation with either equations or with diagrams like a circuit diagram. To be avoided, if you can.
If you are looking for a Decision Engine with a GUI, than you can try this one: http://gandalf.nebo15.com/
We just released it, it's open source and production ready.
You probably need some kind of inference engine. Think about doing it in prolog.