State pattern for form validation - forms

I need to capture user input using a form. Each field within the form will undergo validation. The field will be either valid or invalid. Depending on the user input, certain parts of the form may be enabled, disabled, filtered or otherwise modified.
I am considering the state pattern to model the state transitions through the form. Each state will affect how the form is displayed, filtered etc. However, my understanding of the state pattern is that it would require a very large number of states to represent my form.
For example; if I have 10 fields that can be valid or invalid that is:
10P2 = 90 permutations.
That is an enormous number of states to represent in code, and I have grossly simplified the problem.
Questions:
Am I misunderstanding how to implement the state pattern for my problem?
If not, is the state pattern the wrong solution to my problem?
If yes to the last question, what is a good general solution?

Am I misunderstanding how to implement the state pattern for my
problem?
I think you've understood it correctly.
If not, is the state pattern the wrong solution to my problem?
Yes. The State pattern is a good solution when there are a limited number of states (conditions). This is not true in your case.
If yes to the last question, what is a good general solution?
I would recommend using the Specification pattern. You can have any number of rules attached to your input fields. The rules can determine if the field should be enabled or disabled, visible or hidden. Also worth noting is that the rules can be easily unit tested separately.

Related

Can I avoid a relation loop in my database design?

I try to design database tables for the case shown below. I also have an account defined, but it's not important regarding my problem.
There is a list of operations (expenses). Each operation can take place in specified POI, places can be grouped in chains (optional). Each operation can have a recipient, specifically a shop chain.
My current design looks like below. I could even remove chain table in favor of direct reference to recipient, but it still leaves a loop between tables. Effectively, single row could contain references to place and receiving account having different recipient defined.
The only solution I can see is a table check to exclude described case, but I'm wondering: is there a better fix?
As far as I can tell there isn't anything fundamentally wrong with your design. There's no need to change it just because it contains a loop. The loop in this case doesn't even appear to be a circular dependency. If you believe your current design accurately models what it is intended to then I see no need to change it.

Are there exceptions to the rule that requirements should be atomic?

I'm reviewing a requirements spec where some of the requirements include the word "and" or sometimes even a list of required functionality.
Am mostly thinking these should be broken up but this does have the downside of making a long document even longer and even less readable - which in practice may mean its intended audience ends up skimming over it or only reading sections rather than absorbing the whole thing.
However, there are some requirements where it seems a bit silly to break them up. E.g: there are a lot of get/set operations, which always go together - it seems a bit overkill to always break them up into "The user shall be able to get...", "The user shall be able to set..." Other examples are enable/disable, validation lists, supported platforms/browsers etc.
Just wondering if anyone has had similar thoughts and whether it might sometimes be OK to break the rule of atomicity?
My opinion is that you do not have to break up the requirements, as long as you uniquely identify them. E.g. "[REQ1] The user should be able to [a] set ... and [b] get ..." In this way you keep the document readable and also keep the possibility of separately tracing the atomic parts.

How to structure a RESTful URI with mulitple inter-related parameters

I'm building a RESTful API in which the user can issue a query about a given object, with a weight attached to that object. E.g.:
http://host.domain.com/cars?id=100&weight=50
(This is a contrived, simplified example, so apologies if this doesn't make much semantic sense!)
The complication is that the user might need to combine multiple objects in a single query. What I'm wondering is if there is a standard RESTful way to do this? For example, options that occur to me include:
http://host.domain.com/cars?id1=100&weight1=50&id2=200&weight2=90
http://host.domain.com/cars?ids=100,200&weights=50,90
I don't like the second one, because, for example, weights are optional, so you'd need to allow something like this:
http://host.domain.com/cars?ids=100,200&weights=,90
The first one seems preferable to me, but it seems like it could become complicated, particularly as I already have indexed arguments (e.g. x1, x2) meaning I'll need to have two levels of indexes (x1_1, x1_2, ...)
Anyone know of a standard approach to this kind of thing? Or can anyone think of a pragmatic, sensible solution?
I am not sure your question is covered by Cool URIs - http://www.w3.org/TR/cooluris/
My personal choice, with no citations to support it, would be to firstly get rid of the query string using the server configuration (redirects or aliases), so that the base resource would appear as:
http://host.domain.com/cars
The list of IDs and weights could then be appended (in the URI's 'path info'), delimited as you see fit -- semi-colons, or slashes. My choice would be the latter, simply as it makes the URI cleaner to read and easier to type. The only time that becomes a problem is if weights are sometimes omitted, though that could be overcome if the IDs were alphanumeric (perhaps hashes), and the weights always numeric.
I still don't know if this is right or not, and LeeGee's suggestion seems reasonable, but I've ended up going with something like this:
http://host.domain.com/cars?id_1=100&weight_1=50&id_2=200&weight_2=90
It ends up creating ugly looking URIs, but it seems to me that they're consistent, and unambiguous, particularly when optional arguments are omitted.

How to start working with a large decision table

Today I've been presented with a fun challenge and I want your input on how you would deal with this situation.
So the problem is the following (I've converted it to demo data as the real problem wouldn't make much sense without knowing the company dictionary by heart).
We have a decision table that has a minimum of 16 conditions. Because it is an impossible feat to manage all of them (2^16 possibilities) we've decided to only list the exceptions. Like this:
As an example I've only added 10 conditions but in reality there are (for now) 16. The basic idea is that we have one baseline (the default) which is valid for everyone and all the exceptions to this default.
Example:
You have a foreigner who is also a pirate.
If you go through all the exceptions one by one, and condition by condition you remove the exceptions that have at least one condition that fails. In the end you'll end up with the following two exceptions that are valid for our case. The match is on the IsPirate and the IsForeigner condition. But as you can see there are 2 results here, well 3 actually if you count the default.
Our solution
Now what we came up with on how to solve this is that in the GUI where you are adding these exceptions, there should run an algorithm which checks for such cases and force you to define the exception more specifically. This is only still a theory and hasn't been tested out but we think it could work this way.
My Question
I'm looking for alternative solutions that make the rules manageable and prevent the problem I've shown in the example.
Your problem seem to be resolution of conflicting rules. When multiple rules match your input, (your foreigner and pirate) and they end up recommending different things (your cangetjob and cangetevicted), you need a strategy for resolution of this conflict.
What you mentioned is one way of resolution -- which is to remove the conflict in the first place. However, this may not always be possible, and not always desirable because when a user adds a new rule that conflicts with a set of old rules (which he/she did not write), the user may not know how to revise it to remove the conflict.
Another possible resolution method is prioritization. Mark a priority on each rule (based on things like the user's own authority etc.), sort the matching rules according to priority, and apply in ascending sequence of priority. This usually works and is much simpler to manage (e.g. everybody knows that the top boss's rules are final!)
Prioritization may also be used to mark a certain rule as "global override". In your example, you may want to make "IsPirate" as an override rule -- which means that it overrides settings for normal people. In other words, once you're a pirate, you're treated differently. This make it very easy to design a system in which you have a bunch of normal business rules governing 90% of the cases, then a set of "exceptions" that are treated differently, automatically overriding certain things. In this case, you should also consider making "?" available in the output columns as well.
One other possible resolution method is to include attributes in each of your conditions. For example, certain conditions must have no "zeros" in order to pass (? doesn't matter). Some conditions must have at least one "one" in order to pass. In other words, mark each condition as either "AND", "OR", or "XOR". Some popular file-system security uses this model. For example, CanGetJob may be AND (you want to be stringent on rights-to-work). CanBeEvicted may be OR -- you may want to evict even a foreigner if he is also a pirate.
An enhancement on the AND/OR method is to provide a threshold that the total result must exceed before passing that condition. For example, putting CanGetJob at a threshold of 2 then it must get at least two 1's in order to return 1. This is sometimes useful on conditions that are not clearly black-and-white.
You can mix resolution methods: e.g. first prioritize, then use AND/OR to resolve rules with similar priorities.
The possibilities are limitless and really depends on what your actual needs are.
To me this problem reminds business rules engine where there is no known algorithm to define outputs from inputs (e.g. using boolean logic) but the user (typically some sort of administrator) has to define all or some the logic itself.
This might sound a bit of an overkill but OTOH this provides virtually limit-less extension capabilities: you don't have to code any new business logic, just define a new rule set.
As I understand your problem, you are looking for a nice way to visualise the editing for these rules. But this all depends on your programming language and the tool you select for this. Java, for example, has JBoss Drools. Quoting their page:
Drools Guvnor provides a (logically
centralized) repository to store you
business knowledge, and a web-based
environment that allows business users
to view and (within certain
constraints) possibly update the
business logic directly.
You could possibly use this generic tool or write your own.
Everything depends on what your actual rules will look like. Rules like 'IF has an even number of these properties THEN' would be painful to represent in this format, whereas rules like 'IF pirate and not geek THEN' are easy.
You can 'avoid the ambiguity' by stating that you'll always be taking the first actual match, in other words your rules have a priority. You'd then want to flag rules which have no effect because they are 'shadowed' by rules higher up. They're not hard to find, so it's something your program should do.
Your interface could also indicate groups of rules where rules within the group can be in any order without changing the outcomes. This will add clarity to what the rules are really saying.
If some of your outputs are relatively independent of the others, you will also get a more compact and much clearer table by allowing question marks in the output. In that design the scan for first matching rule is done once for each output. Consider for example if 'HasChildren' is the only factor relevant to 'Can Be Evicted'. With question marks in the outputs (= no effect) you could be halving the number of exception rules.
My background for this is circuit logic design, not business logic. What you're designing is similar to, but not the same as, a PLA. As long as your actual rules are close to sum of products then it can work well. If your rules aren't, for example the 'even number of these properties' rule, then the grid like presentation will break down in a combinatorial explosion of cases. Your best hope if your rules are arbitrary is to get a clearer more compact presentation with either equations or with diagrams like a circuit diagram. To be avoided, if you can.
If you are looking for a Decision Engine with a GUI, than you can try this one: http://gandalf.nebo15.com/
We just released it, it's open source and production ready.
You probably need some kind of inference engine. Think about doing it in prolog.

a simple/practical example of fuzzy c-means algorithm

I am writing my master thesis on the subject of dynamic keystroke authentication. To support ongoing research, I am writing code to test out different methods of feature extraction and feature matching.
My current simple approach just checks if the reference password keycodes matches the currently typed in keycodes and also checks if the keypress times (dwell) and the key-to-key times (flight) are the same as reference times +/- 100ms (tolerance). This is of course very limited and I want to extend it with some sort of fuzzy c-means pattern matching.
For each key the features look like: keycode, dwelltime, flighttime (first flighttime is always 0).
Obviously the keycodes can be taken out of the fuzzy algorithm because they have to be exactly the same.
In this context, how would a practical implementation of fuzzy c-means look like?
Generally, you would do the following:
Determine how many clusters you want (2? "Authentic" and "Fake"?)
Determine what elements you want to cluster (individual keystrokes? login attempts?)
Determine what your feature vectors will look like (dwell time, flight time?)
Determine what distance metric you will be using (how will you measure the distance of each sample from each cluster?)
Create exemplar training data for each cluster type (what does an authentic login look like?)
Run the FCM algorithm on the training data to generate the clusters
To create the membership vector for any given login attempt sample, run it through the FCM algorithm using the clusters you found in step 6
Use the resulting membership vector to determine (based on some threshold criteria) whether the login attempt is authentic
I'm not an expert, but this seems like an odd approach to determining whether a login attempt is authentic or not. I've seen FCM used for pattern recognition (eg. which facial expression am I making?), which makes sense because you're dealing with several categories (eg. happy, sad, angry, etc...) with defining characteristics. In your case, you really only have one category (authentic) with defining characteristics. Non-authentic keystrokes are simply "not like" authentic keystrokes, so they won't cluster.
Perhaps I am missing something?
I don't think you really want to do clustering here. You might want to do some proper fuzzy matching though instead of just allowing some delta on each value.
For clustering, you need to have many data points. Additionally, you'd need to know the proper number of means you need.
But what are these multiple objects meant to be? You have one data point for every keycode. You don't want to have the user type the password 100 times to see if he can do it consistently. And even then, what do you expect the clusters to be? You already know which keycode comes at which position, you don't want to find out what keycodes the user use for his password...
Sorry, I really don't see any clustering here. The term "fuzzy" seems to have mislead you to this clustering algorithm. Try "fuzzy logic" instead.