What is the benefit of Connectedness as defined by Resource Oriented Architecture (ROA)? The way I understand it, the crux of Connectedness is the ability to crawl the entire application state using only the root URIs.
But how useful is that really?
For example, imagine that HTTP GET http://example.com/users/joe returns a link to http://examples.com/uses/joe/bookmarks.
Unless you're writing a dumb web crawler (and even then I wonder), you still need to teach the client what each link means at compile-time. That is, the client needs to know that the "bookmarks URI" returns a URI to Bookmark resources, and then pass control over to special Bookmark-handling algorithms. You can't just pass links blindly to some general client method. Since you need this logic anyway:
What's the difference between the client figuring out the URI at runtime versus providing it at compile-time (making http://example.com/users/bookmarks a root URI)?
Why is linking using http://example.com/users/joe/bookmarks/2 preferred to id="2"?
The only benefit I can think of is the ability to change the path of non-root URIs over time, but this breaks cached links so it's not really desirable anyway. What am I missing?
You are right that changing Uris is not desirable but it does happen and using complete Uris instead of constructing them makes change easier to deal with.
One other benefit is that your client application can easily retrieve resources from multiple hosts. If you allowed your client to build the URI's the client would need to know on which host certain resources reside. This is not a big deal when all of the resources live on a single host but it becomes more tricky when you are aggregating data from multiple hosts.
My last thought is that maybe you are oversimplifying the notion of connectedness by looking at it as a static network of links. Sure the client needs to know about the possible existence of certain links within a resource but it does not necessarily need to know exactly what are the consequences of following that link.
Let me try an give an example: A user is placing an order for some items and they are ready to submit their cart. The submit link may actually go to two different places depending on whether the order will be delivered locally or internationally. Maybe orders over a certain value need to go through an extra step. The client just knows that it has to follow the submit link, but it does not have compiled in knowledge of where to go next. Sure you could build a common "next step" type of resource so the client could have this knowledge explicitly but by having the server deliver the link dynamically you introduce a lot less client-server coupling.
I think of the links in resources as placeholders for what the user could choose to do. Who will do the work and how it will be done is determined by what uri the server attaches to that link.
Its easier to extend, and you could write small apps and scripts to work along with the core application fairly easily.
Added: Well the whole point starts with the idea that you don't specify at compile-time how to convert URIs to uids in a hardcoded fashion, instead you might use dictionaries or parsing to do that, giving you a much more flexible system.
Then later on say someone else decides to change the URI syntax, that person could write a small script that translates URIs without ever touching your core Application. Another benefit is if your URIs are logical other users, even within a corporate scenario, can easily write Mash-ups to make use of your system, without touching your original App or even recompiling it.
Of course the counter side to the whole argument is that it'll take you longer to implement a URI based system over a simple UID system. But if your App will be used by others in a regular fashion, that initial time investment will greatly payback (it could be said to have a good extensibility based ROI).
Added: And another point which is a mater of tastes to some degree is the URI itself will be a better Name, because it conveys a logical and defined meaning
I'll add my own answer:
It is far easier to follow server-provided URIs than construct them yourself. This is especially true as resource relationships become too complex to be expressed in simple rules. It's easier to code the logic once in the server than re-implement it in numerous clients.
The relationship between resources may change even if individual resource URIs remain unchanged. For example, imagine Google Maps indexes their map tiles from 0 to 100, counting from the top-left to the bottom-right of the screen. If Google Maps were to change the scale of their tiles, clients that calculate relative tile indexes would break.
Custom IDs identify a resource. URIs go a step further by identifying how to retrieve the resource representation. This simplifies the logic of read-only clients such as web-crawlers or clients that download opaque resources such as video or audio files.
Related
In my years specifying and designing REST APIs, I'm increasingly finding that its very similar to designing a website where the user's journey and the actions and links are story-boarded and critical to the UX.
With my API designs currently, I return links in items and at the bottom of resources. They perform actions, mutate state or bring back other resources.
But its as if each link opens in a new tab; the client explores down a new route and their next options may narrow as they go.
If this were a website, it wouldn't necessarily be a good design. The user would have to either open links in new tabs or back-up the stack all the time to get things done.
Good sites are forward only, or indeed have a way to indicate a branch off the main flow, i.e. links automatically opening in new windows (via anchor tag target).
So should a good REST API be designed as if the client discards the current resource and advances to the next and is always advancing forward?
Or do we assume the client is building a map as it goes, like um a Roomba exploring our living room?
The thing with the map concept is that the knowledge that one should return to a previous resource, of the many it might know about, is in a sentient human, a guess. Computers are incapable of guessing and so its need programming, and this implies out-of-band static documentation and breaks REST.
In my years specifying and designing REST APIs, I'm increasingly finding that its very similar to designing a website
Yes - a good REST API looks a lot like a machine readable web site.
So should a good REST API be designed as if the client discards the current resource and advances to the next and is always advancing forward?
Sort of - the client is permitted to cache representations; so if you present a link, the client may "follow" the link to the cached representation rather than using the server.
That also means that the client may, at its discretion, "hit the back button" to go off and do something else (for example, if the link that it was hoping to find isn't present, it might try to achieve its goal another way). This is part of the motivation for the "stateless" constraint; the server doesn't have to pretend to know the client's currently displayed page to interpret a message.
Computers are incapable of guessing and so its need programming, and this implies out-of-band static documentation and breaks REST.
Fielding, writing in 2008
Of course the client has prior knowledge. Every protocol, every media type definition, every URI scheme, and every link relationship type constitutes prior knowledge that the client must know (or learn) in order to make use of that knowledge. REST doesn’t eliminate the need for a clue. What REST does is concentrate that need for prior knowledge into readily standardizable forms. That is the essential distinction between data-oriented and control-oriented integration.
I found this nugget in Fielding's original work.
https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
The model application is therefore an engine that moves from one state to the next by examining and choosing from among the alternative state transitions in the current set of representations. Not surprisingly, this exactly matches the user interface of a hypermedia browser. However, the style does not assume that all applications are browsers. In fact, the application details are hidden from the server by the generic connector interface, and thus a user agent could equally be an automated robot performing information retrieval for an indexing service, a personal agent looking for data that matches certain criteria, or a maintenance spider busy patrolling the information for broken references or modified content [39].
It reads like a great REST application would be built to be forward only, like a great website should be simple to use even without a back button, including advancing to a previously-seen representation (home and search links always available).
Interestingly we tend to really think about user journeys in web design, and the term journey is a common part of our developer language, but in API design this hasn't yet permeated.
I have read a lot of discussions here on SO, watched Jon Moore's presentation (which explained a lot, btw) and read over Roy Fielding's blog post on HATEOAS but I still feel a little in the dark when it comes to client design.
API Question
For now, I'm simply returning xhtml with forms/anchors and definition lists to represent the resources. The following snippet details how I lay out forms/anchors/lists.
# anchors
<li class='docs_url/#resourcename'>
<a rel='self' href='resource location'></a>
</li>
# forms
<form action='action_url' method='whatever_method' class='???'></form>
# lists
<dl class='docs_url/#resourcename'>
<dt>property</dt>
<dd>value</dd>
</dl>
My question is mainly for forms. In Jon's talk he documents form types such as (add_location_form) etc. and the required inputs for them. I don't have a lot of resources but I was thinking of abstract form types (add , delete, update, etc) and just note in the documentation that for (add, update) that you must send a valid representation of the target resource and with delete that you must send the identifier.
Question 1: With the notion of HATEOAS, shouldn't we really just make the client "discover" the form (by classing them add,delete,update etc) and just send back all the data we gave them? My real question here (not meant to be a discussion) is does this follow good practice?
Client Question
Following HATEOAS, with our actions on resources being discover-able, how does this effect client code (consumers of the api) and their ui. It sounds great that following these principals that the UI should only display actions that are available but how is that implemented?
My current approach is parsing the response as xml and usin xpath to look for the actions which are known at the time of client development (documented form classes ie. add,delete,update) and display the ui controls if they are available.
Question 2: Am I wrong in my way of discovery? Or is this too much magic as far as the client is concerned ( knowing the form classes )? Wouldn't this assume that the client knows which actions are available for each resource ( which may be fine because it is sort of a reason for creating the client, right? ) and should the mapping of actions (form classes) to resources be documented, or just document the form classes and allow the client (and client developer) to research and discover them?
I know I'm everywhere with this, but any insight is much appreciated. I'll mark answered a response that answers any of these two questions well. Thanks!
No, you're pretty much spot on.
Browsers simply render the HTML payload and rely on the Human to actually interpret, find meaning, and potentially populate the forms appropriately.
Machine clients, so far, tend to do quite badly at the "interpret" part. So, instead developers have to make the decisions in advance and guide the machine client in excruciating detail.
Ideally, a "smart" HATEOS client would have certain facts, and be aware of context so that it could better map those facts to the requirements of the service.
Because that's what we do, right? We see a form "Oh, they want name, address, credit card number". We know not only what "name", "address", and "credit card" number mean, we also can intuit that they mean MY name, or the name of the person on the credit card, or the name of the person being shipped to.
Machines fail pretty handily at the "intuit" part as well. So as a developer, you get to code in the logic of what you think may be necessary to determine the correct facts and how they are placed.
But, back to the ideal client, it would see each form, "know" what the fields wanted, consult its internal list of "facts", and then properly populate the payload for the request and finally make the request.
You can see that a trivial, and obviously brittle, way to do that is to simply map the parameter names to the internal data. When the parameter name is "name", you may hard code that to something like: firstName + " " + lastName. Or you may consider the actual rel to "know" they're talking about shipping, and use: shipTo.firstName + " " + shipTo.lastName.
Over time, ideally you could build up a collection of mappings and such so that if suddenly a payload introduced a new field, and it happened to be a field you already know about, you could fill that in as well "automatically" without change to the client.
But the simply truth is, that while this can be done, it's pretty much not done. The semantics are usually way to vague, you'd have to code in new "intuition" each time for each new payload anyway, so you may as well code to the payload directly and be done with it.
The key thing, though, especially about HATEOS, is that you don't "force" your data on to the server. The server tells you what it wants, especially if they're giving you forms.
So the thought process is not "Oh, if I want a shipping invoice, I see that, right now, they want name, address and order number, and they want it url encoded, and they want it sent to http://example.com/shipping_invoice. so I'll just always send: name + "&" + address + "&" + orderNumber every time to http://example.com/shipping_invoice. Easy!".
Rather what you want to do is "I see they want a name, address, and order number. So what I'll do is for each request, I will read their form. I will check what fields they want each time. If they want name, I will give them name. If they want address, I will give them address. If they want order number, I will give them order number. And if they have any PRE-POPULATED fields (or even "hidden" fields), I will send those back too, and I will send it in the encoding they asked for, assuming I support it, to the URL I got from the action field of the FORM tag.".
You can see in the former case, you're ASSUMING that they want that payload every time. Just like if you were hard coding URLs. Whereas with the second, maybe they decided that the name and address are redundant, so they don't ask for it any more. Maybe they added some nice defaults for new functionality that you may not support yet. Maybe they changed the encoding to multi-part? Or changed the endpoint URL. Who knows.
You can only send what you know when you code the client, right? If they change things, then you can only do what you can do. If they add fields, hopefully they add fields that are not required. But if they break the interface, hey, they break the interface and you get to log an error. Not much you can do there.
But the more that you leverage HATEOS part, the more of it they make available to you so you can be more flexible: forms to fill out, following redirects properly, paying attention to encoding and media types, the more flexible your client becomes.
In the end, most folks simply don't do it in their clients. They hard code the heck out of them because it's simple, and they assume that the back end is not changing rapidly enough to matter, or that any downtime if such change does happen is acceptable until they correct the client. More typically, especially with internal systems, you'll simply get an email from the developers "hey were changing XYZ API, and it's going live on March 1st. Please update your clients and coordinate with the release team during integration testing. kthx".
That's just the reality. That doesn't mean you shouldn't do it, or that you shouldn't make your servers more friendly to smarter clients. Remember a bad client that assumes everything does not invalidate a good REST based system. These systems work just fine with awful clients. wget ftw, eh?
I've searched the web for this bit to no avail - I Hope some one can point me in the right direction. I'm happy to look things up, but its knowing where to start.
I am creating an iPhone app which takes content updates from a webserver and will also push feedback there. Whilst the content is obviously available via the app, I don't want the source address to be discovered and published my some unhelpful person so that it all becomes freely available.
I'm therefore looking at placing it in a mySQL database and possibly writing some PHP routines to provide access to my http(s) requests. That's all pretty new to me but I can probably do it. However, I'm not sure where to start with the security question. Something simple and straightforward would be great. Also, any guidance on whether to stick with the XML parser I currently have or to switch to JSON would be much appreciated.
The content consists of straightforward data but also html and images.
Doing exactly what you want (prevent users from 'unauthorized' apps to get access to this data') is rather difficult because at the end of the day, any access codes and/or URLs will be stored in your app for someone to dig up and exploit.
If you can, consider authenticating against the USER not the App. So that even if there is a 3rd party app created that can access this data from where ever you store it, you can still disable it on a per-user basis.
Like everything in the field of Information Security, you have to consider the cost-benefit. You need to weigh-up the value of your data vs. the cost of your security both in terms of actual development cost and the cost of protecting it as well as the cost of inconveniencing users to the point that you can't sell your data at all.
Good luck!
I'm writing a web application for public consumption...How do you get over/ deal with the fear of User Input? As a web developer, you know the tricks and holes that exist that can be exploited particularly on the web which are made all the more easier with add-ons like Firebug etc
Sometimes it's so overwhelming you just want to forget the whole deal (does make you appreciate Intranet Development though!)
Sorry if this isn't a question that can be answered simply, but perhaps ideas or strategies that are helpful...Thanks!
One word: server-side validation (ok, that may have been three words).
There's lots of sound advice in other answers, but I'll add a less "programming" answer:
Have a plan for dealing with it.
Be ready for the contingency that malicious users do manage to sneak something past you. Have plans in place to mitigate damage, restore clean and complete data, and communicate with users (and potentially other interested parties such as the issuers of any credit card details you hold) to tell them what's going on. Know how you will detect the breach and close it. Know that key operational and development personnel are reachable, so that a bad guy striking at 5:01pm on the Friday before a public holiday won't get 72+ clear hours before you can go offline let alone start fixing things.
Having plans in place won't help you stop bad user input, but it should help a bit with overcoming your fears.
If its "security" related concerns you need to just push through it, security and exploits are a fact of life in software, and they need to be addressed head-on as part of the development process.
Here are some suggestions:
Keep it in perspective - Security, Exploits and compromises are going to happen to any application which is popular or useful, be prepared for them and expect them to occur
Test it, then test it again - QA, Acceptance testing and sign off should be first class parts of your design and production process, even if you are a one-man shop. Enlist users to test as a dedicated (and vocal) user will be your most useful tool in finding problems
Know your platform - Make sure you know the technology, and hardware you are deploying on. Ensure that relevant patches and security updates are applied
research - look at applications similar to your own and see what issues they experience, surf their forums, read their bug logs etc.
Be realistic - You are not going to be able to fix every bug and close every hole. Pick the most impactful ones and address those
Lots of eyes - Enlist as many people to review your designs and code as possible. This should be in addition to your QA resources
You don't get over it.
Check everything at server side - validate input again, check permissions, etc.
Sanitize all data.
That's very easy to write in bold letter and a little harder to do in practice.
Something I always did was wrap all user strings in an object, something like StringWrapper which forces you to call an encoding method to get the string. In other words, just provide access to s.htmlEncode() s.urlEncode().htmlEncode() etc. Of course you need to get the raw string so you can have a s.rawString() method, but now you have something you can grep for to review all uses of raw strings.
So when you come to 'echo userString' you will get a type error, and you are then reminded to encode/escape the string through the public methods.
Some other general things:
Prefer white-lists over black lists
Don't go overboard with stripping out bad input. I want to be able to use the < character in posts/comments/etc! Just make sure you encode data correctly
Use parameterized SQL queries. If you are SQL escaping user input yourself, you are doing it wrong.
First, I'll try to comfort you a bit by pointing out that it's good to be paranoid. Just as it's good to be a little scared while driving, it's good to be afraid of user input. Assume the worst as much as you can, and you won't be disappointed.
Second, program defensively. Assume any communication you have with the outside world is entirely compromised. Take in only parameters that the user should be able to control. Expose only that data that the user should be able to see.
Sanitize input. Sanitize sanitize sanitize. If it's input that will be displayed on the site (nicknames for a leaderboard, messages on a forum, anything), sanitize it appropriately. If it's input that might be sent to SQL, sanitize that too. In fact, don't even write SQL directly, use an intermediary of some sort.
There's really only one thing you can't defend from if you're using HTTP. If you use a cookie to identify somebody's identity, there's nothing you can do from preventing somebody else in a coffeehouse from sniffing the cookie of somebody else in that coffee house if they're both using the same wireless connection. As long as they're not using a secure connection, nothing can save you from that. Even Gmail isn't safe from that attack. The only thing you can do is make sure an authorization cookie can't last forever, and consider making them re-login before they do something big like change password or buy something.
But don't sweat it. A lot of the security details have been taken care of by whatever system you're building on top of (you ARE building on top of SOMETHING, aren't you? Spring MVC? Rails? Struts? ). It's really not that tough. If there's big money at stake, you can pay a security auditing company to try and break it. If there's not, just try to think of everything reasonable and fix holes when they're found.
But don't stop being paranoid. They're always out to get you. That's just part of being popular.
P.S. One more hint. If you have javascript like this:
if( document.forms["myForm"]["payment"].value < 0 ) {
alert("You must enter a positive number!");
return false;
}
Then you'd sure as hell have code in the backend that goes:
verify( input.payment >= 0 )
"Quote" everything so that it can not have any meaning in the 'target' language: SQL, HTML, JavaScript, etc.
This will get in the way of course, so you have to be careful to identify when this needs special handling, like through administrative privileges to deal with some if the data.
There are multiple types of injection and cross-site scripting (see this earlier answer), but there are defenses against all of them. You'll clearly want to look at stored procedures, white-listing (e.g. for HTML input), and validation, to start.
Beyond that, it's hard to give general advice. Other people have given some good tips, such as always doing server-side validation and researching past attacks.
Be vigilant, but not afraid.
No validation in web-application layer.
All validations and security checks should be done by the domain layer or business layer.
Throw exceptions with valid error messages and let these execptions be caught and processed at presentation layer or web-application.
You can use validation framework
to automate validations with the help
of custom validation attributes.
http://imar.spaanjaars.com/QuickDocId.aspx?quickdoc=477
There should be some documentation of known exploits for the language/system you're using. I know the Zend PHP Certification covers that issue a bit and you can read the study guide.
Why not hire an expert to audit your applications from time to time? It's a worthwhile investment considering your level of concern.
Our client always say: "Deal with my users as they dont differentiate between the date and text fields!!"
I code in Java, and my code is full of asserts i assume everything is wrong from the client and i check it all at server.
#1 thing for me is to always construct static SQL queries and pass your data as parameters. This limits the quoting issues you have to deal with enormously. See also http://xkcd.com/327/
This also has performance benefits, as you can re-use the prepared queries.
There are actually only 2 things you need to take care with:
Avoid SQL injection. Use parameterized queries to save user-controlled input in database. In Java terms: use PreparedStatement. In PHP terms: use mysql_real_escape_string() or PDO.
Avoid XSS. Escape user-controlled input during display. In Java/JSP terms: use JSTL <c:out>. In PHP terms: use htmlspecialchars().
That's all. You don't need to worry about the format of the data. Just about the way how you handle it.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
We have written a software package for a particular niche industry. This package has been pretty successful, to the extent that we have signed up several different clients in the industry, who use us as a hosted solution provider, and many others are knocking on our doors. If we achieve the kind of success that we're aiming for, we will have literally hundreds of clients, each with their own web site hosted on our servers.
Trouble is, each client comes in with their own little customizations and tweaks that they need for their own local circumstances and conditions, often (but not always) based on local state or even county legislation or bureaucracy. So while probably 90-95% of the system is the same across all clients, we're going to have to build and support these little customizations.
Moreover, the system is still very much a work in progress. There are enhancements and bug fixes happening continually on the core system that need to be applied across all clients.
We are writing code in .NET (ASP, C#), MS-SQL 2005 is our DB server, and we're using SourceGear Vault as our source control system. I have worked with branching in Vault before, and it's great if you only need to keep 2 or 3 branches synchronized - but we're looking at maintaining hundreds of branches, which is just unthinkable.
My question is: How do you recommend we manage all this?
I expect answers will be addressing things like object architecture, web server architecture, source control management, developer teams etc. I have a few ideas of my own, but I have no real experience in managing something like this, and I'd really appreciate hearing from people who have done this sort of thing before.
Thanks!
I would recommend against maintaining separate code branches per customer. This is a nightmare to maintain working code against your Core.
I do recommend you do implement the Strategy Pattern and cover your "customer customizations" with automated tests (e.g. Unit & Functional) whenever you are changing your Core.
UPDATE:
I recommend that before you get too many customers, you need to establish a system of creating and updating each of their websites. How involved you get is going to be balanced by your current revenue stream of course, but you should have an end in mind.
For example, when you just signed up Customer X (hopefully all via the web), their website will be created in XX minutes and send the customer an email stating it's ready.
You definitely want to setup a Continuous Integration (CI) environment. TeamCity is a great tool, and free.
With this in place, you'll be able to check your updates in a staging environment and can then apply those patches across your production instances.
Bottom Line: Once you get over a handful of customers, you need to start thinking about automating your operations and your deployment as yet another application to itself.
UPDATE: This post highlights the negative effects of branching per customer.
Our software has very similar requirements and I've picked up a few things over the years.
First of all, such customizations will cost you both in the short and long-term. If you have control over it, place some checks and balances such that sales & marketing do not over-zealously sell customizations.
I agree with the other posters that say NOT to use source control to manage this. It should be built into the project architecture wherever possible. When I first began working for my current employer, source control was being used for this and it quickly became a nightmare.
We use a separate database for each client, mainly because for many of our clients, the law or the client themselves require it due to privacy concerns, etc...
I would say that the business logic differences have probably been the least difficult part of the experience for us (your mileage may vary depending on the nature of the customizations required). For us, most variations in business logic can be broken down into a set of configuration values which we store in an xml file that is modified upon deployment (if machine specific) or stored in a client-specific folder and kept in source control (explained below). The business logic obtains these values at runtime and adjusts its execution appropriately. You can use this in concert with various strategy and factory patterns as well -- config fields can contain names of strategies etc... . Also, unit testing can be used to verify that you haven't broken things for other clients when you make changes. Currently, adding most new clients to the system involves simply mixing/matching the appropriate config values (as far as business logic is concerned).
More of a problem for us is managing the content of the site itself including the pages/style sheets/text strings/images, all of which our clients often want customized. The current approach that I've taken for this is to create a folder tree for each client that mirrors the main site - this tree is rooted at a folder named "custom" that is located in the main site folder and deployed with the site. Content placed in the client-specific set of folders either overrides or merges with the default content (depending on file type). At runtime the correct file is chosen based on the current context (user, language, etc...). The site can be made to serve multiple clients this way. Efficiency may also be a concern - you can use caching, etc... to make it faster (I use a custom VirtualPathProvider). The largest problem we run into is the burden of visually testing all of these pages when we need to make changes. Basically, to be 100% sure you haven't broken something in a client's custom setup when you have changed a shared stylesheet, image, etc... you would have to visually inspect every single page after any significant design change. I've developed some "feel" over time as to what changes can be comfortably made without breaking things, but it's still not a foolproof system by any means.
In my case I also have no control other than offering my opinion over which visual/code customizations are sold so MANY more of them than I would like have been sold and implemented.
This is not something that you want to solve with source control management, but within the architecture of your application.
I would come up with some sort of plugin like architecture. Which plugins to use for which website would then become a configuration issue and not a source control issue.
This allows you to use branches, etc. for the stuff that they are intended for: parallel development of code between (or maybe even over) releases. Each plugin becomes a seperate project (or subproject) within your source code system. This also allows you to combine all plugins and your main application into one visual studio solution to help with dependency analisys etc.
Loosely coupling the various components in your application is the best way to go.
As mention before, source control does not sound like a good solution for your problem. To me it sounds that is better yo have a single code base using a multi-tenant architecture. This way you get a lot of benefits in terms of managing your application, load on the service, scalability, etc.
Our product using this approach and what we have is some (a lot) of core functionality that is the same for all clients, custom modules that are used by one or more clients and at the core a the "customization" is a simple workflow engine that uses different workflows for different clients, so each clients gets the core functionality, its own workflow(s) and some extended set of modules that are either client specific or generalized for more that one client.
Here's something to get you started on multi-tenancy architecture:
Multi-Tenant Data Architecture
SaaS database tenancy patterns
Without more info, such as types of client specific customization, one can only guess how deep or superficial the changes are. Some simple/standard approaches to consider:
If you can keep a central config specifying the uniqueness from client to client
If you can centralize the business rules to one class or group of classes
If you can store the business rules in the database and pull out based on client
If the business rules can all be DB/SQL based (each client having their own DB
Overall hard coding differences based on client name/id is very problematic, keeping different code bases per client is costly (think of the complete testing/retesting time required for the 90% that doesn't change)...I think more info is required to properly answer (give some specifics)
Layer the application. One of those layers contains customizations and should be able to be pulled out at any time without affect on the rest of the system. Application- and DB-level "triggers" (quoted because they may or many not employ actual DB triggers) that call customer-specific code or are parametrized with customer keys) are very helpful.
Core should never be customized, but you must layer it in somewhere, even if it is simplistic web filtering.
What we have is a a core datbase that has the functionality that all clients get. Then each client has a separate database that contains the customizations for that client. This is expensive in terms of maintenance. The other problem is that when two clients ask for a simliar functionality, it is often done differnetly by the two separate teams. There is currently little done to share custiomizations between clients and make common ones become part of the core application. Each client has their own application portal, so we don't have the worry about a change to one client affecting some other client.
Right now we are looking at changing to a process using a rules engine, but there is some concern that the perfomance won't be there for the number of records we need to be able to process. However, in your circumstances, this might be a viable alternative.
I've used some applications that offered the following customizations:
Web pages were configurable - we could drag fields out of view, position them where we wanted with our own name for the field label.
Add our own views or stored procedures and use them in: data grids (along with an update proc) and reports. Each client would need their own database.
Custom mapping of Excel files to import data into system.
Add our own calculated fields.
Ability to run custom scripts on forms during various events.
Identify our own custom fields.
If you clients are larger companies, you're almost going to need your own SDK, API's, etc.