Auto completion of a field, design for server side scalability

Auto completion of a field, design for server side scalability - autocomplete

When handling autocompletion feature for a form field where every character typed by a user triggers an api call for suggestions, how do you proxy this call to scale?
Direct from java script is not possible due to cross domain restrictions, and not secure because that would expose the api keys.
Moving this to the controller or model, would incur a lot of queries to the server side that would put heavy burden on them when the active user base has reached a certain limit.
Whats the standard industry practice for such a feature?

You'll need to be very smart on the client and on the server.
Use a lot of caching everywhere to avoid extra work. Use CORS or JSONP. And frankly speaking this is a lot of work. Not speaking of Lucene/SOLR being not very autocomplete capable engine.
Btw: look at www.rockitsearch.com . It has implementation autocomple with all the basic features. All you'll need to do is: register and export your data there. And then integrate your widget on your website.

Not sure what you mean by "proxy this call", but in general:
You can use JSONP for cross domain queries. But you pay performance penalty on a client side.
It's OK to query same domain. There is no single answer since topic is very generic. How you scale depends on your infrastructure. If application is designed to scale horizontally you scale just by adding more servers to your servers pool. Which is pretty simple using Amazon or Azure cloud services. It is also important to optimize database queries and indexes so that database responds fast. If user base is big you can even have multiple copies of the same databases to help with performance.
Don't worry about optimizations prematurely since you may never get to that point. If you get it is good problem to have and in this case solution is trivial.

Related

Managing UI requirements in a microservice architecture

We have different client applications (each is built with a different UI and is targeted to a different sales channels) that are used to capture orders that ultimately need to be processed by our factory.
At first we decided to offer a single "order" microservice that would be used by all these client applications for business rules execution and data storage. This microservice will also trigger our backoffice processes such as client profile update, order analysis, documents storage to our electronic vault, invoicing, communications, etc.
The challenge we are facing is that these client applications are developed by teams that are external to ours (we are a backoffice team only). Each team responsible to develop a client application will be able to offer a different UX to their users (some will allow to save orders in an incomplete state, some wil allow to capture data using a specific worflow, some will use text fields instead of listboxes for some values, etc.).
This diversity of behaviors from client applications is an issue because our microservice logic will become very complex to be able to support all those UI requirements. Moreover, everytime a change will be made to one of the client applications, we will have to modify our microservice which is a case of strong coupling.
My questions are: What would be your best advice to manage this issue? Should we let each application capture the data the way it wants (and persist it if needed in its own database) and let them call our microservice only when an order is complete and compliant to our API contract?
Should we keep our idea of having a single "order" microservice for everyone and force each client application to capture the data the same way?
Any other option?
We want to reduce the duplication of data and business rules in our ecosystem but in the same time we don't want our 'order' microservice to become a mess.
Many thanks for your help.

Moreover, everytime a change will be made to one of the client applications, we will have to modify our microservice which is a case of strong coupling.
This rings alarm bells for me. A change to a UI shouldn't require a change to a backend service. (The exception would be if a new feature were being added to a system and the backend service needed to play a part in supporting that feature, but I wouldn't just call that a change to a client.) As you have said, it's strong coupling, and that's something to be avoided in a microservices environment.
Ideally, your service should provide a generic, programmatic API that is flexible enough to support multiple UIs (or other non-UI applications) without having any knowledge of how the UIs work.
It sounds like you have some decisions to make about what responsibilities your service will and won't take on:
Does it make more sense for your generic orders service to facilitate the storage/retrieval/completion of incomplete orders, or to force its clients to manage this somewhere else?
Does it make more sense for your generic service to provide facilities to assist in the tracking of workflows, or to force the UIs that need that functionality to find it elsewhere?
For clients that want to show list boxes, does it make sense for your generic orders service to provide APIs that aid in populating those boxes?
Should we let each application capture the data the way it wants (and persist it if needed in its own database) and let them call our microservice only when an order is complete and compliant to our API contract?
It really depends on whether you think that's the most sensible way for your service to behave. Something that will play into that will be how similar or dissimilar the needs of each UI is. If 4 out of 5 UIs have the same needs, it could well make sense to support that generically in your service. If every single UI behaves differently to the others, putting that functionality in your generic orders service would amount to storing frontend code somewhere that it doesn't belong.
It seems like there might also be some organisational considerations to these decisions. If the teams using your service are only frontend teams (i.e. without capacity/skills to build backend services), then someone will still have to build the backend functionality they require.
Should we keep our idea of having a single "order" microservice for everyone and force each client application to capture the data the same way?
Yes to the idea of having a single order service with a generic interface for everyone. With regards to forcing client applications to capture data a certain way, your API will only dictate what they need to do to create an order. You can't (and shouldn't) force anything on them about the way they capture the data before calling your service. They can do it however they like. The questions are really around whether your service supports various models of capture or pushes that responsibility back to the frontend.
What would be your best advice to manage this issue?
Collaborate with the teams that will use the service. Gather as much information as you can about the use cases in which they intend to use it. Discover what is common for the majority and choose what of that you will support. Create a semi-formal spec (e.g. well-documented Open API), share it with the client teams, ask for feedback, and iterate. For the parts of the UIs that aren't common across clients, strongly consider telling those teams they'll need to support those elements of their design themselves, especially if they represent significant work on your end.

How to structure a RESTful backend API with a database?

I want to make an API using REST which interacts (stores) data in a database.
While I was reading some design patterns and I came across remote facade, and the book I was reading mentions that the role of this facade is to translate the course grained methods from the remote calls into fine grained local calls, and that it should not have any extra logic. As an explaination, it says that the program should still work without this facade.
Here's an example
Yet I have two questions:
Considering I also have a database, does it make sense to split the general call into specific calls for each attribute? Doesn't it make more sense to just have a general "get data" method that runs one query against the database and converts it into an usable object, to reduce the number of database calls? So instead of splitting the get address to get street, get city, get zip, make on db call for all that info.
With all this in mind, and, in my case using golang, how should the project be structured in terms of files and functions?
I will have the main file with all the endpoints from the REST API, calling the controllers that handle these requests.
I will have a set of files that define those controllers. Are these controllers the remote facade? Should those methods not have logic in that case, and just call the equivalent local methods?
Should the local methods call the database directly, or should they use some sort of helper class that accesses the database?
Assuming all questions are positive, does the following structure make sense?
Main
Controllers
Domain
Database helper

First and foremost, as Mike Amundsen has stated
Your data model is not your object model is not your resource model is not your affordance model
Jim Webber did say something very similar, that by implementing a REST architecture you have an integration model, in the form of the Web, which is governed by HTTP and the other being the domain model. Resources adept and project your domain model to the world, though there is no 1:1 mapping between the data in your database and the representations you send out. A typical REST system does have many more resources than you have DB entries in your domain model.
With that being said, it is hard to give concrete advice on how you should structure your project, especially in terms of a certain framework you want to use. In regards to Robert "Uncle Bob" C. Martin on looking at the code structure, it should tell you something about the intent of the application and not about the framework¹ you use. According to him Architecture is about intent. Though what you usually see is the default-structure imposed by a framework such as Maven, Ruby on Rails, ... For golang you should probably read through certain documentation or blogs which might or might not give you some ideas.
In terms of accessing the database you might either try to follow a micro-service architecture where each service maintains their own database or you attempt something like a distributed monolith that acts as one cohesive system and shares the database among all its parts. In case you scale to the broad and a couple of parallel services consume data, i.e. in case of a message broker, you might need a distributed lock and/or queue to guarantee that the data is not consumed by multiple instances at the same time.
What you should do, however, is design your data layer in a way that it does scale well. What many developers often forget or underestimate is the benefit they can gain from caching. Links are basically used on the Web to reference from one resource to an other and giving the relation some semantic context by the utilization of well-defined link-relation names. Link relations also allow a server to control its own namespace and change URIs as needed. But URIs are not only pointers to a resource a client can invoke but also keys for a cache. Caching can take place on multiple locations. On the server side to avoid costly calculations or look ups on the client side to avoid sending requests out in general or on intermediary hops which allow to take away pressure from heavily requested servers. Fielding made caching even a constraint that needs to be respected.
In regards to what attributes you should create queries for is totally dependent on the use case you attempt to depict. In case of the address example given it does make sense to return the address information all at once as the street or zip code is rarely queried on its own. If the address is part of some user or employee data it is more vague whether to return that information as part of the user or employee data or just as a link that should be queried on its own as part of a further request. What you return may also depend on the capabilities of the media-type client and your service agree upon (content-type negotiation).
If you implement something like a grouping for i.e. some football players and certain categories they belong to, such as their teams and whether they are offense or defense players, you might have a Team A resource that includes all of the players as embedded data. Within the DB you could have either an own table for teams and references to the respective player or the team could just be a column in the player table. We don't know and a client usually doesn't bother as well. From a design perspective you should however be aware of the benefits and consequences of including all the players at the same time in regards to providing links to the respective player or using a mixed approach of presenting some base data and a link to learn further details.
The latter approach is probably the most sensible way as this gives a client enough information to determine whether more detailed data is needed or not. If needed a simple GET request to the provided URI is enough, which might be served by a cache and thus never reach the actual server at all. The first approach has for sure the disadvantage that it doesn't reuse caching optimally and may return way more data then actually needed. The approach to include links only may not provide enough information forcing the client to perform a follow-up request to learn data about the team member. But as mentioned before, you as the service designer decide which URIs or queries are returned to the client and thus can design your system and data model accordingly.
In general what you do in a REST architecture is providing a client with choices. It is good practice to design the overall interaction flow as a state machine which is traversed through receiving requests and returning responses. As REST uses the same interaction model as the Web, it probably feels more natural to design the whole system as if you'd implement it for the Web and then apply the design to your REST system.
Whether controllers should contain business logic or not is primarily an opinionated question. As Jim Webber correctly stated, HTTP, which is the de-facto transport layer of REST, is an
application protocol whose application domain is the transfer of documents over a network. That is what HTTP does. It moves documents around. ... HTTP is an application protocol, but it is NOT YOUR application protocol.
He further points out that you have to narrow HTTP into a domain application protocol and trigger business activities as a side-effect of moving documents around the network. So, it's the side-effect of moving documents over the network that triggers your business logic. There is no straight rule whether to include business logic in your controller or not, but usually you try to keep the business logic in yet their own layer, i.e. as a service that you just invoke from within the controller. That allows to test the business logic without the need of the controller and thus without the need of a real HTTP request.
While this answer can't provide more detailed information, partly due to the broad nature of the question itself, I hope I could shed some light in what areas you should put in some thoughts and that your data model is not necessarily your resource or affordance model.

RESTful API runtime discoverability / HATEOAS client design

For a SaaS startup I'm involved in, I am building both a RESTful web API and a couple of client apps on different platforms that consume it. I think I've got the API figured out, but now I'm turning to the clients. As I've been reading about REST, I see that a key part of REST is discovery, but there seems to be a lot of debate between two different interpretations of what discovery really means:
Developer discovery: The developer hard-codes copious amounts of API details into the client, such as resource URI's, query parameters, supported HTTP methods, and other details that they've discovered through browsing the docs and experimenting with the API's responses. This type of discovery IMHO necessitates cool linkage and the API versioning question, and leads to hard coupling of the client code to the API. Not much better than if using a well-documented collection of RPC's it seems.
Runtime discovery - The client app itself is able to figure out everything it needs with little or no out-of-band information (presumably, only a knowledge of the media types the API deals with.) Links can be hot. But to make the API very efficient, a lot of link templating for query parameters seems to be needed, which makes out-of-band info creep back in. There are possibly other difficulties I haven't thought of yet since I haven't gotten to that point in development. But I do like the idea of loose coupling.
Runtime discovery seems to be the holy grail of REST, but I'm seeing precious little discussion about how to implement such a client. Almost all REST sources I've found seem to assume Developer discovery. Anyone know of some Runtime discovery resources? Best practices? Examples or libraries with real code? I'm working in PHP (Zend Framework) for one client. Objective-C (iOS) for the other.
Is Runtime discovery a realistic goal, given the present set of tools and knowledge in the developer community? I can write my client to treat all of the URI's in an opaque manner, but how to do this most efficiently is a question, especially over low-bandwidth connections. Anyway, URI's are only part of the equation. What about link templating in the Runtime context? How about communicating what methods are supported, aside from making a lot of OPTIONS requests?

This is definitely a tough nut to crack. At Google, we've implemented our Discovery Service that all our new APIs are built against. The TL;DR version is we generate a JSON Schema-like spec that our clients can parse - many of them dynamically.
That results means easier SDK upgrades for the developer and easy/better maintenance for us.
By no means the perfect solution, but many of our devs seem to like.
See link for more details (and make sure to watch the vid.)

Fascinating. What you are describing is basically the HATEOAS principle. What is HATEOAS you ask? Read this: http://en.wikipedia.org/wiki/HATEOAS
In layman's terms, HATEOAS means link following. This approach decouples your client from specific URL's and gives you the flexibility to change your API without breaking anyone.

You did your home work and you got to the heart of it: runtime discovery is holy grail. Don't chase it.
UDDI tells a poignant story of runtime discovery: http://en.wikipedia.org/wiki/Universal_Description_Discovery_and_Integration

One of the requirements that should be satisfied before you can call an API 'RESTful' is that it should be possible to write a generic client application on top of that API. With the generic client, a user should be able to access all the API's functionality. A generic client is a client application that does not assume that any resource has a specific structure beyond the structure that is defined by the media type. For example, a web browser is a generic client that knows how to interpret HTML, including HTML forms etc.
Now, suppose we have a HTTP/JSON API for a web shop and we want to build a HTML/CSS/JavaScript client that gives our customers an excellent user experience. Would it be a realistic option to let that client be a generic client application? No. We want to provide a specific look-and-feel for every specific data element and every specific application state. We don't want to include all knowledge about these presentation-specifics in the API, on the contrary, the client should define the look and feel and the API should only carry the data. This implies that the client has hard-coded coupling of specific resource elements to specific layouts and user interactions.
Is this the end of HATEOAS and thus the end of REST? Yes and no.
Yes, because if we hard-code knowledge about the API into the client, we loose the benefit of HATEOAS: server-side changes may break the client.
No, for two reasons:
Being "RESTful" is a property of the API, not of the client. As long as it is possible, in theory, to build a generic client that offers all capabilities of the API, the API can be called RESTful. The fact that clients don't obey the rules, is not the API's fault. The fact that a generic client would have a lousy user experience is not an issue. Why is it important to know that it is possible to have a generic client, if we don't actually have that generic client? This brings me to the second reason:
A RESTful API offers clients the option to choose how generic they want to be, i.e. how resilient to server-side changes they want to be. Clients which need to provide a great user experience may still be resilient to URI changes, to changes in default values and more. Clients doing batch jobs without user interaction may be resilient to other kinds of changes.
If you are interested in practical examples, checkout my JAREST paper. The last section is about HATEOAS. You will see that with JAREST, even highly interactive and visually attractive clients can be quite resilient to server-side changes, though not 100%.

I think the important point about HATEOAS is not that it is some holy grail client-side, but that it isolates the client from URI changes - it is assumed you are using known (or developer discovered custom) Link Relations that will allow the system to know which link for an object is the editable form. The important point is to use a media type that is hypermedia aware (e.g. HTML, XHTML, etc).

You write:
To make the API very efficient, a lot of link templating for query parameters seems to be needed, which makes out-of-band info creep back in.
If that link template is supplied in the previous request, then there is no out-of-band information. For example a HTML search form uses link templating (/search?q=%#) to generate a URL (/search?q=hateoas), but nothing is known by the client (the web browser) other than how to use HTML forms and GET.

Scala/Lift: Does it require special http routing?

Since Lift is stateful, each subsequent request to a page/site must go back to the same server. Presumably that means that the front-end load balancer needs to keep track of which client is talking to which server.
How does that work out for hosting on places like Heroku/Elastic Beanstalk, where the load balancer is all done automagically for you by the service? I know if you are setting up all your machines yourself you can set the routing to do the correct thing, but how does it work on these PaaS type hosts where all this is meant to be done for you?
EDIT: Google App Engine would have the same limitations, if i am not mistaken?

Heroku will distribute requests between dynos (processes) evenly so I believe you would have to use some form of sessions serialisation for a stateful Lift app. I believe Elastic Beanstalk does have some facilities to support this however (as ELB does).
David Pollock writes about how to use Lift in a stateless way and also talks generally about the design of Lift in this area here.

Lift is not really intended to be used in a pure stateless mode, its possible, but its not where the framework excels. ELB does indeed have support for sticky sessions, which is the configuration you need to take up in order to use Lift successfully in nearly any environment.
More broadly, this "sticky session" functionality can be achieved with either software of L4 hardware balancing. You might be interested in chapter 15 of Lift in Action which spends a fair amount of time discussing this very subject and the various session serialisation strategies if you really want that.

Web UI to a restful interface, good idea?

I am working on a experimental website (which is accessible through web browser) that will act as a front-end to a restful interface (a sub-system). The website will serve as an interface between a user and the restful interface, as it will make http requests to the restful interface for almost all database operations. Authentication will probably be done using openid and authorization for the database operations will be done via oAuth.
Just out of curiousity, is this a feasible solution or I should develop two systems that accesses the database in parallel (i.e. the website has its own data access logic, and the restful interface has another data access logic)? And what are the pros/cons if I insist on doing it this way (it is just an experiment project for me to learn things like how OpenID and oAuth work in real life anyway) besides there will be more database queries and http requests generated for each transaction?

Your concept sounds quite feasible. I'd say that you'll get some fairly good wins out of this approach. For starters you'll get a large degree of code reuse since you'll be able to put other front ends on top of the RESTful service. Additionally, you'll be able to unit test this architecture with relative ease. Finally, you'll be able to give 3rd party developers access to the same API that you use (subject possibly to some restrictions) which will be a huge win when it comes to attracting customers and developers to your platform.
On the down side, depending on how you structure your back end you could run into the standard problem of granularity. Too much granularity and you'll end up making lots of connections for very little amounts of data. Too little and you'll get more data than you need in some cases. As for security, you should be able to lock down the back end so that requests can only be made under certain conditions: requests contain an authorization token, api key, etc.

Sounds good, but I'd recommend that you do this only if you plan to open up the restful API for other UI's to use, or simply to learn something cool. Support HTML XML and JSON for the interface.
Otherwise, use a great MVC framework instead (asp.net MVC, rails, cakephp). You'll end up with the same basic result but you'll be "strongerly" typed to the database.

with a modern javascript library your approach is quite straightforward.
ExtJS now has always had Ajax support, but it is now able to do this via a REST interface.
So, your ExtJS user interface components populate receive a URL. They populate themselves via a GET to the URL, and store update via POST to the URL.
This has worked really well on a project I'm currently working on. By applying RESTful principles there's an almost clinical separation between the front & backends - meaning it would be trivial undertaking to replace other. Plus, the API barely needs documenting, since it's an implementation of an existing mature standard.
Good luck,
Ian

woow! A question from 2009! And it's funny to read the answers. Many people seem to disagree with the web services approach and JS front end - which has nowadays become kind of standard, known as Single Page Applications..

I think the general approach you outline is quite feasible -- the main pro is flexibility, the main con is that it won't protect clueless users against their own ((expletive deleted)) abuses. As most users are likely to be clueless, this isn't feasible for mass consumption... but, it's fine for really leet users!-)

So to clarify, you want to have your web UI call into your web service, which in turn calls into the database?
This is exactly the path I took for a recent project and I think it was a mistake because you end up creating a lot of extra work. Here's why:
When you are coding your web service, you will create a library to wrap database calls, which is typical. No problem there.
But then when you code your web UI, you will end up creating another library to wrap calls into the REST interface... because otherwise it will get cumbersome making all the raw HTTP calls.
So you essentially created 2 data access libraries, one to wrap DB and the other to wrap the Web service calls. This basically doubles the amount of work you do, because for every operation on a resource, you will end up implementing in both libraries. This gets tiring real fast.
The simpler alternative is to create a single library that wraps access to the database, as before, then use that library from BOTH the web UI and web service.
This is assuming that your web UI and web service reside on the same network and both have direct access to the backend database server (which was the case for me). In this setup having both go directly to the database is also a lot more efficient then having the UI go through the web service.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse