Universal RESTful Data API framework - rest

we are implementing an operational data store for some key data sets ("single view of customer", "single view of employee") to provide meaningful integrated data primarily to front office applications (primarily B2E....i.e. both provider and consumer are controlled internally, no external exposure).
All the "single views" will be centered around a slow changing master business entity ("customer", "employee", "asset", "product") which has related child/satellite entities which are more transactional/fast changing in nature (i.e. bookings, orders, payments etc.). The different "single views" will be overlapping and interconnected.
So this ODS would become a data abstraction layer between disparate "systems of record" and vertical systems of engagement, providing a traversable data universe decoupling clients from producers
Of course, an ODS is pointless if there is no way to access the data. Therefore, I am looking for some kind of elegant way to implement a resource based data services layer on top of the ODS with some of the following characteristics
Stateless
REST Level 3 (HATEOS) but most likely limited to GET's (i.e. create, update, delete would be executed against the actual "system of record" in order to apply business logic)
abstraction between enterprise data model (=domain entities are exposed as representations) and physical data model (=actual data is stored in physical tables)...no leakage of physical data model
support for common query features, such as order by, paging etc.
easy to use for developers, decoupled from physical data model and backend system changes
Ideally not constrained to relational db's even though this could be optional for now.
The key standard I came across for that would be OData, but there is a couple of concerns though
It is very MS centric both from a standard (now OASIS) but more importantly from an implementation perspective (.NET, WCF Data services). We are an enterprise java shop.
A couple of frontrunners (netflix, ebay) have sunsetted their OData services which is always a concerning thing
OData has been touted as a magic blackbox which exposes the underlying datamodel more or less automatically. That means that there is no "human translation"/modelling in between, and the service layer therefore doesn't provide the abstraction that is one of the key requirements.
Now, some of these disadvantages might not be that critical as we don't plan to expose the data services layer to the external world, but rather use it within our own environment (i.e. rather small number of consumers which can be controlled).But then the question is how much value OData adds.
I do know that there are no free lunches out there :-)
Are there any other approaches of how to implement a generic data access layer ?
thx a lot, Nick

To answer your concerns about OData:
The statement of OData being MS centric is not true anymore. And there's a very good news to you especially when you're using Java to write services or clients. The Apache Olingo project is currently maintained and developed under an open source manner and Microsoft is just one of the contributors of it. The project is based on an OData V2 version and will also support OData V4.
It's true that Netflix and Ebay don't expose their OData services anymore. But according to the monitoring of the OData ecosystem on OData.org, there are new OData services and clients published constantly. (The OData.org has started to accept contribution to the ecosystem)
I'm not sure if I correctly understood this item. But if I did, the OData vocabularies (a part in the OData standard) may be able to resolve your concern. As with the OData model annotated with terms, you can add more "human specification" to the capability of the service and more control to the data and the model. The result will be, that the client will be able to intelligently "human translate" the data and the model to better interact with the service. What is even better about vocabularies is, although there are canonical namespaces of annotations that are reserved by the protocol, you can also define your own vocabulary and have it accepted by whoever wants to consume your service as the vocabularies of OData is extensible as is defined in the OData protocol.
In addition, although your plan is to only expose a data service for limited access. OData natures such as queryablility, RESTful data API, and new OData V4 compelling features such as delta responses, asynchronous requests, server side aggregation will definitely help you write a more efficient and powerful data publishing and consumption story.

Related

Multiple GraphQL "hops" in end-to-end flow?

I am working on an enterprise-level system and am trying to understand if my idea is super inefficient.
Our company is looking to use GraphQL, and we want to use it as a way to assist the front-end client in retrieving data, but also as a data abstraction over our raw data. What I mean is:
If we have GraphQL closer to the client as one instance (that GraphQL server would sit in front of our domain REST services), but then we also have GraphQL sitting atop the data layer, does that present any issues?
I know the question might arise: "Why don't you have GraphQL over the domain services, and GraphQL over the data, but then federate those into a gateway and have clients pull from there!" But one of the tenants we are sticking to at our company is there must be an abstraction over our data. So, we either abstract that data via a REST API (which we do now), or we have GraphQL over the data and act as the abstraction.
So given that "data abstraction" requirement, I want to understand if there are any issues with the two "hops"/instances of GraphQL in the end-to-end flow?
This is a common pattern. We used this for our backend services, which received graphql on the domain layer and then used prisma for the data layer.
I have two recommendations from our experience.
Try, as best as possible, to auto-generate both your resolvers and your data API using a language, specific tool.
Do testing against the domain layer to make sure that nothing from the data layer slips through. It will be tempting to do simple "pass through" requests as the two schemas will often start off synchronized, and you may wind up accidentally passing through data you don't want going to the client.
(Shameless plug!) For the second one, Meeshkan does this sort of testing in an automated fashion, and there are plenty of testing frameworks you can use to execute hand-written tests as well (ie cucumber.

How to structure a RESTful backend API with a database?

I want to make an API using REST which interacts (stores) data in a database.
While I was reading some design patterns and I came across remote facade, and the book I was reading mentions that the role of this facade is to translate the course grained methods from the remote calls into fine grained local calls, and that it should not have any extra logic. As an explaination, it says that the program should still work without this facade.
Here's an example
Yet I have two questions:
Considering I also have a database, does it make sense to split the general call into specific calls for each attribute? Doesn't it make more sense to just have a general "get data" method that runs one query against the database and converts it into an usable object, to reduce the number of database calls? So instead of splitting the get address to get street, get city, get zip, make on db call for all that info.
With all this in mind, and, in my case using golang, how should the project be structured in terms of files and functions?
I will have the main file with all the endpoints from the REST API, calling the controllers that handle these requests.
I will have a set of files that define those controllers. Are these controllers the remote facade? Should those methods not have logic in that case, and just call the equivalent local methods?
Should the local methods call the database directly, or should they use some sort of helper class that accesses the database?
Assuming all questions are positive, does the following structure make sense?
Main
Controllers
Domain
Database helper
First and foremost, as Mike Amundsen has stated
Your data model is not your object model is not your resource model is not your affordance model
Jim Webber did say something very similar, that by implementing a REST architecture you have an integration model, in the form of the Web, which is governed by HTTP and the other being the domain model. Resources adept and project your domain model to the world, though there is no 1:1 mapping between the data in your database and the representations you send out. A typical REST system does have many more resources than you have DB entries in your domain model.
With that being said, it is hard to give concrete advice on how you should structure your project, especially in terms of a certain framework you want to use. In regards to Robert "Uncle Bob" C. Martin on looking at the code structure, it should tell you something about the intent of the application and not about the framework¹ you use. According to him Architecture is about intent. Though what you usually see is the default-structure imposed by a framework such as Maven, Ruby on Rails, ... For golang you should probably read through certain documentation or blogs which might or might not give you some ideas.
In terms of accessing the database you might either try to follow a micro-service architecture where each service maintains their own database or you attempt something like a distributed monolith that acts as one cohesive system and shares the database among all its parts. In case you scale to the broad and a couple of parallel services consume data, i.e. in case of a message broker, you might need a distributed lock and/or queue to guarantee that the data is not consumed by multiple instances at the same time.
What you should do, however, is design your data layer in a way that it does scale well. What many developers often forget or underestimate is the benefit they can gain from caching. Links are basically used on the Web to reference from one resource to an other and giving the relation some semantic context by the utilization of well-defined link-relation names. Link relations also allow a server to control its own namespace and change URIs as needed. But URIs are not only pointers to a resource a client can invoke but also keys for a cache. Caching can take place on multiple locations. On the server side to avoid costly calculations or look ups on the client side to avoid sending requests out in general or on intermediary hops which allow to take away pressure from heavily requested servers. Fielding made caching even a constraint that needs to be respected.
In regards to what attributes you should create queries for is totally dependent on the use case you attempt to depict. In case of the address example given it does make sense to return the address information all at once as the street or zip code is rarely queried on its own. If the address is part of some user or employee data it is more vague whether to return that information as part of the user or employee data or just as a link that should be queried on its own as part of a further request. What you return may also depend on the capabilities of the media-type client and your service agree upon (content-type negotiation).
If you implement something like a grouping for i.e. some football players and certain categories they belong to, such as their teams and whether they are offense or defense players, you might have a Team A resource that includes all of the players as embedded data. Within the DB you could have either an own table for teams and references to the respective player or the team could just be a column in the player table. We don't know and a client usually doesn't bother as well. From a design perspective you should however be aware of the benefits and consequences of including all the players at the same time in regards to providing links to the respective player or using a mixed approach of presenting some base data and a link to learn further details.
The latter approach is probably the most sensible way as this gives a client enough information to determine whether more detailed data is needed or not. If needed a simple GET request to the provided URI is enough, which might be served by a cache and thus never reach the actual server at all. The first approach has for sure the disadvantage that it doesn't reuse caching optimally and may return way more data then actually needed. The approach to include links only may not provide enough information forcing the client to perform a follow-up request to learn data about the team member. But as mentioned before, you as the service designer decide which URIs or queries are returned to the client and thus can design your system and data model accordingly.
In general what you do in a REST architecture is providing a client with choices. It is good practice to design the overall interaction flow as a state machine which is traversed through receiving requests and returning responses. As REST uses the same interaction model as the Web, it probably feels more natural to design the whole system as if you'd implement it for the Web and then apply the design to your REST system.
Whether controllers should contain business logic or not is primarily an opinionated question. As Jim Webber correctly stated, HTTP, which is the de-facto transport layer of REST, is an
application protocol whose application domain is the transfer of documents over a network. That is what HTTP does. It moves documents around. ... HTTP is an application protocol, but it is NOT YOUR application protocol.
He further points out that you have to narrow HTTP into a domain application protocol and trigger business activities as a side-effect of moving documents around the network. So, it's the side-effect of moving documents over the network that triggers your business logic. There is no straight rule whether to include business logic in your controller or not, but usually you try to keep the business logic in yet their own layer, i.e. as a service that you just invoke from within the controller. That allows to test the business logic without the need of the controller and thus without the need of a real HTTP request.
While this answer can't provide more detailed information, partly due to the broad nature of the question itself, I hope I could shed some light in what areas you should put in some thoughts and that your data model is not necessarily your resource or affordance model.

why OData is preferred over SCIM?

I recently studied about Odata and SCIM protocol. Both are providing REST services but am not clear about the disadvantages of SCIM which makes OData better preference for Restful services.Could someone please help me understand the differences ?
That's a great question and one I've had to deal with in the development of our private cloud identity platform.Here's my (long-winded) take:
OData (http://odata.org) is a set of standards for modeling and accessing data of all sorts: relational, hierarchical, graph, you name it. It provides a language for defining data models (read: schema) and a standard approach to defining the APIs to access that data. OData started out firmly rooted in XML, but has evolved to HTTP/JSON over the last couple of years.
SCIM (http://www.simplecloud.info/) started with a much more limited scope: to provide a REST API for provisioning identity data, i.e. users and groups. SCIM v2 expanded to include schema extensibility as well as search and CRUD operations. In this sense, SCIM’s evolution is very similar to that of Service Provisioning Markup Language (SPML). In a lot of ways SCIM is a RESTful, JSON-y version of SPML.
The two standards started with different goals, but have converged somewhat in terms of functionality. Even so, OData is substantially more expressive. A few areas come to mind:
Query expressions – SCIM is limited to expressions comparing
attributes to literal values, e.g. ~/Users?filter=surname eq “smith”,
combined with “and”s and “or”s. OData supports comparison of
attributes to other attributes, e.g. ?$filter=Surname eq Title, and
even comparison of attributes of different entities. I suspect that
SCIM’s limitation in this regard comes from the influence of LDAP on
its designers and early use cases.
Queries on complex data models – Both SCIM and OData support exposing
relationships between resources using URLs. This allows you to create
richer data models. But the SCIM query language doesn’t allow
operations on navigation (“$ref”) attributes, so you can’t navigate
between resources as part of a query. OData incorporates navigation
properties as part of the data model, and as a consequence, you can
use a single OData query to (for instance) “return all the people who
work for my boss’s boss who are assigned to my project”. You can of
course do this with SCIM, but it’s up to the client application to do
the navigation, maintain state, and so on. OData is more like SQL in
that it lets you offload query complexity to the server. It’s a good
idea in general, and critical when the client is a constrained device
like a phone.
Functions and Actions – SCIM defines an API for CRUD and search
operations, but that’s all. OData provides for the definition of
other kinds of APIs. For instance, you can define an entity-bound API
called “SendEmail” that accepts a message as one of its parameters,
and the system would send an email to the entity in context. This is
a really powerful notion in my mind, and allows applications to
discover APIs the same way it can discover everything else about an
OData system. Functions can be used in query expressions too, so you
can ask things like “return all the customer whose
Type system – SCIM doesn’t really provide a type system per se.
Resources are classified by a resource type that defines the allowed
attributes for resources of that type, but that’s it. OData defines
inheritance of types, abstract types, “open” (extensible) types and
type conversion, and supplies a much wider range of primitive types
(e.g. geolocation, streams, enumerations, etc. OData also allows you
to explicitly define referential constraints between entities.
Extensibility – SCIM (in v2) supports the addition of new resource
types and additional attributes to existing resource types. Because
OData is data model driven, everything is extensible, including
entities, properties, relationships, functions, and actions.
If OData is a so much richer approach to modeling data and defining APIs than SCIM, why bother with SCIM at all? Because SCIM was developed to solve a very specific problem (identity provisioning), it is quite a bit simpler to understand and simpler to implement than OData. If your application can provision identities to one system using SCIM 1.1., it can probably provision to lots of others. While provisioning identity information using OData is no more difficult than with SCIM, it's just more likely that each target OData platform will expose a somewhat different model, which means that you potentially have to tweak your code for each target. That's what SCIM was intended to address, and it succeeds at that. But if you want to expose a rich data model and a corresponding set of APIs (even if for identity information), OData far and away a better choice.
SCIM is specifically for identity management (e.g., users and groups). OData is a general purpose framework for building RESTful web services. You could create a SCIM service using OData (with some minor differences in URI and payload formats).
Like any pattern, to compare / judge you need to understand the problem their meant to address. They are an attempt to address enterprise application integration. Applications from different vendors on different OSs with different APIs need some common protocol to communicate. Think Finance App talking to HR app from different vendors.
So why REST? Because REST is built on HTTP which is a lowest common denominator standard (but not a bad one), meaning that everyone can play and port 80 is the only port open in every firewall.
OData is a more mature and comprehensive standard, but keep in mind its a standard not an implementation. With most vendors their implementation of the standard is incomplete or dated (Microsoft, Oracle, SAP). So before committing to anything ensure it has the features you need (real world).

RESTful web service for systems integration understaning

I am, indeed, new to RESTful services and while I feel I understand the concepts I am resistant to some aspects of its use in my current project.
The project involves the provision of some form data from another system. Project members insist that the form data should be broken down into "resources" as there are customer and customer addresses etc on the form.
So its all about how granular the REST API is... the form data is not complete and actionable until we have all of the form data (and there's very little at that). And, in fact, I guess we will have to prepare some integrator on the service side to assemble all of these resource bits before we can use them because at present we have no persistence for them or, specifically, we have persistence for them but need to hide the data before it becomes actionable.
Again, at present this is point to point communications without any business case for sharing or service composition.
So, i'm of the idea that one service "form" using a POST is an acceptable optimization and do to the amount of work it cuts for us is a pragamatic approach.
What am I not getting about doing it the hard and expensive way?
If you don't need a high level definition in which you need to use a heavier structure with a well formed and heavy xml with its dtd, where you would be using a WSDL, etc. Then the best choice is REST, is lighter and use HTTP.
Here you can find a better explanation:
WSDL vs REST Pros and Cons

How can I setup OData and EF with out coupling to my database structure?

I really like OData (WCF Data Services). In past projects I have coded up so many Web-Services just to allow different ways to read my data.
OData gives great flexibility for the clients to have the data as they need it.
However, in a discussion today, a co-worker pointed out that how we are doing OData is little more than giving the client application a connection to the database.
Here is how we are setting up our WCF Data Service (Note: this is the traditional way)
Create an Entity Framework (E)F Data Model of our database
Publish that model with WCF Data Services
Add Security to the OData feed
(This is where it is better than a direct connection to the SQL Server)
My co-worker (correctly) pointed out that all our clients will be coupled to the database now. (If a table or column is refactored then the clients will have to change too)
EF offers a bit of flexibility on how your data is presented and could be used to hide some minor database changes that don't affect the client apps. But I have found it to be quite limited. (See this post for an example) I have found that the POCO templates (while nice for allowing separation of the model and the entities) also does not offer very much flexibility.
So, the question: What do I tell my co-worker? How do I setup my WCF Data Services so they are using business oriented contracts (like they would be if every read operation used a standard WCF Soap based service)?
Just to be clear, let me ask this a different way. How can I decouple EF from WCF Data Services. I am fine to make up my own contracts and use AutoMapper to convert between them. But I would like to not go directly from EF to OData.
NOTE: I still want to use EF as my ORM. Rolling my own ORM is not really a solution...
If you use your custom classes instead of using classes generated directly by EF you will also change a provide for WCF Data Services. It means you will no more pass EF context as generic parameter to DataService base class. This will be OK if you have read only services but once you expect any data modifications from clients you will have a lot of work to do.
Data services based on EF context supports data modifications. All other data services use reflection provider which is read only by default until you implement IUpdatable on your custom "service context class".
Data services are technology for creating quickly services exposing your data. They are coupled with their context and it is responsibility of the context to provide abstraction. If you want to make quick and easy services you are dependent on features supported by EF mapping. You can make some abstractions in EDMX, you can make projections (DefiningQuery, QueryView) etc. but all these features have some limitations (for example projections are readonly unless you use stored procedures for modifications).
Data services are not the same as providing connection to database. There is one very big difference - connection to database will ensure only access and execution permissions but it will not ensure data security. WCF Data Services offer data security because you can create interceptors which will add filters to queries to retrieve only data the user is allowed to see or check if he is allowed to modify the data. That is the difference you can tell your colleague.
In case of abstraction - do you want a quick easy solution or not? You can inject abstraction layer between service and ORM but you need to implement mentioned method and you have to test it.
Most simple approach:
DO NOT PUBLISH YOUR TABLES ;)
Make a separate schema
Add views to this
Put those views to EF and publish them.
The views are decoupled from the tables and thus can be simplified and refactored separately.
Standard approach, also for reporting.
Apart from achieving more granular data authorisation (based of certain field values etc) OData also allows your data to be accessible via open standards like JSON/Xml over Http using OAuth. This is very useful for the web/mobile applications. Now you could create a web service to expose your data but that will warrant a change every time your client needs change in the data requirements (e.g. extra fields needed) whereas OData allows this via OData queries. In a big enterprise this is also useful for designing security at infrastructure level as it will only allow the text based (http) calls which can be inspected/verified for security threats via network firewalls'.
You have some other options for your OData client. Have a look at Simple.OData.Client, described in this article: http://www.codeproject.com/Articles/686240/reasons-to-consume-OData-feeds-using-Simple-ODa
And in case you are familiar with Simple.Data microORM, there is an OData adapter for it:
https://github.com/simplefx/Simple.OData/wiki
UPDATE. My recommendations go for client choice while your question is about setting up your server side. Then of course they are not what you are asking. I will leave however my answer so you aware of client alternatives.