I am working on an enterprise-level system and am trying to understand if my idea is super inefficient.
Our company is looking to use GraphQL, and we want to use it as a way to assist the front-end client in retrieving data, but also as a data abstraction over our raw data. What I mean is:
If we have GraphQL closer to the client as one instance (that GraphQL server would sit in front of our domain REST services), but then we also have GraphQL sitting atop the data layer, does that present any issues?
I know the question might arise: "Why don't you have GraphQL over the domain services, and GraphQL over the data, but then federate those into a gateway and have clients pull from there!" But one of the tenants we are sticking to at our company is there must be an abstraction over our data. So, we either abstract that data via a REST API (which we do now), or we have GraphQL over the data and act as the abstraction.
So given that "data abstraction" requirement, I want to understand if there are any issues with the two "hops"/instances of GraphQL in the end-to-end flow?
This is a common pattern. We used this for our backend services, which received graphql on the domain layer and then used prisma for the data layer.
I have two recommendations from our experience.
Try, as best as possible, to auto-generate both your resolvers and your data API using a language, specific tool.
Do testing against the domain layer to make sure that nothing from the data layer slips through. It will be tempting to do simple "pass through" requests as the two schemas will often start off synchronized, and you may wind up accidentally passing through data you don't want going to the client.
(Shameless plug!) For the second one, Meeshkan does this sort of testing in an automated fashion, and there are plenty of testing frameworks you can use to execute hand-written tests as well (ie cucumber.
Related
I want to make an API using REST which interacts (stores) data in a database.
While I was reading some design patterns and I came across remote facade, and the book I was reading mentions that the role of this facade is to translate the course grained methods from the remote calls into fine grained local calls, and that it should not have any extra logic. As an explaination, it says that the program should still work without this facade.
Here's an example
Yet I have two questions:
Considering I also have a database, does it make sense to split the general call into specific calls for each attribute? Doesn't it make more sense to just have a general "get data" method that runs one query against the database and converts it into an usable object, to reduce the number of database calls? So instead of splitting the get address to get street, get city, get zip, make on db call for all that info.
With all this in mind, and, in my case using golang, how should the project be structured in terms of files and functions?
I will have the main file with all the endpoints from the REST API, calling the controllers that handle these requests.
I will have a set of files that define those controllers. Are these controllers the remote facade? Should those methods not have logic in that case, and just call the equivalent local methods?
Should the local methods call the database directly, or should they use some sort of helper class that accesses the database?
Assuming all questions are positive, does the following structure make sense?
Main
Controllers
Domain
Database helper
First and foremost, as Mike Amundsen has stated
Your data model is not your object model is not your resource model is not your affordance model
Jim Webber did say something very similar, that by implementing a REST architecture you have an integration model, in the form of the Web, which is governed by HTTP and the other being the domain model. Resources adept and project your domain model to the world, though there is no 1:1 mapping between the data in your database and the representations you send out. A typical REST system does have many more resources than you have DB entries in your domain model.
With that being said, it is hard to give concrete advice on how you should structure your project, especially in terms of a certain framework you want to use. In regards to Robert "Uncle Bob" C. Martin on looking at the code structure, it should tell you something about the intent of the application and not about the framework¹ you use. According to him Architecture is about intent. Though what you usually see is the default-structure imposed by a framework such as Maven, Ruby on Rails, ... For golang you should probably read through certain documentation or blogs which might or might not give you some ideas.
In terms of accessing the database you might either try to follow a micro-service architecture where each service maintains their own database or you attempt something like a distributed monolith that acts as one cohesive system and shares the database among all its parts. In case you scale to the broad and a couple of parallel services consume data, i.e. in case of a message broker, you might need a distributed lock and/or queue to guarantee that the data is not consumed by multiple instances at the same time.
What you should do, however, is design your data layer in a way that it does scale well. What many developers often forget or underestimate is the benefit they can gain from caching. Links are basically used on the Web to reference from one resource to an other and giving the relation some semantic context by the utilization of well-defined link-relation names. Link relations also allow a server to control its own namespace and change URIs as needed. But URIs are not only pointers to a resource a client can invoke but also keys for a cache. Caching can take place on multiple locations. On the server side to avoid costly calculations or look ups on the client side to avoid sending requests out in general or on intermediary hops which allow to take away pressure from heavily requested servers. Fielding made caching even a constraint that needs to be respected.
In regards to what attributes you should create queries for is totally dependent on the use case you attempt to depict. In case of the address example given it does make sense to return the address information all at once as the street or zip code is rarely queried on its own. If the address is part of some user or employee data it is more vague whether to return that information as part of the user or employee data or just as a link that should be queried on its own as part of a further request. What you return may also depend on the capabilities of the media-type client and your service agree upon (content-type negotiation).
If you implement something like a grouping for i.e. some football players and certain categories they belong to, such as their teams and whether they are offense or defense players, you might have a Team A resource that includes all of the players as embedded data. Within the DB you could have either an own table for teams and references to the respective player or the team could just be a column in the player table. We don't know and a client usually doesn't bother as well. From a design perspective you should however be aware of the benefits and consequences of including all the players at the same time in regards to providing links to the respective player or using a mixed approach of presenting some base data and a link to learn further details.
The latter approach is probably the most sensible way as this gives a client enough information to determine whether more detailed data is needed or not. If needed a simple GET request to the provided URI is enough, which might be served by a cache and thus never reach the actual server at all. The first approach has for sure the disadvantage that it doesn't reuse caching optimally and may return way more data then actually needed. The approach to include links only may not provide enough information forcing the client to perform a follow-up request to learn data about the team member. But as mentioned before, you as the service designer decide which URIs or queries are returned to the client and thus can design your system and data model accordingly.
In general what you do in a REST architecture is providing a client with choices. It is good practice to design the overall interaction flow as a state machine which is traversed through receiving requests and returning responses. As REST uses the same interaction model as the Web, it probably feels more natural to design the whole system as if you'd implement it for the Web and then apply the design to your REST system.
Whether controllers should contain business logic or not is primarily an opinionated question. As Jim Webber correctly stated, HTTP, which is the de-facto transport layer of REST, is an
application protocol whose application domain is the transfer of documents over a network. That is what HTTP does. It moves documents around. ... HTTP is an application protocol, but it is NOT YOUR application protocol.
He further points out that you have to narrow HTTP into a domain application protocol and trigger business activities as a side-effect of moving documents around the network. So, it's the side-effect of moving documents over the network that triggers your business logic. There is no straight rule whether to include business logic in your controller or not, but usually you try to keep the business logic in yet their own layer, i.e. as a service that you just invoke from within the controller. That allows to test the business logic without the need of the controller and thus without the need of a real HTTP request.
While this answer can't provide more detailed information, partly due to the broad nature of the question itself, I hope I could shed some light in what areas you should put in some thoughts and that your data model is not necessarily your resource or affordance model.
I have been reading about the articles on the web about the benefits of graphql but so far I have not been able to find a single benefit of it.
One of the most common benefits mentioned in those articles are below?
No Overfetching with GraphQL.
Reducing number of calls made from client side.
Data Load Control Granularity
Evolve your API without versions.
Those above all makes sense but it is not the graphql itself that provides these benefits. Any second layer api written in java/python or any other language would be able to provide this benefits too. It is basically introducing another layer of abstraction above the data retrieval systems, rest or whatever, and decoupling the client side from that layer. After you do that everything you can do with graphql can also be done with any other language too.
Anyone can implement a say scala server that retrieves the data from various api's integrates them, create objects internally and feeds the client with only the relevant part of the data with total control on the data. This api can be easily versioned and released accordingly. Considering the syntax of graphql and how cumbersome it is and difficulty of creating a good cache around it, I can't see why would you use it really.
So the overall question is there any benefits of graphql that is provided to the application because of the graphql itself and not because you implement another layer of abstraction between your applications and your api's?
Best practices known as REST existed earlier, too.
GraphQL is more standarized than REST, safer (no injections) and syntax gives great flexibility in the area of quickly changing client needs.
It's just a good standard of best practices.
I feel GrapgQL is another example of overengineering. I would say "Best standards and practices" are "Keeping It Simple."
Breaking down and object and building a custom one before sending it to the client is very basic.
At my company we've decided on a microservice architecture for a new project.
We've taken a look at GraphQL and realised its potential and advantages for using as our single API endpoint.
What we disagree on is how the communication should be done between GraphQL and each micro service. Some argue for REST, others say we should also have a graphQL endpoint for each service.
I was wondering what are some of the pros and cons of each.
For example, having everything in graphQL seems a bit redundant, as we'd be replicating parts of the schema in each service.
On the other hand, we're using GraphQL to avoid some REST pitfalls. We're afraid having REST endpoints will nullify the advantages gained from gQL.
Has anyone come across a similar dilemma?
None of us are experienced with GraphQL, so is there some obvious pro and con here that we might be missing?
Thanks in advance!
Great question! Sounds like you're asking how to set up your architecture for GraphQL and microservices, and why.
Background
I would recommend using GraphQL since it's best use case is to consolidate data sources in a clean way and expose all that data to you via one standardized API. On the flip side, one of the main problems with using microservices is that it's hard to wrangle all the different functions that you can possibly have. And as your application grows, it becomes a major problem with consolidating all these microservice functions.
The benefits of using these technologies are tremendous since now you essentially have a GraphQL API gateway that allows you to access your microservices from your client as if it were a single monolithic app, but you also get the many benefits of using microservices from a performance and efficiency standpoint.
Architecture
So the architecture I would recommend is to have a GraphQL proxy sitting in front of your microservices, and in your GraphQL query and mutation resolvers, call out to the function that you need to retrieve the necessary data.
It doesn't really matter all that much between having a GraphQL gateway in front of GraphQL microservices or a GraphQL gateway in front of REST endpoints, although I would actually argue that it would be simpler to expose your microservice functions as REST endpoints since each function should theoretically serve only one purpose. You won't need the extra overhead and complexities of GraphQL in this case since there shouldn't be too much relational logic going on behind the scenes.
If you're looking for microservice providers the best ones that I've seen are AWS Lambda, Webtask, Azure Functions, and Google Cloud Functions. And you can use Serverless as a way to manage and deploy these microservice functions.
For example:
import request from 'request';
// GraphQL resolver to get authors
const resolverMap = {
Query: {
author(obj, args, context, info) {
// GET request to fetch authors from my microservice
return request.get('https://example.com/my-authors-microservice');
},
},
};
GraphQL Service
This is something that we've been exploring at Scaphold as well in case you'd like to rely on a service to help you manage this workflow. We first provide a GraphQL backend service that helps you get started with GraphQL in a matter of minutes, and then allow you to append your own microservices (i.e. custom logic) to your GraphQL API as a composition of functions. It's essentially the most advanced webhook system that's gives you flexibility and control over how to call out to your microservices.
Feel free to also join the Serverless GraphQL Meetup in SF if you're in the area :)
Hope this helps!
My company has been using GraphQL in production for about a year. Maintaining the schemas in our "Platform API" and also in our microservices became arduous. Developers kept asking us why they needed to do double work and what the benefit was. Especially since we required in-depth code reviews to change/update the production GraphQL schema
Apollo GraphQL released schema stitching which has solved most of the problems we were having. Essentially individual microservices each maintain their own GraphQL endpoint, then our Node.js Platform API stitches them all together. The resulting API is a client developer's dream, and the backend developers get the level of autonomy about their code they're used to. I highly recommend trying schema stitching. We've been adopting it incrementally for a few months and it's been wonderful.
As an added benefit, while defining our sub-schemas we started decoupling certain microservices, instead relying on the stitched data extensions to fill in holes in objects. Feels like the missing piece in DDD
You are asking about how to use GraphQL in a microservice architecture. One approach you are considering is that all microservices are GraphQL. The other approach is using GraphQL as the API gateway and REST for the backend data APIs.
In a recent evaluation which includes load tests of Node based data API microservices, I concluded that Express (REST) was more efficient than Apollo (GraphQL). It turns out that the general purpose parsing and executing of GraphQL queries can be relatively expensive when compared to JSON parsing with specific, hand coded API handlers. In light of that discovery, I would suggest keeping the data APIs RESTful.
I'm working an a PHP/JS based project, where I like to introduce Domain Driven Design in the backend. I find that commands and queries are a better way to express my public domain, than CRUD, so I like to build my HTTP-based API following the CQS principle. It is not quiet CQRS since I want to use the same model for the command and query side, however many principles are the same. For API documentation I use Swagger.
I found an article which exposes CQRS through REST resources (https://www.infoq.com/articles/rest-api-on-cqrs). They use 5LMT to distinct commands, which Swagger does not support. Moreover, don't I loose the benefit of the intention-revealing interface which CQS provides by putting it into a resource-oriented REST API? I didn't find any articles or products which expose commands and queries directly through an HTTP-based backend.
So my question is: Is it a good idea to expose commands and queries directly through the API. It would look something like this:
POST /api/module1/command1
GET /api/module1/query1
...
It wouldn't be REST but I don't see how REST brings anything beneficial to the table. Maintaining REST resource would introduce yet another model. Moreover, having commands and queries in the URL would allow to use features like routing frameworks and access logs.
The commands and queries are an implementation detail. This is apparent from the fact that they wouldn't exist at all if you had chosen an alternative style.
A RESTful API usually (if done right) follows the conceptual domain model. The conceptual domain model is not an implementation detail, because it is in your users heads and is the a source for requirements for your system.
Thus, a RESTful API is usually much easier to understand, because the clients (developers) have to understand the conceptual domain model anyway, and RESTful interfaces follow the concepts of such a model. The same is not true for a queries and commands based API.
So we have a trade-off
You already identified the drawbacks of building a RESTful API around the commands and queries, and I pointed out the drawbacks of your suggestion. My advice would be the following:
If you're building an API that other teams or even customers consume, then go the RESTful way. It will be much easier for the clients to understand your API.
If, on the other hand, the API is only an internal one that is e.g. used by a JS front-end that your team builds, and you have no external clients on the API, then your suggestion of exposing the commands and queries directly can be short-cut that's worth the (above mentioned) drawbacks.
If you take the shortcut, be honest to yourself and acknowledge it as such. This means that as soon as your requirements change and you now have external clients, you should probably build a RESTful API.
don't I loose the benefit of the intention-revealing interface which
CQS provides by putting it into a resource-oriented REST API?
Intention revealing for whom? A client side programmer? A server side programmer? In the same team/org that maintains the domain model? Outside of that team/org? Someone on the internet who would access your API naively by just probing a starting URI with an OPTIONS request? Someone who would have access to the full API documentation with URIs and payloads structure?
REST is orthogonal to CQRS. In the end, no matter how you expose your resources on the web, domain notions will be reflected somewhere, whether in the URI, the payloads, the media types. I don't think using DDD or CQRS should influence the way you design your API that much.
We have a web2py application that we want to connect to an EmberJS client. The idea is to use the responsive capabilities of EmberJS to keep the client updated writing minimal code.
We have (REST) primitives which are in charge of creating / updating the underlying datastore (CouchDB). These primitives are sometimes complex and covering corner cases, involving the creation of several documents, connecting them, validating configuration parameters, ... This is implemented in the backend. We would like to avoid duplicating the full modelling of the data in our EmberJS application, and avoid duplicating the logic implemented by those primitives.
I have some questions:
does it make sense in EmberJS to just model a subset of the data in the documents? We would just create models for the small amount of properties that the user is able to interact with. The client would not see the full CouchDB documents, just the data necessary for display / interaction.
is it possible to connect EmberJS to a REST interface, without having to fully model the underlying data in the database?
does it make sense in EmberJS to just model a subset of the data in the documents?
Yes. There is no need to create ember models for objects/properties that user will not need to interact with.
is it possible to connect EmberJS to a REST interface, without having to fully model the underlying data in the database?
Definitely that is possible, it's a fairly common use case. The best way to get started is by building a small MVP that works with just couple of models. Once you've got that wired up it will be easy to add more domain objects.
The tricky part (especially at first) will be mapping your rest endpoints to the ember-data REST adapter. The adapter will work out-of-box with some REST endpoints - see the REST Adapter - but connecting a CouchDB datastore will probably require some customization. The tools for this are still evolving, have a look at ember-data integration tests to see what is available.