What is the best way to organize OpenAPI code generated clients? [closed] - openapi

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
Improve this question
OpenAPI client code generation allows for 2 use-cases in the case of in-house clients:
service clients:
Generated client code is tied to a the openAPI of a specific service, covers all the API of that service, and is controlled and maintained (as much as needed for a generated code) by the same developers that handle the service code, and there are no code duplications
usage dependant clients:
Each use case will generate his own client, based on the endpoints in use for that particular use-case. You get smaller clients, thatcover only the endpoints that are in use - the people who are responsible for maintaining those clients are the same people responsible for the specific use-case, and there can be code duplication between different generated clients
If i compare the 2 options i see the following:
+-----------------------+------------------------------+------------------------+
| Criteria | service based clients | use-case based clients |
+-----------------------+------------------------------+------------------------+
| Ownership | Service maintainers | use-case maintainers |
| Code duplication | No duplication | a lot of duplications |
| Clients sependancy | Dependancy exists | No dependancy |
| Client code stability | When upgrading client | When updating client |
| Base url | Single per client | Per endpoint (group) |
| Language/Tech Stack | Depends on service | Depends on client |
+-----------------------+------------------------------+------------------------+
In the case of an outside use of the API those same scenarios also exists: if we release OpenAPI specification only - we are supporting use-case based clients, and if we release a client library - we are supporting service based clients (they are not mutually exclusive)
In my case i am talking about an in-house usage, and that means that the knowledge of the division between services is a common knowledge between the service providers and the service users
Does anyone have good arguments that i have missed regarding any of those 2 use-cases - and a 3rd use-case i have missed?
Right now i am leaning towards a use-case based clients for in-house use, since it will allow us to treat this generated code as a completely transparent layer, leaving only the OpenAPI specs as the layer of communication between teams

Related

Manually spawn stateful pod instances

I'm working on a project where I need to spawn 1 instance per user (customer).
I figured it makes sense to create some sort of manager to handle that and host it somewhere. Kubernetes seems like a good choice since it can be hosted virtually anywhere and it will automate a lot of things (e.g. ensuring instances keep running on failure).
All entities are in Python and have a corresponding Flask API.
InstanceManager Instance (user1)
.-----------. .--------.
POST /instances/user3 --> | | ---------- | |---vol1
| | '--------'
| | -----.
'...........' \ Instance (user2)
\ .--------.
'- | |---vol2
'--------'
Now I can't seem to figure out how to translate this into Kubernetes
My thinking:
Instance is a StatefulSet since I want the data to be maintained through restarts.
InstanceManager is a Service with a database attached to track user to instance IP (for health checks, etc).
I'm pretty lost on how to make InstanceManager spawn a new instance on an incoming POST request. I did a lot of digging (Operators, namespaces, etc.) but nothing seems straightforward. Namely I don't seem to even be able to do that via kubectl. Am I thinking totally wrong on how Kubernetes works?
I've done some progress and thought to share.
Essentially you need to interact with Kubernetes REST API directly instead of applying a static yaml or using kubectl, ideally with one of the numerous clients out there.
In our case there's two options:
Create a namespace per user and then a service in that namespace
Create a new service with a unique name for each user
The first approach seems more sensible since using namespaces gives a lot of other benefits (network control, resource allocation, etc.).
The service itself can be pointing to a statefulset or a pod depending on the situation.
There's another gotcha (and possibly more). Namespaces, pod names, etc, they all need to conform to RFC 1123. So for namespaces, you can't simply use email addresses or even base64. You'll need to use something like user-100 and have a mapping table to map back to an actual user.

How to sync values between redis and PostgreSQL table rows?

I'm building an API product, where users can use multiple APIs (I give them an API key).
I want to monitor the api calls the user makes to be able to know how many calls they've made (and by extension, how much to charge them).
Here's my table: user_apis
| API Name | User ID | Total API calls | Successful API calls | Failed API Calls |
|----------|---------|-----------------|----------------------|------------------|
| cat_data | 1 | 15 | 10 | 5 |
| dog_data | 1 | 3 | 3 | 0 |
| cat_data | 2 | 1 | 0 | 0 |
A user can use different types of APIs, say User 1 uses 2 APIs cat_data and dog_data.
Now, when I'm handling the routes that correspond to cat_data and dog_data APIs, I need to quickly monitor the API call in some kind of middleware.
If User 1's API Key hits the endpoint for cat_data => Right now I'm counting that hit in redis in a very crude format key=[user_api_key_total_calls], value=[number_of_calls+1], I've built a sort of API gateway middleware through which every request passes through and gets tracked in redis.
I need to use redis (or some in memory store) because these APIs might see very high usage (>100 req/sec), so I cannot make a read + write call to a database on every API endpoint hit.
That brings us to the question: How do I make sure I maintain the integrity of this data in the database?
If I don't write to database every time (say, I choose to write after every 1000 calls) and if the redis instance goes down, then that data is lost! How do I avoid this?
There is multiple solutions for this:
First of all, You can persist the Redis data to avoid data-lost. Redis reliability is not less than PostgreSQL or any other databases.
For syncing database and Redis, you can Update all databases rows in one query, once per 15 seconds(for example). But for making that query you need to know what UserIDs where active sync last update, for this you can save list of active UserIDs in a Redis set.

How to design REST API for SaaS with multiple "companies" per account

I am currently working on a SaaS for companies, to manage their business data (employees, invoices, orders, products, ...). The current API design is as following:
GET /employees?limit=10&offset=0
GET /employees/ID
POST /employees
and so on, for every model. In addition you can apply more filters with query parameters.
Until now I checked to which company the logged in account belongs. However now I want that an account can be a "member" in multiple organisations. E.g. if the company using the platform hires an "expert", they should be able to grant his account access (make him a member).
The question: How should i implement this in the API design? I've come up with three solution, but don't really know which one is best practice.
Solution 1:
GET /ORGANISATION-ID/employees?limit=10&offset=0
Solution 2:
GET /employees?limit=10&offset=0&organidationId=ORGANISATION-ID
Solution 3:
The URI stays the same, but a Header is set:
|----------------|------------------------|
| Header name | Value |
|----------------|------------------------|
| Authentication | Bearer TOKEN |
| Organisation | ID ORGANISATION-ID |
| ... | ... |
|----------------|------------------------|
Note: The Authentication header is always set.
I personally thing solution number 3 is the most elegant, but I am not sure if it's inappropriate to use headers for this. Solution 2 is confusing i think, and solution 1 would cause all endpoints to start with the organisation id, which isn't very nice.
Generally I find that the best way to handle this for every path in your API to represent a single resource.
To me that means that, in your case, everything should be namespaced under an organization:
https://api.example.org/org/[orgid]/employees
This way it's very obvious for a member of multiple organizations that there are multiple employee lists.
An similar public example might be github. Everything in github is either A) namespaced under a user B) namespaced under an organization or C) is a top-level github endpoint.

what is a proper api uri version format? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I inherited an API in development from a previous developer which uses the following API uri pattern for versioning:
http://localhost:8080/api/v1/users/
Is this the proper format for API versioning? For example, if a client wants v2 of the User API then the following uri format would appear to indicate that a v2 of the API exists for users but not necessarily that a v2 of the API exists for all entities:
http://localhost:8080/api/users/v2
So it seems like the second uri would be more granular. Is one of these patterns more correct or more commonly used than the other?
I'd use the former: /api/v1/users/ because:
If the new version of the API is exposed as a separate instance of a web-application then configuring site-wide routing is a lot easier: just route all requests after /api/v1/... to the old version and /api/v2/... to the new version. If the v2 is at the end of the URI then configuring routing will be a pain.
The new version of the API may render higher levels obsolete, what if "users" were replaced with "identities", for example? Then the "v2" suffix becomes meaningless.
Resource paths are hierarchies - having a version indicator after a resource should be used to refer to a version of that resource, e.g. if you wanted to get the 2nd snapshot of User 123's state, then that might be /api/users/123/v2 and to get the 300th snapshot then /api/users/123/v300 - by putting the API version information there makes this unclear.
Resource paths should go from left-to-right in "least-to-most specific" order. The API version at the end messes-up that logical order.
Similarly, it makes sense that /api/v1/users represents "Users under Version 1 of the API" - instead of /api/users/v1 in which case it becomes unclear ("Version 1 users of the API"? "A user named 'v1' under the API?", etc).
There are other approaches to consider: if your versioning is only concerned with different schema versions of your DTOs while the underlying data and business logic remains the same then you could let clients specify the DTO schema version they want in a querystring value:
GET /api/users/123?dtoVersion=1
GET /api/users/123?dtoVersion=2

REST API: Should we use PUT or DELETE to update resource partially?

We have website like pluralsight where authors and customers register. Authors publish their courses and customers can give rating to these courses. The table structure looks like:
author table: (save basic information of author: one to one)
authorId | name | contact | email | rating
1 | sahil | 9971343992 | shaf#gmail.com | 3.2
authorRating: (save ratings given to author from customers: one to many)
Id | authorId | customerId | rating |
1 | 1 | 101 | 2.7
2 | 1 | 201 | 3.7
The rating in the author table gets updated when some record gets inserted/updated/deleted in authorRating table. There's some complex algorithm which finalize rating in author table based on authorRating table records.
We have created following APIs for that:
PUT api/author/1/rating /: If there's any change in authorRating table, we recompute the rating of that author and trigger this API to pass new rating. This accepts rating and add/update that in author table. If author table doesn't have id=1, it gives back validation error
DELETE api/author/1/rating /: This removes the rating for author id=1 i.e. set it to NULL. If author table doesn't have id=1, it gives back validation error.
Is this the right API design? OR should we only have PUT API exposed and if they send rating as null in the PUT API, we will set it null in the author table ?
OR should we consider using PATCH here?
As far as you are modifying only a field of one structure, I think PATCH fits better here, but it should be sent to the parent resource:
PATCH api/author/1
For these rating operations, I would use something like:
To insert a new rating for author 1, use POST /api/author/1/rating
To update a rating for author 1, use PATCH /api/author/1/rating. You may want to have more data in the authorRating table that don't want to change (like author and customer ids) but you are only updating some fields, in this case, the rating.
To delete author's 1 rating, DELETE /api/author/1/rating, as you explained, makes sense.
It is a common practice to use POST method for RESTful APIs. You can post virtually any message and handle the message by the parameters. You can POST a delete, or update or other command, depending on your needs.
HTTP is a protocol which defines methods that allow to manipulate resources; files or data on the internet so to say. Any business logic triggered by one of the operations invoked are more or less just side effects while manipulating these resources. While certain things can be achived in multiple ways, operations (slightly) differ in their semantics they convey.
PUT as specified in RFC 7231, which replaces the current representation with the one provided in the request, does state the following about partial updates:
Partial content updates are possible by targeting a separately identified resource with state that overlaps a portion of the larger resource, or by using a different method that has been specifically defined for partial updates (for example, the PATCH method defined in RFC5789).
So, you have either the option of "overlapping" resources and update the other resource, which has the effect of also changing the overlapping data and thus the data in the actual resource as well, or use PATCH therefore.
The primer one can easily be thought of that certain information of an other resource is embedded in the actual resource and on updating the other resource the state of the actual resource will also change as a consequence. Think of a user and his address i.e.
According to Roy Fielding, who wrote the following in his dissertation
The key abstraction of information in REST is a resource. Any information that can be named can be a resource: a document or image, a temporal service (e.g. "today's weather in Los Angeles"), a collection of other resources, a non-virtual object (e.g. a person), and so on. In other words, any concept that might be the target of an author's hypertext reference must fit within the definition of a resource. A resource is a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time.”
resources should be named and referenced via own, unique identifiers. A direct mapping of entities to resources though is often not desirable as the resource can and probably should contain more information for the client to take further actions.
It is therefore up to you if you consider a rating to be a good entity and/or a good resource. I'm not a big fan of this though, though this is an opinionated position.
DELETE has some special semantics. It actually does not guarantee the removal of a file, though it will make the resource unavailable by removing the association (the URI) for that particular resource. What happens to the deleted resource is actually up to the implementation.
The DELETE method requests that the origin server remove the association between the target resource and its current functionality. In effect, this method is similar to the rm command in UNIX: it expresses a deletion operation on the URI mapping of the origin server rather than an expectation that the previously associated information be deleted.
...
If the target resource has one or more current representations, they might or might not be destroyed by the origin server, and the associated storage might or might not be reclaimed, depending entirely on the nature of the resource and its implementation by the origin server (which are beyond the scope of this specification). ... In general, it is assumed that the origin server will only allow DELETE on resources for which it has a prescribed mechanism for accomplishing the deletion.
Usually DELETE should only be used on resources that where created via PUT or POST previously whose creation was confirmed via an own Location response header.
With that being said, as you aked for whether this is the right API design or not. Actually, there is no right or wrong and which stance you take is primarily an opinionated one. As long as you stay in bounds with the HTTP protocol specification (in your particular case) you don't violate REST architect principles. If you design your rating resources in a way that make them uniquely identifiable you can use delete to unreference the respective rating from the author (and probably delete the data from your DB) or send a put request with the new content of that rating resource to the respective endpoint.
Keep in mind though, that the server should do its best to teach the client what next actions it can take without having the client to have some out-of-band information about your API, otherwise you will couple the client to your API and therefore might cause problems when you change your API in the future.