I'm developing a microservice (restful) project that uses kaycloak as IAM. I could create realm, client, users,... for authenticating but my concern is should I manage users only on keycloak or creating my own user table in my microservice?
is should I manage users only on keycloak or creating my own user
table in my micro-service?
First you need to check what can one do (or not) with Keycloak regarding user management and compared with your current (and possible future) requirements. If it does not completely fulfill your requirements then you can either extend Keycloak, adapt your requirements, or (probably the most straightforward solution) have your own user table in your micro-service.
You might want also to create your own user table for performance reasons. Depending on how slow it is to access Keycloak in your setup you might consider using that user table as caching mechanism for quick access of user-related information.
The problem of having that user table is that depending on the user information stored on Keycloak and on the user table you might have to keep them in sync. Moreover, if that information exists on the user table and not on Keycloak, and you need that information on the tokens, you will have to think about how will you handle such situations.
Personally, I would try to avoid creating the user table unless it is really necessary. So a complete answer to your question will most-like be highly dependent of your own needs.
Related
I have an existing API connected to an AWS PostgreSQL database that uses AWS Cognito for User authentication.
The goal is for users to insert data via the API with some field mapped to their Cognito id, and retrieve the same data. The idea would be for each user to only have access to the data 'owned' by them. Similarly to the way row level access works.
But I do not want to create a role for each user which seems to be necessary.
The idea would be that I need to somehow setup a connection to the PostgreSQL DB with the user_id without creating a user and handle the accessible data via a policy, or somehow pass the data to the policy directly.
What would be an ideal way to do this, or is creating a PG user for each user a necessity for this setup?
Thanks in advance
EDIT: I am currently querying the database through my backend with custom code. But I would rather have a system where instead of writing the code myself, the PostgreSQL system handles the security itself using policies(or something similar). I fully understand how PostgreSQL row-level-access works with roles and policies and I would prefer a system where PostgreSQL does the major work without me implementing custom back-end logic and preferably not creating thousands of PostgreSQL roles for the users.
You should not allow users to make a direct connection to the database.
Instead, they should make requests to your back-end, where you have business logic that determines what each user is permitted to access. Your back-end then makes the appropriate calls to the database and returns the response to the user.
This is a much 'safer' response because it prevents users having direct access to your database and it is also a better architecture because it allows you to swap-out the database engine for another one without impacting your service.
The database is for your application, not for your users.
We already have DB with users.
We have to migrate all records to Keycloak DB or we can just implement Storage SPI ?
We don't want to migrate records, because we should also support old DB, it brings problems because we will need synchronize 2 DB.
Can you please write what could be the problems in this approach and write your advices for resolve theirs ?
USER DATA SOURCES
Moving to a system such as Keycloak will require an architectural design on how to manage user fields. Some user fields will need migrating to an identity database managed by Keycloak. Applications can then receive updates to these fields within tokens.
KEYCLOAK DATA
Keycloak will expect to have its own user account storage, and this is where each user's subject claim will originate from. If a new user signs up, the user will be created here before being created in your business data.
Keycloak user data will include fields such as name and email if they are sent in forgot password workflows. You can keep most other user fields in your business data if you prefer.
So to summarize, a migration will be needed, but you don't have to migrate all user fields.
BUSINESS DATA
This may include other user fields that you want to keep where they are, but also include in access tokens and use for authorization in APIs. Examples are values like roles, permissions, tenant ID, partner ID, supscription level.
DESIGN STEPS
My recent blog post walks through some examples and suggests a way to think through your end-to-end flows. There are a couple of different user data scenarios mentioned there.
It is worth doing a day or two of sketching out how you want your system to work. In particular how your APIs will authorize requests, and how you will manage both existing and new users. This avoids the potential for finding expensive problems later.
Consider the following scenario: you have a SSO service (let's say Keycloak), and X applications, that have their own databases, where somewhere in each database, you're referencing a user_id. How to handle this? How to satisfy the foreign constrain problem? Should one synchronise Keycloak, and the applications? How? What are some best practices? What are some experiences?
I've been using Keycloak for several years, and in my experience there are several scenarios regarding synchronizing user data between Keycloak
and your application's database :
Your application is the owner of the user data.
Keycloak is only used for authentication/authorization purposes. In this scenario, your application creates/updates a keycloak user using the admin rest API when needed.
Keycloak is the owner of the user data and you don't need more info than the userid in your database.
In this scenario everything regarding users could be managed by Keycloak (registration, user account parameters, even resource sharing using the authorization services).
Users would be referenced by userid in the database when needed.
NB: You can easily add custom data to the user in Keycloak using the user attributes but one interesting possibility is to extend the user model directly using this : https://www.keycloak.org/docs/latest/server_development/index.html#_extensions_jpa
Keycloak is the owner of the user data and you need more than just the user id (email, firstname, etc)
If performance is not an issue, you could retrieve user info via the Admin Rest API when needed.
If performance is an issue you'll need a copy of Keycloak's user data in your app's database, and you would want that copy to be updated on every user changes.
To do that you could implement callbacks in keycloak (using SPIs: https://www.keycloak.org/docs/latest/server_development/index.html#_events), that will notify your application when an user is created/updated.
NB : You could also use a Change Data Capture tools (like Debezium: https://debezium.io/) to synchronize Keycloak's database with yours.
There's pros and cons to each scenario, you'll have to choose the one which better suits your needs :)
The problem to face lies in the design of a RESTful API that can manage requests from multiple roles in an RBAC-based solution.
Currently we have different resources that can be accessed from different users, which can have one or more roles grouped according to their privileges.
The API we're trying to define must be as clear as possible to the client but without the overhead of adding additional metadata to the URL that could damage and even conflict with the REST practices and definitions. Therefore, we must avoid at all costs include information about the roles inside the URL. The plan is to use JWT tokens that carry in their payloads the info needed to know which permissions has the user making the request.
Having raised our current situation, let's provide an example and state the problem to solve:
Suppose we have * financiers * and * providers * as users with some roles who both want to access ** attentions ** (our resource). Should we add before the resource ** attentions ** information about the * user * whose trying to access the resource?
The endpoints in that case should be defined (as an example) as:
https://example.com/api/v1/financiers/:id/attentions
https://example.com/api/v1/providers/:id/attentions
This way we're attempting to inform the respective controllers that we want the ** attentions ** for that specific role / user which are, in some way, a sub-resource of them.
On the other hand, we could simply implement a much simpler endpoint as follows:
https://example.com/api/v1/attentions
The logic about which attentions return from the database should be now implemented in an unique method that must handle this two roles (and potentially new ones that could come up in the following features). All the information needed must be obtained from the payload from the token, exposing a much more generic API and freeing the web client from the responsibility of which endpoint call depending on the role.
I want to highlight that the attentions are managed in a Microservices Architecture and, hence, the logic to retrieve them is gathered in a single service. The cost of the API Gateway to route the two (and potentially more) of the endpoints from the first solution is a variable not to discard in our specific situation.
Having exposed our current situation:
Which we'll be the best approach to handle this issue?
Is there another alternative not contemplated that could ease the role management and provide a clean API to expose to the client?
In the second solution, is correct to return only the attentions accessible to that specific user based on the roles that it has? Isn't it counterintuitive to access an endpoint and only get some of the resources from that collection (and not all) based on its role?
I hope that someone could clarify the approach we're taking as there are little and none literature that I've found regarding this issue.
There there are multiple solutions for such kind of filtration, and developer have to select one depending on given situation.
As per my experience I can list following.
Structure
When data can't be accessed directly and developer has to use a relation (i.e a table JOIN). In that case URL have to include both the main and sub entities. Before going with this approach a good check is to ask, if the same URL can be used with POST ?
Example
If we have to fetch list of roles assigned to a specific user or want to assign additional roles then we can use
GET users/:uid/roles
POST users/:uid/roles
Security
With Multi-tenant systems where each user can have his/her private resources, i.e other users are prohibited from accessing those resources. Developer should save tenancy information and to filter the resources according to current authentication, without bothering client or requiring any additional info in URL
Example
Phone album of the user
GET photos
POST photos
Search
If it is not security or structure related but client still want to filter the result set depending on his scenario. then developer should use query-string for the filtration.
Example
Client have to fetch messages from his/her inbox or outbox or want messages which are not yet read. or he/she want to search his/her inbox
GET messages?folder=inbox
GET messages?folder=inbox&status=unread
GET messages?search=nasir
I'm building a service which can be used anonymously, however the user has the ability to share content on his\her Facebook and\or Twitter profiles. Upon authorizing the applications I wish to store basic information about the users and link it to the content they are sharing.
Usually services require authentication prior to usage, which solves this problem, however in my case authentication comes at the very last stage and it's split into 4 paths:
[Facebook + Twitter]
[Facebook alone]
[Twitter alone]
[Nothing]
However doing the above will create redundant data in the database i.e. I will have the Facebook information and Twitter information in separate tables with no linkage between them and no relation to the post.
What's the best approach to prevent this? The solution is on the data modeling level? Or on the code level? Or both?
Has this been done before?
I have created a flow chart of how the merging of account data can be done, however this process might create overhead on the database level as it will require searching for entries using the very long FacebookID \ TwitterID.
If extra information is required please state it in a comment.
Thank you
The way I would handle this is to separate the concept of user identity from the concept of authentication used by your application. For example, at the data model level, have Users table store basic user information and have Authentications table that stores user credentials/tokens associated with a particular Authentication Provider.
At the code level, if you are planning to stick with third-party authentication, I would recommend looking into building a layer that can shield your application from having to deal directly with various OAuth providers.
In Ruby/Rails world, this is accomplished by a combination of Devise that manages user identities (it also allows to have built-in username/password authentication, but it does not sound like you are interested in that) and OmniAuth that delivers authentication against multiple providers.
An example application incorporating both is available here: Devise + OmniAuth.
Finally, RailsCast on the subject is here: OmniAuth Part 1
I realize that you may not be working in Ruby/Rails, but these materials may provide you with inspiration for the architecture you are trying to achieve.