Document databases that support REST-style JSON over HTTP access seem ideal for supporting AJAX-rich applications where the browser is making direct calls to the database, bypassing the traditional web server / application logic components. An example of this might be retrieving user preferences once a user has been authenticated. (BBC Homepage might be a good example of this, prior to crashing under the load!)
The problem with this scenario is the security issue - if a user is authenticated using a web server (e.g. basic forms authentication), how is this identity carried over to the document DB. Is the only answer to proxy all requests to the DB through the web server anyway - i.e. secure the document DB so that there is no direct external access?
This seems to make most sense, and is the easiest to implement, but I was wondering whether anyone out there had an experience and / or advice on using document dbs in a heterogeneous environment?
This probably differs in every database you mention. Here's how it works in CouchDB.
CouchDB allows you to manage users and roles.
You can use the validate_doc_update function in your design documents to restrict document creation/update. For example, you can write a validation that denies document update to anyone but its author.
To restrict who can read documents from a database, you can edit the /db_name/_security document and list the users or roles.
However, I don't think you can make the read access more granular (i.e. allow a user to read only the documents they created).
To achieve that, you have to put the CouchDB behind a proxy and use views to serve the documents to authenticated users. You can still use CouchDB user management this way. The proxy just hides the direct access to the database.
For more detailed info, check the security overview on CouchDB wiki, the security chapter of the Relax book and this short screencast.
Well, I only have experience with CouchDB, but hope I can help you nonetheless.
CouchDB has a validation process built-in, you write your validation rules in javascript, and have access to the group in which the current user is. It's all handled by CouchDB itself basically, you don't have to care how you get to login information.
Related
I'm designing a fairly complex backend and now I have a doubt. Is a good idea in Keycloak to differentiate users in different keycloak groups by their country when I create them during a sign-in for example?
I was thinking that it could be useful to better manage users in the future.
What do you think?
There is no direct solution for such question. It clearly depends on your application. If in the future your application will provide services based on the country of each user it might be good idea as your application might get this information about the user directly from Keycloak.
If you are planning to do some researches about your users it also might be good idea as some statistics might be country related or you would like to get country related outputs (to relocate your cloud instances near to majority of your users etc..)
There might be faster database lookups with such additional information but I don't know if Keycloak currently provides functionality for this. On the other hand, if I will sign up to your service while I am chilling on my holidays on the other side of the world from where I usually live your record will be useless. Therefore this action could bring more issues to implementation of your application while you might not need it at all.
If you have no plans for such functionalities there is simply no reason to do such thing. Present web services tend to store more data then they actually need to. For example in majority of recent database leaks you can see LAST geological coordination's point stored with each user. While these might be unnecessary for precise advertisements targeting and unnecessary users screening, there is really no reason to store last geological coordination of each user. Such information might change with each user login and should be determined in "runtime". If services do not benefit from such data users are under threat for no reason.
You should determine what is needed by your application and what is not. You should never store or expose any additional information's about your users regardless how well your application is secured.
So I was using a basic 'if authenticated user' placeholder rule for Firestore when I started using Geofirex. However, when I try to query/use the database with geofirex, my security blocked it. I'm currently running without any rules for the sake of development, so I know everything works, but I have no idea how to add rules to allow this library or have the library identified with the user. Is there even a way to do this?
If a library runs in the same process as the rest of your application code, there is no way to set up separate security for that library. All requests coming from the application are (and should be) treated equally, as there's nothing that inherently makes the library code more trusted than the code of your own application, or the code that malicious user may write.
What you can do is creating an additional collection that only contains the location of each object and its key. You could then point Geofirex to that collection and allow read-access to this data to all users, while securing access to the more sensitive other data about each object. This is what the original GeoFire libraries from Firebase did, and while it leads to more code (to read the additional data objects), it makes it much simpler to secure data access.
I am trying to shift towards serverless architecture when it comes to building REST API. I came from Ruby on Rails background.
I have successfully understood and adapted services such as Api Gateway, Cognito, RDS and Lambda functions, however I am struggling with putting it all together in optimal way.
My case is the following. I have a simple user based platform when there are multiple resources related to application members say blog application.
I have used Cognito for the sake of authentication and Aurora as the database service for keeping thing like articles and likes..
Since the database and Cognito user pool are decoupled, it is hard for me to do things like:
Fetching users that liked particular article
Fetching users comments
It seems problematic for me because I need to pass some unique Cognito user identifier (retrieved during authorization phase in API gateway) to lambda function which will then save the database record with an external reference to this user. On the other hand, If I were to fetch particular users, firstly I must fetch their identifiers from my relation database and then request users details from Cognito user pool..I lack some standard ways of accessing current user in my lambda functions as well as mechanisms for easily associating databse record with that user..
I have not found some convincing recommended patterns for designing such applications even though it seems like a very common problem and I am having hard time struggling if my approach is correct..
I would appreciate some comments on what are some patterns to consider when designing simple user based platform and what are the pitfalls of my solution. Any articles and examples will also be very helpfull.
Thanks in advance.
These sound like standard problems associated with distributed, indpependent, databases. You can no longer delegate all relationships to the database and get a result aggregating them in some way. You have to do the work yourself by calling one database, then the other.
For a case like this:
Fetching users that liked particular article
You would look up the "likes" database to determine user IDs of those who liked it, then look up the "users" database to determine user details such as name and avatar.
Most patterns follow standard database advice, e.g. in the above example, you could follow the performance-oriented pattern of de-normalising - store user data such as name and avatar against each "like", as long as you feel the extra storage and burden of keeping it consistent is justified by the reduction in queries (probably too many Likes to justify this).
Another important practice is using bulk queries to avoid N+1 queries. This is what Rails does with the includes syntax, but you may have to do it yourself here. In my example, it should only take two queries because the second query should get all required user data in one go, by querying for users matching the list of user IDs.
Finally, I'd suggest you try to abstract things. This kind of code gets messy fast, so be sure to build a well-encapsulated data layer that isolates application code from dealing with the mess of multiple databases.
For example I have 2 databases. One of them is called ecommerce which contains real customer information. Another is called ec1 which basically contains only views from tables of ecommerce.
We use our ec1 database to connect to our website or apps. How secure is this method in terms of back end security?
Only exposing ec1 is better than exposing ecommerce because you can reset ec1 using your "safe" values in case of corruption and you can keep some secret data only stored in ecommerce if it doesn't need to be used by your website or your app.
However, this is only a small portion of backend security. Having two different databases with real data and data views doesn't matter a lot if someone can access your server OR can corrupt your data.
I mean, if someone found a way to get some data he should be not authorized to read, it is bad even if it comes from ec1 and not from ecommerce
So yeah, exposing only views is a BETTER solution, but nothing can be said on the overall security because it mainly doesn't depend on that
EDIT: A detailed explaination of backend security is way beyond the possibility of a simple stackoverflow answer (and probably i am not the best teacher) but for basic server security you must take care of:
- Firewall to stop every request but your webapps ones.
- Updated software
- good database passwords
- The user you use for your application queries must only be able to perform operations on ecl1 database, while the views should be generated with a cron and using a different user
These are the main security enhancement tips that comes to my mind
A little backstory
I have to develop a web application for college. This web application has to do with managing different locations using google maps like pinning new locations adding custom descriptions and so on. The login part is done using facebook (login with facebook). The more interesting part would be that the queries (client-server) would have to be done by using REST.
The part that i try to understand
If i use a database to store my user's unique ID, their online status (online/offline) and somehow (didn't settle actually on the idea) to keep a JSON on the server that would contain each user's pinned locations, would all this actually be ok with the REST paradigm ?
I find mixed answers on the internet and i don't know how to think of the statelessness of the application correctly. A session would not be created but the credentials from the database would be necessary for the users to communicate with each other.
The other side of the question
Considering that i'm mistaken and i shouldn't use the database to store the credentials and locations like that, how am i supposed to keep all that data ? I'm thinking something like JSON cached client-side but what if my client changes the computer, wouldn't this mean that he loses all his data? (Also wouldn't this make MVC handicapped by not having a model?) How do i really keep track of all things.
You're making this way too hard on yourself, try to keep it simple since you probably have a deadline. REST is a way of using APIs with HTTP verbs like GET, POST, PUT, and DELETE. It says nothing about how to store the data behind your APIs.
As for storing the data, a database should be fine. Storing it as JSON in the db could work, but in the end you'll have to parse the json every time that you want to use it, so I would suggest that you store it in a DB in such a way that it can be read easily.
For a beginner (especially if you're doing this for a school project), I would definitely suggest that you set up a relational database like Microsoft SQL Database (Microsoft Stack), or a MySQL/PosGres Database (I think this is what they'd use in linux), but if you wanna skip the relational db approach (because it might not be all that "easy" to get going), you can always try a NoSQL database like MongoDB.
Relevant links to help:
http://rest.elkstein.org/ (REST explained)
http://www.restapitutorial.com/lessons/httpmethods.html (REST verbs)
http://en.wikipedia.org/wiki/Relational_database (what is a relational db)
http://en.wikipedia.org/wiki/Database_normalization (Kinda the goal of relational db.. but note you can go too far...http://lemire.me/blog/archives/2010/12/02/over-normalization-is-bad-for-you/)
http://www.mongodb.com/nosql-explained (NoSQL explanation)