Keeping users out of specific MarkLogic databases

Keeping users out of specific MarkLogic databases - marklogic-10

My question is kind of similar to this question, but not quite :
Hide a marklogic database to specific user (permissions)
Background - up until now, developers who use database X were all admins on the server ( this is a historic config that we have recently inherited ), but now we want to have new developers added to the server who definitely wont be admins, and who will have a new database Y added to the server.
What we want to do is have several groups of developers using the same MarkLogic 10 server, but have it so developer group X can only work in their database X, and Developer group Y can only work in database Y. We dont care if they can see all databases on the server.
Does this mean we have to apply permissions to every document in every database to do this, or can we control this via a roles that limit access to specific databases?
Can someone suggest the right way to achieve this please?
Thanks in advance.

You have two tools to work with:
Granular privileges which allow you limit the scope of a privilege to a specific resource (such as database or forest)
Document permissions unique to documents reflective of their respective set of intended users on each database as you already mentioned
However, in my experience, I've generally found this use case is better served by having many small dev clusters rather than one large one as resource contention (one app team pushing CPU to 100%) can become too much of an issue. It is pretty quick and painless to spin up and tear down dev clusters on AWS or Azure. Or, if you're self-hosting, you could look at running multiple MarkLogic Containers on a single host.

Related

Postgresql Multiple Database VS Multiple Schemas

We are in the process of building a cluster for our hosted services at work, the final product will be used to host multiple separate services. We are in the middle of deciding on how we want to setup our databases. We are running a postgresql database server which all services in the cluster will use. The debate right now is whether to give each service its own schema in a single database or to give each service its own database.
We just aren't sure which is the better solution for us. None of our services have a common structure and data does not need to be shared. What we are more concerned about is ease of use.
Here's what we care most about, we are really hoping for an objective vs opinion based answer.
Backups
Disaster recovery - all services vs individual
Security between services
Performance
For some additional information, the cluster is hosted within AWS with our database being an RDS instance.

This is what PostgreSQL official docs says:
Databases are physically separated and access control is managed at the connection level. If one PostgreSQL server instance is to house projects or users that should be separate and for the most part unaware of each other, it is therefore recommendable to put them into separate databases. If the projects or users are interrelated and should be able to use each other's resources they should be put in the same database, but possibly into separate schemas. Schemas are a purely logical structure and who can access what is managed by the privilege system.
Source: http://www.postgresql.org/docs/8.0/static/managing-databases.html

Disaster recovery - all services vs individual
You can dump and restore one database at a time. You can dump and restore one schema at a time. You can also dump schemas that match a pattern.
Security between services
I presume you mean isolation between databases and isolation between schemas. The isolation between databases is stronger and more "natural" for developers concerned with "ease of use". For example, if you use one database per service, every developer can just use the public schema for all development. This might seem "easier" than adding schemas to the search path, or "easier" than using schema.object when programming.
It depends in part on how you manage privileges for the roles you use for development, and on how you manage privileges in each database or schema. You can change default privileges.
Performance
I don't see a measurable difference. YMMV.

building a multi-tenant mongoDB database initially, but later remove it

I have a design issue that i'm facing and because i am relatively new to mongodb i think i need some help to make the right decision.
problem:
i am building a type of social networking website for let's call group A consumers. I also need to build this same type of website for group B consumers. initially, i want to keep them separate with no interaction/sharing between the two groups but i do not want to maintain two separate websites. so a multi-tenant solution is ideal. the tricky part of this problem is that at SOME point in the future, i want to create a website for BOTH group B and A consumers, essentially merging them into 1 website. this 1 website will have all users from the original groups A and B but now they can all see each other, interact with each other, friend each other, etc.
is the right path to first create a multi-tenant mongo database, then later how easy is it to remove this multi-tenancy?

I would suggest that you do not create and drop the databases. Instead you can have the application with 2 tenants like facebook and g+ with their own set of users. However at some point of time in the future, you can just share your facebook user to g+ or the otherwise. in this case, there is no need to drop / merge tenant based tables or database and they will remain intact.
Your application should have the multi-tenancy capabilities that enable user sharing across tenants or linking users across tenants and that is the sure shot approach.

Meteor -MongoDB - Single Database or Multiple Databases for SaaS Offering

This is from another question bust i think it should be answered by the meteor team because i can't find a straight answer so far.
"..We have decided to use MongoDB for a SaaS offering we are creating. Each company that signs up gets their own url (mycompany.domain.com) and their own private set of users, projects, etc... Since we are using a NoSQL solution, and wouldn't have to manage pushing out schema updates to every database like we would with MySQL, I am wondering if it would be better to have one huge database containing all the data, or to have one database per client..."
So, can i have with meteor aproach (with one meteor project/server):
1) Different Url for each company
2) Different database (in the same monodb server) for each company and for that specific company users.

If you look at meteor's own hosting they use a mongodb server from MongoHQ. You could use multiple meteor servers with the single mongodb server and multiple databases.
I would think it depends more on your apps design, Meteor can use either design.
1) You could use the publish functions to provide each client with only his/her own records from one huge DB, use a way to get the subdomain http host into the publish function so it only gives out data for that set.
2) Use seperate meteor instances connecting to their own mongodb database on one server, and use some kind of proxy to server them to the subdomains. You could push each one with whatever data you would like, even perhaps separate app sets.
It would really depend on what you're building. If you want to only have to update one set of data so it updates for everyone you could go with 1), so if your use case requires this it might be a better option to go with.
The benefit of using seperate meteor instances is primarily customization. Its really hard to get the gist of what you want with the details you've given, so ill cut it short: If you want the ability of each client to be very different use 2), otherwise use 1)
If you look at Meteor.com's hosting I think each deployment is given its own database, the main reason: customization, everyones deployment is likely to be completely different.
UPDATE:
As of March 2014, there is a third party atmosphere package meteor-dbproxy that allows you to use multiple mongodb servers (as well as separate oplog integration endpoints) in your backend, thus allowing you db-level sandboxed multi-tenancy.

From a MongoDB point of view, you can do a database per client. The current stable MongoDB version, 2.2, has database level locking opposed to the large global lock of previous versions.
This way, if one of your clients is hammering the system, they don't affect your other clients with a global lock.

SaaS Architecture Question from Newbie

I have developed a number of departmental client-server applications, and am now ready to begin working on moving one of these applications to a SaaS model. I have done some basic web development, but I'm a newbie when it comes to SaaS architectures.
One of the first questions that comes to mind as I try to design the architecture is the question of single vs. multi tenancy. The pros and cons of each vary significantly depending on the type of application and scale required, so I'd like to describe my application and scale needs below, and hope others can comment on how I should get started with the architecture.
The client-server application currently consists of a Firebird database and a Windows application. The database contains about 20 tables containing a few thousand records in 4 primary tables, and a few hundred records in various lookup and related tables. Although the number of records is small, the size can get large, as the database can contain large BLOBS. Each customer sets up their own database and has a handful of users within the organization connected to it. When I update the db schema, a new windows application is released, and it checks the db schema and then applies the updates as needed.
For the SaaS application, I am designing for 100's (not 1000's or millions) of new customers per year. My first thought was to go with a multi tenancy model to make updates easy (shut down apply the updates to one database, and then start up). On the other hand, a single tenancy model would provide a means to roll updates out to a group of customers at a time, and spread the risk of data corruption - i.e. if something goes wrong with a database, it will impact one customer instead of all customers. With this idea, I was thinking of having a single web front-end which would connect to a single customer database upon login. Thus, when a new customer creates an account, a new database would be created (each customer would have their own db with multiple users as needed for the customer).
In this model, a db update would require either a process to go through each db to apply schema changes, or a trigger upon logging in to initiate a schema update similar to the client-server model currently in use.
Can anyone point me to information for similar applications which have been ported from client-server to SaaS? Or provide any pointers to consider? Basically I'm looking for architecture examples of taking a departmental application and making it available as a self service website for multiple customers. Thanks for any suggestions, resources, etc.

Good questions.
One thing that comes to mind is that if you have multiple databases which you roll out in a staged manner to reduce the likelihood of breaking all of your customers, you will have to address the issue of what to do if the db structure changes. You will either have to be very rigorous with respect to maintaining backward compatibility, or else deploy separate versions of your code base and somehow manage which tenants are associated with which databases.
We are providing our application using a SaaS model as well.
It was, initially a Windows app which worked similar to your multiple database proposal. Upon login, the win app would authenticate against a single "licensee" database which would then respond with connection information for a database specific to that licensee. The nice thing about this was that it provided 1) physical separation of licensee data, which our customers liked and 2) enabled us to physically locate the database on a server geographically closer to the users which both improves performance and avoids some potentially tricky legal and regulatory issues with respect to providing data across country boundaries.
Of course, since the app was a thick client app, we could get away with making database changes and pushing them out to one licensee at a time. When we were ready to upgrade, we could push out an updated thick client in conjunction with the new database - thereby ensuring that the codebase was a match with the database. As long as the common "licensee" authentication database stayed consistent, this worked fairly well.
On the other hand, though, this solution brought with it all of the problems of maintaining and managing a thick client approach which finally lead us down the thing client, browser-based approach.
In our new model, everything is in a single database. When we have updates, we push both the code and the db out at the same time. This solves the problem of keeping the code base consistent with the database structure. However, we are now confronted with the issues mentioned in #s 1 and 2, above, which we have yet to resolve.
I hope this provides some food for thought for you.
I, too, am interested in this question.
Thanks for the post.
-S

Using Postgresql as middle layer. Need opinion

I need some opinions.
I'm going to develop a POS and inventory software for a friend. This is a one man small scale project so I want to make the architecture as simple as possible.
I'm using Winform to develop the GUI (web interface doesn't make sense for POS software). For the database, I am using Postgresql.
The program will control access based on user roles, so either I have to develop a middle tier, using a web server, to control user access or I can just set user priveleges directly in Postgresql.
Developing a middle tier will be time consuming, and the maintenance will be more complex. So I prefer to set access control directly in the database.
Now it appears that using database to control user access is troublesome. I have to set priveleges for each role. Not to mention that for some tables, the priveleges are at column level. This makes reasoning about the security very hard.
So what I'm doing now is to set all the tables to be inaccessible except by superusers. The program will connect to the database using public role. Because the tables are inaccessible by public, I'm going to make publicly accessible stored functions with SECURITY DEFINER (with superuser role). The only way to access the tables is by using these functions.
I'll put the user roles and passwords in a table. Because the user table itself is inaccessible by non-superuser, I'll make a login function, let's call it fn_login(username, password). fn_login will return a session key if login is successful.
To call other functions, we need to supply session key for the user, e.g.: fn_purchase_list(session_key), fn_purchase_new(session_key, purchase_id, ...).
That way, I'm treating the stored functions as APIs. Adding new user will be easier as I only need to add new rows in the user table rather than adding new Postgresql roles. I won't need to set priveleges at column level. All controls will be done programmatically.
So what do you think? Is this approach feasible and scalable? Is there a better way to do it?
Thanks!

I believe there is a better way to do it. But since you haven't discussed what type of security you need, I cannot elaborate on specifics.
Since you are developing the application code in .NET, that code needs to be trusted (unlike a web application). Therefore, why don't you simply implement your roles and permissions in the application code, rather than the database?
My concern with your stated approach is the human overhead of stored procedures. Would much rather see you write the stated functions in C#, rather than in PostgreSQL. Then, standard version control and software development techniques could apply.

If you wait until somebody has at your database to check security, I think you'll be too late. That's a client/server mentality that went out at the end of the 90s. It's part of the reason why n-tier architectures came into vogue. Client/server can't scale horizontally as well as an n-tier solution.
I'd advise that you take better advantage of the middle tier. Security should be a cross-cutting concern that's further up the stack than your persistence layer.

If the MANAGEMENT of the database security is the issue, then you should add the task of automating that management. That means that you can store higher level data with the database tables, and then your application can convert that data in to the appropriate details and artifacts that the database requires.
It sounds like the database has the detail that you need, you just need to facilitate the management of that detail, and roll that in to your app.

My honest advice: Do not invent POS and inventory software. Take one of existing projects and make it better.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse