data-at-rest encryption for NoSQL - mongodb

Prototyping a project with Mongo & Spring Boot and thinking it does a lot of what I want. However, I really need to have encrypted data-at-rest, which would seem to indicate I have to purchase the enterprise version. Since I don't have a budget yet, I am wondering if there is another alternative that people have found useful? I think DynamoDB can be used in a local & test environment. Or it viable to encrypt the data at the application level and still have great performance for my CRUD operations?

I've done application level encryption with DynamoDB before with some success. My issues where not really with DynamoDB but with the encryption in the application.
First, encryption/decryption is very expensive. I had to increase the number of servers I was using by over double just to handle the extra CPU load. Your milage may very. In my case, I was using Node.js and the servers suddenly switched from being I/O bound to being CPU bound.
Second, doing encryption/decryption application side adds a lot of complexity to your app. You will almost certainly need to parallelize the encryption/decryption to minimize the added latency that it will cause. Also, you will need to figure out a secure way of sharing the keys.
Last, application level encryption will make some DynamoDB operations unavailable to you. For example, conditions probably won't make sense anymore for encrypted values.
Long story short, I wouldn't recommend application level encryption regardless of the database.
DynamoDB now supports what they call Server-Side Encryption at Rest. Personally I think that name is a little confusing but from their perspective, your application is the client and DynamoDB is the server.
Amazon DynamoDB encryption at rest helps you secure your application
data in Amazon DynamoDB tables further using AWS-managed encryption
keys stored in AWS Key Management Service (KMS). Encryption at rest is
fully transparent to the user with all DynamoDB queries working
seamlessly on encrypted data. With this new capability, it has never
been easier to use DynamoDB for security-sensitive applications with
strict encryption compliance and regulatory requirements.
Blog post about DynamoDB encryption at rest
You simply enable encryption when you create a new table and DynamoDB
takes care of the rest. Your data (tables, local secondary indexes,
and global secondary indexes) will be encrypted using AES-256 and a
service-default AWS Key Management Service (KMS) key. The encryption
adds no storage overhead and is completely transparent; you can
insert, query, scan, and delete items as before. The team did not
observe any changes in latency after enabling encryption and running
several different workloads on an encrypted DynamoDB table.

Related

Should I have a seperate database to store financial data for each user in my postgreSQL server?

I am creating accounting/invoicing software and my database is in postgreSQL. Should I create a separate database for each user since the data is sensitive financial data? Or is having a user foreign key secure enough? If I am hosting the database on aws I understand that I could have a few db servers across multiple availability zones and regions so that if one is compromised it wouldn't effect everyone even if many users have info stored in a single database. Is this safe enough? Thanks!
In general no. Encrypt the data so that if someone exfiltrates a dump they can't actually use it without the decryption key. If you're worried that someone with admin access can see the user's information then you might want to consider a user-level encryption for all fields related to personally identifiable information.
There are few ways you could go about it but I wouldn’t create a new DB for every customers. It will be too expensive and a pain to maintain and evolve.
To me, this sounds like you are creating a multi-tenant application.
I’d personally use the row-level security feature in Postgres (see this article) or create a separate Schema for each Customer.
You can add an extra layer of protection with encryption at rest. AWS support it (link)

Choosing correct approach to build a muti-tenant architecture with Azure Cosmos DB (MongoDB)

I am little confused in choosing the suitable approach of creating database/collections for a multi-tenant system in MongoDB in CosmosDB API.
I would have 500 tenants for my application, where each tenant's data may grow up to 3-5GB and initially each tenant may need minimum RUs (400 RU/s).
For this use case i would have few options to go with:
1. PartitionKey (per tenant)
2. Container w/ shared throughput (per tenant)
3. Container w/ dedicated throughput (per tenant)
4. Database Account (per tanant)
Considering the Performance Isolation, Cost, Availability and Security, may i know which option would be suitable for the mentioned use case ?.
Please let me know your inputs as i have less exposure to NoSQL and Cosmos track.
The answer is potentially multiple options and it depends on your specific tenant use cases.
Tenant/Partition is the least expensive with a marginal cost per tenant of zero. This is a great option for providing a "free-tier" in your app but you can scale this up to a paid tier for your customers too. Max storage size is 20GB. With this scheme you will need to implement your own resource governance. You will need to ensure customers are not "running hot" and consuming throughput and storage that is drastically out of line from other users. However if you're building a multi-tenant app, resource governance is something you should already be doing.
Tenant/Container is more expensive at 400 RU/s per month ($25/month) which is the minimum throughput for a container. This is ideal when you have tenants that are very large and require isolation from others in the previous tier.
Tenant/Account is same marginal cost as Tenant/Container. This is useful if you have customers that have GDPR requirements that prevent or require replication into specific Azure regions.
Note that I DO NOT recommend Tenant/Container using shared Database throughput. The reason is because with this scheme, all containers share the same throughput which is what you get with Tenant/Partition but performance is not predictable with shared Database throughput so it is not a good choice. Additionally, you are limited to 25 containers per database further making it a poor choice.
Finally, for your app you will need to implement a mechanism to migrate customers from one tier to another. You will also of course require some sort of auth-n/auth-z mechanism. For Cosmos DB you can optionally use our native users and permissions and use resource tokens to secure access to data.
We did a presentation on this last year at BUILD with a customer of ours Citrix who built their own cloud offering on top of Azure using Cosmos DB as their user meta-data store. Definitely worth checking out and will provide you more details and insights, Mission-critical multi-tenant apps with Cosmos DB
PS: if you are building a new service on Cosmos DB I recommend using our Core (SQL) API rather than MongoDB. This is our native service and you will get the best performance and features. Our MongoDB API is the best choice for customers who are looking to migrate and want a fully managed MongoDB experience.
Hope this is helpful.

Bring your own encryption - Postgresql 10 - Multi-tenant db

New to Stack so forgive me if I'm doing something incorrectly here...
We are currently working in postgressql 10, servicing multiple customers, setup using centralized tables partitioned by a customer id. The setup is AWS/EC2, with the traditional ETL approach of each client file going into their own table in an import schema (import.clientA_members) before being standardized and going to prod.members with clientA used as a partition.
Recently, a few clients have brought "Bring your own key" requirements and I am struggling with the best approach to implement this. The setup seems pretty straightforward if we had a database per client but as it stands right now that would cause some pretty big shifts on how we approach everything here from scalability and efficiency to reporting and analytics.
Has anyone here run into this situation? I'm not a DBA, more of an operator, so I'm not sure what the options are in terms of encrypting partitions with different keys, and those keys working well with the AWS Key Management Store.
Thanks in advance!

MongoDb protect database file from anonymous access

I created a mongodb database with this description
http://docs.mongodb.org/manual/tutorial/enable-authentication-without-bypass/
created database
created admin-user
run mongodb with --auth parameter
that works fine.
but how can I really protect the database files from anonymous access?
When someone would take my database-file and run mongodb without --auth parameter he would have access to the whole database.
Is there a way to protect the database file itself so I can't just run mongodb without --auth?
Best regards
Tobias
Encrypting data files is only part of an overall security strategy - if someone has access to copy any files from your computer or a backup, they may also be able to snag your encryption keys from the same source. The MongoDB manual has a Security section which covers general best practices including access control, network exposure, auditing, and a high level checklist.
If you want to encrypt your MongoDB data files you will need to look into a solution for "encryption at rest".
As at MongoDB 2.6, there is no built-in support for data encryption but there are a number of open source as well as commercial solutions available.
The broad categories of encryption at rest are application level or storage encryption (which can be used independently or together, depending on your requirements). Encryption will add some performance overhead for disk I/O, so you should consider this in your testing & evaluation of a suitable solution for your requirements.
A few examples of encryption at rest solutions are:
LUKS (Linux Unified Key Setup)
Windows Bitlocker Drive Encryption
For more information on supported options, have a read of the Encryption at Rest section of the MongoDB security documentation.

Sharing object between Node.js server with memcached / couchcase cluster

I was looking for a way to share object in several nodes cluster, and after a bit of research I thought it would be best using redis pub/sub. Then, I saw that redis doesn't support cluster yet, which means that a system based on redis will have single point of failure. Since high availability is a key feature for me, this solution is not applicable.
At the moment, I am looking into 2 other solutions for this issue:
Memcached
Couchbase
I have 2 questions:
On top of which solution it would be more efficient to simulate pub/sub?
which is better when keeping clusters in mind?
I was hoping that someone out there faced similar issues and share his experience.
I think it's a bad idea to use memcached and couchbase for pub/sub. Both solutions don't provide built-in pub/sub functions and implementing pub/sub on app side can cause a lot of ops/sec to memcache/couchbase server and as a result you'll get slow performance.
Couchbase stores data into disk, so for temporary storage it's better to use memcaced. It will be faster and will not load your disk.
If you can avoid that "pub/sub" and use memcached/couchbase just as simple HA shared key-value storage - do it. It will be much better then pub/sub.
When you install Couchbase server it provides 2 types of buckets: couchbase (with disk persistance, ability to create views, etc.) and memcached (only in-memory key-value storage). Both types of buckets act in the same way in clusters. Also couchbase support memcache api calls, so you don't need to change code to test both variants.
I've tried to use memcached provider for socket.io "pub/sub" sharing, but as I mentioned before it's ugly. And in my case there were few node.js servers with socket.io, so instead of sharing I've implemented something like "p2p messaging" between servers on top of sockets.
UPD: If you have such big amount of data may be it will be better not to have one shared storage, but use something like sharding with "predictible" data location.