NoSQL development/production database - nosql

I am in charge of the database for an application that we are developing and I am starting to get confused on how to use my development database.
I understand that having two separate databases is useful, helps while developing new features or change the database structure and this is why we actually have a production database and a development database. However as the project grows I am slowly getting confused on how I should use the development database and the development environment as a whole.
Our data is stored in Firestore which is a NoSQL database. We use it to store real time data that need to be accessed both by users and by a growing number of scripts that process the data to create some more. This real time data is also useful while developing to monitor the behavior related to the changes we made for a specific feature (on our test app).
So my question is :
Should my development database be a copy of the production database (copy every insert, update, delete ...) and should we duplicate all our scripts (one on the production environment and one on the development environment) that process the data ? In which case I would need to create connexions between each database and the costs related to storing and processing the data would double.
Or should I just use the development database as a database with the same structure as my production database, with less data and just pull some data or activate some pipelines to redirect some real time data when I need to test a new feature or a change in my database.
Also if you know a good book that I could read on the subject I'd take it !
Thank you,

Related

How to achieve synchronization and backup using Realm Sync?

I am planning to develop a cross-platform app for iOS and Android and would like to sync the data via Realm Sync. Realm Sync is serverless. Let's assume the following case: A user only has one device, uses the app, saves data and the device breaks down. Then there would be no way for that user to recover their data, since Realm Sync is just a sync and not a backup, right? But how can you implement synchronization and backup within the framework of Realm? What role does MongoDB Atlas play?
Many thanks in advance!
Note: I am an undergraduate student.
Realm Sync is cloud based - meaning there's a cloud based server that stores the app data remotely
The data is stored locally first and then automatically sync'd to the cloud at a later time (usually within seconds or faster). That's why Realm is considered an "offline first" database; data is stored locally (offline) and then copied/stored online.
Realm can either be a purely local only database with no cloud sync or can be a synchronized online as a "real-time" cloud database. (or both!)
When sync is used, the database that backs Realm is called MongoDB, and is a NoSQL database.
Generally speaking, when a Realm client app is developed, the SDK you choose is the layer between your code and objects and the MongoDB back-end database.
The SDK allows you to code in an object-oriented way without the need to directly work with the low level NoSQL objects.
You create models, relationships and the UI and the SDK takes the app data, massages it, and stores it in MongoDB as NoSQL.
As a followup so future readers don't have to read through all the comments:
Q) what does the term "serverless" mean?
A) I like this definition
Serverless offloads all management responsibility for backend cloud infrastructure and operations tasks - provisioning, scheduling, scaling, patching and more - to the cloud provider
and the more straight and too the point:
Serverless is a cloud development model that allows developers to build and run applications without having to manage servers.
Q) Doesn't that mean that the data is then only temporarily stored in the
cloud
A) Not at all. As mentioned above, Realm data is always written locally first and then the SDK sync's it to the cloud. If you totally erase your device, once you reinstall and run the app, Realm will pull down the data from the server (sync)
Q) Because the prices for Realm Sync also do not include storage space
costs: Pricing
A) That link includes storage per plan and costs: In summary
Shared plan is up to 5GB, the serverless is up to 1TB and the dedicated is 4TB per shard. Then, Shared is $0/Month, Serverless is .30/million reads and dedicated is a flat $57/month.
Q) Is it then possible to save the user data on this (an Atlas
cluster)?
A) YES! That's the whole idea! BUT. If you're developing in Realm, it's you job to craft a great app using the SDK, and let the SDK interface with Atlas (MongoDB) on the backend. Depending on your use case you may rarely need to do anything with Atlas directly.
The big picture is that when coding with Realm, you work with objects, structures and relationships in a more much natural object-oriented way - the SDK does the heavy lifting of taking that data and 'converting' it for storage on the Realm Sync server (MongoDB Atlas) - as it ends up being NoSQL.

What is the most lightweight way to get last modified datetime from an Aqueduct application?

I'm writing a very small REST API using Dart/Aqueduct hosted on Heroku utilizing PostgreSQL.
When communicating with this API, I need to fetch all data and store it in an application locally. The application will on reboot ask the API if any data has changed in any databases, which will only get modified through this API.
My question is: How do I check, whether data has been modified? Storing it in the Aqueduct channel is not viable, as Heroku will boot servers up as needed (and multiple at the same time) changing this modified date time each time.
PostgreSQL cannot supply this modified date time (https://dba.stackexchange.com/questions/58214/getting-last-modification-date-of-a-postgresql-database-table) - so what do I do? Is the only way to do this to have a seperate table storing this information? Can this be done more lightweight, so I wouldn't have to make a query to the databases when e.g. calling https://my-api.com/lastModified? Should I serve a static file, which should be written to on each data modification?
Maybe there exist a smart, lightweight solution!
Cheers :)
Yes, use a separate table to store the date - the ORM is already doing this to track migrations. I’m not sure what light weight means in this context, but you won’t run into any problems with this approach, whereas a file or in-memory storage won’t work at all.

Ionic mobile app using Cassandra, what about local storage?

I am working on a project using Ionic for the mobile side, I have a web app as well linked to a Cassandra database.
I need data synchronization between the mobile device (local storage) and the server-hosted Cassandra database. I use the cassandra-driver to connect to the database but then I realize how problematic it is to convert the data to an other type of database (SQLite for example).
Should I rather use an other database than Cassandra to make the synchronization easier ? (I need a noSQL solution)
Choice of the database depends on type of data you want to store. Cassandra is a column oriented database. It has great performance when you have to deal with large amount of data, but has many limitations related to the queries you need in order to pull data. For that reason, it might require additional efforts to develop something that you could easily do with some other database. So, the real question is do you really need Cassandra.
If you are using it only for mobile application, I don't think you will have so much data to exploit Cassandra benefits.
In your place, I would rather consider some other databases, such as MongoDB in case JSON is appropriate format for your data or Redis if you data is key/value pairs.

Local, file-based database for an Electron application

We are working on an application that will be offered both as a web-based and as a cross-platform desktop solution by means of Electron.
Due to customer requirements, the desktop client cannot make use of "the cloud" to store data; all data should be stored in the local machine or, even better, the user should have the option to keep the database/data file on an external HDD so that another user on the same local network can use the same data file.
We've been looking at NeDB, PouchDB, etc, but all these use either Web SQL or IndexedDB on the browser itself to store the data.
NeDB can theoretically use the file system but that seems only possible for Node Webkit apps.
Another option is of course MongoDB, but it requires setting up a site on a web server. Seeing as how our users will set that up in on their own machines, that will work for one user only but would make it very hard for them to share the data (note: assume users with little technical know-how).
Is there a way to force NeDB to persist data in a file instead of the in-browser database?
Alternatively, does any one know of a file-based, compact database that plays well with electron/node?
We'd preferably like to use a NoSQL database, but options of file-based SQL databases will be considered as well.
I have some experience with NeDB in an Electron app and I can say it will definitely work on the filesystem.
How are you initializing NeDB (or whatever your database choice is)? Also, are you initializing it in the main or renderer process? If you can share that, I think we could trace the issue to a configuration issue.
This is how you start NeDB with a persistent data-store that saves to disk.
var Datastore = require('nedb')
, db = new Datastore({ filename: 'path/to/datafile', autoload: true });
I think MongoDB is going to be overkill for an Electron app (it's meant to be really a high performance, distributed database running in the cloud).
Another option you could consider is LevelDB (a key/value store that can persist to the filesystem) which is popular in the node community. (EDIT 4/17/17 IndexedDB uses LevelDB underneath the hood, so if you go that route, may as well just use that)
One aspect I would definitely evaluate carefully is: How difficult is this database going to be to package and distribute? How do I integrate it into my build system? Level and NeDB can be included simply via npm install and any native code compiling is handled seamlessly with node-gyp, which is as simple as it gets. However, bundling Mongo, for example, will require some work to get a working build for each different platform.

SaaS Architecture Question from Newbie

I have developed a number of departmental client-server applications, and am now ready to begin working on moving one of these applications to a SaaS model. I have done some basic web development, but I'm a newbie when it comes to SaaS architectures.
One of the first questions that comes to mind as I try to design the architecture is the question of single vs. multi tenancy. The pros and cons of each vary significantly depending on the type of application and scale required, so I'd like to describe my application and scale needs below, and hope others can comment on how I should get started with the architecture.
The client-server application currently consists of a Firebird database and a Windows application. The database contains about 20 tables containing a few thousand records in 4 primary tables, and a few hundred records in various lookup and related tables. Although the number of records is small, the size can get large, as the database can contain large BLOBS. Each customer sets up their own database and has a handful of users within the organization connected to it. When I update the db schema, a new windows application is released, and it checks the db schema and then applies the updates as needed.
For the SaaS application, I am designing for 100's (not 1000's or millions) of new customers per year. My first thought was to go with a multi tenancy model to make updates easy (shut down apply the updates to one database, and then start up). On the other hand, a single tenancy model would provide a means to roll updates out to a group of customers at a time, and spread the risk of data corruption - i.e. if something goes wrong with a database, it will impact one customer instead of all customers. With this idea, I was thinking of having a single web front-end which would connect to a single customer database upon login. Thus, when a new customer creates an account, a new database would be created (each customer would have their own db with multiple users as needed for the customer).
In this model, a db update would require either a process to go through each db to apply schema changes, or a trigger upon logging in to initiate a schema update similar to the client-server model currently in use.
Can anyone point me to information for similar applications which have been ported from client-server to SaaS? Or provide any pointers to consider? Basically I'm looking for architecture examples of taking a departmental application and making it available as a self service website for multiple customers. Thanks for any suggestions, resources, etc.
Good questions.
One thing that comes to mind is that if you have multiple databases which you roll out in a staged manner to reduce the likelihood of breaking all of your customers, you will have to address the issue of what to do if the db structure changes. You will either have to be very rigorous with respect to maintaining backward compatibility, or else deploy separate versions of your code base and somehow manage which tenants are associated with which databases.
We are providing our application using a SaaS model as well.
It was, initially a Windows app which worked similar to your multiple database proposal. Upon login, the win app would authenticate against a single "licensee" database which would then respond with connection information for a database specific to that licensee. The nice thing about this was that it provided 1) physical separation of licensee data, which our customers liked and 2) enabled us to physically locate the database on a server geographically closer to the users which both improves performance and avoids some potentially tricky legal and regulatory issues with respect to providing data across country boundaries.
Of course, since the app was a thick client app, we could get away with making database changes and pushing them out to one licensee at a time. When we were ready to upgrade, we could push out an updated thick client in conjunction with the new database - thereby ensuring that the codebase was a match with the database. As long as the common "licensee" authentication database stayed consistent, this worked fairly well.
On the other hand, though, this solution brought with it all of the problems of maintaining and managing a thick client approach which finally lead us down the thing client, browser-based approach.
In our new model, everything is in a single database. When we have updates, we push both the code and the db out at the same time. This solves the problem of keeping the code base consistent with the database structure. However, we are now confronted with the issues mentioned in #s 1 and 2, above, which we have yet to resolve.
I hope this provides some food for thought for you.
I, too, am interested in this question.
Thanks for the post.
-S