Suppose we have several (100) nodes in an IOT network. Each node has limited resources. There is a postgresql database server running in one of these nodes. Now every node has several (4-5) processes which need to interact with this server to perform some insert and select queries. Each query response has to be as fast as possible for the process to work as it should. Now i think of some ways to do this are :
Each process in a node makes one database client and performs queries.
All processes in a node send their queries to a destination in localhost itself from where the queries are performed through an optimum number of database clients. This way we have some sort of control over the number of database clients like optimisation of queries getting performed through a priority queue implementation or performing queries in separate thread/process through a separate database client in each thread/process. In this case somewhat we have the control over the optimisation of number of clients,number of threads/processes , priority of in what order queries must be executed.
Each node sends all queries through some network protocol directly to the database server which then uses a limited number of database clients performing queries now in its own localhost database and then returning the response to each node through same channel. This way it increases the latency but keeps number of clients minimum. Plus we can also implement some optimisation here running each client in different process/thread etc. Database interaction can be faster in this case since number of clients can be kept minimum, it is running in localhost machine itself but it adds some overhead to transfer the query response data back to the node's process.
In order to keep the resource usage as minimum as possible in every node and queries response as fast as possible , what is the best strategy to solve this problem ?
Without knowing the networking details, option 3 would be used normally.
Reasons:
Authentication: Typically you do not want to use database users to authenticate IoT devices.
Security: By using a specific IoT protocol, you can be sure to use TLS and certificate based server authentication.
Protocol compatibility: In upgrade cases, you must ensure that you can upgrade the client nodes independently from the server nodes, or vice versa. This may not be the case for database protocol.
Related
I am currently planning some server infrastructure. I have two servers in different locations. My apps (apis and stuff) are running on both of them. The client connects to the nearest (best connection). In case of failure of one server the other can process the requests. I want to use mongodb for my projects. The first idea is to use a replica set, therefore I can ensure the data is consistent. If one server fails the data is still accessible and the secondary switches to primary. When the app on the primary server wants to use the data, it is fine, but the other server must connect to to the primary server in order to handle data (that would solve the failover, but not the "best connection" problem). In Mongodb there is an option to read data from secondary servers, but then I have to ensure, that the inserts (only possible on primary) are consistent on every secondary. There is also an option for this "writeConcern". Is it possible to somehow specify “writeConcern on specific secondary”? Because If an add a second secondary without the apps on it, "writeConcern" on every secondary would not be necessary. And if I specify a specific value I don't really know on which secondary the data is available, right ?
Summary: I want to reduce the connections between the servers when the api is called.
Please share some thought or Ideas to fix my problem.
Writes can only be done on primaries.
To control which secondary the reads are directed to, you can use max staleness as well as tags.
that the inserts (only possible on primary) are consistent on every secondary.
I don't understand what you mean by this phrase.
If you have two geographically separated datacenters, A and B, it is physically impossible to write data in A and instantly see it in B. You must either wait for the write to propagate or wait for the read to fetch data from the remote node.
To pay the cost at write time, set your write concern to the number of nodes in the deployment (2, in your proposal). To pay the cost at read time, use primary reads.
Note that merely setting write concern equal to the number of nodes doesn't make all nodes have the same data at all times - it just makes your application only consider the write successful when all nodes have received it. The primary can still be ahead of a particular secondary in terms of operations committed.
And, as noted in comments, a two-node replica set will not accept writes unless both members are operational, which is why it is generally not a useful configuration to employ.
Summary: I want to reduce the connections between the servers when the api is called.
This has nothing to do with the rest of the question, and if you really mean this it's a premature optimization.
If what you want is faster network I/O I suggest looking into setting up better connectivity between your application and your database (for example, I imagine AWS would offer pretty good connectivity between their various regions).
I am using mongodb as a database for my game servers. When I started them, they were bound to just one region, I used the free tier of Mongodb Atlas in that region, India, and it worked perfectly with around 20ms of latency even on high load.
Now when I'm trying to scale up my servers and reach other regions like US-East, the latency jumps up to 500ms.
Is there a way I could open up two mongodb servers on both my India and US instances, which would always be in complete sync with each other, while the game server process would just use localhost to connect to the specific replica. I'm using pyMongo.
The players have no connection with the database, it's the game server process that manages it
Is there a way I could open up two mongodb servers on both my India and US instances, which would always be in complete sync with each other, while the game server process would just use localhost to connect to the specific replica.
Assuming you are using a replica set, I believe the closest you can come to achieving the stated requirement for reads is by:
Performing all writes with the write concern w=(# of nodes in the deployment), if you have two servers then use w=2.
Performing reads with read preference=nearest and read concern=majority.
I say "closest" because I suspect read concern=majority on a secondary could return stale data even with w=(# of nodes in deployment), since I imagine the majority commit point would necessarily lag behind the point when each secondary commits each document. To guarantee that you are reading current data you must read from the primary.
Note that such a setup has (at least) two additional major drawbacks:
If any of the servers becomes unavailable, your application ceases to be able to write any data to the database.
Each write will wait for all servers in the deployment to store it, hence all writes will be slow.
Achieving the same requirement for reads and writes is not physically possible. (You essentially want to be able to write a document in, say, India in 20ms and have it be "instantly" available in US, i.e. be able to retrieve it in US 20 ms later, but it takes a minimum of 500 ms for data to travel from India to US.)
I am currently reading up on some distributed systems design patterns. One of the designs patterns when you have to deal with a lot of data (billions of entires or multiple peta bytes) would be to spread it out across multiple servers or storage units.
One of the solutions for this is to use a Consistent hash. This should result in an even spread across all servers in the hash.
The concept is rather simple: we can just add new servers and only the servers in the range would be affected, and if you loose servers the remaining servers in the consistent hash would take over. This is when all servers in the hash have the same data (in memory, disk or database).
My question is how do we handle adding and removing servers from a consistent hash where there are so much data that it can't be stored on a single host. How do they figure out what data to store and what not too?
Example:
Let say that we have 2 machines running, "0" and "1". They are starting to reach 60% of their maximum capacity, so we decide to add an additional machine "2". Now a large part the data on machine 0 has to be migrated to machine 2.
How would we automate so this will happen without downtime and while being reliable.
My own suggested approach would be that the service hosing consistent hash and the machines would have be aware of how to transfer data between each other. When a new machine is added, will the consistent hash service calculate the affected hash ranges. Then inform the affect machine
of the affected hash range and that they need to transfer affected data to machine 2. Once the affected machines are done transferring their data, they would ACK back to the consistent hash service. Once all affected services are done transferring data, the consistent hash service would start sending data to machine 2, and inform the affected machine that they can remove their transferred data now. If we have peta bytes on each server can this process take a long time. We there for need to keep track of what entires where changes during the transfer so we can ensure to sync them after, or we can submit the write/updates to both machine 0 and 2 during the transfer.
My approach would work, but i feel it is a little risky with all the backs and forth, so i would like to hear if there is a better way.
How would we automate so this will happen without downtime and while being reliable?
It depends on the technology used to store your data, but for example in Cassandra, there is no "central" entity that governs the process and it is done like almost everything else; by having nodes gossiping with each other. There is no downtime when a new node joins the cluster (performance might be slightly impacted though).
The process is as follow:
The new node joining the cluster is defined as an empty node without system tables or data.
When a new node joins the cluster using the auto bootstrap feature, it will perform the following operations
- Contact the seed nodes to learn about gossip state.
- Transition to Up and Joining state (to indicate it is joining the cluster; represented by UJ in the nodetool status).
- Contact the seed nodes to ensure schema agreement.
- Calculate the tokens that it will become responsible for.
- Stream replica data associated with the tokens it is responsible for from the former owners.
- Transition to Up and Normal state once streaming is complete (to indicate it is now part of the cluster; represented by UN in the nodetool status).
Taken from https://thelastpickle.com/blog/2017/05/23/auto-bootstrapping-part1.html
So when the joining node is in the Joining State, it is receiving data from other nodes but not ready for reads until the process is complete (Up status).
DataStax also has some material on this https://academy.datastax.com/units/2017-ring-dse-foundations-apache-cassandra?path=developer&resource=ds201-datastax-enterprise-6-foundations-of-apache-cassandra
I am currently researching different databases to use for my next project. I was wanting to use a decentralized database. For example Apache Cassandra claims to be decentralized. MongoDB however says it uses replication. From what I can see, as far as these databases are concerned, replication and decentralization are basically the same thing. Is that correct or is there some difference/feature between decentralization and replication that I'm missing?
Short answer, no, replication and decentralization are two different things. As a simple example, let's say you have three instances (i1, i2 and i3) that replicate the same data. You also have a client that fetches data from only i1. If i1 goes down you will still have the data replicated to i2 and i3 as a backup. But since i1 is down the client has no way of getting the data. This an example of a centralized database with single point of failure.
A centralized database has a centralized location that the majority of requests goes through. It could, as in Mongo DB's case be instances that route queries to instances that can handle the query.
A decentralized database is obviously the opposite. In Cassandra any node in a cluster can handle any request. This node is called the coordinator for the request. The node then reads/writes data from/to the nodes that are responsible for that data before returning a result to the client.
Decentralization means that there should be no single point of failure in your application architecture. These systems will provide deployment scheme, where there's no leader (or master) elected during the service life-cycle. These are often deliver services in a peer-to-peer fashion.
Replication means, that simply your data is copied over to another server instance to ensure redundancy and failure tolerance. Client requests can still be served from copies, but your system should ensure some level of "consistency", when making copies.
Cassandra serves requests in a peer-to-peer fashion. Meaning that clients can initiate requests to any node participating in the cluster. It also provides replication and tunable consistency.
MongoDB offers master/slave deployment, so it's not considered as decentralized. You can deliver a multi-master, to ensure that requests can still be served if master node goes down. It also provides replication out-of-the box.
Links
Cassandra's tunable consistency
MongoDB's master-slave configuration
Introduction to Cassandra's architecture
We have a data system in which writes and reads can be made in a couple of geographic locations which have high network latency between them (crossing a few continents, but not this slow). We can live with 'last write wins' conflict resolution, especially since edits can't be meaningfully merged.
I'd ideally like to use a distributed system that allows fast, local reads and writes, and copes with the replication and write propagation over the slow connection in the background. Do the datacenter-aware features in e.g. Voldemort or Cassandra deliver this?
It's either this, or we roll our own, probably based on collecting writes using something like
rsync and sorting out the conflict resolution ourselves.
You should be able to get the behavior you're looking for using Voldemort. (I can't speak to Cassandra, but imagine that it's similarly possible using it.)
The key settings in the configuration will be:
replication-factor — This is the total number of times the data is stored. Each put or delete operation must eventually hit this many nodes. A replication factor of n means it can be possible to tolerate up to n - 1 node failures without data loss.
required-reads — The least number of reads that can succeed without throwing an exception.
required-writes — The least number of writes that can succeed without the client getting back an exception.
So for your situation, the replication would be set to whatever number made sense for your redundancy requirements, while both required-reads and required-writes would be set to 1. Reads and writes would return quickly, with a concomitant risk of stale or lost data, and the data would only be replicated to the other nodes afterwards.
I have no experience with Voldemort, so I can only comment on Cassandra.
You can deploy Cassandra to multiple datacenters with an inter-DC latency higher than a few milliseconds (see http://spyced.blogspot.com/2010/04/cassandra-fact-vs-fiction.html).
To ensure fast local reads, you can configure the cluster to replicate your data to a certain number of nodes in each datacenter (see "Network Topology Strategy"). For example, you specify that there should always be two replica in each data center. So even when you lose a node in a data center, you will still be able to read your data locally.
Write requests can be sent to any node in a Cassandra cluster. So for fast writes, your clients would always speak to a local node. The node receiving the request (the "coordinator") will replicate the data to other nodes (in other datacenters) in the background. If nodes are down, the write request will still succeed and the coordinator will replicate the data to the failed nodes at a later time ("hinted handoff").
Conflict resolution is based on a client-supplied timestamp.
If you need more than eventual consistency, Cassandra offers several consistency options (including datacenter-aware options).