What is meant by Distributed System? - distributed-computing

I am reading about distributed systems and getting confused with what is really means?
I understand on high level, it means that set of different machines that work together to achieve a single goal.
But this definition seems too broad and loose. I would like to give some points to explain the reasons for my confusion:
I see lot of people referring the micro-services as distributed system where the functionalities like Order, Payment etc are distributed in different services, where as some other refer to multiple instances of Order service which possibly trying to serve customers and possibly use some consensus algorithm to come to consensus on shared state (eg. current Inventory level).
When talking about distributed database, I see lot of people talk about different nodes which possibly use to store/serve a part of user request like records with primary key from 'A-C' in first node 'D-F' in second node etc. On high level it looks like sharding.
When talking about distributed rate limiting. Some refer to multiple application nodes (so called distributed application nodes) using a single rate limiter, some other mention that the rate limiter itself has multiple nodes with a shared cache (like redis).
It feels that people use distributed systems to mention about microservices architecture, horizontal scaling, partitioning (sharding) and anything in between.

I am reading about distributed systems and getting confused with what is really means?
As commented by #ReinhardMänner, the good general term definition of distributed system (DS) is at https://en.wikipedia.org/wiki/Distributed_computing
A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve a common goal.
Anything that fits above definition can be referred as DS. All mentioned examples such as micro-services, distributed databases, etc. are specific applications of the concept or implementation details.
The statement "X being a distributed system" does not inherently imply any of such details and for each DS must be explicitly specified, eg. distributed database does not necessarily meaning usage of sharding.

I'll also draw from Wikipedia, but I think that the second part of the quote is more important:
A distributed system is a system whose components are located on
different networked computers, which communicate and coordinate their
actions by passing messages to one another from any system. The
components interact with one another in order to achieve a common
goal. Three significant challenges of distributed systems are:
maintaining concurrency of components, overcoming the lack of a global clock, and managing the independent failure of components. When
a component of one system fails, the entire system does not fail.
A system that constantly has to overcome these problems, even if all services are on the same node, or if they communicate via pipes/streams/files, is effectively a distributed system.
Now, trying to clear up your confusion:
Horizontal scaling was there with monoliths before microservices. Horizontal scaling is basically achieved by division of compute resources.
Division of compute requires dealing with synchronization, node failure, multiple clocks. But that is still cheaper than scaling vertically. That's where you might turn to consensus by implementing consensus in the application, or using a dedicated service e.g. Zookeeper, or abusing a DB table for that purpose.
Monoliths present 2 problems that microservices solve: address-space dependency (i.e. someone's component may crash the whole process and thus your component) and long startup times.
While microservices solve these problems, these problems aren't what makes them into a "distributed system". It doesn't matter if the different processes/nodes run the same software (monolith) or not (microservices), it matters that they are different processes that can't easily communicate directly (e.g. via function calls that promise not to fail).
In databases, scaling horizontally is also cheaper than scaling vertically, The two components of horizontal DB scaling are division of compute - effectively, a distributed system - and division of storage - sharding - as you mentioned, e.g. A-C, D-F etc..
Sharding of storage does not define distributed systems - a single compute node can handle multiple storage nodes. It's just that it's much more useful for a database that divides compute to also shard its storage, so you often see them together.
Distributed rate limiting falls under "maintaining concurrency of components". If every node does its own rate limiting, and they don't communicate, then the system-wide rate cannot be enforced. If they wait for each other to coordinate enforcement, they aren't concurrent.
Usually the solution is "approximate" rate limiting where components synchronize "occasionally".
If your components can't easily (= no latency) agree on a global rate limit, that's usually because they can't easily agree on a global anything. In that case, you're effectively dealing with a distributed system, even if all components just threads in the same process.
(that could happen e.g. if you plan to scale out but haven't done so yet, so you don't allow your threads to communicate directly.)

Related

How to properly define and differentiate between nodes, processes, transactions & operations?

As part of my research I need to provide the reader with a comprehensive introduction to distributed systems. I am currently struggling with properly defining a number of the concepts that are recurring in literature on distributed systems and transactions. These are (a) nodes, (b) processes, (c) transactions and, (d) operations. I could really use some help in understanding their correlation, as I seem to continuously mix up nodes with processes and transaction with operations. Any input is appreciated!
I have already tried to grasp these concepts by researching the following literature:
Distributed Systems: Concepts and Design (G. Coulouris et al.)
A brief introduction to distributed systems (A.S. Tannenbaum)
I'm not sure what type of the ambiguity you exactly perceive in the defined terms and thus it's harder to put the right answer. These terms have the same meaning in the distributed systems terminology as any other part of the information technology science.
To be more concrete.
The node is usually "a machine" which runs one or multiple processes. The process executes operations. Operations may be grouped in a transaction (the transaction is composed from operations).
I just quickly searched in the resources you referred and there is said
A computing element, which we will generally refer to as a node, can
be either a hardware device or a software process.
The node runs processes. But the node itself can be a real hardware (a machine) or it could be a virtual machine - which is a process that runs on some machine (a real hardware).
From distributed system perspective you don't mind what the node is in reality (it's real as the HW or it's virtual as the SW) but it's a "container" for running processes.
Process is "a runtime". It processes something. It can process numbers, data, messages... The chunks of the work that is processed inside of the process are operations. E.g. you save data to a database and you do it as an operation.
The transaction defines a unit of work which consists of several operations. The transaction brings you guarantees over those operations. What are those guarantees depend on model you use. If you think about ACID transactions (as defined in paper Principles of Transaction-Oriented Database Recovery from 1983) then you are guaranteed that the all operation are successfully process or no of them is(A), consistency is maintained(C), parallel transactions do not interfere(I) and you are guaranteed that transaction outcome is persistent(D).

Why can't CP systems also be CAP?

My understanding of the CAP acronym is as follows:
Consistent: every read gets the most recent write
Available: every node is available
Partion Tolerant: the system can continue upholding A and C promises when the network connection between nodes goes down
Assuming my understanding is more or less on track, then something is bother me.
AFAIK, availability is achieved via any of the following techniques:
Load balancing
Replication to a disaster recovery system
So if I have a system that I already know is CP, why can't I "make it full CAP" by applying one of these techniques to make it available as well? I'm sure I'm missing something important here, just not sure what.
It's the partition tolerance, that you got wrong.
As long as there isn't any partitioning happening, systems can be consistent and available. There are CA systems which say, we don't care about partitions. You can have them running inside racks with server hardware and make partitioning extremely unlikely. The problem is, what if partitions occur?
The system can either choose to
continue providing the service, hoping the other server is down rather than providing the same service and serving different data - choosing availability (AP)
stop providing the service, because it couldn't guarantee consistency anymore, since it doesn't know if the other server is down or in fact up and running and just the communication between these two broke off - choosing consistency (CP)
The idea of the CAP theorem is that you cannot provide both Availability AND Consistency, once partitioning occurs, you can either go for availability and hope for the best, or play it safe and be unavailable, but consistent.
Here are 2 great posts, which should make it clear:
You Can’t Sacrifice Partition Tolerance shows the idea, that every truly distributed system needs to deal with partitioning now and than and hence CA systems will break instantly at the first occurrence of a partition
CAP Twelve Years Later: How the "Rules" Have Changed is slightly more up to date and shows the CAP theorem more flexible, where developers can choose how applications behave during partitioning and can sacrifice a bit of consistency to gain some availability, ...
So to finally answer your question, if you take a CP system and replicate it more often, you might either run into overhead of messages sent between the nodes of the system to keep it consistent, or - in case a substantial part of the nodes fails or network partitioning occurs without any part having a clear majority, it won't be able to continue operation as it wouldn't be able to guarantee consistency anymore. But yes, these lines are getting more blurred now and I think the references I've provided will give you a much better understanding.

In Oracle RAC, will an application be faster, if there is a subset of the code using a separate Oracle service to the same database?

For example, I have an application that does lots of audit trails writing. Lots. It slows things down. If I create a separate service on my Oracle RAC just for audit CRUD, would that help speed things up in my application?
In other words, I point most of the application to the main service listening on my RAC via SCAN. I take the subset of my application, the audit trail data manipulation, and point it to a separate service listening but pointing same schema as the main listener.
As with anything else, it depends. You'd need to be a lot more specific about your application, what services you'd define, your workloads, your goals, etc. Realistically, you'd need to test it in your environment to know for sure.
A separate service could allow you to segregate the workload of one application (the one writing the audit trail) from the workload of other applications by having different sets of nodes in the cluster running each service (under normal operation). That can help ensure that the higher priority application (presumably not writing the audit trail) has a set amount of hardware to handle its workload even if the lower priority thread is running at full throttle. Of course, since all the nodes are sharing the same disk, if the bottleneck is disk I/O, that segregation of workload may not accomplish much.
Separating the services on different sets of nodes can also impact how frequently a particular service is getting blocks from the local node's buffer cache rather than requesting them from the other node and waiting for them to be shipped over the interconnect. It's quite possible that an application that is constantly writing to log tables might end up spending quite a bit of time waiting for a small number of hot blocks (such as the right-most block in the primary key index for the log table) to get shipped back and forth between different nodes. If all the audit records are being written on just one node (or on a smaller number of nodes), that hot block will always be available in the local buffer cache. On the other hand, if writing the audit trail involves querying the database to get information about a change, separating the workload may mean that blocks that were in the local cache (because they were just changed) are now getting shipped across the interconnect, you could end up hurting performance.
Separating the services even if they're running on the same set of nodes may also be useful if you plan on managing them differently. For example, you can configure Oracle Resource Manager rules to give priority to sessions that use one service over another. That can be a more fine-grained way to allocate resources to different workloads than running the services on different nodes. But it can also add more overhead.

Difference between centralized and distributed computing

Can anyone tell me the differences between centralized and distributed computing?
Centralized
A system with centralized multiprocessor parallel architecture.In the late 1980 s Centralized systems have been progressively replaced by distributed systems.
characteristics of centralized system
Non autonomous components
usually homogeneous technology
Multiple users share the same resources at all time
single point of control
single point of failure
Distributed
set of tightly coupled programs executing on one or more computers which are interconnected through a network and coordinating their actions. These programs know about one another and carry out tasks that none could carry out in isolation
characteristics of distributed system
autonomous components
Mostly build using heterogeneous technology
System components may be used exclusively
Concurrent processes can execute
Multiple point of failure
Requirement of distributed system
Scalability- possibility of adding new hosts
openness- easily extended and modified
Heterogeneity-supports various H/W S/w platforms
Resource sharing- H/w, S/W and data
fault tolerance- ability to function correctly even if faults occur
Centralized: all calculations are done on one particular computer (system). Example: you have a dedicated server for calculating data.
Distributed: the calculation is distributed to multiple computers. Example: when you have a large amount of data then you can divide it and send each part to particular computers which will make the calculations for their part.
Main basic differences are:
distrib-systems have no global state
no shared memory
no shared variables
distrib-systems have no shared time clock
therefore order of events is difficult
distrib-systems can have race conditions
race conditions see http://en.wikipedia.org/wiki/Race_condition
So "computing" in a distrubuted environment is very difficult. Do you have concret question about programing models or whatever?
Centralized Systems
"In Centralized Systems,several jobs are done on a particular central processing unit(CPU)"
Distributed Systems
"In Distributed Systems,jobs are distributed among several processor.The Processor are interconnected by a computer network"
(Sheheryar ,NUML)
Briefly, Centralized computing, as the name itself depicts, is concerned with just a single server. The particular operation is being held at this server location and nowhere else.
Distributed computing is held where the system requirement is quite large, and the job is distributed to several processors and the solutions are then combined together, keeping in mind that the processors are interconnected by a computer network.
centralized system:is a system which computing is done at central location using terminals attached to central computer in brief (mainframe and dump terminals all computation is done on the mainframe through terminals )
distributed system:is a collection of independent computers that appear to its users as single coherent system where hardware is distributed consisting of n processing elements (processor and memory )also software is distributed where no centralized os each processing element has its own os ,no physically centralized file system and inter-process communication via message passing at lowest level
Big Note:the main differences is reliability. in distributed system if one machine crashes,the system as a whole can still survive
METHOD OF ARBITRATION In all but the simplest systems, more than one module may
need control of the bus.
In a centralized scheme, a single hardware device, referred to as a bus controller or arbiter, is responsible for allocating time on the bus.
In a distributed scheme, there is no central controller. Rather, each module contains access
control logic and the modules act together to share the bus.
in centralized system in case the server fails it affects the whole system because the server controls the whole operation
in D.S system incase a system fails it doesn't affect the operations of the other computers because they are independent and distributed in operations
Let us try to understand this with an example.
Say you are carrying a large amount of money. You are in a crowded train, where your pocket may be picked and you might lose money. What is the ideal strategy for carrying money?
Put all money in a single pocket: In this case, it is easy for you to just put the money in the pocket and be done. When you go back home, you can simply take out money from the pocket and count it. But wait. What if your pocket is picked? You lose ALL the money (bankrupt? eh!). Seems like it is not the best idea to store all the money in a SINGLE pocket. Let us think what else we can do
Divide your money: Put some of it in the left pocket, put some in the right pocket and maybe put some in your bag (which has a limited capacity). You need to devise a strategy to divide the money with you. Also, when you go back home, you will have to spend time collecting money from different pockets and collecting it at one place. However, we are in a better situation now. If one of our pocket (or bag) is picked, we do not lose ALL of the money. The chances of your bag, left pocket and the right pocket, all being picked is fairly low. With a little overhead of dividing money, you can now avoid losing all of your money. Isn’t that better?
This is how distributed systems work. They divide the information (money in your case) and keep it on different machines (pockets and bags for us). This way if one of the machine goes down, we are not at a big loss. That is, we do not have a single point of failure
Another important thing that distributed systems implement is data replication. They put replicas of same information in multiple machines. This way, if one of the machines goes down, we do not lose the information. So, we now have something called as fault tolerance.

avoiding overuse of consensus protocols in a distributed system

I'm new to distributed systems, and I'm reading about "simple Paxos". It creates a lot of chatter and I'm thinking about performance implications.
Let's say you're building a globally-distributed database, with several small-ish clusters located in different locations. It seems important to minimize the amount of cross-site communication.
What are the decisions you definitely need to use consensus for? The only one I thought of for sure was deciding whether to add or remove a node (or set of nodes?) from the network. It seems like this is necessary for vector clocks to work. Another I was less sure about was deciding on an ordering for writes to the same location, but should this be done by a leader which is elected via Paxos?
It would be nice to avoid having all nodes in the system making decisions together. Could a few nodes at each local cluster participate in cross-cluster decisions, and all local nodes communicate using a local Paxos to determine local answers to cross-site questions? The latency would be the same assuming the network is not saturated, but the cross-site network traffic would be much lighter.
Let's say you can split your database's tables along rows, and assign each subset of rows to a subset of nodes. Is it normal to elect a set of nodes to contain each subset of the data using Paxos across all machines in the system, and then only run Paxos between those nodes for all operations dealing with that subset of data?
And a catch-all: are there any other design-related or algorithmic optimizations people are doing to address this?
Good questions, and good insights!
It creates a lot of chatter and I'm thinking about performance implications.
Let's say you're building a globally-distributed database, with several small-ish clusters located in different locations. It seems important to minimize the amount of cross-site communication.
What are the decisions you definitely need to use consensus for? The only one I thought of for sure was deciding whether to add or remove a node (or set of nodes?) from the network. It seems like this is necessary for vector clocks to work. Another I was less sure about was deciding on an ordering for writes to the same location, but should this be done by a leader which is elected via Paxos?
Yes, performance is a problem that my team had seen in practice as well. We maintain a consistent database & distributed lock manager; and orignally used Paxos for all writes, some reads and cluster membership updates.
Here are some of the optimizations we did:
As much as possible, nodes sent the transitions to a Distinguished Proposer/Learner (elected via Paxos), which
decided on write ordering, and
batched transitions while waiting for the response from the prior instance. (But batching too much also caused problems.)
We had considered using multi-paxos but we ended up doing something cooler (see below).
With these optimizations, we were still hurting for performance, so we split our server into three layers. The bottom layer is Paxos; it does what you suggest; viz. merely decides the node membership of the middle layer. The middle layer is a custom-in-house-high-speed chain consensus protocol, which does consensus & ordering for the DB. (BTW, chain-consensus can be viewed as Vertical Paxos.) The top layer now just maintains the database/locks & client connections. This design has lead to several orders of magnitude latency and throughput improvement.
It would be nice to avoid having all nodes in the system making decisions together. Could a few nodes at each local cluster participate in cross-cluster decisions, and all local nodes communicate using a local Paxos to determine local answers to cross-site questions? The latency would be the same assuming the network is not saturated, but the cross-site network traffic would be much lighter.
Let's say you can split your database's tables along rows, and assign each subset of rows to a subset of nodes. Is it normal to elect a set of nodes to contain each subset of the data using Paxos across all machines in the system, and then only run Paxos between those nodes for all operations dealing with that subset of data?
These two together remind me of the Google Spanner paper. If you skip over the parts about time, it's essentially doing 2PC globally and Paxos on the shards. (IIRC.)