RDMA atomic operation's implementation - rdma

I heard about that RDMA reads and writes are implemented like, when requests come the nic find the physical page and then using DMA to get the data to nic and then to the target.
This is straightforward for reads and write, but it is odd when it comes to the atomic operations. My question is that is RDMA atomic operations implemented the same way as reads and writes and how? And more specificely, what is it's relationship between the cpu's atomic operations (like compare and swap) and the RDMA's corresponding operations?.

RDMA atomic operations are implemented using PCI-express read and write operations. As such they do not provide atomicity with respect to the CPU's atomic operations, nor with respect to other HCAs.

Related

Does paxos provide true linearizable consistency or not?

I think I might be confusing concepts here, but it seems to me like paxos would provide linearizable consistency for systems that implement it.
I know Cassandra uses it. I'm not 100% clear on how but assuming a leader is elected and that single leader does all the writes then communication is synchronous and real-time linearizability is achieved right?
But consensus algorithms like paxos are generally considered partially synchronous because there is a quorum (not 100% of node communication)- does this also mean it's not truly linearizable as well?
maybe because there is only a quorum a node could fall out of sync and that would break linearization?
A linearizable system does not need to be synchronous. Linearizability is a safety property: it says "nothing bad happens" but it doesn't affect linearizability if nothing good happens either. Any reads or writes that do not return (or that return an error) can be ignored when checking for linearizability. This means it's perfectly possible for a system to be linearizable even if one or more of the nodes are faulty or partitioned or running slowly.
Paxos is commonly used to implement a replicated state machine: a system that executes a sequence of operations on multiple nodes at once. Since the operations are deterministic and the nodes all agree on the operations to run and the sequence in which to run them, the nodes all converge to the same state (eventually).
You can implement a linearizable system using Paxos by having the operations in the sequence be writes and reads using the fact that the operations are placed in a totally-ordered sequence (i.e. linearized) by the Paxos protocol.
It's important to put the reads in the sequence as well as the writes. Imagine instead you only used Paxos to agree on the writes, and served reads directly from a node's local state. If the node serving the reads is partitioned from the other nodes then it would serve stale reads, violating linearizability. Each read must involve a quorum of nodes to ensure that the returned value is fresh, which means (effectively) putting the read into the sequence alongside the writes.
(There's some tricks you can play to make reads a bit more efficient than writes, given that reads commute with each other and don't need to be persisted to disk, but you can't escape the need to contact a quorum of nodes for both read and write operations)

atomic operation definition and multiprocessor

I'm learning about synchronization and now I'm confused about the definition of atomic operation. Through searching, I could only find out that atomic operation is uninterruptible operation.
Then, won't the atomic operation only be valid for uni processor system since for multiprocessor system, many operation can be run simultaneously?
This link explains it pretty much perfectly (emphasis mine):
On multiprocessor systems, ensuring atomicity exists is a little
harder. It is still possible to use a lock (e.g. a spinlock) the same
as on single processor systems, but merely using a single instruction
or disabling interrupts will not guarantee atomic access. You must
also ensure that no other processor or core in the system attempts to
access the data you are working with. The easiest way to achieve this
is to ensure that the instructions you are using assert the 'LOCK'
signal on the bus, which prevents any other processor in the system
from accessing the memory at the same time. On x86 processors, some
instructions automatically lock the bus (e.g. 'XCHG') while others
require you to specify a 'LOCK' prefix to the instruction to achieve
this (e.g. 'CMPXCHG', which you should write as 'LOCK CMPXCHG op1,
op2').

Atomic Instructions | Maintain Data Consistency

Atomic instructions are those which execute as a whole and cannot be interrupted.
Is it also necessary that the data it operates on isn't manipulated during execution ?, i.e. an instruction executing on another core accessing the atomic instruction's data.
I'm currently enrolled in the "Operating Systems" course at my college.
Is it also necessary that the data it operates on isn't manipulated during execution ?
Yes.
And that is why such instructions can be expensive to execute, possibly taking hundreds of cycles and including lock the CPUs' busses and checking no other CPUs (not just other cores: multi-socket systems have to be included) are accessing the affected memory.
See also this answer.
There are two concepts at work here:
1) Atomic instructions are those that a processor cannot interrupt.
2) Interlocked instructions are those that lock the memory bus and invalidate [sections of] the CPU caches.
An interlocked instruction will always be atomic. An atomic instruction might not be (and often is not) interlocked.

Can single-processor systems handle multi-level queue scheduling?

I know that in Asymmetric multiprocessing one processor can make all the scheduling decisions whilst the others execute user code only. But is it possible for a single-processor system to allow for multi-level queue scheduling? And why?
Certainly a single processor system can use multi-level queue scheduling (MLQS). The MLQS algorithm is used to decide which process to run next when a processor becomes available. The algorithm doesn't require that there be more than one processor in the system. As a matter of fact, the algorithm is most efficient if there is only one processor. In a multi-processor system the data structure would need some sort of locking to prevent it from being corrupted.

MongoDB Concurrency

I'm a little confused about MongoDB's read locking. How many concurrent read operations can a single collection support?
As written in the link given by tk : http://www.mongodb.org/pages/viewpage.action?pageId=7209296
Read/Write Lock
mongod uses a read/write lock for many
operations. Any number of concurrent
read operations are allowed, but
typically only one write operation
(although some write operations yield
and in the future more concurrency
will be added). The write lock
acquisition is greedy: a pending write
lock acquisition will prevent further
read lock acquisitions until
fulfilled.
Typical read/write lock
You can check here: http://www.mongodb.org/pages/viewpage.action?pageId=7209296