What does it mean for Mach ports to be "globally unique"? - mach

I read here in the GNU Hurd docs that Mach ports are "globally unique." I know that one of the design goals for Mach was for it to be easily used in distributed computing environments, so does this mean Mach ports are globally unique across all Mach systems at any given point in time? This would mean that Mach ports can be used as references to objects that live on remote Mach systems, which is convenient. But how is this implemented?
Or does this statement mean Mach ports are only guaranteed to be globally unique on a single Mach system at any given point in time?

Related

Is PCI "CF8h/CFCh" IO port addresses only applicable to processors with an IO address space?

Some CPU like x86 processor has two address spaces. One for memory and one for IO. And different instructions to access them.
And the PCI 3.0 spec also mentions some important IO addresses:
Two DWORD I/O locations are used to generate configuration
transactions for PC-AT compatible systems. The first DWORD location
(CF8h) references a read/write register that is named CONFIG_ADDRESS.
The second DWORD address (CFCh) references a read/write register named
CONFIG_DATA.
So it seems PCI 3.0 spec is tightly coupled to processors that does implement IO address space. And that's the a priori knowledge that SW/FW writers should know.
So what about the other processor archs that don't have IO address space? Like ARM. How can they interact with the PCI configuration space?
The paragraph immediately preceding the one quoted in the question directly addresses the question. It says:
Systems must provide a mechanism that allows software to generate PCI configuration transactions. ...
For PC-AT compatible systems, the mechanism for generating configuration transactions is defined and specified in this section. ...
For other system architectures, the method of generating configuration
transactions is not defined in this specification.
In other words, systems that are not PC-AT compatible must provide a mechanism, but it is specified elsewhere. The PCI spec isn't tightly coupled to PC-AT systems, but it doesn't define the mechanism for other types of systems.
The paragraph in the question only applies to PC-AT compatible systems.
Below quote from here clears things up:
The method for generating configuration cycles is host dependent. In
IA machines, special I/O ports are used. On other platforms, the PCI
configuration space can be memory-mapped to certain address locations
corresponding to the PCI host bridge in the host address domain.
And
I/O space can be accessed differently on different platforms.
Processors with special I/O instructions, like the Intel processor
family, access the I/O space with in and out instructions. Machines
without special I/O instructions will map to the address locations
corresponding to the PCI host bridge in the host address domain. When
the processor accesses the memory-mapped addresses, an I/O request
will be sent to the PCI host bridge, which then translates the
addresses into I/O cycles and puts them on the PCI bus.
So for non-IA platform, MMIO can just be used instead. And the platform specs should document that memory-mapped address for the PCI host bridge as the a priori knowledge for SW/FW writers.
ADD 1 - 14:36 2023/2/5
From the digital design's perspective, the host CPU and the PCIe subsystem are just two separate IP blocks. And the communication between them is achieved by a bunch of digital signals in the form of address/data/control lines. As long as the signals can be conveyed, the communication can be made.
For x86 CPUs, the memory address space and IO address space are just different usage of address lines down to the earth. I don't think there's any strong reason that memory addresses cannot be used to communicate with PCIe subsystem. I think it's a more logical choice back then to use I/O addresses for PCIe because PCIe is deemed as I/O.
So the real critical thing I think, is to convey the digital signals in proper format between IPs. PCIe is independent of CPU architectures and cares nothing about what lines to be used. For ARM, there's nothing unnatural to use memory addresses, i.e., MMIO. After all it's digital signals and are capable of passing necessary information properly.

What is meant by Distributed System?

I am reading about distributed systems and getting confused with what is really means?
I understand on high level, it means that set of different machines that work together to achieve a single goal.
But this definition seems too broad and loose. I would like to give some points to explain the reasons for my confusion:
I see lot of people referring the micro-services as distributed system where the functionalities like Order, Payment etc are distributed in different services, where as some other refer to multiple instances of Order service which possibly trying to serve customers and possibly use some consensus algorithm to come to consensus on shared state (eg. current Inventory level).
When talking about distributed database, I see lot of people talk about different nodes which possibly use to store/serve a part of user request like records with primary key from 'A-C' in first node 'D-F' in second node etc. On high level it looks like sharding.
When talking about distributed rate limiting. Some refer to multiple application nodes (so called distributed application nodes) using a single rate limiter, some other mention that the rate limiter itself has multiple nodes with a shared cache (like redis).
It feels that people use distributed systems to mention about microservices architecture, horizontal scaling, partitioning (sharding) and anything in between.
I am reading about distributed systems and getting confused with what is really means?
As commented by #ReinhardMänner, the good general term definition of distributed system (DS) is at https://en.wikipedia.org/wiki/Distributed_computing
A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another from any system. The components interact with one another in order to achieve a common goal.
Anything that fits above definition can be referred as DS. All mentioned examples such as micro-services, distributed databases, etc. are specific applications of the concept or implementation details.
The statement "X being a distributed system" does not inherently imply any of such details and for each DS must be explicitly specified, eg. distributed database does not necessarily meaning usage of sharding.
I'll also draw from Wikipedia, but I think that the second part of the quote is more important:
A distributed system is a system whose components are located on
different networked computers, which communicate and coordinate their
actions by passing messages to one another from any system. The
components interact with one another in order to achieve a common
goal. Three significant challenges of distributed systems are:
maintaining concurrency of components, overcoming the lack of a global clock, and managing the independent failure of components. When
a component of one system fails, the entire system does not fail.
A system that constantly has to overcome these problems, even if all services are on the same node, or if they communicate via pipes/streams/files, is effectively a distributed system.
Now, trying to clear up your confusion:
Horizontal scaling was there with monoliths before microservices. Horizontal scaling is basically achieved by division of compute resources.
Division of compute requires dealing with synchronization, node failure, multiple clocks. But that is still cheaper than scaling vertically. That's where you might turn to consensus by implementing consensus in the application, or using a dedicated service e.g. Zookeeper, or abusing a DB table for that purpose.
Monoliths present 2 problems that microservices solve: address-space dependency (i.e. someone's component may crash the whole process and thus your component) and long startup times.
While microservices solve these problems, these problems aren't what makes them into a "distributed system". It doesn't matter if the different processes/nodes run the same software (monolith) or not (microservices), it matters that they are different processes that can't easily communicate directly (e.g. via function calls that promise not to fail).
In databases, scaling horizontally is also cheaper than scaling vertically, The two components of horizontal DB scaling are division of compute - effectively, a distributed system - and division of storage - sharding - as you mentioned, e.g. A-C, D-F etc..
Sharding of storage does not define distributed systems - a single compute node can handle multiple storage nodes. It's just that it's much more useful for a database that divides compute to also shard its storage, so you often see them together.
Distributed rate limiting falls under "maintaining concurrency of components". If every node does its own rate limiting, and they don't communicate, then the system-wide rate cannot be enforced. If they wait for each other to coordinate enforcement, they aren't concurrent.
Usually the solution is "approximate" rate limiting where components synchronize "occasionally".
If your components can't easily (= no latency) agree on a global rate limit, that's usually because they can't easily agree on a global anything. In that case, you're effectively dealing with a distributed system, even if all components just threads in the same process.
(that could happen e.g. if you plan to scale out but haven't done so yet, so you don't allow your threads to communicate directly.)

How CPU generates logical addresses?

CPU generates logical addresses. These logical addresses then converted into physical addresses by special unit MMU. This is written in so many books including Galvin (slides 6-7).
But I want to know how CPU generates logical address and what does it mean?
It is just a simplification.
CPU doesn't generate logical addresses. They are stored in your executable file. CPU reads your program and extracts these addresses.
Here (slide 7) Galvin says:
In MMU scheme, the value in the relocation register is added to
every address generated by a user process at the time it is sent to
memory.
The user program deals with logical addresses; it never sees the
real physical addresses.
The CPU does not generate logical addresses. Logical to physical address mapping is defined by the operating system. The operating system sets up page tables that define the mapping.
The processors defines structure of the page tables. The operating system defines the content of the page tables.

PLC capability and operating principles

I come from a C/C++ background, a lot of which has been in an embedded systems context. None of those embedded systems have involved PLCs - it had never make sense to have one CPU doing all it's C/C++ logic, then surrendering control of the I/O to some other device when (usually) you could just do it yourself because the I/O was directly connected to your CPU.
With the advent of EtherCAT, we are seeing advantages in moving our I/O onto EtherCAT for its flexiblity, modularity, etc. However, the preferred mode of driving a lot of the EtherCAT hardware seems to be via a PLC. In the case of the Beckhoff TwinCAT PLC environment, trying to bypass the PLC seems to be either technically difficult or expensive or both.
Which makes us want to know a lot of things about PLCs... starting with:
is it best to think about them as a serial processing device, parallel processing device, or neither (does it depend on the specific device)?
are they a "Turing Complete" general purpose computing device, or do they have limitations?
do they run the entire PLC program (loops and all) every PLC cycle?
if the PLC I/O is not controlling some industrial process under the supervision of a maintenance department, and/or takes place on millisecond timescales, might those be good reasons to make full use of more modern programming techniques (structured text rather than ladder diagrams, for example), in contrast to advice in the likes of this answer?
Just to cover both interpretations of serial and parallel -PLC logic processing is sequential.
Most PLC's can be programmed via Serial, USB or Ethernet connections
As regards to devices that PLCs' connect to, they are usually serial. For instance many industrial control system networks uses Profibus which is a serial bus based communication - typically Profibus uses the RS-485 serial interface. I can’t really think of a place where I have seen parallel communication. Most are serial - MODBUS, DeviceNet etc....with parallel you have problems with the extra cost of cabling, noise, long distances etc.
Yes PLC languages are Turing complete but probably not as convenient as other programming languages. For example with a Siemens PLC you have a choice of how you implement the logic - Ladder, S7 Graph (these are graphical based), Statement List (instruction based), Function Block Diagram, Structured Control Language (similar
to Pascal). This is a nice article comparing PLC programming languages with guidelines for how to choose a language http://www.automation.com/pdf_articles/IEC_Programming_Thayer_L.pdf
The PLC scan time is the time taken by a PLC to read inputs, execute the whole program and based on the logic just processed update the outputs accordingly. PLC scan time is not deterministic as it depends on inputs, outputs, timers, memory etc. Usually PLC's are used where speed is required - for slower processes DCS's can be used. It would be usual to see execution times of between 4-6 ms. With most PLC's you can modify the default maximum cycle monitoring time. If this time expires, the CPU can be commanded to stop or an interrupt can be triggered with the required logic. Note in many cases greater then 1 second scan time is "undesirable"!
I have found that in my experience that nearly all of the PLC's I have worked on are never composed of simple ladder logic networks. PLC’s are not simple representations of physical relays. They are used to control intricate often safety critical processes interacting with a multitude of different devices/equipment. Also in the majority of cases you have a SCADA system to implement and you may have enterprise level applications (MES,ERP) system to consider. Many processes require complex scheduling and logic control algorithms -- vial filling, bio pharma, electrical, oil & gas….there is a long list. As per the above link it depends on your need but modern processes often dictate the need for more then a simple program composed of a few ladder networks
More "modern" programming language (actually ST is more modern than C) often means also more complexity on the program, which is something that should be avoided in the PLC world. They are Real Time machines, where the cycle times, maintainability, robustness and clarity are far more important than regular PC (which are not RT) and embedded world. If PLCs would be programmed with same way as most handheld devices, we would be living in the world where having lights on would be totally random act, since the powerplant just tilted because of programming bug.
A Murrays answer is better than I would ever write, but since I can't comment yet I wanted to underline these parts I wrote here.

Are there applications where the number of network ports is not enough?

In TCP/IP, the port number is specified by a 16-bit field, yielding a total of 65536 port numbers. However, the lower range (don't really know how far it goes) is reserved for the system and cannot be utilized by the application. Assuming that 60,000 port numbers are available, it should be more than plenty for most nework application. However, MMORPG games often have tens of thousands of concurrently connected users at a time.
This got me wondering: Are there situations where a network application can run out of ports? How can this limitation be worked around?
You don't need one port per connection.
A connection is uniquely identified by a tuple of (host address, host port, remote address, remote port). It's likely your host IP address is the same for each connection, but you can still service 100,000 clients on a single machine with just one port. (In theory: you'll run into problems, unrelated to ports, before that.)
The canonical starter resource for this problem is Dan Kegels C10K page from 1999.
The lower range you refer to is probably the range below 1024 on most Unix like systems. This range is reserved for privileged applications. An application running as a normal user can not start listening to ports below 1024.
An upper range is often used by the OS for return ports and NAT when creating connections.
In short, because of how TCP works, ports can run out if a lot of connections are made and then closed. The limitation can be mitigated to some extent by using long-lived connections, one for each client.
In HTTP, this means using HTTP 1.1 and keep-alive.
There are 2^16 = 65536 per IP address. In other words, for a computer with one Ip address to run out of ports it should use more than 65536 ports which will never happen naturally!
You have to understand a socket which is (IP+Port) and the end to end device for communication
IPv4 is 32 bit let's say somehow it can address around 2^32 computers publicly (regardless of NATing).
so now there are 2^16*2^32 = 2^48 public sockets possible (which is in the order of 10^15) so it will not have a conflict (again regardless of NATing).
However IPv6 is introduced to allow more public IPs