Zilog z80 I, R registers purpose - cpu-architecture

There are I and R registers in the Control section of the Z80 cpu, what is their purpose and usage?

The R register is the memory refresh register. It's used to refresh dynamic RAM. Essentially, it is incremented on each instruction and placed on the address bus (when not in use for fetching or storing data) so that dynamic RAM chips can be refreshed.
You can ignore the R register, although people did use it as a source of semi random numbers.
The I register is the interrupt vector base register. In interrupt mode 2, the Z80 has a table of 128 interrupt vectors. The I register tells you the page in RAM where that table is.

Related

What is the benefit of having the registers as a part of memory in AVR microcontrollers?

Larger memories have higher decoding delay; why is the register file a part of the memory then?
Does it only mean that the registers are "mapped" SRAM registers that are stored inside the microprocessor?
If not, what would be the benefit of using registers as they won't be any faster than accessing RAM? Furthermore, what would be the use of them at all? I mean these are just a part of the memory so I don't see the point of having them anymore. Having them would be just as costly as referencing memory.
The picture is taken from Avr Microcontroller And Embedded Systems The: Using Assembly and C by Muhammad Ali Mazidi, Sarmad Naimi, and Sepehr Naimi
AVR has some instructions with indirect addressing, for example LD (LDD) – Load Indirect From Data Space to Register using Z:
Loads one byte indirect with or without displacement from the data space to a register. [...]
The data location is pointed to by the Z (16-bit) Pointer Register in the Register File.
So now you can move from a register by loading its data-space address into Z, allowing indirect or indexed register-to-register moves. Certainly one can think of some usage where such indirect access would save the odd instruction.
what would be the benefit of using registers as they won't be any faster than accessing RAM?
accessing General purpose Registers is faster than accessing Ram
first of all let us define how fast measured in microControllers .... fast mean how many cycle the instruction will take to excute ... LOOk at the avr architecture
See the General Purpose Registers GPRs are input for the ALU , and the GPRs are controlled by instruction register (2 byte width) which holds the next instruction from the code memory.
Let us examine simple instruction ADD Rd , Rr; where Rd,Rr are any two register in GPRs so 0<=r,d<=31 so each of r and d could be rebresented in 5 bit,now open "AVR Instruction Set Manual" page number 32 look at the op-code for this simple add instraction is 000011rdddddrrrr and becuse this op-code is two byte(code memory width) this will fetched , Decoded and excuit in one cycle (under consept of pipline ofcourse) jajajajjj only one cycle seems cool to me
I mean these are just a part of the memory so I don't see the point of having them anymore. Having them would be just as costly as referencing memory
You suggest to make the all ram as input for the ALU; this is a very bad idea: a memory address takes 2 bytes.
If you have 2 operands per instruction as in Add instruction you will need 4 Byte for saving only the operands .. and 1 more byte for the op-code of the operator itself in total 5 byte which is waste of memory!
And furthermore this architecture could only fetch 2 bytes at a time (instruction register width) so you need to spend more cycles on fetching the code from code memory which is waste of cycles >> more slower system
Register numbers are only 4 or 5 bits wide, depending on the instruction, allowing 2 per instruction with room to spare in a 16-bit instruction word.
conclusion GPRs' existence are crucial for saving code memory and program execution time
Larger memories have higher decoding delay; why is the register file a part of the memory then?
When cpu deal with GPRs it only access the first 32 position not all the data space
Final comment
don't disturb yourself by time diagram for different ram technology because you don't have control on it ,so who has control? the architecture designers , and they put the limit of the maximum crystal frequency you can use with there architecture and everything will be fine .. you only concern about cycles consuming with your application

Why does registers exists and how they work together with cpu?

So I am currently learning Operating Systems and Programming.
I want how the registers work in detail.
All I know is there is the main memory and our CPU which takes address and instruction from the main memory by the help of the address bus.
And also there is something MCC (Memory Controller Chip which helps in fetching the memory location from RAM.)
On the internet, it shows register is temporary storage and data can be accessed faster than ram for registers.
But I want to really understand the deep-down process on how they work. As they are also of 32 bits and 16 bits something like that. I am really confused.!!!
I'm not a native english speaker, pardon me for some perhaps incorrect terminology. Hope this will be a little bit helpful.
Why does registers exists
When user program is running on CPU, it works in a 'dynamic' sense. That is, we should store incoming source data or any intermediate data, and do specific calculation upon them. Memory devices are needed. We have a choice among flip-flop, on-chip RAM/ROM, and off-chip RAM/ROM.
The term register for programmer's model is actually a D flip-flop in the physical circuit, which is a memory device and can hold a single bit. An IC design consists of standard cell part (including the register mentioned before, and and/or/etc. gates) and hard macro (like SRAM). As the technology node advances, the standard cells' delay are getting smaller and smaller. Auto Place-n-Route tool will place the register and the related surrounding logic nearby, to make sure the logic can run at the specified 3.0/4.0GHz speed target. For some practical reasons (which I'm not quite sure because I don't do layout), we tend to place hard macros around, leading to much longer metal wire. This plus SRAM's own characteristics, on-chip SRAM is normally slower than D flip-flop. If the memory device is off the chip, say an external Flash chip or KGD (known good die), it will be further slower since the signals should traverse through 2 more IO devices which have much larger delay.
how they work together with cpu
Each register is assigned a different 'address' (which maybe not open to programmer). That is implemented by adding address decode logic. For instance, when CPU is going to execute an instruction mov R1, 0x12, the address decode logic sees the binary code of R1, and selects only those flip-flops corresponding to R1. Then data 0x12 is stored (written) into those flip-flops. Same for read process.
Regarding "they are also of 32 bits and 16 bits something like that", the bit width is not a problem. Both flip-flops and a word in RAM can have a bit width of N, as long as the same address can select N flip-flops or N bits in RAM at one time.
Registers are small memories which resides inside the processor (what you called CPU). Their role is to hold the operands for fast processor calculations and to store the results. A register is usually designated by a name (AL, BX, ECX, RDX, cr3, RIP, R0, R8, R15, etc.) and has a size which is the number of bits it can store (4, 8, 16, 32, 64, 128 bits). Other registers have special meanings, and their bits control the state or provide information about the state of the processor.
There are not many registers (because they are very expensive). All of them have a capacity of only a few kilobytes, so they can't store all the code and data of your program, which can go up to gigabytes. This is the role of the central memory (what you call RAM). This big memory can hold gigabytes of data and each byte has its address. However, it only holds data when the computer is turned on. The RAM reside outside of the CPU Chip and interacts with him via Memory Controller Chip which stands as interface between CPU and RAM.
On top of that, there is the hard drive that stores your data when you turn off your computer.
That is a very simple view to get you started.

How to figure out the interrupt source on I/O APIC?

I understand that I/O APIC chip has 24 PINs, usually single chip system will map PIN 0~23 IRQ 32~55 respectively. Furthermore I could edit the related RTEs to allocate interrupt handler functions.
But how can I figure out the I/O APIC interrupt source on each PINs?
I understand that it is related to ACPI, but on detail how should I do this, is it mapped on some ACPI table? or I should use AML to check it??
Thank you very much!!
The general steps (for a modern OS) are:
Preparation
a) Parse the ACPI "APIC/MADT" table to determine if PIC chips exist (PCAT_COMPAT flag), how many IO APICs there are, and how many inputs each IO APIC has. If ACPI doesn't exist, you might want to try searching for/parsing older "MultiProcessor Spec." table and extracting the same information; however, if ACPI does exist it's possible that the "MultiProcessor Spec." table is designed to provide a "minimum stub" that's contains no real information (so you must check ACPI first and prefer using ACPI if it exists), and it may not be worth the hassle of supporting systems that don't support ACPI (especially if the operating system requires a 64-bit CPU, etc).
b) Parse the ACPI "FADT" to determine if MSI may (or must not) be enabled
c) Determine if the OS will use PIC alone, IO APICs alone, or IO APIC plus MSI. Note that this can (should?) take into account the operating system's own boot parameters and/or configuration (e.g. so if there's a compatibility problem the end user can work around the problem).
d) If PIC chips exist; mask all IRQs in the PIC chips, then reconfigure the PIC chips (to set whatever "base vector number" you want them to use - e.g. maybe so that the master PIC is interrupt vectors 32 to 39 and the slave is vectors 40 to 47). If IO APIC/s exist, mask all IRQs in each IO APIC. Note: if the PIC chips exist they both have a "spurious IRQ" that can't be masked, so if you don't want to use PIC chips it's still a good idea to reconfigure the PIC chips such that their spurious IRQs (and the interrupt handlers for them) aren't going to be in the way.
e) Use an ACPI AML interpreter to execute the _PIC object; to inform ACPI/AML that you will be using either IO APIC or PIC. Note that "OS uses PIC" is the default for backward compatibility, so this step could be skipped if you're not using IO APIC.
f) Configure the local APIC in each CPU (not covered here).
Devices
Before starting a device driver for a device:
a) Figure out the device's details (e.g. use "class, subclass and programming interface" fields from PCI configuration space to figure out what the device is) and check if you actually have a device driver for it; and decide if you want the device to use PCI IRQs or MSI.
b1) If the device will be using PCI IRQs and if the OS is using PIC chips (and not IO APICs); get the "Interrupt Line" field from the device's PCI configuration space and determine which interrupt vector it will be by adding the corresponding PIC chip's "base interrupt vector" to it.
b2) If the device will be using PCI IRQs (and not MSI) and if the OS is using IO APIC and not PIC; determine which "interrupt pin at the PCI slot" the device uses by reading the "Interupt Pin" field from the device's PCI configuration space. Then use an ACPI AML interpreter to execute the _PRT object and get a current (not forgetting that PCI-E supports "hot-plug") PCI IRQ routing table. Use this table (and the PCI device's "bus:device:function" address and which "interrupt pin" it uses) to determine where the PCI IRQ is connected (e.g. which global interrupt, which determines which input of which IO APIC). Then; if you haven't already (because the same interrupt line is shared by a different device) use some kind of "interrupt vector manager" to allocate an interrupt vector for the PCI IRQ, and configure the IO APIC input to generate that interrupt vector. Note that (for IO APIC and MSI) "interrupt vector" determines "IRQ priority", so so for high speed/latency sensitive devices (e.g. network card) you'll want interrupt vectors that imply "high IRQ priority" and for a slower/less latency sensitive devices (e.g. USB controller) you'll want to use interrupt vectors that imply "lower IRQ priority".
b3) If the device will be using MSI; determine how many consecutive interrupt vectors the device wants; then use some kind of "interrupt vector manager" to try allocate as many consecutive interrupt vectors as the device wants. Note that it is possible to give the device less interrupts than it wants.
c) Regardless of how it happened, you now know which interrupt vector/s the device will use. Start the device driver that's suitable for the device, and tell the device driver which interrupt vectors its device will use (and which MMIO regions, etc).
Note: There's more advanced ways to assign interrupt vectors than "first come first served"; and there's probably no technical reason why you can't re-evaluate/re-assign interrupt vectors later as some kind of dynamic optimization scheme (e.g. re-allocating interrupt vectors so they're given to frequently used PCI devices instead of idle/unused PCI devices).

How is CR8 register used to prioritize interrupts in an x86-64 CPU?

I'm reading the Intel documentation on control registers, but I'm struggling to understand how CR8 register is used. To quote the docs (2-18 Vol. 3A):
Task Priority Level (bit 3:0 of CR8) — This sets the threshold value
corresponding to the highest- priority interrupt to be blocked. A
value of 0 means all interrupts are enabled. This field is available
in 64- bit mode. A value of 15 means all interrupts will be disabled.
I have 3 quick questions, if you don't mind:
So bits 3 thru 0 of CR8 make up those 16 levels of priority values. But priority of what? A running "thread", I assume, correct?
But what is that priority value in CR8 compared to when an interrupt is received to see if it has to be blocked or not?
When an interrupt is blocked, what does it mean? Is it "delayed" until later time, or is it just discarded, i.e. lost?
CR8 indicates the current priority of the CPU. When an interrupt is pending, bits 7:4 of the interrupt vector number is compared to CR8. If the vector is greater, it is serviced, otherwise it is held pending until CR8 is set to a lower value.
Assuming the APIC is in use, it has an IRR (Interrupt Request Register) with one bit per interrupt vector number. When that bit is set, the interrupt is pending. It can stay that way forever.
When an interrupt arrives, it is ORed into the IRR. If the interrupt is already pending (that is, the IRR bit for that vector is already set), the new interrupt is merged with the prior one. (You could say it is dropped, but I don't think of it that way; instead, I say the two are combined into one.) Because of this merging, interrupt service routines must be designed to process all the work that is ready, rather than expecting a distinct interrupt for each unit of work.
Another related point is that Windows (and I assume Linux) tries to keep the IRQ level of a CPU as low as possible at all times. Interrupt service routines do as little work as possible at their elevated hardware interrupt level and then cue a deferred procedure call to do the rest of their work at DPC IRQ level. The DPC will normally be serviced immediately unless another IRQ has arrived because they are at a higher priority than normal processes.
Once a CPU starts executing a DPC it will then execute all the DPC's in its per CPU DPC cue before returning the CPU IRQL to zero to allow normal threads to resume.
The advantage of doing it this way is that an incoming hardware IRQ of any priority can interrupt a DPC and get its own DPC on the cue almost immediately, so it never gets missed.
I should also try and explain ( as I think it is 😁) the difference between the IRQ level of a CPU and the priority of an IRQ .
Before Control Register 8 became available with x64 the CPU had no notion of an IRQ level.
The designers of windows NT decided that every logical processor in a system should have a NOTIONAL IRQ Level that would be stored in a data structure called a processor control block for each CPU. They decided there should be 32 levels for no reason I know of 😁.
Software and hardware interrupts are also assigned a level so if they are above the level that the CPU has assigned then they are allowed to continue.
Windows does NOT make use of the interrupt priority assigned by the PIC/APIC hardware, instead it uses the interrupt mask bits in them. The various pins are assigned a vector number and then they get a level.
When Windows raises the LRQL of a CPU in its PCB it also reprograms the interrupt mask of the PIC/APIC. But not straight away.
Every interrupt that occurs causes the Windows trap dispatcher to execute and compare the IRQ level with the CPU IRQL and if the IRQ level is higher the interrupt goes ahead, if not THEN Windows reprograms the mask and returns to the executing thread instead.
The reason for that is that reprogramming the PIC takes time and if no lower level IRQ comes in then windows can save its self a job.
On x64 there IS CR8 and I am still looking at how that works.

lpc1768 linker script why ram start address

lpc1768 linker script why ram start address should be given at 0x100000C8, RAM (rwx) : ORIGIN = 0x100000C8, LENGTH = 0x7F38.
If I give at 0x10000000 then system is crashing when I enable UART interrupts.
Your crash cause should be a providing a big clue - that offset of 0xC8 (192 bytes) is to allow space for the interrupt vector table.
Not all applications will require that the vector table be in RAM - if you have a fixed program in flash then the vector table can be there as well. But if you are using a bootloader to run code from RAM and want that code to include interrupt service routines, you would need to place a vector table in RAM and update the register which points to it. See for example AN10866.