What Problems Might Occur if I Overlap Multi-Register Modbus Data Items? - modbus

It is common to use 2 registers to read/write a floating point value in Modbus.
My question is what problems or compatibility issues arise if I specify my device register map with overlapping data as follows..
40001 (float a), 40002 (float b), 40003 (float c), 40004 (float d), and so on.
Float (a) can be read at 40001 with FC03, number of registers is 2.
Float (b) can be read at 40002 with FC03, number of registers is 2.
Float (a) and (b) can be read at 40001 with FC03, number of registers is 4.

This will render your device to become not a modbus-compatible device, but just a modbus-like device.
The drawback is that there are plenty of modbus clients, mostly SCADA systems, which would simply stop working with such register map. So if you don't care about 3rd party clients, you can do it, but what is the purpose?
UPD
Also you get undefined behavior on reading registers which belong to different values at the same time. What is the expected output of reading single word at 4002? LSB of a or MSB of b?
How do I read 2 consequent numbers (a and b)??
Modbus is already only modbus-like when it comes to multi-register
values
Wrong, it is still modbus, but whenever you ready multi-register values or implement time stamps, you explicitly define them in documentation and you rules should not violate general modbus rules like mention above. There is nothing wrong in telling you are using 2 registers with MSB BOM.
So the answer is it could work for some specific cases but is generally not usable at all.

Related

What is the difference between Program Status Word (PSW) and Program Counter (PC)?

In an Operating Systems course, the instructor introduced PSW and PC when he talked about Interrupt Handling.
His explanation was
PC holds the address of the next instruction to be fetched
PSW contains execution status information
But later I searched online and found that PSW = PC + status register. This makes me quite confused.
On the one hand, I am not sure what "execution status information" refers to. On the other hand, if PSW has the functions of a PC, why do we still need it?
Appreciate any explanation.
This isn't really standardized terminology. Most architectures have some register that plays the role of a status word, containing bits to indicate things like whether an add instruction caused a carry. But different architectures give it different names, and what exactly is included can vary widely. I'm not aware of any architecture that includes the program counter as part of their status word, but if they want to do that, well, who's going to stop them?
This is the kind of thing where you just have to look at the definition given by whatever book or article you are reading (or infer it from context), and realize that a different author may use the word differently.
In general, interrupts are hardware level subroutine calls. They do the same thing as a subroutine call (change the algorithm that the processor is executing) however they do it without warning the "executing code" that they are now operating.
In order to not damage the "executing code" all information that it was using must be stored. This includes the Program Counter (usually saved to the stack by the interrupt hardware in the same way that a subroutine call does) and all of the registers that the interrupt function will alter- these must be saved by pushing them onto the stack. The registers etc must be restored before the return from interrupt (RETI) instruction - the PC is restored by the RETI itself.
The PSW (often called the flag register) is a very important register and must generally be saved first. It contains bits like Zero (the last calculation resulted in a zero result) Carry (the last calculation resulted in a carry ie the result number is bigger than the register can hold) and several other flags. I suggest that you read the data sheet of an 8 bit microcontroller for an idea of what these flags might be. suffice it to say that these flags are needed in order to perform conditional jumps. And whilst they will often be ignored you can't take that chance.
You are probably correct in Your instructor using the term PSW to mean all all of the registers.
The subject of interrupts contains concepts that are common to subroutine calls in general (e.g. don't leave data that you don't want overwritten in a register before entering a subroutine). And later on in operating systems, the concept of context switches that occur during multi-tasking.
Peter

CPUs with instructions with more than two branch destinations

Processors usually come with jmp-instructions to continue from a different fixed location and may depend on some condition. So the out-degree is two at most.
Are there any processors out there that have a single instruction that branches to one of three or more fixed locations?
There are a lot of reasons to assume / guess no, but I'm not familiar with enough ISAs to give a definite no. Especially if we include historical early computers from the 50s and 60s; they often have very odd stuff compared to modern systems.
Normally you just use an indirect branch (target address in a register or from memory, or looked up from a compressed table with ARM tbb) if you need anything other than taken vs. fall-through, so there's very little benefit to spending an opcode on a funky direct branch instruction with 2 non-fallthrough destinations.
Also, you'd need space in the instruction encoding for either 2 separate targets, or else some special rule like fall-through, PC + offset, PC + offset*2 (i.e. jump twice as far forwards or backwards). Using it would require laying out code with targets at specific offsets. You do sometimes make a table of fixed-size blocks of instructions and compute an offset into it (instead of looking up an address from a table of addresses), but having an instruction that forced you to do that sounds unlikely.
The condition itself could be a register being - / 0 / + as a 3-way condition, or FLAGS being less-than, equal, or greater-than. Or something else.
So it sounds very unlikely, and a complication to branch-prediction (unless you just treat it as indirect, in which case why bother).
But I wouldn't be shocked if there's some combination of conditions that make it make sense on some ISA. Maybe if there's a special-case handler address in some special register, and the normal case involves taken or fall-through?
But if we allow one of the target addresses to come from a register or other internal state, any branch that can fault would count. Consider a hypothetical ISA with a compare-and-branch on memory, like Intel with macro-fused cmp [rdi], eax / jne rel32 which decodes to a single internal uop.
Then the possible targets are:
fall-through to RIP
taken to RIP+rel32
#PF fault to the page-fault handler address (loaded from memory on x86-64).

Register Ranges in HLSL?

I am currently refactoring a large chunk of old code and have finally dove into the HLSL section where my knowledge is minimal due to being out of practice. I've come across some documentation online that specifies which registers are to be used for which purposes:
t – for shader resource views (SRV)
s – for samplers
u – for unordered access views (UAV)
b – for constant buffer views (CBV)
This part is pretty self explanatory. If I want to create a constant buffer, I can just declare as:
cbuffer LightBuffer: register(b0) { };
cbuffer CameraBuffer: register(b1) { };
cbuffer MaterialBuffer: register(b2) { };
cbuffer ViewBuffer: register(b3) { };
However, originating from the world of MIPS Assembly I can't help but wonder if there are finite and restricted ranges on these. For example, temporary registers are restricted to a range of t0 - t7 in MIPS Assembly. In the case of HLSL I haven't been able to find any documentation surrounding this topic as everything seems to point to assembly languages and microprocessors (such as the 8051 if you'd like a random topic to read up on).
Is there a set range for the four register types in HLSL or do I just continue as much as needed in a sequential fashion and let the underlying assembly handle the messy details?
Note
I have answered this question partially, as I am unable to find a range for u currently; however, if someone has a better, more detailed answer than what I've given through testing, then feel free to post it and I will mark that as the correct answer. I will leave this question open until December 1st, 2018 to give others a chance to give a better answer for future readers.
Resource slot count (for d3d11, indeed d3d12 case expands that) are specified in Resource Limit msdn page.
The ones which are of interest for you here are :
D3D11_COMMONSHADER_INPUT_RESOURCE_REGISTER_COUNT (which is t) = 128
D3D11_COMMONSHADER_SAMPLER_SLOT_COUNT (which is s) = 16
D3D11_COMMONSHADER_CONSTANT_BUFFER_HW_SLOT_COUNT (which is b) = 15 but one is reserved to eventually store some constant data from shaders (if you have a static const large array for example)
The u case is different, as it depends on Feature Level (and tbh is a vendor/os version mess) :
D3D11_FEATURE_LEVEL_11_1 or greater, this is 64 slots
D3D11_FEATURE_LEVEL_11 : It will always be 8 (but some cards/driver eventually support 64, you need at least windows 8 for it (It might also be available in windows 7 with some platform update too). I do not recall a way to test if 64 is supported (many nvidia in their 700 range do for example).
D3D11_FEATURE_LEVEL_10_1 : either 0 or 1, there's a way to check is compute is supported
You need to perform a feature check:
D3D11_FEATURE_DATA_D3D10_X_HARDWARE_OPTIONS checkData;
d3dDevice->CheckFeatureSupport(D3D11_FEATURE_D3D10_X_HARDWARE_OPTIONS, &checkData);
BOOL computeSupport = checkData.ComputeShaders_Plus_RawAndStructuredBuffers_Via_Shader_4_x
Please note that for some OS/Driver version I had this flag returning TRUE while not supported (Intel was doing that on win7/8), so in that case the only valid solution was to try to either create a small Raw / Byte Address buffer or a Structured Buffer and check the HRESULT
As a side note feature feature level 10 or below are for for quite old configurations nowadays, so except for rare scenarios you can probably safely ignore it (I just leave it for information purpose).
Since it's usually a long wait time for these types of questions I tested the b register by attempting to create a cbuffer in register b51. This failed as I expected and luckily SharpDX spit out an exception that stated it has a maximum of 14. So for the sake of future readers I am testing all four register types and posting back the ranges I find successful.
b has a range of b0 - b13.
s has a range of s0 - s15.
t has a range of t0 - t127.
u has a range of .
At the current moment, I am unable to find a range for the u register as I have no examples of it in my code, and haven't actually ever used it. If someone comes along that does have an example usage then feel free to test it and update this post for future readers.
I did find a contradiction to my findings above in the documentation linked in my question; they have an example using a t register above the noted range in this answer:
Texture2D a[10000] : register(t0);
Texture2D b[10000] : register(t10000);
ConstantBuffer<myConstants> c[10000] : register(b0);
Note
I would like to point out that I am using the SharpDX version of the HLSL compiler and so I am unsure if these ranges vary from compiler to compiler; I heavily doubt that they do, but you can never be too sure until you try to exceed them. GLSL may be the same due to being similar to HLSL, but it could also be very different.

AMD64 ABI Function Calling Sequence for arguments of type __m256

I've been reading through this section for a while, but I can't seem to figure it out. I'm on AMD64 ABI Draft 0.99.6, page 18, section 3.32 Parameter Passing and theres the following text:
Arguments of type __m256 are split into four eightbyte chunks. The least significant one belongs to class SSE and all the others to class SSEUP.
I'm confused because it sounds like I use three SSEUP registers and only one SSE, but that seems wasteful of the other two SSE registers associated with the SSEUP. Am I misreading it? I probably won't even use this datatype, but I've been confused on this text for quite a while. Can someone give an example of how this would work? I'm probably missing something obvious.
Page 18 just contains a list of definitions necessary for a later discussion of the algorithm used to pass the parameters of a function.
Particularly, the SSE class is always passed in a new vector register, the first available of %xmm0-%xmm7.
Note that these names refer to the lower 128-bit parts of the registers but its better to think of them in terms of variable size vector registers %v0-%v7.
The SSEUP class is passed in the next available 64-bit (eight-byte) of the last vector register used.
__m256 is then passed, in processors that support AVX, using a single %ymm register: the lower 64 bits get the SSE class - and hence a new %v0 register - while the other three 64 bits chunks get SSEUP thereby reusing the %v0 register.
Here's the relevant quote from the document:
If the class is SSE, the next available vector register is used, the registers
are taken in the order from %xmm0 to %xmm7.
If the class is SSEUP, the eightbyte is passed in the next available eightbyte
chunk of the last used vector register.
The SSEUP class was introduced earlier in the ABI and it is still present today.
You can quickly consult the Version 0.9 to see the differences: the type _m256 and _m512 were not present for example.
For compiler that doesn't support the new ABI with the _m256 type or for compilers that do support it but target processors with no AVX support, that type is usually an aggregate of two _m128 and thus by the rules described later (particularly the post-merge rules) it is passed in memory:
If the size of an object is larger than two eightbytes, or in C++, is a nonPOD structure or union type, or contains unaligned fields, it has class
MEMORY.
For compilers using the old ABI
If the size of the aggregate exceeds two eightbytes and the first eightbyte
isn’t SSE or any other eightbyte isn’t SSEUP, the whole argument
is passed in memory.
For compilers using the new ABI
The standard is admittedly confusing mostly due to the need to address backward compatibility, the SSE and SSEUP classifications are handy classifications in an architecture where the vector registers keep widening and broad range of different sizes are already present out there.

What is the data format for the device address using libMPSSE I2C?

I am attempting to use libMPSSE to perform I2C communications. The example code listed in the attached document connects to a 24LC024H EEPROM device.
The address for the device used in the example as defined in it's documentation is 1010XXX_ where the X's are configurable. In the examples associated diagram you can see the values are configured to be 1. It also states that the R/W bit (_) should not be included meaning the address passed to the library should be 10101110. The address actually used in the example code is 0x57 which is 01010111.
I do not see how we got from A to B here. I cannot figure out how to format the address of the device I am trying to communicate with nor can I find any documentation spelling it out. The only documenation on the address parameter says:
Address of the I2C slave. This is a 7bit value and it
should not contain the data direction bit, i.e. the
decimal value passed should be always less than 128
This confusing since the data direction bit is usually the LSB.
I was updating my question to clarify what the address should be and a coincidence in the editor cause the answer to smack me in the face.
By "should not be included" they do not mean that the bit should be zero but rather by completely nonexistent. To them this means shifting the address bits down to remove it as the LSB. It also implies that the MSB should always be zero even though it's not explicitly defined anywhere.