What's so special about 0x55AA? - operating-system

I have encountered 0x55AA in 2 scenarios:
the final 2 bytes of boot sector in the legacy booting process contains 0x55AA.
the first 2 bytes of the Option ROM must be 0x55AA
So what's special about 0x55AA?
The binary version of 0x55AA is 0101010110101010. Is it because it is evenly interleaved 0 and 1? But I don't see that's a strong criteria.

0x55AA is a "signature word". It is used as the "end of sector" marker in the last 2 bytes of a 512 byte boot record. This includes MBR and it's extended boot records and in the newer GPTs protective MBR.
References:
Image from Master Boot Record - microsoft.com.
How Basic Disks and Volumes Work - microsoft.com.

There is nothing magical or mystical about that combination. Implementers needed a means by which to determine if the first sector of a device was bootable (boot signature) and that combination occurring in the last two bytes of a sector is so improbable, is why it was chosen.Similarly, SMBIOS entry point can be found scanning BIOS for _SM_ signature that must be on an segment boundary like this;
Find_SMBIOS:
push ds
push bx ; Preserve essential
push si
; Establish DS:BX to point to base of BIOS code
mov ax, 0xf000
mov ds, ax ; Segment where table lives
xor bx, bx ; Initial pointer
mov eax, '_SM_' ; Scan buffer for this signature
; Loop has maximum of 4096 interations. As table is probably at top of buffer, cycling
; though it backwards saves time. In my test bed, BOCH's 2.6.5 BIOS-bochs-latest it was
; 1,451 interations.
.L0: sub bx, 16 ; Bump pointer to previous segment
jnz .J0
; Return NULL in AX and set CF. Either AX or flag can be tested on return.
mov ax, bx
stc
jmp .Done
; Did we find signature at this page
.J0: cmp [bx], eax
jnz .L0 ; NZ, keep looking
; Calculate checksum to verify position
mov cx, 15
mov ax, cx
mov si, bx ; DS:SI = Table entry point
; Compute checksum on next 15 bytes
.L1: lodsb
add ah, al
loop .L1
or ah, ah
jnz .L0 ; Invalid, try to find another occurence
; As entry point is page aligned, we can do this to determine segment.
shr bx, 4
mov ax, ds
add ax, bx
clc ; NC, found signature
.Done:
pop si
pop bx ; Restore essential
pop ds
ret
That signature is easily identifiable in a hex dump and it fits into a 16 bit register. Where those two criteria precipitating factors, I don't know, but again, the probability of 0x5f4d535f appearing on an even 16 byte boundary is very unlikely.

Related

the usage of offset for storing (stw) in powerpc assembly

long long int i=57745158985; #the C code
0000000000100004: li r7,13
0000000000100008: lis r8,29153
000000000010000c: ori r8,r8,0x3349
0000000000100010: stw r7,24(rsp)
0000000000100014: stw r8,28(rsp)
0000000000100018: lfd fp0,24(rsp)
000000000010001c: stfd fp0,8(rsp)
Can anyone explain the part of after the ori instruction? Thanks in advance.
It looks like this is on a 32-bit big endian machine. I will assume i is a local variable.
Starting with these instructions...
li r7,13
lis r8,29153
ori r8,r8,0x3349
After these instructions:
r7 contains 13
r8 contains ((29153 << 16) | 0x3349)
The required value for i is 57745158985, which is equal to
(13<<32) | ((29153 << 16) | 0x3349)
Clearly this value is too big to fit in a single 32-bit register.
The next instructions are where the 64-bit local variable i is "created" on the stack.
stw r7,24(rsp)
stw r8,28(rsp)
rsp is the stack pointer for the function.
Here i is being initialized to it's initial value of 57745158985.
stw r7,24(rsp) stores the four bytes of r7 starting at an offset of 24 bytes into the stack.
stw r8,28(rsp) stores the four bytes r8 starting at an offset of 28 bytes into the stack.
So i is the 8 bytes starting from an offset of 24 on the stack.
As this is a big-endian architecture the most significant bytes are placed first in memory.
Placing the value of r7 at lower address performs acts like the (13<<32) when considering the 8 bytes as one long long int.
These next instructions load the value of i into a floating point register and save it at a different location on the stack.
lfd fp0,24(rsp)
stfd fp0,8(rsp)
These 3 are loading up two 32 bit literal values into GPRs r7 and r8
0000000000100004: li r7,13
0000000000100008: lis r8,29153
000000000010000c: ori r8,r8,0x3349
These two are storing the two 32 bit values out consecutive 32 bit memory locations pointed to by rsp (which is the stack pointer == r1) + 24
0000000000100010: stw r7,24(rsp)
0000000000100014: stw r8,28(rsp)
This is a 64 bit load from the same location (ie rsp + 24) into floating point register 0 (ie fp0). (you can't move GPRs to FPR directly on this processor, so you go via memory)
0000000000100018: lfd fp0,24(rsp)
This is storing the same 64 bit FPR0 out to a different offset from the stack point.
000000000010001c: stfd fp0,8(rsp)

Can someone explain to me this emu8086 code?

mov bx,offset array
dec bx
mov cx,100
next: inc bx
cmp [bx],0FFH
loope next
can you explain why we ( DEC BX ) AND AGAIN (INC BX ) ? looking for compelete answer thx
We decrement bx before the loop, then the first instruction in the loop increments bx.
This way, on the first iteration of the loop, bx is again pointing at the beginning of array. On the second iteration, it points at the second item, and so on.
It might initially seem more straightforward to do something like:
mov bx, offset array
next: cmp [bx], 0ffh
inc bx
loopne next
The problem with this is that we're depending on the cmp to set the Z flag, which is used by the loopne instruction--but the inc instruction also affects the Z flag, so this would lose the result from the cmp, so the loopne wouldn't work correctly any more.
That having been said, this seems to be doing roughly the same thing as repne scasb can do:
mov di, offset array
mov al, 0ffh
mov cx, 100
repne scasb ; this instruction implements the entire loop
The big difference is that repne scasb always searches in an array whose base is given in es:di, which can sometimes be clumsy to deal with (e.g., if you're already using es to point to something else).

How can I multiply two vectors in NASM x86 assembly?

I am trying to multiply two vectors of floating point values in assembly code. I am using NASM preprocessor on Intel x86_64 architecture.
Recently I have learned about SSE extension for Intel assembly, so I was trying to implement packed multiplication using SSE instructions and tail recursion. Here is my function:
mul:
movdqa xmm0, [rdi]
movdqa xmm1, [rsi]
mulps xmm0, xmm1
movdqa [rdi], xmm0
add rdi, 16
add rsi, 16
sub rdx, 4
cmp rdx, 0
jg mul
ret
It multiplies two vectors value by value and saves results into the first vector. Pointers to vectors are stored in rdi and rsi registers accordingly, size of both should be written in rdx.
The program written above catches segmentation fault on the second instruction. I think I am using SSE incorrectly.
Is there any other way to use packed multiplication? Or am I just doing something wrong here?

6502 assembly get data from a block of memory

I've been learning 6502 assembly using the cbm programming studio. I’m reading a book by Jim Butterfield and Richard Mansfield. Both books discuss how one can use a method (I think it was indirect addressing) to get data from a block of memory (like messages) but there isn't an example could someone provide me one please? I don't care what method is used.
It's fairly straight forward. You set ups a pair of zero page addresses to hold the address of the start of the block and then use indirect indexing by Y to access bytes within the block. The instruction LDA ($80),Y reads the bytes at $80 and $81 as a 16 bit address ($81 contains the highest 8 bits) then adds Y on, then reads the byte at the resulting address.
Note that, if you know the address in advance, you do not need to use indirect addressing, you can use absolute indexed.
The following routine demos both address modes. It copies the 10 bytes at a location specified in the X and Y registers (Y is the high byte) to the locations following $0400
stx $80 ; Store the low byte of the source address in ZP
sty $81 ; Store the high byte of the source in ZP
ldy #0 ; zero the index
loop: lda ($80),y ; Get a byte from the source
sta $0400,y ; Store it at the destination
iny ; Increment the index
cpy #10 ; Have we done 10 bytes?
bne loop ; Go round again if not
Note that there is an obvious optimisation in the above, but I'll leave that as an exercise for the reader.
Edit OK here is the obvious optimisation as per i486's comment
stx $80 ; Store the low byte of the source address in ZP
sty $81 ; Store the high byte of the source in ZP
ldy #9 ; initialise to the highest index
loop: lda ($80),y ; Get a byte from the source
sta $0400,y ; Store it at the destination
dey ; Decrement the index
bpl loop ; Go round again if index is still >= 0

harmonic series with x86-64 assembly

Trying to compute a harmonic series.
Right now I'm entering the number I want the addition to go up to.
When I enter a small number like 1.2, the program just stops, doesn't crash, it seems to be doing calculations.
BUt it never finishes the program
here is my code
denominator:
xor r14,r14 ;zero out r14 register
add r14, 2 ;start counter at 2
fld1 ;load 1 into st0
fxch st2
denomLoop:
fld1
mov [divisor], r14 ;put 1 into st0
fidiv dword [divisor] ;divide st0 by r14
inc r14 ;increment r14
fst qword [currentSum] ;pop current sum value into currentSum
jmp addParts
addParts:
fld qword [currentSum]
fadd st2 ;add result of first division to 1
fxch st2 ;place result of addition into st2
fld qword [realNumber] ;place real number into st0
;compare to see if greater than inputed value
fcom st2 ;compare st0 with st2
fstsw ax ;needed to do floating point comparisons on FPU
sahf ;needed to do floating point comaprisons on FPU
jg done ;jump if greater than
jmp denomLoop ;jump if less than
The code is basically computing the 1/2 or 1/3 or 1/4 and adding it to a running sum, then compares to see if i've reached a value above what I entered, once it has it should exit the loop
do you guys see my error?
This line seems suspicious:
fst qword [currentSum] ;pop current sum value into currentSum
contrary to the comment, fst stores the top of the stack into memory WITHOUT popping it. You want fstp if you want to pop it.
Overall, the stack behavior of your program seems suspicious -- it pushes various things onto the fp stack but never pops anything. After a couple of iterations, the stack will overflow and wrap around. Depending on your settings, you'll then either get an exception or get bogus values if you don't have exceptions enabled.