Can someone explain to me this emu8086 code? - emu8086

mov bx,offset array
dec bx
mov cx,100
next: inc bx
cmp [bx],0FFH
loope next
can you explain why we ( DEC BX ) AND AGAIN (INC BX ) ? looking for compelete answer thx

We decrement bx before the loop, then the first instruction in the loop increments bx.
This way, on the first iteration of the loop, bx is again pointing at the beginning of array. On the second iteration, it points at the second item, and so on.
It might initially seem more straightforward to do something like:
mov bx, offset array
next: cmp [bx], 0ffh
inc bx
loopne next
The problem with this is that we're depending on the cmp to set the Z flag, which is used by the loopne instruction--but the inc instruction also affects the Z flag, so this would lose the result from the cmp, so the loopne wouldn't work correctly any more.
That having been said, this seems to be doing roughly the same thing as repne scasb can do:
mov di, offset array
mov al, 0ffh
mov cx, 100
repne scasb ; this instruction implements the entire loop
The big difference is that repne scasb always searches in an array whose base is given in es:di, which can sometimes be clumsy to deal with (e.g., if you're already using es to point to something else).

Related

How can I multiply two vectors in NASM x86 assembly?

I am trying to multiply two vectors of floating point values in assembly code. I am using NASM preprocessor on Intel x86_64 architecture.
Recently I have learned about SSE extension for Intel assembly, so I was trying to implement packed multiplication using SSE instructions and tail recursion. Here is my function:
mul:
movdqa xmm0, [rdi]
movdqa xmm1, [rsi]
mulps xmm0, xmm1
movdqa [rdi], xmm0
add rdi, 16
add rsi, 16
sub rdx, 4
cmp rdx, 0
jg mul
ret
It multiplies two vectors value by value and saves results into the first vector. Pointers to vectors are stored in rdi and rsi registers accordingly, size of both should be written in rdx.
The program written above catches segmentation fault on the second instruction. I think I am using SSE incorrectly.
Is there any other way to use packed multiplication? Or am I just doing something wrong here?

Get a negative result out of sub instruction x86_64

I've been stuck on a rather simple instruction, but everything in assembly only seems simple until I try to really understand it haha.
I've paid attention to this post which clarifies some stuff but I'm still confused: Understanding intel SUB instruction
Here is a super simple situation:
Two strings are sent to function for comparison. I stripped down to keep the essentials.
function:
mov al, [rdi] ; == 0
sub al, [rsi] ; == 32
ret ; returns 224 and not -32 like I would like
I am guessing I need to check the overflow flag right after the sub instruction if it's on, then my result is negative. Or maybe the sign flag, more logically (both seem right in this case ?)
From that, I would need to subtract 256 to rax to make it -32 before returning... But it seems a little weird to me, there is got to be a cleaner way?
The 8-bit values 224 and -32 are the same on x86 and x86-64, since it uses two's complement to represent negative numbers. Subtracting 256 wouldn't help you. It's up to the caller of the function to choose to interpret the result as signed (-128 to 127, in which case it's -32), or unsigned (0 to 255, in which case it's 224). I assume you're asking this because your code that calls this function is interpreting it wrongly. This could happen if you accidentally used movzx on the result instead of movsx to extend it, or worse, if you used ax, eax, or rax without extending it first (in which case you could end up with complete garbage like 0x11223344556677e0).

What's so special about 0x55AA?

I have encountered 0x55AA in 2 scenarios:
the final 2 bytes of boot sector in the legacy booting process contains 0x55AA.
the first 2 bytes of the Option ROM must be 0x55AA
So what's special about 0x55AA?
The binary version of 0x55AA is 0101010110101010. Is it because it is evenly interleaved 0 and 1? But I don't see that's a strong criteria.
0x55AA is a "signature word". It is used as the "end of sector" marker in the last 2 bytes of a 512 byte boot record. This includes MBR and it's extended boot records and in the newer GPTs protective MBR.
References:
Image from Master Boot Record - microsoft.com.
How Basic Disks and Volumes Work - microsoft.com.
There is nothing magical or mystical about that combination. Implementers needed a means by which to determine if the first sector of a device was bootable (boot signature) and that combination occurring in the last two bytes of a sector is so improbable, is why it was chosen.Similarly, SMBIOS entry point can be found scanning BIOS for _SM_ signature that must be on an segment boundary like this;
Find_SMBIOS:
push ds
push bx ; Preserve essential
push si
; Establish DS:BX to point to base of BIOS code
mov ax, 0xf000
mov ds, ax ; Segment where table lives
xor bx, bx ; Initial pointer
mov eax, '_SM_' ; Scan buffer for this signature
; Loop has maximum of 4096 interations. As table is probably at top of buffer, cycling
; though it backwards saves time. In my test bed, BOCH's 2.6.5 BIOS-bochs-latest it was
; 1,451 interations.
.L0: sub bx, 16 ; Bump pointer to previous segment
jnz .J0
; Return NULL in AX and set CF. Either AX or flag can be tested on return.
mov ax, bx
stc
jmp .Done
; Did we find signature at this page
.J0: cmp [bx], eax
jnz .L0 ; NZ, keep looking
; Calculate checksum to verify position
mov cx, 15
mov ax, cx
mov si, bx ; DS:SI = Table entry point
; Compute checksum on next 15 bytes
.L1: lodsb
add ah, al
loop .L1
or ah, ah
jnz .L0 ; Invalid, try to find another occurence
; As entry point is page aligned, we can do this to determine segment.
shr bx, 4
mov ax, ds
add ax, bx
clc ; NC, found signature
.Done:
pop si
pop bx ; Restore essential
pop ds
ret
That signature is easily identifiable in a hex dump and it fits into a 16 bit register. Where those two criteria precipitating factors, I don't know, but again, the probability of 0x5f4d535f appearing on an even 16 byte boundary is very unlikely.

Negating numbers in Y86

I know there would be many methods to do it and I'm trying to find the most efficient way.
One particular way I'm trying to avoid is subtracting the number from zero since it would involve transfering the value from the register that used to have 0 back to the register that contains pre-negated number which would be a pain.
From the limited information I can find online about Y86, it is a simplified version of x86. The x86 instruction set has a NEG instruction to negate a number. Y86 does not. You will probably have to subtract your value from 0.
Sorry, Y86 is limited, so almost any operation that you can imagine will end up costing more calories than a simple subtraction from 0.
What we can do it optimize the establishment of 0 (by using XOR) and the preservation/restoration of the interim value (by using the stack.)
The following code works:
#
# Negate a number in %ebx by subtracting it from 0
#
Start:
irmovl $999, %eax // Some random value to prove non-destructiveness
irmovl Stack, %esp // Set the stack
pushl %eax // Preserve
Go:
irmovl $300, %ebx
xorl %eax, %eax
subl %ebx,%eax
rrmovl %eax, %ebx
Finish:
popl %eax // Restore
halt
.pos 0x0100
Stack:

harmonic series with x86-64 assembly

Trying to compute a harmonic series.
Right now I'm entering the number I want the addition to go up to.
When I enter a small number like 1.2, the program just stops, doesn't crash, it seems to be doing calculations.
BUt it never finishes the program
here is my code
denominator:
xor r14,r14 ;zero out r14 register
add r14, 2 ;start counter at 2
fld1 ;load 1 into st0
fxch st2
denomLoop:
fld1
mov [divisor], r14 ;put 1 into st0
fidiv dword [divisor] ;divide st0 by r14
inc r14 ;increment r14
fst qword [currentSum] ;pop current sum value into currentSum
jmp addParts
addParts:
fld qword [currentSum]
fadd st2 ;add result of first division to 1
fxch st2 ;place result of addition into st2
fld qword [realNumber] ;place real number into st0
;compare to see if greater than inputed value
fcom st2 ;compare st0 with st2
fstsw ax ;needed to do floating point comparisons on FPU
sahf ;needed to do floating point comaprisons on FPU
jg done ;jump if greater than
jmp denomLoop ;jump if less than
The code is basically computing the 1/2 or 1/3 or 1/4 and adding it to a running sum, then compares to see if i've reached a value above what I entered, once it has it should exit the loop
do you guys see my error?
This line seems suspicious:
fst qword [currentSum] ;pop current sum value into currentSum
contrary to the comment, fst stores the top of the stack into memory WITHOUT popping it. You want fstp if you want to pop it.
Overall, the stack behavior of your program seems suspicious -- it pushes various things onto the fp stack but never pops anything. After a couple of iterations, the stack will overflow and wrap around. Depending on your settings, you'll then either get an exception or get bogus values if you don't have exceptions enabled.