Storing a String in Variable and printing it out in 8086 Assembly language - emu8086

;CODE FOR PRINTING A STRING IN 8086 ASSEMBLY LANGUAGE:
.model small
.stack 100h
.data
msg db 'hello$'
.code
main proc
mov dx,#data
mov ds,dx
mov dx,offset msg ;lea dx,msg
mov ah,9
int 21h
mov ah,4ch
int 21h
main endp
end main
MY QUESTIONS:
db can store 8 bits of data but hello$ is 6 byte in size(1 char= 8 bits). how can it store the string if the string is larger then db's capacity?
if i write MOV DX,MSG it shows error(as dx is 8 bit register and the string is larger then it's capacity). but it works when it is written as MOV DX,OFFSET msg or LEA DX,msg. can you explain what offset & lea does?

db can store 8 bits of data but hello$ is 6 byte in size(1 char= 8 bits). how can it store the string if the string is larger then db's capacity?
It does not store the hello$ as a whole instead it stores the OFFSET of hello$ which in this case is h ( starting character of your string ).
This is how your string will be stored in memory:
Lets say DS:SI registers ( which point to some memory address/location usually the address of the variable you have declared in your program. ) store address 07200. And lets assume offset ( 'h' in your case ) has been assigned this address. Now all the other character following the offset will be stored at contiguous memory location so address 07200 will store h, 07201 will store e, 07202 will store l and so on. So in this way msg variable will only have to store the offset because it knows it will find all the other character of string following the offset ( arrays are stored in contiguous memory ).
if i write MOV DX,MSG it shows error(as dx is 8 bit register and the string is larger then it's capacity). but it works when it is written as MOV DX,OFFSET msg or LEA DX,msg. can you explain what offset & lea does?
First of all DX is not 8bit instead its 16bit register and 09h service of INT 21h needs an offset of your string to be placed in DX register and then from there it keeps printing the characters on the console until it encounters $ ( string termination character ) so writing msg DX, MSG upsets 09h service of INT 21h that is why it throws an error. mov dx, offset msg and lea dx, msg ( Load Effect Address ) both place the offset of the string in DX register.

offset stores the address of first char. in this case it's 'h'..so dx doesn't hold the whole string but the address of the first char, probably lea also works as offset...that solves the second question if i am right..but how come db store 6 byte char? as it is capable of dealing with 8bits or 1 byte only

Related

Iterate through alphabet in Swift explanation

I accidentally wrote this simple code to print alphabet in terminal:
var alpha:Int = 97
while (alpha <= 122) {
write(1, &alpha, 1)
alpha += 1
}
write(1, "\n", 1)
//I'm using write() function from C, to avoid newline on each symbol
And I've got this output:
abcdefghijklmnopqrstuvwxyz
Program ended with exit code: 0
So, here is the question: Why does it work?
In my logic, it should display a row of numbers, because an integer variable is being used. In C, it would be a char variable, so we would mean that we point to a sign at some index in ASCII. Then:
char alpha = 97;
Would be a code point to an 'a' sign, by incrementing alpha variable in a loop we would display each element of ascii through 122nd.
In Swift though, I couldn't assign an integer to Character or String type variable. I used Integer and then declared several variables to assign UnicodeScalar, but accidentally I found out that when I'm calling write, I point to my integer, not the new variable of UnicodeScalar type, although it works! Code is very short and readable, but I don't completely understand how does work and why at all.
Has anyone had such situation?
Why does it work?
This works “by chance” because the integer is stored in little-endian byte order.
The integer 97 is stored in memory as 8 bytes
0x61 0x00 0x00 0x00 0x00 0x00 0x00 0x00
and in write(1, &alpha, 1), the address of that memory location is
passed to the write system call. Since the last parameter (nbyte)
is 1, the first byte at that memory address is written to the
standard output: That is 0x61 or 97, the ASCII code of the letter
a.
In Swift though, I couldn't assign an integer to Character or String type variable.
The Swift equivalent of char is CChar, a type alias for Int8:
var alpha: CChar = 97
Here is a solution which does not rely on the memory layout and
works for non-ASCII character as well:
let first: UnicodeScalar = "α"
let last: UnicodeScalar = "ω"
for v in first.value...last.value {
if let c = UnicodeScalar(v) {
print(c, terminator: "")
}
}
print()
// αβγδεζηθικλμνξοπρςστυφχψω

difference between integer*4 and int32 when using fread in matlab

I want to read a file with my matlab.The first 200 byte of this file is unnecessary so i put it away and for the rest, i should read 4 byte, 4 byte .because of this, i write a simple code such as below:
[fidr, message]= fopen('myfile.format','r' , 'n');
extra=fread(fidr,200,'int8');
fidTemp = fopen('mynewfile.format','w');
while ~feof(fidr)
Tempc=fread(fidr,1,'int32');
fwrite(fidTemp , Tempc, 'integer*4');
Temp_c=Temp_c+1;
end
[fidr11 , message11] = fopen('mynewfile.format');
mynewfile=fread(fidr11,'int32');
when i read matlab help for fread and fwrite i notice for signed 32 bit (4 byte) they mentioned int32 and integer*4 but did not not say what is their differences.is there any difference between them or they are same?

perl bitwise AND and bitwise shifting

I was reading some example code snippet for the module Net::Pcap::Easy, and I came across this piece of code
my $l3protlen = ord substr $raw_bytes, 14, 1;
my $l3prot = $l3protlen & 0xf0 >> 2; # the protocol part
return unless $l3prot == 4; # return unless IPv4
my $l4prot = ord substr $packet, 23, 1;
return unless $l4prot == '7';
After doing a total hex dump of the raw packet $raw_bytes, I can see that this is an ethernet frame, and not on a TCP/UDP packet. Can someone please explain what the above code does?
For parsing the frame, I looked up this page.
Now onto the Perl...
my $l3protlen = ord substr $raw_bytes, 14, 1;
Extract the 15th byte (character) from $raw_bytes, and convert to its ordinal value (e.g. a character 'A' would be converted to an integer 65 (0x41), assuming the character set is ASCII). This is how Perl can handle binary data as if it were a string (e.g. passing it to substr) but then let you get the binary values back out and handle them as numbers. (But remember TMTOWTDI.)
In the IPv4 frame, the first 14 bytes are the MAC header (6 bytes each for destination and source MAC address, followed by 2-byte Ethertype which was probably 0x8000 - you could have checked this). Following this, the 15th byte is the start of the Ethernet data payload: the first byte of this contains Version (upper 4 bytes) and Header Length in DWORDs (lower 4 bytes).
Now it looks to me like there is a bug in the next line of this sample code, but it may well normally work by a fluke!
my $l3prot = $l3protlen & 0xf0 >> 2; # the protocol part
In Perl, >> has higher precedence than &, so this will be equivalent to
my $l3prot = $l3protlen & (0xf0 >> 2);
or if you prefer
my $l3prot = $l3protlen & 0x3c;
So this extracts bits 2 - 5 from the $l3prot value: the mask value 0x3c is 0011 1100 in binary. So for example a value of 0x86 (in binary, 1000 0110) would become 0x04 (binary 0000 0100).
In fact a 'normal' IPv4 value is 0x45, i.e. protocol type 4, header length 5 dwords. Mask that with 0x3c and you get... 4! But only by fluke: you have tested the top 2 bits of the length, not the protocol type!
This line should surely be
my $l3prot = ($l3protlen & 0xf0) >> 4;
(note brackets for precedence and a shift of 4 bits, not 2). (I found this same mistake in the CPAN documentation so I guess it's probably quite widely spread.)
return unless $l3prot == 4; # return unless IPv4
For IPv4 we expect this value to be 4 - if it isn't, jump out of the function right away. (So the wrong code above gives the result which lets this be interpreted as an IPv4 packet, but only by luck.)
my $l4prot = ord substr $packet, 23, 1;
Now extract the 24th byte and convert to ordinal value in the same way. This is the Protocol byte from the IP header:
return unless $l4prot == '7';
We expect this to be 7 - if it isn't jump out of the function right away. (According to IANA, 7 is "Core-based trees"... but I guess you know which protocols you are interested in!)

implementation of sha-256 in perl

i'm trying very hard on implementing the sha-256 algorithm. I have got problems with the padding of the message. for sha-256 you have to append one bit at the end of the message, which I have reached so far with $message .= (chr 0x80);
The next step should be to fill the emtpy space(512bit block) with 0's.
I calculated it with this formula: l+1+k=448-l and append it then to the message.
My problem comes now:Append in the last 64bit block the binary representation of the length of the message and fill the rest with 0's again. Since perl handles their data types by themself, there is no "byte" datatype. How can I figure out which value I should append?
please see also the official specification:
http://csrc.nist.gov/publications/fips/fips180-3/fips180-3_final.pdf
If at all possible, pull something off the shelf. You do not want to roll your own SHA-256 implementation because to get official blessing, you would have to have it certified.
That said, the specification is
5.1.1 SHA-1, SHA-224 and SHA-256
Suppose that the length of the message, M, is l bits. Append the bit 1 to the end of the message, followed by k zero bits, where k is the smallest, non-negative solution to the equation
l + 1 + k ≡ 448 mod 512
Then append the 64-bit block that is equal to the number l expressed using a binary representation. For example, the (8-bit ASCII) message “abc” has length 8 × 3 = 24, so the message is padded with a one bit, then 448 - (24 + 1) = 423 zero bits, and then the message length, to become the 512-bit padded message
423 64
.-^-. .---^---.
01100001 01100010 01100011 1 00…00 00…011000
“a” “b” “c” '-v-'
l=24
Then length of the padded message should now be a multiple of 512 bits.
You might be tempted to use vec because it allows you to address single bits, but you would have to work around funky addressing.
If bits is 4 or less, the string is broken into bytes, then the bits of each byte are broken into 8/BITS groups. Bits of a byte are numbered in a little-endian-ish way, as in 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80. For example, breaking the single input byte chr(0x36) into two groups gives a list (0x6, 0x3); breaking it into 4 groups gives (0x2, 0x1, 0x3, 0x0).
Instead, a pack template of B* specifies
A bit string (descending bit order inside each byte).
and N
An unsigned long (32-bit) in "network" (big-endian) order.
The latter is useful for assembling the message length. Although pack has a Q parameter for quad, the result is in the native order.
Start with a bit of prep work
our($UPPER32BITS,$LOWER32BITS);
BEGIN {
use Config;
die "$0: $^X not configured for 64-bit ints"
unless $Config{use64bitint};
# create non-portable 64-bit masks as constants
no warnings "portable";
*UPPER32BITS = \0xffff_ffff_0000_0000;
*LOWER32BITS = \0x0000_0000_ffff_ffff;
}
Then you can defined pad_message as
sub pad_message {
use bytes;
my($msg) = #_;
my $l = bytes::length($msg) * 8;
my $extra = $l % 512; # pad to 512-bit boundary
my $k = 448 - ($extra + 1);
# append 1 bit followed by $k zero bits
$msg .= pack "B*", 1 . 0 x $k;
# add big-endian length
$msg .= pack "NN", (($l & $UPPER32BITS) >> 32), ($l & $LOWER32BITS);
die "$0: bad length: ", bytes::length $msg
if (bytes::length($msg) * 8) % 512;
$msg;
}
Say the code prints the padded message with
my $padded = pad_message "abc";
# break into multiple lines for readability
for (unpack("H*", $padded) =~ /(.{64})/g) {
print $_, "\n";
}
Then the output is
6162638000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000018
which matches the specification.
First of all I hope you do this just as an exercise -- there is a Digest module in core that already computes SHA-256 just fine.
Note that $message .= (chr 0x80); appends one byte, not one bit. If you really need bitwise manipulation, take a look at the vec function.
To get the binary representation of an intger, you should use pack. To get it to 64 bit, do something like
$message .= pack 'Q', length($message)
Note that the 'Q' format is only available on 64 bit perls; if yours isn't one, simply concatenate four 0-bytes with a 32 bit value (pack format L).

Interpreting a zero when reading data from the serial line in MATLAB using 'fscanf'

I have a stream of data coming in over the serial line from an Arduino board. The stream looks like this:
0x43 0x03 0x39 0x00 0x0D 0x0A
The first two bytes (0x43 and 0x03) are single-byte integer values. The next two bytes (0x39 and 0x00) are a single 16-bit little-endian signed integer value. The final two bytes (0x10 and 0x13) are supposed to be a terminator sequence ("\r\n").
I am using MATLAB to read in this data. I create a serial connection, open it, and read in the data. Unfortunately, I am running into problems with using 0x00 as a byte value because fscanf simply considers it to be the null-terminator of a string.
Here is some sample code:
%Create and open serial connection
serialcon = serial('COM5');
fopen(serialcon);
firstChar = fscanf(serialcon, '%c', 1); %Read 0x43
secondChar = fscanf(serialcon, '%c', 1); %Read 0x03
integerByteChars = fscanf(serialcon, '%c', 2); %Read 0x39 and 0x00
fscanf(serialcon, '%c'); %Read until end-of-line
integerBytes = uint8(integerByteChars); %value should be (in hex): [ 0x39 0x00 ]
integerValue = typecast(integerBytes, 'uint16'); %value should be (in hex): 0x0039
Unfortunately, what happens is "integerByteChars" is not a 2-element array as I would like it to be, but rather a 1-element array because fscanf just considers 0x00 to be a null-terminating string value. This surprises me, however, because I am inputting the data using '%c' and not '%s' (which is used for strings).
What I need is a function that will read these bytes as data even if it's a zero byte and not throw it away. What functions are available to me that will do that? Can fscanf be coerced into doing so?
fread would be a good way of doing this.
You could read all 6 bytes with:
data = fread(s2,6,'uint8')
and then work through the vector that is returned.
firstChar = data(1);
secondChar = data(2);
integerValue = data(3) + data(4) * 256; % Need to check endian calc
if data(5) ~= 13 || data(6) ~= 10
error('Not terminated correctly')
end
BTW, are you sure you have your CR/LF ASCII values correct?