learning javascript and haven't learned about toString - numbers

What is 0xff.toString()? Why?
Now I when I type this on my console I get the output of 255 but why when 0xff is 225 in base 10.

Because 0xff is not base 10. 0x part of that specifically means this is hexadecimal and 0xff would be 255.

Related

Deciphering an unknown character encoding

I am trying to understand the string character encoding of a proprietary file format (fp7-file format from Filemaker Pro).
I found that each character is obfuscated by XOR with 0b01011010 and that the string length is encoded using a single starting byte (max string length in Filemaker is 100 characters).
Encoding is a variable byte encoding, where by default ISO 8859-1 (Western) is used to encode most characters.
If a unicode character outside ISO 8859-1 is to be encoded, some sort of control characters are included into the string that modify the decoding of the next or several following characters. These control characters are using the ASCII control character space (0x01 to 0x1f in particular). This is were I am stuck, as I can't seem to find a pattern to how these control characters work.
Some examples of what I think I have found:
When encountering a control character 0x11 the following characters are created by adding 0x40 to the byte value, e.g. the character Ā (Unicode \U0100) is encoded as 0x11 0xC0 (0xC0 + 0x40 = 0x100).
When encountering the control character 0x10 the previous control character seems to be reset.
When encountering the control character 0x03 the next (only the next!) character is created by adding 0x100 to the byte value. If the control character 0x03 is preceeded by 0x1b then all following characters are created by adding 0x100.
An example string (0_ĀĐĠİŀŐŠŰƀƐƠưǀǐǠǰȀ), its unicode code points and the encoding in Filemaker:
char 0 _ Ā Đ Ġ İ ŀ Ő Š Ű ƀ Ɛ Ơ ư ǀ ǐ Ǡ ǰ Ȁ
unicode 30 5f 100 110 120 130 140 150 160 170 180 190 1a0 1b0 1c0 1d0 1e0 1f0 200
encoded 30 5f 11 c0 d0 e0 f0 3 40 3 50 3 60 3 70 1b 3 80 90 a0 b0 c0 d0 e0 f0 1c 4 80
As you can see the characters 0 and _ are encoded with their direct unicode/ASCII value. The characters ĀĐĠİ are encoded using the 0x11 control byte. Then ŀŐŠŰ are encoded using 0x03 for each character, then 0x1B 0x03 are used to encode the next 8 characters, etc.
Does this encoding scheme look familiar to anybody?
The rules are simple for characters up to 0x200, but then become more and more confusing, even to the point where they seem position dependent.
I can provide more examples for a weekend of puzzles and joy.

Convert from int to binary or hex in mikroc

I got an int value from 0 to 255 and I want to convert that value to hex or binary so i can use it into an 8 bit register(PIC18F uC).
How can i do this conversion?
I tried to use IntToHex function from Conversion Library but the output of this function is a char value, and from here i got stuck.
I'm using mikroc for pic.
Where should I start?
Thanks!
This is a common problem. Many don't understand that, Decimal 15 is same as Hex F is same as Octal 17 is same as Binary 1111.
Different number systems are for Humans, for CPU, it's all Binary!
When OP says,
I got an int value from 0 to 255 and I want to convert that value to
hex or binary so i can use it into an 8 bit register(PIC18F uC).
It reflects this misunderstanding. Probably, because debugger is configured to show "decimal" values and sample code/datasheet shows Hex value for register operations.
So, when you get "int" value from 0 to 255, you can directly write that number to 8-bit register. You don't have to convert it to hex. Hex is just representation which makes Human's life easy.
What you can do is - this is good practise --
REG_VALUE = (unsigned char) int_value;

UTF-8 in decimal

Is representing UTF-8 encoding in decimals even possible? I think only values till 255 would be correct, am I right?
As far as I know, we can only represent UTF-8 in hex or binary form.
I think it is possible. Let's look at an example:
The Unicode code point for ∫ is U+222B.
Its UTF-8 encoding is E2 88 AB, in hexadecimal representation. In octal, this would be 342 210 253. In decimal, it would be 226 136 171. That is, if you represent each byte separately.
If you look at the same 3 bytes as a single number, you have E288AB in hexadecimal; 70504253 in octal; and 14846123 in decimal.

LZW TIFF decoding

When decoding tiff files with LZW decompression, the first 9 bits in the encoded bitstream should be "256", the clear-code.
But when I read it I get a 128, which I just can't figure out. I created the file with GDAL.
My code reading the file is:
val res = (for {
i <- 0 until next
if (bitSet.get(i + index))
} yield (1 << i)).sum
The index is the index in the encoded bitstream and next is how many bits I should read (starting with 9).
So my question is why do I read an 128 instead of an 256? When printing the bitstream input the first bit that is set as 1 is bit number 8 (index 7).
The file in question is: https://dl.dropboxusercontent.com/u/42266515/aspect_lzw.tif
Thanks!
Thanks for posting the sample image. There's nothing wrong with the image; the first code is 0x100 (256). You must remember that TIFF LZW is encoded in "Motorola" byte order. The first two bytes of the file are 0x80 0x00. In binary, it's 10000000 00000000. The first 9 bits (when looking in the correct order) are 100000000 which is 256. You must gather the bytes in big-endian order and then you'll be able to decode it correctly. Here is a sample byte stream:
If the data from the file is: 0x80 0x01 0x25 0x43 0x7E
The bits are (laid out in big-endian order)
10000000 00000001 00100101 01000011 01111110
Taking 9-bit codes from this bitstream will get you:
100000000 (256), 000000100 (4), 100101010 (298), ...

0x00000000 hexadecimal?

I had always been taught 0–9 to represent values zero to nine, and A, B, C, D, E, F for 10-15.
I see this format 0x00000000 and it doesn't fit into the pattern of hexadecimal. Is there a guide or a tutor somewhere that can explain it?
I googled for hexadecimal but I can't find any explanation of it.
So my 2nd question is, is there a name for the 0x00000000 format?
0x simply tells you the number after it will be in hex
so 0x00 is 0, 0x10 is 16, 0x11 is 17 etc
The 0x is just a prefix (used in C and many other programming languages) to mean that the following number is in base 16.
Other notations that have been used for hex include:
$ABCD
ABCDh
X'ABCD'
"ABCD"X
Yes, it is hexadecimal.
Otherwise, you can't represent A, for example. The compiler for C and Java will treat it as variable identifier. The added prefix 0x tells the compiler it's hexadecimal number, so:
int ten_i = 10;
int ten_h = 0xA;
ten_i == ten_h; // this boolean expression is true
The leading zeroes indicate the size: 0x0080 hints the number will be stored in two bytes; and 0x00000080 represents four bytes. Such notation is often used for flags: if a certain bit is set, that feature is enabled.
P.S. As an off-topic note: if the number starts with 0, then it's interpreted as octal number, for example 010 == 8. Here 0 is also a prefix.
Everything after the x are hex digits (the 0x is just a prefix to designate hex), representing 32 bits (if you were to put 0xFFFFFFFF in binary, it would be 1111 1111 1111 1111 1111 1111 1111 1111).
hexadecimal digits are often prefaced with 0x to indicate they are hexadecimal digits.
In this case, there are 8 digits, each representing 4 bits, so that is 32 bits or a word. I"m guessing you saw this in an error, and it is a memory address. this value means null, as the hex value is 0.