Is there any reason why when I run the following
var name = "A"
withUnsafeBytes(of: &name, { bytes in
print (bytes)
for byte in bytes.enumerated() {
print (byte.element, byte.offset)
}
})
The last byte is 255?
I expected the bytes to just contain 65 as that is the ASCII code!
That is, byte 0 is 65 (as expected) and byte 15 is 255 (all the rest are zeroed)
Why is byte 15 255?
struct String is a (opaque) structure, containing pointers to the actual character storage. Short strings are stored directly in those pointers, which is why in your particular case the byte 65 is printed first.
If you run your code with
var name = "A really looooong string"
then you'll notice that there is no direct connection between the output of the program and the characters of the string.
If the intention is to enumerate the bytes of the UTF-8 representation of the string then
for byte in name.utf8 { print(byte) }
is the correct way.
Related
My problem is to retrieve real-time data from an inverter (Voltronic family).
The inverter has a server and, if correctly asked, can send back information according to a communication protocol.
The communication is done through the serial port.
In particular a string similar to "XXXX"+ + CR has to be sent and the relevant data are sent back.
In my case the only string I need to send is "QPIGS". In this case I would have back many information that will allow me to produce a sort of control desk.
Since the string I need is always and only this, I made an off-line calculation of the <crc> that I need to complete the request.
The <crc> value is composed by two bytes, "·©". The first is the "mid point", hex b7, and the second is the "copyright sign" hex a9.
So the complete string should be "QPIGS·©". if I add the CR in powershell "`r", the complete string should be "QPIGS·©`r".
The script is very simple:
$port= new-Object System.IO.Ports.SerialPort COM1,2400,None,8,one
$port.ReadTimeout = 1000
$port.open()
$str='QPIGS·©`r'
$port.WriteLine($str')
Start-Sleep -Milliseconds 300
while ($x = $port.ReadExisting())
{
Write-Host $x
}
$port.Close()
But unfortunately it didn't work.
The inverter recognise the string but it doesn't match with what it was expecting and send back a NACK response. The exchange happens but is not succesfull.
In order to investigate more deeply I used a serial port serial sniffer to have evidence of what was really sent to the inverter and I found that what has been sent is the following
175 15/10/2022 17:06:29 IRP_MJ_WRITE DOWN 51 50 49 47 53 3f 3f 0a QPIGS??. 8 8 COM1
instead of what I was expecting
175 15/10/2022 17:06:29 IRP_MJ_WRITE DOWN 51 50 49 47 53 b7 a9 0d QPIGS·©. 8 8 COM1
It seems that the two <crc> bytes are ignored and substituted with two ?, hex 3f.
I imagine a problem of encoding.......but I can't find a solution.
Thanks for your help.
Tip of the hat to CherryDT for his comments and links to relevant related posts.
the complete string should be "QPIGS·©" [ + "`r" for a CR]
If you send strings to a serial port, its character encoding matters.
The default encoding is ASCII, which means that only Unicode characters in the 7-bit ASCII subrange can be sent, which excludes · (MIDDLE DOT, U+00B7) and © (COPYRIGHT SIGN, U+00A9) - that is, any Unicode character whose code point is greater than 0x7f (127) is "lossily" converted to a verbatim ASCII-range ? character, 0x3f (63).
You have two basic options:
Avoid string processing altogether and send an array of bytes: Convert the QPIGS substring to an array of (ASCII-range) byte values and append byte values 0xb7 and 0xa9:
Because the .NET strings are Unicode strings (encodes as UTF-16LE code units), you can take advantage of the fact that the code-point range 0x0 - 0x7f coincides with the ASCII code-point range, so you can simply cast ASCII-range characters to [byte[]] (via a [char[]] cast):
# Results in the following byte array:
# [byte[]] (0x51, 0x50, 0x49, 0x47, 0x53, 0xb7, 0xa9, 0xd)
[byte[]] $bytes = [char[]] 'QPIGS' + 0xb7, 0xa9, [char] "`r"
$port.Write($bytes, 0, $bytes.Count)
Use the port's .Encoding property to specify a character encoding in which the string "QPIGS·©`r" results in the desired byte values:
In this case you need the Windows-1252 encoding, where · is represented as byte value 0xb7, and © as 0xa9, and all ASCII-range characters are represented by their usual byte values:
$port.Encoding = [System.Text.Encoding]::GetEncoding(1252)
$port.Write("QPIGS·©`r")
I understand that you have a hex string and perform SHA256 on it twice and then byte-swap the final hex string. The goal of this code is to find a Merkle Root by concatenating two transactions. I would like to understand what's going on in the background a bit more. What exactly are you decoding and encoding?
import hashlib
transaction_hex = "93a05cac6ae03dd55172534c53be0738a50257bb3be69fff2c7595d677ad53666e344634584d07b8d8bc017680f342bc6aad523da31bc2b19e1ec0921078e872"
transaction_bin = transaction_hex.decode('hex')
hash = hashlib.sha256(hashlib.sha256(transaction_bin).digest()).digest()
hash.encode('hex_codec')
'38805219c8ac7e9a96416d706dc1d8f638b12f46b94dfd1362b5d16cf62e68ff'
hash[::-1].encode('hex_codec')
'ff682ef66cd1b56213fd4db9462fb138f6d8c16d706d41969a7eacc819528038'
header_hex is a regular string of lower case ASCII characters and the decode() method with 'hex' argument changes it to a (binary) string (or bytes object in Python 3) with bytes 0x93 0xa0 etc. In C it would be an array of unsigned char of length 64 in this case.
This array/byte string of length 64 is then hashed with SHA256 and its result (another binary string of size 32) is again hashed. So hash is a string of length 32, or a bytes object of that length in Python 3. Then encode('hex_codec') is a synomym for encode('hex') (in Python 2); in Python 3, it replaces it (so maybe this code is meant to work in both versions). It outputs an ASCII (lower hex) string again that replaces each raw byte (which is just a small integer) with a two character string that is its hexadecimal representation. So the final bit reverses the double hash and outputs it as hexadecimal, to a form which I usually call "lowercase hex ASCII".
I accidentally wrote this simple code to print alphabet in terminal:
var alpha:Int = 97
while (alpha <= 122) {
write(1, &alpha, 1)
alpha += 1
}
write(1, "\n", 1)
//I'm using write() function from C, to avoid newline on each symbol
And I've got this output:
abcdefghijklmnopqrstuvwxyz
Program ended with exit code: 0
So, here is the question: Why does it work?
In my logic, it should display a row of numbers, because an integer variable is being used. In C, it would be a char variable, so we would mean that we point to a sign at some index in ASCII. Then:
char alpha = 97;
Would be a code point to an 'a' sign, by incrementing alpha variable in a loop we would display each element of ascii through 122nd.
In Swift though, I couldn't assign an integer to Character or String type variable. I used Integer and then declared several variables to assign UnicodeScalar, but accidentally I found out that when I'm calling write, I point to my integer, not the new variable of UnicodeScalar type, although it works! Code is very short and readable, but I don't completely understand how does work and why at all.
Has anyone had such situation?
Why does it work?
This works “by chance” because the integer is stored in little-endian byte order.
The integer 97 is stored in memory as 8 bytes
0x61 0x00 0x00 0x00 0x00 0x00 0x00 0x00
and in write(1, &alpha, 1), the address of that memory location is
passed to the write system call. Since the last parameter (nbyte)
is 1, the first byte at that memory address is written to the
standard output: That is 0x61 or 97, the ASCII code of the letter
a.
In Swift though, I couldn't assign an integer to Character or String type variable.
The Swift equivalent of char is CChar, a type alias for Int8:
var alpha: CChar = 97
Here is a solution which does not rely on the memory layout and
works for non-ASCII character as well:
let first: UnicodeScalar = "α"
let last: UnicodeScalar = "ω"
for v in first.value...last.value {
if let c = UnicodeScalar(v) {
print(c, terminator: "")
}
}
print()
// αβγδεζηθικλμνξοπρςστυφχψω
I need to convert the given text (not in file format) into binary values and store in a single array that is to be given as input to other function in Matlab .
Example:
Hi how are you ?
It is to be converted into binary and stored in an array.I have used dec2bin() function but i did not suceed in getting the output required.
Sounds a bit like a trick question. In MATLAB, a character array (string) is just a different representation of 16-bit unsigned character codes.
>> str = 'Hi, how are you?'
str =
Hi, how are you?
>> whos str
Name Size Bytes Class Attributes
str 1x16 32 char
Note that the 16 characters occupy 32 bytes, or 2 bytes (16-bits) per character. From the documentation for char:
Valid codes range from 0 to 65535, where codes 0 through 127 correspond to 7-bit ASCII characters. The characters that MATLAB® can process (other than 7-bit ASCII characters) depend upon your current locale setting. To convert characters into a numeric array,use the double function.
Now, you could use double as it recommends to get the character codes into double arrays, but a minimal representation would simply involve uint16:
int16bStr = uint16(str)
To split this into bytes, typecast into 8-bit integers:
typecast(int16bStr,'uint8')
which yields 32 uint8 values (bytes), which are suitable for conversion to binary representation with dec2bin, if you want to see the binary (but these arrays are already binary data).
If you don't expect anything other than ASCII characters, just throw out the extra bits from the start:
>> int8bStr =
72 105 44 32 104 111 119 32 97 114 101 32 121 111 117 63
>> binStr = reshape(dec2bin(binStr8b.'),1,[])
ans =
110011101110111001111111111111110000001001001011111011000000 <...snip...>
I have an task to generate an BASE64 string from HTTP response which will be used by the external program.Before I was doing this,I had known that if you use BASE64,the length of the string will be extend like 57 * 4/3 = 76.
But I've no idea which need to be handle if the HTTP respone longer than 57 bytes! So I had not any special handle for the HTTP response,just direct covert "$response->content" to which I want.(Actually length of the response more than 57 bytes).
3.The unexpected thing is that the length of the encode string not excatly follow 4/3 rule!!! When the len of input string is 57,then len of encode string is 77. When the len of input string is 114, the encode string is 154,why?
4.When I try to use the BASE64 output from an external c# arg(),seems it can only receive first 57 bytes.
#Sample Code
my $cont = $response->content;
$cont = substr ($cont, 0, 57);
my $encode = encode_base64($cont);
printf("Length Before Decode = %d.\n",length($cont));
printf("Length After Decode = %d.\n",length($encode));
#
According to this note, encode_base64() adds a newline. Are you counting that newline in the encoded string length? It seems to account for your extra character.