Does Rust have a way to convert several bytes to a number? [duplicate] - numbers

This question already has answers here:
Converting number primitives (i32, f64, etc) to byte representations
(5 answers)
Closed 6 years ago.
And convert a number to a byte array?
I'd like to avoid using transmute, but it's most important to reach maximum performance.

A u32 being 4 bytes, you may be able to use std::mem::transmute to interpret a [u8; 4] as a u32 however:
beware of alignment
beware of endianness
A no-dependency solution is simply to perform the maths, following in Rob Pike's steps:
fn as_u32_be(array: &[u8; 4]) -> u32 {
((array[0] as u32) << 24) +
((array[1] as u32) << 16) +
((array[2] as u32) << 8) +
((array[3] as u32) << 0)
}
fn as_u32_le(array: &[u8; 4]) -> u32 {
((array[0] as u32) << 0) +
((array[1] as u32) << 8) +
((array[2] as u32) << 16) +
((array[3] as u32) << 24)
}
It compiles down to reasonably efficient code.
If dependencies are an option though, using the byteorder crate is just simpler.

There is T::from_str_radix to convert from a string (you can choose the base and T can be any integer type).
To convert an integer to a String you can use format!:
format!("{:x}", 42) == "2a"
format!("{:X}", 42) == "2A"
To reinterpret an integer as bytes, just use the byte_order crate.
Old answer, I don't advise this any more:
If you want to convert between u32 and [u8; 4] (for example) you can use transmute, it’s what it is for.
Note also that Rust has to_be and to_le functions to deal with endianess:
unsafe { std::mem::transmute::<u32, [u8; 4]>(42u32.to_le()) } == [42, 0, 0, 0]
unsafe { std::mem::transmute::<u32, [u8; 4]>(42u32.to_be()) } == [0, 0, 0, 42]
unsafe { std::mem::transmute::<[u8; 4], u32>([0, 0, 0, 42]) }.to_le() == 0x2a000000
unsafe { std::mem::transmute::<[u8; 4], u32>([0, 0, 0, 42]) }.to_be() == 0x0000002a

Related

Bitwise and arithmetic operations in swift

Honestly speaking, porting to swift3(from obj-c) is going hard. The easiest but the swiftiest question.
public func readByte() -> UInt8
{
// ...
}
public func readShortInteger() -> Int16
{
return (self.readByte() << 8) + self.readByte();
}
Getting error message from compiler: "Binary operator + cannot be applied to two UInt8 operands."
What is wrong?
ps. What a shame ;)
readByte returns a UInt8 so:
You cannot shift a UInt8 left by 8 bits, you'll lose all its bits.
The type of the expression is UInt8 which cannot fit the Int16 value it is computing.
The type of the expression is UInt8 which is not the annotated return type Int16.
d
func readShortInteger() -> Int16
{
let highByte = self.readByte()
let lowByte = self.readByte()
return Int16(highByte) << 8 | Int16(lowByte)
}
While Swift have a strictly left-right evaluation order of the operands, I refactored the code to make it explicit which byte is read first and which is read second.
Also an OR operator is more self-documenting and semantic.
Apple has some great Swift documentation on this, here:
https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/AdvancedOperators.html
let shiftBits: UInt8 = 4 // 00000100 in binary
shiftBits << 1 // 00001000
shiftBits << 2 // 00010000
shiftBits << 5 // 10000000
shiftBits << 6 // 00000000
shiftBits >> 2 // 00000001

Bitshifting and sign

Let me start with the problem:
def word(byte1 : Byte, byte2 : Byte, byte3 : Byte, byte4: Byte) : Int = {
((byte4 << 0)) | ((byte3 << 8)) | ((byte2 << 16)) | ((byte1 << 24))
}
The goal here is pretty simple. Given 4 bytes, pack them in to an Int.
The code above does not work because it appears the shift operator tries to preserve the sign. For example, this:
word(0xFA.toByte, 0xFB.toByte, 0xFC.toByte, 0xFD.toByte).formatted("%02X")
Produces FFFFFFFD when I would have expected FAFBFCFD.
Making the problem smaller:
0xFE.toByte << 8
Produces -2 in two's complement, not 0xFE00.
How can I do a shift without the sign issues?
AND the bytes with 0xFF to undo the effects of sign extension before the shift:
((byte4 & 0xFF) << 0) | ((byte3 & 0xFF) << 8) | ...
Your suspicion is correct and #user2357112 answers your question.
Now, you can use ByteBuffer as a clean alternate:
def word(byte1 : Byte, byte2 : Byte, byte3 : Byte, byte4: Byte) : Int =
ByteBuffer.wrap(Array(byte1, byte2, byte3, byte4)).getInt

Why does "UInt64(1 << 63)" crash?

println(UInt8(1 << 7)) // OK
println(UInt16(1 << 15)) // OK
println(UInt32(1 << 31)) // OK
println(UInt64(1 << 63)) // Crash
I would like to understand why this happens for UInt64 only. Thanks!
Edit:
To make matters more confusing, the following all work:
println(1 << UInt8(7))
println(1 << UInt16(15))
println(1 << UInt32(31))
println(1 << UInt64(63))
My guess is that an intermediate result produced by computing 1 << 63 is too large.
Try println(UInt64(1) << UInt64(63)).
The type inferrer didn't do its job well and decided that 1 << 63 is a UInt32 and used this function: func <<(lhs: UInt32, rhs: UInt32) -> UInt32
println(1 << UInt64(63)) works because the compiler knows that since UInt64(63) is a UInt64, then the integer literal 1 is inferred to be a UInt64, therefore the operation results in a UInt64 and is not out of bounds.

Is there a better way to detect endianness in .NET than BitConverter.IsLittleEndian?

It would be nice if the .NET framework just gave functions/methods from the BitConverter class that just explicitly returned an array of bytes in the proper requested endianness.
I've done some functions like this in other code, but is there a shorter more direct way? (efficiency is key since this concept is used a TON in various crypto and password derivation contexts, including PBKDF2, Skein, HMAC, BLAKE2, AES and others)
// convert an unsigned int into an array of bytes BIG ENDIEN
// per the spec section 5.2 step 3 for PBKDF2 RFC2898
static internal byte[] IntToBytes(uint i)
{
byte[] bytes = BitConverter.GetBytes(i);
if (!BitConverter.IsLittleEndian)
{
return bytes;
}
else
{
Array.Reverse(bytes);
return bytes;
}
}
I also see that others struggle with this question, and I haven't seen a good answer yet :( How to deal with 'Endianness'
The way I convert between integers and byte[] is by using bitshifts with fixed endianness. You don't need to worry about host endianness with such code. When you care that much about performance, you should avoid allocating a new array each time.
In my crypto library I use:
public static UInt32 LoadLittleEndian32(byte[] buf, int offset)
{
return
(UInt32)(buf[offset + 0])
| (((UInt32)(buf[offset + 1])) << 8)
| (((UInt32)(buf[offset + 2])) << 16)
| (((UInt32)(buf[offset + 3])) << 24);
}
public static void StoreLittleEndian32(byte[] buf, int offset, UInt32 value)
{
buf[offset + 0] = (byte)value;
buf[offset + 1] = (byte)(value >> 8);
buf[offset + 2] = (byte)(value >> 16);
buf[offset + 3] = (byte)(value >> 24);
}
With big endian you just need to change the shift amounts or the offsets:
public static void StoreBigEndian32(byte[] buf, int offset, UInt32 value)
{
buf[offset + 3] = (byte)value;
buf[offset + 2] = (byte)(value >> 8);
buf[offset + 1] = (byte)(value >> 16);
buf[offset + 0] = (byte)(value >> 24);
}
If you're targetting .net 4.5 it can be useful to mark these methods with [MethodImpl(MethodImplOptions.AggressiveInlining)].
Another performance tip for crypto is avoiding arrays as much as possible. Load the data from the array at the beginning of the function, then run everything using local variables and only in the very end you copy back to the array.

Three boolean values saved in one tinyint

probably a simple question but I seem to be suffering from programmer's block. :)
I have three boolean values: A, B, and C. I would like to save the state combination as an unsigned tinyint (max 255) into a database and be able to derive the states from the saved integer.
Even though there are only a limited number of combinations, I would like to avoid hard-coding each state combination to a specific value (something like if A=true and B=true has the value 1).
I tried to assign values to the variables so (A=1, B=2, C=3) and then adding, but I can't differentiate between A and B being true from i.e. only C being true.
I am stumped but pretty sure that it is possible.
Thanks
Binary maths I think. Choose a location that's a power of 2 (1, 2, 4, 8 etch) then you can use the 'bitwise and' operator & to determine the value.
Say A = 1, B = 2 , C= 4
00000111 => A B and C => 7
00000101 => A and C => 5
00000100 => C => 4
then to determine them :
if( val & 4 ) // same as if (C)
if( val & 2 ) // same as if (B)
if( val & 1 ) // same as if (A)
if((val & 4) && (val & 2) ) // same as if (C and B)
No need for a state table.
Edit: to reflect comment
If the tinyint has a maximum value of 255 => you have 8 bits to play with and can store 8 boolean values in there
binary math as others have said
encoding:
myTinyInt = A*1 + B*2 + C*4 (assuming you convert A,B,C to 0 or 1 beforehand)
decoding
bool A = myTinyInt & 1 != 0 (& is the bitwise and operator in many languages)
bool B = myTinyInt & 2 != 0
bool C = myTinyInt & 4 != 0
I'll add that you should find a way to not use magic numbers. You can build masks into constants using the Left Logical/Bit Shift with a constant bit position that is the position of the flag of interest in the bit field. (Wow... that makes almost no sense.) An example in C++ would be:
enum Flags {
kBitMask_A = (1 << 0),
kBitMask_B = (1 << 1),
kBitMask_C = (1 << 2),
};
uint8_t byte = 0; // byte = 0b00000000
byte |= kBitMask_A; // Set A, byte = 0b00000001
byte |= kBitMask_C; // Set C, byte = 0b00000101
if (byte & kBitMask_A) { // Test A, (0b00000101 & 0b00000001) = T
byte &= ~kBitMask_A; // Clear A, byte = 0b00000100
}
In any case, I would recommend looking for Bitset support in your favorite programming language. Many languages will abstract the logical operations away behind normal arithmetic or "test/set" operations.
Need to use binary...
A = 1,
B = 2,
C = 4,
D = 8,
E = 16,
F = 32,
G = 64,
H = 128
This means A + B = 3 but C = 4. You'll never have two conflicting values. I've listed the maximum you can have for a single byte, 8 values or (bits).