How do I convert a string into a vector of bytes in rust? - type-conversion

That might be the dumbest Rustlang question ever but I promise I tried my best to find the answer in the documentation or any other place on the web.
I can convert a string to a vector of bytes like this:
let bar = bytes!("some string");
Unfortunately I can't do it this way
let foo = "some string";
let bar = bytes!(foo);
Because bytes! expects a string literal.
But then, how do I get my foo converted into a vector of bytes?

(&str).as_bytes gives you a view of a string as a &[u8] byte slice (that can be called on String since that derefs to str, and there's also String.into_bytes will consume a String to give you a Vec<u8>.
Use the .as_bytes version if you don't need ownership of the bytes.
fn main() {
let string = "foo";
println!("{:?}", string.as_bytes()); // prints [102, 111, 111]
}
BTW, The naming conventions for conversion functions are helpful in situations like these, because they allow you to know approximately what name you might be looking for.

To expand the answers above. Here are a few different conversions between types.
&str to &[u8]:
let my_string: &str = "some string";
let my_bytes: &[u8] = my_string.as_bytes();
&str to Vec<u8>:
let my_string: &str = "some string";
let my_bytes: Vec<u8> = my_string.as_bytes().to_vec();
String to &[u8]:
let my_string: String = "some string".to_owned();
let my_bytes: &[u8] = my_string.as_bytes();
String to Vec<u8>:
let my_string: String = "some string".to_owned();
let my_bytes: Vec<u8> = my_string.into_bytes();
Specifying the variable type is optional in all cases. Just added to avoid confusion.
Playground link: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=5ad228e45a38b4f097bbbba49100ecfc

`let v1: Vec<u8> = string.encode_to_vec();`
`let v2: &[u8] = string.as_bytes();`
two work difference, in some of library use ownership of bytes !! if you use as_bytes() see compiler error: must be static.
for example: tokio_uring::fs::File::write_at()
get a ownership of bytes !!
but if you need borrowing , use as_bytes()

Related

Inconsistent ranges of strings with greek letter

I'll show the code and the output, since it's easier to explain the issue.
Code and output in the commented lines:
let greekLetter = "β"
let string1 = greekLetter
/// string2 is the same as string1 but converted to NSString then back to String
let string2 = String(NSString(string: greekLetter))
print(string1.range(of: greekLetter)!)
/// prints: Index(_rawBits: 0)..<Index(_rawBits: 131072)
print(string2.range(of: greekLetter)!)
/// prints: Index(_rawBits: 0)..<Index(_rawBits: 65536)
The problem: A String that contains a greek letter returns a range that is different from the same String with the same greek letter that was converted to NSString and then back to String again.
Any ideas why?
Why this question is raised:
I'm doing some parsing and I need to find the range of specific string and then insert something else instead of it. Because of wrong ranges returned inserting strings in a wrong position due to wrong lower/upper bound location.
UPDATE 2:
Let's say I have a task: in a given string "β-1" change "1" to "2". And this string comes from the server.
Please look at this code sample:
let wordWithGreekLetter = "β-1"
var string1 = wordWithGreekLetter
let data = """
{ "name" : "\(wordWithGreekLetter)" }
""".data(using: String.Encoding.utf8)
struct User: Decodable {
let name: String
}
let user = try! JSONDecoder().decode(User.self, from: data!)
/// string2 is the same as string1 but decoded from the data
var string2 = user.name
let rangeOfNumberOne1 = string1.range(of: "1")!
string1.removeSubrange(rangeOfNumberOne1)
string1.insert("2", at: rangeOfNumberOne1.lowerBound)
/// RESULT: string1 = "β-2"
let rangeOfNumberOne2 = string2.range(of: "1")!
string2.removeSubrange(rangeOfNumberOne2)
string2.insert("2", at: rangeOfNumberOne2.lowerBound)
/// RESULT: string2 = "β2-"
As Rob explained in Why is startIndex not equal to endIndex shifted to startIndex position in String in Swift?, the raw bits of the index are an implementation detail, and you should not care about that value.
The actual problem is that (quote from Collection):
Saved indices may become invalid as a result of mutating operations.
so that rangeOfNumberOne1/2 may be no longer valid after you call removeSubrange() on the string.
In this particular case this may happen for string2 (which is bridged from an NSString) because removing a character may reorganize the internal storage. But this is pure speculation: what matters only is that the current code exhibits undefined behavior.
If you replace
let rangeOfNumberOne1 = string1.range(of: "1")!
string1.removeSubrange(rangeOfNumberOne1)
string1.insert("2", at: rangeOfNumberOne1.lowerBound)
by
let rangeOfNumberOne1 = string1.range(of: "1")!
string1.replaceSubrange(rangeOfNumberOne1, with: "2")
(and similarly for string2) then you'll get the same result "β-2" for both strings.

Swift, stringWithFormat, %s gives strange results

I searched for an answer the entire day but nothing really came close to answering my issue. I am trying to use stringWithFormat in Swift but while using printf format strings. The actual issue I have is with the %s. I can't seem to get to the original string no matter how I try this.
Any help would be much much appreciated (or workarounds).
Things I already did: tried all available encodings for the cString, tried creating an ObjC function to use for this, but when I passed the arguments from Swift I ran into the same strange issue with the %s, even if when hardcoded in the ObjC function body it appears to print the actual correct String.
Please find bellow the sample code.
Many thanks!
var str = "Age %2$i, Name: %1$s"
let name = "Michael".cString(using: .utf8)!
let a = String.init(format: str, name, 1234)
Expected result is quite clear I presume, however I get something like this instead of the correct name:
"Age 1234, Name: ÿQ5"
Use withCString() to invoke a function with the C string
representation of a Swift string. Also note that %ld is the correct
format for a Swift Int (which can be a 32-bit or 64-bit integer).
let str = "Age %2$ld, Name: %1$s"
let name = "Michael"
let a = name.withCString { String(format: str, $0, 1234) }
print(a) // Age 1234, Name: Michael
Another possible option would be to create a (temporary) copy
of the C string representation
(using the fact a Swift string is automatically converted to a C string when passed to a C function taking a const char * parameter,
as explained in String value to UnsafePointer<UInt8> function parameter behavior):
let str = "Age %2$ld, Name: %1$s"
let name = "Michael"
let nameCString = strdup(name)!
let a = String(format: str, nameCString, 1234)
print(a)
free(nameCString)
I assume that your code does not work as expected because name
(which has type [CChar] in your code) is bridged to an NSArray,
and then the address of that array is passed to the string
formatting method.
Use "%1$#" instead of "%1$s", and don't use the cString call.
This works for me:
var str = "Age %2$i, Name: %1$#"
let name = "Michael"
let a = String.init(format: str, name, 1234)

Swift: How to convert a String to UInt8 array?

How do you convert a String to UInt8 array?
var str = "test"
var ar : [UInt8]
ar = str
Lots of different ways, depending on how you want to handle non-ASCII characters.
But the simplest code would be to use the utf8 view:
let string = "hello"
let array: [UInt8] = Array(string.utf8)
Note, this will result in multi-byte characters being represented as multiple entries in the array, i.e.:
let string = "é"
print(Array(string.utf8))
prints out [195, 169]
There’s also .nulTerminatedUTF8, which does the same thing, but then adds a nul-character to the end if your plan is to pass this somewhere as a C string (though if you’re doing that, you can probably also use .withCString or just use the implicit conversion for bridged C functions.
let str = "test"
let byteArray = [UInt8](str.utf8)
swift 4
func stringToUInt8Array(){
let str:String = "Swift 4"
let strToUInt8:[UInt8] = [UInt8](str.utf8)
print(strToUInt8)
}
I came to this question looking for how to convert to a Int8 array. This is how I'm doing it, but surely there's a less loopy way:
Method on an Extension for String
public func int8Array() -> [Int8] {
var retVal : [Int8] = []
for thing in self.utf16 {
retVal.append(Int8(thing))
}
return retVal
}
Note: storing a UTF-16 encoded character (2 bytes) in an Int8 (1 byte) will lead to information loss.

Convert a string to base64

I need a simple thing: encode a string in base64. I found an example:
extern crate serialize;
use serialize::base64::{mod, ToBase64};
use serialize::hex::FromHex;
fn main() {
let input = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d";
let result = input.from_hex().unwrap().as_slice().to_base64(base64::STANDARD);
println!("{}", result);
}
Which seams to work but I don't understand why input contains only characters in HEX. Moreover, this Python code produces a different result:
base64.b64encode(input) # =>
'NDkyNzZkMjA2YjY5NmM2YzY5NmU2NzIwNzk2Zjc1NzIyMDYyNzI2MTY5NmUyMDZjNjk2YjY1MjA2MTIwNzA2ZjY5NzM2ZjZlNmY3NTczMjA2ZDc1NzM2ODcyNmY2ZjZk'
So I decided to do the following:
//....
let input = "some string 123";
let result2 = input.unwrap().as_slice().to_base64(base64::STANDARD);
let result3 = input.as_slice().to_base64(base64::STANDARD);
And it didn't compile due to the errors:
error: type `&str` does not implement any method in scope named `unwrap`
test1.rs:9 let result2 = input.unwrap().as_slice().to_base64(base64::STANDARD);
^~~~~~~~
test1.rs:9:34: 9:44 error: multiple applicable methods in scope [E0034]
So how do I encode a simple string in base64?
If you don't have hex input, try this:
let result = input.as_bytes().to_base64(base64::STANDARD);
to_base64 is only defined for a slice of bytes so you have to first call as_bytes on the string:
extern crate serialize;
use serialize::base64::{mod, ToBase64};
fn main() {
let input = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d";
let result = input.as_bytes().to_base64(base64::STANDARD);
println!("{}", result);
}
Input has the type &static str:
let input = "some string 123";
There is no unwrap defined for &'static str:
let result2 = input.unwrap().as_slice().to_base64(base64::STANDARD);
You already have a slice (&str) but you need &[u8]:
let result3 = input.as_slice().to_base64(base64::STANDARD);

Interpolate String Loaded From File

I can't figure out how to load a string from a file and have variables referenced in that string be interpolated.
Let's say a text file at filePath that has these contents:
Hello there, \(name)!
I can load this file into a string with:
let string = String.stringWithContentsOfFile(filePath, encoding: NSUTF8StringEncoding, error: nil)!
In my class, I have loaded a name in: let name = "George"
I'd like this new string to interpolate the \(name) using my constant, so that its value is Hello there, George!. (In reality the text file is a much larger template with lots of strings that need to be swapped in.)
I see String has a convertFromStringInterpolation method but I can't figure out if that's the right way to do this. Does anyone have any ideas?
This cannot be done as you intend, because it goes against type safety at compile time (the compiler cannot check type safety on the variables that you are trying to refer to on the string file).
As a workaround, you can manually define a replacement table, as follows:
// Extend String to conform to the Printable protocol
extension String: Printable
{
public var description: String { return self }
}
var string = "Hello there, [firstName] [lastName]. You are [height]cm tall and [age] years old!"
let firstName = "John"
let lastName = "Appleseed"
let age = 33
let height = 1.74
let tokenTable: [String: Printable] = [
"[firstName]": firstName,
"[lastName]": lastName,
"[age]": age,
"[height]": height]
for (token, value) in tokenTable
{
string = string.stringByReplacingOccurrencesOfString(token, withString: value.description)
}
println(string)
// Prints: "Hello there, John Appleseed. You are 1.74cm tall and 33 years old!"
You can store entities of any type as the values of tokenTable, as long as they conform to the Printable protocol.
To automate things further, you could define the tokenTable constant in a separate Swift file, and auto-generate that file by using a separate script to extract the tokens from your string-containing file.
Note that this approach will probably be quite inefficient with very large string files (but not much more inefficient than reading the whole string into memory on the first place). If that is a problem, consider processing the string file in a buffered way.
There is no built in mechanism for doing this, you will have to create your own.
Here is an example of a VERY rudimentary version:
var values = [
"name": "George"
]
var textFromFile = "Hello there, <name>!"
var parts = split(textFromFile, {$0 == "<" || $0 == ">"}, maxSplit: 10, allowEmptySlices: true)
var output = ""
for index in 0 ..< parts.count {
if index % 2 == 0 {
// If it is even, it is not a variable
output += parts[index]
}
else {
// If it is odd, it is a variable so look it up
if let value = values[parts[index]] {
output += value
}
else {
output += "NOT_FOUND"
}
}
}
println(output) // "Hello there, George!"
Depending on your use case, you will probably have to make this much more robust.