LaVOZs

The World’s Largest Online Community for Developers

'; rust - Is there a good way to convert UTF-8 bytes in text (&str) to UTF-8 characters? - LavOzs.Com

I have a byte array which has UTF8 byte as str. I want to convert this array to UTF8 chars.

like this

["21", "22", "23", "24"]: [&str]

to

!"#$
use std::convert::TryFrom;

fn main() {
    let input = ["21", "22", "23", "24"];

    let result: String = input
        .iter()
        .map(|s| u32::from_str_radix(s, 16).unwrap()) // HEX string to unsigned int
        .map(|u| char::try_from(u).unwrap()) // unsigned int to char (unicode verification)
        .collect();

    assert_eq!(result, "!\"#$");
}

Then you can add the error verification if you want instead of unwrap. I would do that with an iterator:

use std::convert::TryFrom;
use std::error::Error;

struct HexSliceToChars<'a> {
    slice: &'a [&'a str],
    index: usize,
}

impl<'a> HexSliceToChars<'a> {
    fn new(slice: &'a [&'a str]) -> Self {
        HexSliceToChars {slice, index: 0 }
    }
}

impl<'a> Iterator for HexSliceToChars<'a> {
    type Item = Result<char, Box<dyn Error>>;

    fn next(&mut self) -> Option<Self::Item> {
        self.slice.get(self.index).map(|s| {
            let u = u32::from_str_radix(s, 16)?;
            let c = char::try_from(u)?;

            self.index += 1;

            Ok(c)
        })
    }
}

fn main() {
    let input = ["21", "22", "23", "24"];
    let result: Result<String, _> = HexSliceToChars::new(&input).collect();
    // Error handling
    let result = result.unwrap();

    assert_eq!(result, "!\"#$");
}

I think you're looking for the function std::str::from_utf8. From the docs:

use std::str;

// some bytes, in a stack-allocated array
let sparkle_heart = [240, 159, 146, 150];

// We know these bytes are valid, so just use `unwrap()`.
let sparkle_heart = str::from_utf8(&sparkle_heart).unwrap();

assert_eq!("💖", sparkle_heart);

For your example, you could change the input literal

fn main() {
    let input = [0x21, 0x22, 0x23, 0x24];
    let result = std::str::from_utf8(&input).unwrap();
    assert_eq!(result, "!\"#$");
}

or keep it stringly-typed and parse it to hex first

fn main() {
    let input = ["21", "22", "23", "24"];
    let parsed:Result<Vec<u8>, std::num::ParseIntError> = input
        .iter()
        .map(|s| u8::from_str_radix(s, 16))
        .collect();
    let parsed = parsed.expect("Couldn't parse input as hexadecimal");
    let result = String::from_utf8(parsed).expect("Input contains invalid UTF-8");
    assert_eq!(result, "!\"#$");
}
Related