r0bin.xyz

Home About Blog Projects

Reading Config Files with Rust

Oct 14, 2024 10 min tech rust coding crates +2 confee lilap

While working on lilap, I thought it’d be fun to give publishing a Rust crate a shot. The crate doesn’t do anything complicated or flashy, it’s a simple configuration file parser that keeps key-value pairs in a HashMap. My goal wasn’t to do something that had never been done before, but to share something I found useful. I also focused on making the API feel nice to use and ensuring that error handling was solid. In this post, I’ll take you through confee, the crate I published. We’ll go over design decisions and dive into the implementation for them juicy details.

Why did I create confee? Well, I needed a simple way to store key-value pair defaults that users could easily overwrite with a configuration file. I also wanted smooth and effortless type conversions, in order to use values more effectively in applications. I noticed that Rust doesn’t have any of this functionality built-in, and most of the crates I found on crates.io, at least in the first couple of pages, didn’t quite fit the bill. Some had too many features or too many dependencies, while others parsed more complex configuration languages. So, I decided to make confee, and publish it for others to use.

confee is structured like most other library crates out there:

/
├── Cargo.lock
├── Cargo.toml
├── README.md
├── examples/
│   ├── example.conf
│   └── main.rs
├── src/
│   ├── conf.rs
│   └── lib.rs
└── target/

I won’t spend time going over every little thing, only things that are relevant to confee, because the Rust docs are amazing!

Bam!

// ...
pub struct Conf {
    pairs: HashMap<String, String>,
    delim: Option<char>,
    conf_file_name: String,
    updated: bool,
    empty_string: String,
}
// ...

Did I scare you? Sorry.

At first, I wasn’t sure whether to make the pairs field private or public. Making it public would give users more flexibility and allow them to interact with the HashMap directly, but I started to question whether that was a good idea. For the use cases of confee, there really shouldn’t be a reason for users to access the underlying data structure directly. Once the configuration defaults set during initialization are updated with values from the configuration file, those values shouldn’t need to change again, think of them like command line arguments. So, what’s the point of exposing pairs? It just complicates things.

What about delim and conf_file_name? Enter with_delim() and with_file().

impl Conf {
    // ...
    pub fn with_delim(&mut self, delim: char) -> &mut Self {
        self.delim = Some(delim);
        self
    }
    // ...
    pub fn with_file(&mut self, conf_file_name: &str) -> &mut Self {
        self.conf_file_name = conf_file_name.to_string();
        self
    }
    // ...
}

Ok, do these look like setters to you? If they do, that’s because they are. I just like the look of this better:

    // ...
    conf.with_delim('=');
    conf.with_file(conf_file_name);
    // or
    conf.with_delim('=').and_file(conf_file_name);
    // because this is more ergonomic at the end:
    conf.with_file(conf_file_name).and_delim('=').update();
    // you might read this as:
    // update conf with file conf_file_name where delim is '='

Don’t worry, and_file() and and_delim() call with_file() and with_delim() respectively, they’re just eye candy.

The remaining fields, updated and empty_string, aren’t really relevant to this post. However, I have to mention that it’s a bit frustrating that the language rules don’t allow for something like &"".to_string(). Instead, you end up having to create a field called empty_string or something, just to return a reference to an empty string. I know it’s an issue with lifetimes, but man, it’s a bit annoying…

While designing the impl block for Conf, I wasn’t sure whether to use new() or something similar for initialization. Then, when I checked out what HashMap allows, it was like a light bulb went off in my head. I realized I could name the initialization function from(), since we’ll be creating Conf from the defaults that the user of the library provides.

impl Conf {

    pub fn from<const N: usize>(defaults: [(String, String); N]) -> Self {
        Self {
            pairs: HashMap::from(defaults),
            delim: None,
            conf_file_name: "".to_string(),
            empty_string: "".to_string(),
            updated: false,
        }
    }
    // ...
}

It’s starting to come together:

    // ...
    let mut conf = Conf::from([
        ("foo".to_string(), "bar".to_string()),
        ("yee".to_string(), "haw".to_string()),
    ]);
    match conf.with_file(conf_file_name).and_delim('=').update() {
        Ok(_)  => println!("Successfully updated configuration!"),
        Err(e) => panic!("Error updating configuration: {}", e),
    }
    // ...

Just take a look at how beautiful that looks! I can’t help but get emotional, so elegant, yet so simple.

Alright, now that we know how to create an instance of Conf and set some key fields to parse a configuration file, whether to overwrite some defaults or not, let’s dive into how the parsing actually works.

    // ...
    pub fn update(&mut self) -> Result<(), String> {
        let lines = self.read_lines()?;
        for line in lines {
            let i = line
                .find(self.delim.unwrap_or(DEFAULT_DELIM))
                .ok_or_else(|| format!("No delimiter found in line: {}", line))?;
            let key = line[..i].trim();
            let value = line[i + 1..].trim();
            self.pairs
                .entry(key.to_string())
                .and_modify(|v| *v = value.to_string());
        }
        self.updated = true;
        Ok(())
    }
    fn read_lines(&self) -> Result<Vec<String>, String> {
        let contents = fs::read_to_string(&self.conf_file_name).map_err(|e| e.to_string())?;
        Ok(contents.lines().map(String::from).collect())
    }
    // ...

Let’s walk through what’s happening here:

  1. Call read_lines() and store all the lines read from a file in lines. The ? operator propagates any errors that occur, returning early if there’s an error, or assigning a Vec<String> to lines on success.
    1. Read all of conf_file_name and store it as one string in contents. Because read_string() returns Result, handle the error, passing along the reason for the failure.
    2. Calling lines() treats each line as a string slice, which we then convert to String with map(String::from) (remember we can pass functions as closures but not closures as functions), and place all the strings in a vector with collect().
  2. Loop through each line in lines since Vector is iterable.
  3. Try to find the first instance of delim in line, where dellim is the default delimiter character if it has not been set by the user. If line does not contain delim exit early, if it is, store the index of the delim character in i.
  4. Extract they name of the property and it’s value into key and value respectively, and remove any whitespace at either end of the strings.
  5. Index into pairs which is a HashMap<String, String>, but don’t insert key if it doesn’t exist. Modify key only if it exists with value
  6. Set updated and return success

Now, we can parse our configuration files, knowing exactly what’s happening. Let’s take a look at one last thing before we can fully understand this example.

How do we index into the HashMap if we are not exposing pairs? This is where trait impl blocks come in handy:

impl Index<&str> for Conf {
    type Output = String;

    fn index(&self, key: &str) -> &Self::Output {
        self.pairs.get(key).unwrap_or(&self.empty_string)
    }
}

This is operator overloading in C++, and allows us to do this:

    // ...
    let mut conf = Conf::from([
        ("foo".to_string(), "bar".to_string()),
        ("yee".to_string(), "haw".to_string()),
    ]);
    println!("foo: {}", conf["foo"]);
    // ...

That syntax is important and can come in handy from time to time. But where things really get interesting is here:

impl Conf {
    // ...
    pub fn get<T: FromStr>(&self, key: &str) -> Option<T> {
        self.pairs.get(key).and_then(|v| v.parse::<T>().ok())
    }
}

This method lets us look up key in pairs and then tries to convert the String value to a type T, as long as T implements the FromStr trait, returning None on conversion failure. We can now be total nerds:

    // ...
    let mut conf = Conf::from([
        ("log".to_string(), "stdout".to_string()),
        ("dir".to_string(), "/var/www/html/".to_string()),
        ("addr".to_string(), "127.0.0.1".to_string()),
        ("port".to_string(), "8080".to_string()),
    ]);
    match conf.with_file(&args[1]).update() {
        Ok(_) => println!("Successfully updated configuration!"),
        Err(e) => panic!("Error updating configuration: {}", e),
    }
    let dir: PathBuf = conf.get("dir").unwrap();
    let addr: IpAddr = conf.get("addr").unwrap();
    let port: u16 = conf.get("port").unwrap();
    // ...

Since get() returns an Option, it’s a good idea to handle the None case rather than using unwrap(), which could cause a panic. This approach keeps the underlying data structure simple, and it allows values to be converted to any type the user needs, while still handling errors properly.

So that’s confee. I hope you had fun diving into it with me. I do plan on updating it in the future, especially as part of lilap.