While working on lilap, I thought it’d be fun to give publishing a Rust crate a shot. The crate doesn’t do anything complicated or flashy, it’s a simple configuration file parser that keeps key-value pairs in a HashMap
. My goal wasn’t to do something that had never been done before, but to share something I found useful. I also focused on making the API feel nice to use and ensuring that error handling was solid. In this post, I’ll take you through confee, the crate I published. We’ll go over design decisions and dive into the implementation for them juicy details.
Why did I create confee? Well, I needed a simple way to store key-value pair defaults that users could easily overwrite with a configuration file. I also wanted smooth and effortless type conversions, in order to use values more effectively in applications. I noticed that Rust doesn’t have any of this functionality built-in, and most of the crates I found on crates.io, at least in the first couple of pages, didn’t quite fit the bill. Some had too many features or too many dependencies, while others parsed more complex configuration languages. So, I decided to make confee, and publish it for others to use.
confee is structured like most other library crates out there:
/
├── Cargo.lock
├── Cargo.toml
├── README.md
├── examples/
│ ├── example.conf
│ └── main.rs
├── src/
│ ├── conf.rs
│ └── lib.rs
└── target/
I won’t spend time going over every little thing, only things that are relevant to confee, because the Rust docs are amazing!
Bam!
// ...
pub struct Conf {
pairs: HashMap<String, String>,
delim: Option<char>,
conf_file_name: String,
updated: bool,
empty_string: String,
}
// ...
Did I scare you? Sorry.
At first, I wasn’t sure whether to make the pairs
field private or public. Making it public would give users more flexibility and allow them to interact with the HashMap
directly, but I started to question whether that was a good idea. For the use cases of confee, there really shouldn’t be a reason for users to access the underlying data structure directly. Once the configuration defaults set during initialization are updated with values from the configuration file, those values shouldn’t need to change again, think of them like command line arguments. So, what’s the point of exposing pairs
? It just complicates things.
What about delim
and conf_file_name
? Enter with_delim()
and with_file()
.
impl Conf {
// ...
pub fn with_delim(&mut self, delim: char) -> &mut Self {
self.delim = Some(delim);
self
}
// ...
pub fn with_file(&mut self, conf_file_name: &str) -> &mut Self {
self.conf_file_name = conf_file_name.to_string();
self
}
// ...
}
Ok, do these look like setters to you? If they do, that’s because they are. I just like the look of this better:
// ...
conf.with_delim('=');
conf.with_file(conf_file_name);
// or
conf.with_delim('=').and_file(conf_file_name);
// because this is more ergonomic at the end:
conf.with_file(conf_file_name).and_delim('=').update();
// you might read this as:
// update conf with file conf_file_name where delim is '='
Don’t worry, and_file()
and and_delim()
call with_file()
and with_delim()
respectively, they’re just eye candy.
The remaining fields, updated
and empty_string
, aren’t really relevant to this post. However, I have to mention that it’s a bit frustrating that the language rules don’t allow for something like &"".to_string()
. Instead, you end up having to create a field called empty_string
or something, just to return a reference to an empty string. I know it’s an issue with lifetimes, but man, it’s a bit annoying…
While designing the impl
block for Conf
, I wasn’t sure whether to use new()
or something similar for initialization. Then, when I checked out what HashMap
allows, it was like a light bulb went off in my head. I realized I could name the initialization function from()
, since we’ll be creating Conf
from the defaults that the user of the library provides.
impl Conf {
pub fn from<const N: usize>(defaults: [(String, String); N]) -> Self {
Self {
pairs: HashMap::from(defaults),
delim: None,
conf_file_name: "".to_string(),
empty_string: "".to_string(),
updated: false,
}
}
// ...
}
It’s starting to come together:
// ...
let mut conf = Conf::from([
("foo".to_string(), "bar".to_string()),
("yee".to_string(), "haw".to_string()),
]);
match conf.with_file(conf_file_name).and_delim('=').update() {
Ok(_) => println!("Successfully updated configuration!"),
Err(e) => panic!("Error updating configuration: {}", e),
}
// ...
Just take a look at how beautiful that looks! I can’t help but get emotional, so elegant, yet so simple.
Alright, now that we know how to create an instance of Conf
and set some key fields to parse a configuration file, whether to overwrite some defaults or not, let’s dive into how the parsing actually works.
// ...
pub fn update(&mut self) -> Result<(), String> {
let lines = self.read_lines()?;
for line in lines {
let i = line
.find(self.delim.unwrap_or(DEFAULT_DELIM))
.ok_or_else(|| format!("No delimiter found in line: {}", line))?;
let key = line[..i].trim();
let value = line[i + 1..].trim();
self.pairs
.entry(key.to_string())
.and_modify(|v| *v = value.to_string());
}
self.updated = true;
Ok(())
}
fn read_lines(&self) -> Result<Vec<String>, String> {
let contents = fs::read_to_string(&self.conf_file_name).map_err(|e| e.to_string())?;
Ok(contents.lines().map(String::from).collect())
}
// ...
Let’s walk through what’s happening here:
read_lines()
and store all the lines read from a file in lines
. The ?
operator propagates any errors that occur, returning early if there’s an error, or assigning a Vec<String>
to lines
on success.
conf_file_name
and store it as one string in contents
. Because read_string()
returns Result
, handle the error, passing along the reason for the failure.lines()
treats each line as a string slice, which we then convert to String with map(String::from)
(remember we can pass functions as closures but not closures as functions), and place all the strings in a vector with collect()
.lines
since Vector
is iterable.delim
in line
, where dellim
is the default delimiter character if it has not been set by the user. If line
does not contain delim
exit early, if it is, store the index of the delim
character in i
.key
and value
respectively, and remove any whitespace at either end of the strings.pairs
which is a HashMap<String, String>
, but don’t insert key
if it doesn’t exist. Modify key
only if it exists with value
updated
and return successNow, we can parse our configuration files, knowing exactly what’s happening. Let’s take a look at one last thing before we can fully understand this example.
How do we index into the HashMap
if we are not exposing pairs
? This is where trait impl
blocks come in handy:
impl Index<&str> for Conf {
type Output = String;
fn index(&self, key: &str) -> &Self::Output {
self.pairs.get(key).unwrap_or(&self.empty_string)
}
}
This is operator overloading in C++, and allows us to do this:
// ...
let mut conf = Conf::from([
("foo".to_string(), "bar".to_string()),
("yee".to_string(), "haw".to_string()),
]);
println!("foo: {}", conf["foo"]);
// ...
That syntax is important and can come in handy from time to time. But where things really get interesting is here:
impl Conf {
// ...
pub fn get<T: FromStr>(&self, key: &str) -> Option<T> {
self.pairs.get(key).and_then(|v| v.parse::<T>().ok())
}
}
This method lets us look up key
in pairs
and then tries to convert the String
value to a type T
, as long as T
implements the FromStr
trait, returning None
on conversion failure. We can now be total nerds:
// ...
let mut conf = Conf::from([
("log".to_string(), "stdout".to_string()),
("dir".to_string(), "/var/www/html/".to_string()),
("addr".to_string(), "127.0.0.1".to_string()),
("port".to_string(), "8080".to_string()),
]);
match conf.with_file(&args[1]).update() {
Ok(_) => println!("Successfully updated configuration!"),
Err(e) => panic!("Error updating configuration: {}", e),
}
let dir: PathBuf = conf.get("dir").unwrap();
let addr: IpAddr = conf.get("addr").unwrap();
let port: u16 = conf.get("port").unwrap();
// ...
Since get()
returns an Option
, it’s a good idea to handle the None
case rather than using unwrap()
, which could cause a panic. This approach keeps the underlying data structure simple, and it allows values to be converted to any type the user needs, while still handling errors properly.
So that’s confee. I hope you had fun diving into it with me. I do plan on updating it in the future, especially as part of lilap.