Rust Lifetimes and Iterators

I’ve recently learned a new piece of Rust syntax related to specifying lifetimes with types that don’t have an explicit lifetime defined.

Confused already? So was I. The TLDR is here.

To save time for anyone who might already know about this “trick” here’s the resulting function definition:

1
pub fn events(&self) -> impl Iterator<Item = Event> + '_;

The rest of this article is about explaining the + '_ part, why it’s needed and what it solves. Also, VSCode + RLS currently does not highlight this properly, which tells me this is a pretty niche feature.

Before I begin explaining what the issue was and how it’s solved, I think it’s best to show the overall problem I was trying to “code my way around”.

The Problem Overview

I’ve been trying to refactor the code in Texel ASCII Art Editor so that I can abstract away the dependency on Termion. The main goal was to add support for Crossterm and thus enable Windows compilation. Termion and Crossterm are both “terminal libraries” allowing to both read input (keys etc.) and output text and commands to the terminal.

I already abstracted the drawing parts but still had to handle the input events loop which looked something like this:

1
2
3
4
5
6
7
8
9
10
11
12
// adds termion event parsing to Stdin
use termion::input::TermRead;

// construct event mapper from a character map stored on disk
let input_map = InputMap::from(config.char_map);

// c contains termion specific Result<termion::event::Event, std::io::Error>
for c in stdin().events() {
// maps termion events to internal Texel events (e.g. 'x' to 'SaveAndQuit')
let mapped = input_map.map_input(c.unwrap());
// do stuff with internal event
}

I needed to abstract away this whole loop and mapping. I thought if I do it all in one go and refactor the existing input_map code into something like a InputSource generic struct I could easily swap implementations around.

The idea was to have a loop that looked like this:

1
2
3
4
5
6
// construct input source + mapper from a character map stored on disk
let input_source = InputSource::from(config.char_map);

for mapped in input_source.events() {
// do stuff with internal event
}

The Code

The idea seemed straightforward at first. I’ve put the original InputMap code into the newly created Inputsource such that:

1
2
3
4
5
6
// Event is the internal Texel Event I want, e.g. `SaveAndQuit`
type RawMap = HashMap<termion::event::Event, Event>;

pub struct InputSource {
map: RawMap, // to be used by the new iterator for mapping TEvent -> Event
}

Now I needed to add an iterator that can consume the original termion::input::Events iterator that’s returned by the “expanded” Stdin::events() call.

This new iterator needs to unwrap the contents, unwrap the Result as well and then perform the original InputMap::map_input call somehow. Definition is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
struct MappedIter<'a> {
source: termion::input::Events<Stdin>, // termion iterator
map: &'a RawMap, // reference to the hash map, lives at least as long as this iterator
}

impl MappedIter<'_> {
fn map_input(&self, raw_event: termion::event::Event) -> Event {
// original mapping code
}
}

impl Iterator for MappedIter<'_> {
type Item = Event;

fn next(&mut self) -> Option<Event> {
// unwrap termion's Result and map to internal Event
match self.source.next() {
None => None,
Some(result) => Some(self.map_input(result.unwrap())),
}
}
}

So far so good. I have a working “iterator-mapper” constructed with a basic ownership model. InputSource gets constructed with the required mapping from the disk (config file) and stores it as a HashMap.

A note about the first use of '_: it means anonymous lifetime and in my mind translates roughly to “explicit lifetime, but elided”, so that we don’t need to name it and there’s no need to declare it at the impl level.

In both our implementation blocks here we don’t really use the lifetime for anything and so it can just be “elided explicitly” here.

The only remaining part is the InputSource::events method which should create a new mapping iterator each round. Here’s how that looked in my first iteration:

1
2
3
4
5
6
7
8
impl InputSource {
pub fn events(&self) -> MappedIter<'_> {
MappedIter {
source: stdin().events(), // termion's iterator
map: &self.map, // pass reference to the event map
}
}
}

First thing to notice here is that we use the anonymous lifetime '_ again, but in the return type. This has a different meaning than in the implementation blocks before. Whereas before it meant we don’t need an explicit named lifetime since we don’t “use” it, in here it means “single lifetime for output”. We’re basically saying that the return value is borrowing from self and thus needs to outlive this InputSource.

This code worked as-is but it has one major flaw. The MappedIter type is returned directly mandating it to be public. MappedIter is however a termion specific piece of code that I’d like to keep hidden. There’s no reason why I need the main program to “see” it either since all I care about is that it’s an Iterator<Item = Event>.

I thought I can easily solve this by changing the definition to:

1
2
3
4
5
6
7
8
impl InputSource {
pub fn events(&self) -> impl Iterator<Item = Event> {
MappedIter {
source: stdin().events(), // termion's iterator
map: &self.map, // pass reference to the event map
}
}
}

This however produced:

this return type evaluates to the "'static" lifetime...

The Error

The error is a bit cryptic but I looked at my code again and got the basic gist of the issue.

It’s best visible by looking at the two versions of the function definition side-by-side:

1
2
pub fn events(&self) -> MappedIter<'_>; // works
pub fn events(&self) -> impl Iterator<Item = Event>; // doesn't

The basic typing is correct, MappedIter definitely implements Iterator<Item = Event> correctly. The error also obviously points to a lifetime issue.

Having a second look made me realize that the real problem is my code is “lossy” here. If MappedIter didn’t require an explicit lifetime definition this would “just work”.

But MappedIter<'a> does require a lifetime definition because it has a reference to a HashMap inside it. The ownership model is very basic, InputSource owns the HashMap and MappedIter therefore has to live in its lifetime.

The problem is that impl Iterator<Item = Event> does not specify a lifetime, and also has no lifetime in its own definition.

At this point I thought that using the impl keyword here was impossible. How can I specify the lifetime requirement to the return type if the return type itself is a foreign trait? It’s not like I can just add <'a> to Iterator

The Solution

The solution as mentioned at the start, is to add the odd looking + '_ lifetime “addition” at the end. I actually went to IRC channel and bothered a human about this, but if I knew how to read properly the Rust compiler already told me a few lines below the error:

1
2
3
help: you can add a constraint to the return type to make it last less than `'static` and match the anonymous lifetime #1 defined on the method body at 49:5
|
49 | pub fn events(&self) -> impl Iterator<Item = Event> + '_ {

Fixed code:

1
2
3
4
5
6
7
8
impl InputSource {
pub fn events(&self) -> impl Iterator<Item = Event> + '_ {
MappedIter {
source: stdin().events(), // termion's iterator
map: &self.map, // pass reference to the event map
}
}
}

And there we have it right? Well, sure it compiles but.. why does adding + '_ fix this and more importantly, what does it mean?

Looking at the anonymous lifetime definition explains the core lifetime problem and to an extend the solution, but the examples shown there don’t do any kind of “summing”, so what’s going on here?

In order to understand why a plus sign is used here we need to look more at what the impl keyword means in the return type definition.

The documentation clearly says it, the syntax is fn name() -> impl Trait. This means that anything coming after the impl keyword in this context is a Trait definitions. The plus sign suddenly makes sense.

Traits can be “combined” with the plus sign, and more importantly for our use case, can be also combined with lifetime definition. The whole expression becomes in essence a new trait.

Adding the + '_ just means we’re defining out Iterator trait to have an explicit lifetime that gets elided to mean “for the duration of self“ thus fixing the “lossy” problem.

As an interesting side note, VSCode does not highlight this type of syntax properly at this time, which tells me this is a pretty niche syntax.