Rust's View of Memory

I like to talk about Rust a lot on here, and I think people might be interested in a post about Rust's memory model. Despite looking very like C++ the semantics are somewhat different and that's why people bounce off Rust.

First, let's talk about C. It has manual memory management: I can allocate memory and I can free memory and when I do those things is up to me. I can have values on the stack, which might contain addresses of memory, hopefully memory I've allocated and initialized:

void main() {
  char *x = malloc(10); // 10 bytes
  // . . . do stuff  . . .
  free(x);
}

If I pass something to another function in C, it doesn't affect it in the original function:

void main() {
  char *x = malloc(10); // 10 bytes
  f(x);
  free(x); // I still have to remember to free it
}

void f(char *x) {
  printf("%c", x[0]); // or something
}

Rust introduces a concept of "lifetimes." The lifetime of a value is the scope where it's visible, and values are automatically dropped (equivalent of free) at the close of their lifetime:

fn main() {
  let x = vec![0u8; 10]; // a Vec of 10 bytes of zeroes, lifetime starts here
  // ...stuff
  // End of x's lifetime; implicitly dropped here
}

I can't use x before it's declared (obviously) and I can't use it after it's dropped; the scope and the lifetime are equivalent and the drop is inserted automatically.

Here's another example to make the scope thing clearer:

fn main() {
  let x = 0u8; // x's lifetime starts here
  {
    let y = 1u8; // y's starts here...
    println!("{}", y); // it's used here...
  } // ...and it ends here, the braces define a block scope, just like C
  // x's ends here
}

Other things besides blocks can end a lifetime, though. The big one is passing a value as a variable. Every value has exactly one "owner," one scope that defines its lifetime and is responsible for dropping it at the end. When you pass a value to another function, that other function becomes the owner and drops it:

fn main() {
  let x = vec![0u8; 10]; // x's lifetime starts here
  foo(x); // and ends here!
  // No implicit drop here because foo has already taken ownership
}

// Takes a Vec<u8> which it owns
fn foo(x: Vec<u8>) {
  println!("{}", x[0]) // We use x here
  // and because we now own it, a drop is inserted here for it
}

This is called "move semantics" and is the core idea behind Rust's memory model: passing a value as a parameter ends its lifetime. It's illegal (as in, will not compile) to use a variable after it's been passed to another function:

fn main() {
  let x = vec![0u8; 10];
  foo(x);
  foo(x); // Won't compile; x is no longer in scope after the first call
}

Using a value like this "consumes" it. Consider the C equivalent:

void foo(int *x) {
  printf("%c", x[0]);
  free(x); // We own x now so we're responsible for freeing it
}

void main() {
  char *x = malloc(10);
  foo(x);
  // using x here is (non-)obviously an error; foo freed it
}

foo is responsible for freeing whatever's passed into it, so passing something into it has to remove it from the caller's scope.

This would of course be a colossal pain in the butt to program in; only being able to use anything once. So there are a few tools we get to make it usable. One of them is borrows:

fn main() {
  let x = vec![0u8; 10];
  foo(&x); // Pass a borrow of x to foo, this does not change ownership
  // we still own x so we have an implicit drop here
}

fn foo(x: &Vec<u8>) {
  // We can do things with x but are not responsible for dropping it
}

And this equally obviously opens the door to shenanigans:

fn bar(borrows: &mut Vec<&Vec<u8>>) {
  let x = vec![0u8; 10]; // Allocate some memory...
  borrows.push(&x) // ...and store a borrow of it in a vec
  // x's lifetime ends here so we drop it
}

fn main() {
  let mut borrows: Vec<&Vec<u8>> = vec![]; // A vec of borrows-of-vecs
  bar(&mut borrows); // Pass (a borrow of) it to bar
  // Uh oh!
}

At the end of main, it owns borrows, which is a list of borrows-of-vecs, containing a borrow of bar's x, which has been dropped (at the end of bar because bar owned it). This is an error, we have a way to access a value that no longer exists! Luckily this won't compile:

error[E0597]: `x` does not live long enough
 --> src/main.rs:3:16
  |
1 | fn bar(borrows: &mut Vec<&Vec<u8>>) {
  |                          - let's call the lifetime of this reference `'1`
2 |   let x = vec![0u8; 10]; // Allocate some memory...
  |       - binding `x` declared here
3 |   borrows.push(&x) // ...and store it in a vec
  |   -------------^^-
  |   |            |
  |   |            borrowed value does not live long enough
  |   argument requires that `x` is borrowed for `'1`
4 |   // x's lifetime ends here so we drop it
5 | }
  | - `x` dropped here while still borrowed

For more information about this error, try `rustc --explain E0597`.

This is Rust's famous "borrow checker," a static analysis pass in rustc that ensures no borrows are held longer than the lifetimes of the things they're borrowing.

Another feature that helps us is Copy. Copy is a trait that simply means "if you memcpy this value, you get an identical, valid value back out." Lots of Rust builtin types implement Copy (think, i32, bool, that sort of thing) and any struct can implement Copy if all its members do.

fn main() {
  let x = 10u32; // u32 implements Copy
  square(x); // Implicitly make a new, anonymous u32 and pass it in
  // The original x is still available here; the copy was consumed
}

fn square(x: u32) -> u32 {
  // We own this u32 and will drop it
  x * x
}

// This struct can be Copy because all its members are
#[derive(Copy)]
struct PointXY {
  x: i32,
  y: i32
}

There's a similar trait, Clone, that lets you define a method that creates a new value. It behaves similarly in that if you pass a clone to something the original value is unaffected, but because calling clone() is potentially slower than the compiler inserting a memcpy, you have to explicitly call clone() so performance problems aren't hidden in implicit code.

The Rust standard library contains several utilities that help with memory also, and one more is Arc, letting you use reference-counted pointers to things:

fn main() {
  let s = Arc::new(vec![0u8; 1000000]); // A ref-counted pointer to some bytes
  // Cloning the Arc is very fast because it just increments the reference
  // count, it doesn't copy the actual Vec:
  baz(s.clone());
  baz(s);
}

fn baz(s: Arc<Vec<u8>>) {
  // We own (and will drop) the Arc, but dropping it just decrements
  // the counter. We haven't copied the Vec at all:
  println!("{}", s[0])
}

There are other tools like Cow and Box and so on, and other subtleties around thread safety, but this should give you a pretty good idea of how it works. By enforcing ownership of memory like this, Rust stops most memory leaks and all use-after-free errors and race conditions. Which is not to say that it's impossible to write memory bugs in Rust code, but it gives you more assurances, without the overhead of a garbage collector.