KAIST-CS431: Lock Based API

标准库中的并发API#

Rust 标准库中基于锁的 API 主要围绕几个核心原语构建，这些原语提供了不同级别的并发控制和用途。它们都包含在 std::sync 模块中。以下是一些最常用的基于锁的 API：

std::sync::Mutex<T> (互斥锁)
- 用途： 最常见的互斥锁，用于保护共享数据，确保一次只有一个线程可以访问该数据。
- 特点：
  - 当线程尝试获取已被锁定的 Mutex 时，它会阻塞直到锁被释放。
  - 提供内部可变性（&T -> &mut T）通过 RAII (Resource Acquisition Is Initialization) 机制，即 MutexGuard。当 MutexGuard 离开作用域时，锁会自动释放。
  - 是“poisoning”感知的：如果持有锁的线程在锁被释放前发生 panic，Mutex 会被标记为 poisoned。后续尝试获取锁的线程会得到一个 PoisonError，其中包含原始的 MutexGuard，允许它们决定如何处理被中断的数据。
- 何时使用： 当你需要独占访问某个共享资源时，例如全局计数器、数据结构或配置设置。
std::sync::RwLock<T> (读写锁)
- 用途： 允许多个读取者同时访问共享数据，但只允许一个写入者独占访问数据。
- 特点：
  - 读取者（read()）： 允许多个线程并行获取读锁。只要没有写入者持有锁，所有读锁请求都会成功。
  - 写入者（write()）： 只允许一个线程获取写锁。当有写入者持有锁时，所有读锁和写锁请求都会阻塞。
  - 也提供 RAII 机制，通过 RwLockReadGuard 和 RwLockWriteGuard。
  - 同样是“poisoning”感知的。
- 何时使用： 当你的数据被频繁读取但很少写入时，RwLock 可以提供比 Mutex 更好的并发性能。例如，一个缓存系统或一个配置对象。
std::sync::Once (只运行一次)
- 用途： 确保某个代码块（一个初始化函数）在程序生命周期中只被执行一次，即使有多个线程同时尝试触发它。
- 特点：
  - call_once() 方法会执行一个闭包。第一次调用会实际执行闭包，后续的调用会等待第一次调用完成，但不会再次执行闭包。
  - 通常用于惰性初始化全局数据或单例模式。
- 何时使用： 初始化全局静态变量（例如日志系统、配置加载器）或实现单例模式。通常与 lazy_static crate (在稳定版 Rust 中) 或 std::sync::OnceLock (在 1.70+ 版本中，见下文) 结合使用。
std::sync::Barrier (屏障)
- 用途： 用于同步一组线程，确保所有线程都到达某个预定义点后才能继续执行。
- 特点：
  - 通过 wait() 方法实现等待。当调用 wait() 的线程数量达到预设值时，所有等待的线程都会同时被释放。
  - wait() 返回一个 BarrierWaitResult，指示当前线程是否是最后一个到达屏障的。
- 何时使用： 需要协调多个并行任务的执行，例如在某个阶段结束后开始下一阶段，或者在所有子任务完成后进行汇总。

Rust 1.70+ 中引入的更现代的基于锁的 API：

自 Rust 1.70 版本开始，std 库引入了两个新的、更方便的 API，用于解决常见的同步问题：

std::sync::OnceLock<T> (一次性锁桶)
- 用途： 用于实现惰性、安全的单次初始化，特别适合静态和全局变量。它是 lazy_static crate 的标准库替代品。
- 特点：
  - get_or_init() 方法：如果 OnceLock 尚未初始化，它会使用提供的闭包初始化它，并返回一个对内部数据的引用。如果已经初始化，则直接返回引用。
  - 线程安全： 保证闭包只执行一次，并且所有线程都会看到相同的结果。
  - 不需要手动 Mutex 或 RwLock 就能实现安全的惰性初始化。
- 何时使用： 初始化全局配置、数据库连接池、日志器等，尤其是在不确定这些资源何时会被首次访问时。
std::sync::LazyLock<T> (惰性锁桶)
- 用途： 类似于 OnceLock，但它在第一次访问时才进行初始化，且语法更为简洁，特别适合于 static 变量。它是 lazy_static crate 的标准库替代品。
- 特点：
  - LazyLock 是一个结构体，它包含一个值和初始化它的闭包，并在第一次通过 Deref 访问时才执行闭包。
  - 声明方式： 类似于 static MY_VALUE: LazyLock<MyType> = LazyLock::new(|| my_initializer());
  - 提供了比 OnceLock 更平滑的语法糖来处理初始化。
- 何时使用： 当你需要一个在程序启动时不立即初始化，而是在第一次使用时才初始化，并且在整个程序生命周期中都不会改变的 static 常量时。

第三方库中的并发 API#

parking_lot#

Amanieu/parking_lot: Compact and efficient synchronization primitives for Rust. Also provides an API for creating custom synchronization primitives. (github.com)

一个优化的第三方并发原语库。和标准库主要不同是parking_lot库中的mutex是没有lock poisoning的。正常情况下感觉用不到这个库。。。

关于lock poisoning的解释，可以参考官方博客：Launching the Lock Poisoning Survey | Rust Blog (rust-lang.org)

So what is poisoning anyway?#

Let’s say you have an Account that can update its balance:
impl Account {
    pub fn update_balance(&mut self, change: i32) {
        self.balance += change;
        self.changes.push(change);
    }
}
Let’s also say we have the invariant that balance == changes.sum(). We’ll call this the balance invariant. So at any point when interacting with an Account you can always depend on its balance being the sum of its changes, thanks to the balance invariant.

There’s a point in our update_balance method where the balance invariant isn’t maintained though:
impl Account {
    pub fn update_balance(&mut self, change: i32) {
        self.balance += change;
//      self.balance != self.changes.sum()
        self.changes.push(change);
    }
}
That seems ok, because we’re in the middle of a method with exclusive access to our Account and everything is back to good when we return. There isn’t a Result or ? to be seen so we know there’s no chance of an early return before the balance invariant is restored. Or so we think.

What if self.changes.push didn’t return normally? What if it panicked instead without actually doing anything? Then we’d return from update_balance early without restoring the balance invariant. That seems ok too, because a panic will start unwinding the thread it was called from, leaving no trace of any data it owned behind. Ignoring the Drop trait, no data means no broken invariants. Problem solved, right?

What if our Account wasn’t owned by that thread that panicked? What if it was shared with other threads as a Arc<Mutex<Account>>? Unwinding one thread isn’t going to protect other threads that could still access the Account, and they’re not going to know that it’s now invalid.

This is where poisoning comes in. The Mutex and RwLock types in the standard library use a strategy that makes panics (and by extension the possibility for broken invariants) observable. The next consumer of the lock, such as another thread that didn’t unwind, can decide at that point what to do about it. This is done by storing a switch in the lock itself that’s flipped when a panic causes a thread to unwind through its guard. Once that switch is flipped the lock is considered poisoned, and the next attempt to acquire it will receive an error instead of a guard.

The standard approach for dealing with a poisoned lock is to propagate the panic to the current thread by unwrapping the error it returns:
let mut guard = shared.lock().unwrap();
That way nobody can ever observe the possibly violated balance invariant on our shared Account.

That sounds great! So why would we want to remove it?

What’s wrong with lock poisoning?#

There’s nothing wrong with poisoning itself. It’s an excellent pattern for dealing with failures that can leave behind unworkable state. The question we’re really asking is whether it should be used by the standard locks, which are std::sync::Mutex and std::sync::RwLock. We’re asking whether it’s a standard lock’s job to implement poisoning. Just to avoid any confusion, we’ll distinguish the poisoning pattern from the API of the standard locks by calling the former poisoning and the latter lock poisoning. We’re just talking about lock poisoning.

In the previous section we motivated poisoning as a way to protect us from possibly broken invariants. Lock poisoning isn’t actually a tool for doing this in the way you might think. In general, a poisoned lock can’t tell whether or not any invariants are actually broken. It assumes that a lock is shared, so is likely going to outlive any individual thread that can access it. It also assumes that if a panic leaves any data behind then it’s more likely to be left in an unexpected state, because panics aren’t part of normal control flow in Rust. Everything could be fine after a panic, but the standard lock can’t guarantee it. Since there’s no guarantee there’s an escape hatch. We can always still get access to the state guarded by a poisoned lock:
let mut guard = shared.lock().unwrap_or_else(|err| err.into_inner());
All Rust code needs to remain free from any possible undefined behavior in the presence of panics, so ignoring panics is always safe. Rust doesn’t try guarantee all safe code is free from logic bugs, so broken invariants that don’t potentially lead to undefined behavior aren’t strictly considered unsafe. Since ignoring lock poisoning is also always safe it doesn’t really give you a dependable tool to protect state from panics. You can always ignore it.

So lock poisoning doesn’t give you a tool for guaranteeing safety in the presence of panics. What it does give you is a way to propagate those panics to other threads. The machinery needed to do this adds costs to using the standard locks. There’s an ergonomic cost in having to call .lock().unwrap(), and a runtime cost in having to actually track state for panics.

With the standard locks you pay those costs whether you need to or not. That’s not typically how APIs in the standard library work. Instead, you compose costs together so you only pay for what you need.

crossbeam#

crossbeam-rs/crossbeam: Tools for concurrent programming in Rust (github.com)

一个并发编程的工具集，在标准库外提供了很多扩展。CS431的讲师也是其中的maintainer之一。

Scope in crossbeam_utils::thread - Rust (docs.rs)将线程的生命周期限制在一个scope内。
- 目的是为了安全地共享非'static的变量；
- 原理是通过添加一个scope，保证spawn的线程在scope的生命周期结束之前终止，来保证变量的安全性。
CachePadded in crossbeam_utils - Rust (docs.rs) cacheline填充，避免伪共享。1
crossbeam_channel - Rust (docs.rs) 一个类似于Go中的channel的multi-producer multi-consumer的channel库。同时还提供了select!原语提供Go中select多信道监听的功能。

伪共享#

Understanding False Sharing – Parallel Computing (wordpress.com)

rayon#

rayon-rs/rayon: Rayon: A data parallelism library for Rust (github.com)

Rayon 是一个Rust的数据并行计算库。它非常轻巧，可以轻松地将顺序计算转换为并行计算。同时保证不会有数据争用情况出现。

use rayon::prelude::*;
fn sum_of_squares(input: &[i32]) -> i32 {
    input.par_iter() // <-- just change that!
         .map(|&i| i * i)
         .sum()
}

cs431#

作者自己实现的一个Lock Based API，可以参考学习锁机制。

cs431/spinlock.rs at main · kaist-cp/cs431 (github.com)