Rust Futures, Tasks, and Threads - rFronteddu/general_wiki GitHub Wiki

Many operating systems have supplied threading-based concurrency models for decades now, and many programming languages have support for them as a result. However, they are not without their tradeoffs. On many operating systems, they use a fair bit of memory for each thread, and they come with some overhead for starting up and shutting down. Threads are also only an option when your operating system and hardware support them! Unlike mainstream desktop and mobile computers, some embedded systems do not have an OS at all, so they also do not have threads!

The async model provides a different—and ultimately complementary—set of tradeoffs. In the async model, concurrent operations do not require their own threads. Instead, they can run on tasks, as when we used trpl::spawn_task to kick off work from a synchronous function throughout the streams section. A task is a lot like a thread—but instead of being managed by the operating system, it is managed by library-level code: the runtime.

In the previous section, we saw that we could build a Stream by using an async channel and spawning an async task which we could call from synchronous code. We could do the exact same thing with a thread!

fn get_intervals() -> impl Stream<Item = u32> {
    let (tx, rx) = trpl::channel();

    // This is *not* `trpl::spawn` but `std::thread::spawn`!
    thread::spawn(move || {
        let mut count = 0;
        loop {
            // Likewise, this is *not* `trpl::sleep` but `std::thread::sleep`!
            thread::sleep(Duration::from_millis(1));
            count += 1;

            if let Err(send_error) = tx.send(count) {
                eprintln!("Could not send interval {count}: {send_error}");
                break;
            };
        }
    });

    ReceiverStream::new(rx)
}

However, there is a significant difference between these two approaches behave, although we might have a hard time measuring it in this very simple example. We could spawn hundreds of thousands or even millions of async tasks on any modern personal computer. If we tried to do that with threads, we would literally run out of memory!

However, there is a reason these APIs are so similar. Threads act as a boundary for sets of synchronous operations; concurrency is possible between threads. Tasks act as a boundary for sets of asynchronous operations; concurrency is possible both between and within tasks. In that regard, tasks are kind of like lightweight, runtime-managed threads with added capabilities that come from being managed by a runtime instead of by the operating system. Futures are an even more granular unit of concurrency, where each future may represent a tree of other futures. That is, the runtime—specifically, its executor—manages tasks, and tasks manage futures.

On the one hand, concurrency with threads is in some ways a simpler programming model than concurrency with async. Threads are somewhat “fire and forget,” they have no native equivalent to a future, so they simply run to completion, without interruption except by the operating system itself. That is, they have no intra-task concurrency like futures can. Threads in Rust also have no mechanisms for cancellation—a subject we have not covered in depth in this chapter, but which is implicit in the fact that whenever we ended a future, its state got cleaned up correctly.

Tasks then give additional control over futures, allowing you to choose where and how to group them. And it turns out that threads and tasks often work very well together, because tasks can (at least in some runtimes) be moved around between threads. We have not mentioned it up until now, but under the hood the Runtime we have been using, including the spawn_blocking and spawn_task functions, is multithreaded by default! Many runtimes use an approach called work stealing to transparently move tasks around between threads based on the current utilization of the threads, with the aim of improving the overall performance of the system. To build that actually requires threads and tasks, and therefore futures.

As a default way of thinking about which to use when:

If the task is very parallelizable, like processing a bunch of data where each part can be processed separately, threads are a better choice. If the task is very concurrent, like handling messages from a bunch of different sources which may come in a different intervals or different rates, async is a better choice. And if you need some mix of parallelism and concurrency, you do not have to choose between threads and async. You can use them together freely, letting each one serve the part it is best at.

⚠️ **GitHub.com Fallback** ⚠️