mthreads - brown-cs1690/handout GitHub Wiki
In this project you will build on your solution for the uthreads
assignment to make it multiprocessor-safe. In uthreads, you
implemented a threading library with an N-to-1 (N uthreads multiplexed
on 1 "kernel" thread) threading model. In mthreads, you will be
implementing a library with an M-to-N (M uthreads multiplexed on N
"kernel" threads) threading model. You can think of pthreads
as kernel
threads. Having completed this assignment, your threading library will
support user-level threads running in parallel on many processors!
Another way of thinking about mthreads is that it's as if one were implementing threads in the kernel, in a system with multiple processors. Since we're doing all this as user code, we represent processors as POSIX threads (pthreads) and implement spin locks as POSIX mutexes. We represent interrupts using signals — masking interrupts then becomes masking signals.
Download the stencil here.
If this is your first time installing a Github Classroom assignment, or you need a refresher, please refer to the Github and Gradescope guide. This will walk you through how to get setup with Git and install an assignment.
Please be sure to run and compile your code in a Linux environment. Some options are working on the department machines (over SSH) or setting up a virtual environment using Vagrant/VirtualBox. Please see our Developing Locally Guide for more details.
This project is due on March 1st at 11:59 PM ET.
We know that debugging uthreads
has been a bit hectic for M1 Mac Users, due to issues with emulation (especially when using gdb
). For this assignment, you will be running the program natively (i.e., compile it to ARM assembly and run it as a native program), albeit still in the container.
To do so, we've implemented a new flag in the run-container
script. Before proceeding, please pull changes from the os-devcontainer
repository (using git pull
). Please also rebuild the container, using ./build-container
. Once you do so, you can run:
./run-container --no-cc
to disable cross-compliation tools (i.e., tools allowing you to compile x86-64 code on an ARM computer).
Once you do so, you can run mthreads
normally. To use gdb
, you do not need to pass-in any special parameter anymore. You can proceed by running gdb test
(or any other binary you'd like to run).
NB: If you want to run Weenix in the future, you must exit the container and run the run-container
script without the --no-cc
flag.
M-to-N, also known as Hybrid Threading, is a threading model in which user-level application threads are multiplexed on a pool of kernel-level threads. Such kernel-level threads can be referred to as "virtual processors", since these threads may or may not actually be running in parallel -- N kernel threads may themselves share a single CPU.
In uthreads, each user-level thread had a context that the kernel
thread, which was actually the main thread of the program, switched into
when the associated thread was scheduled to run. In mthreads, the
story is basically the same only that instead of having just one main
thread, we have a number of pthreads
, each of which represents a
kernel thread in an OS. These pthreads
are referred to in the stencil
as a lightweight-processes (LWP). LWPs execute one runnable uthread
at
a time.
M-to-N threading is considerably more complicated to implement than N-to-1 threading because, in addition to masking preemption in the right places, the model must synchronize kernel-thread access to data structures including, but not limited to, the run queue, user-level mutexes/condition variables, and user-level thread structures. In fact, this is most of the work you will be doing.
When implementing this assignment it is also important to keep in mind
that a user-level thread may be scheduled on a number of different kernel
threads during its lifetime. This means that in a
routine like uthread_switch
, we need to ensure that data that is
necessary before and after a user thread is scheduled is not optimized
away into a CPU register by the compiler.
Finally, you will find that there are several instances (documented in the stencil) where you will have to perform some operation on a user-thread member after the thread in question has switched off the processor. Reasons for doing this are described in the stencil, but you are strongly encouraged to read chapter 5 of the textbook, which discusses various multiprocessor issues like this. Particular egregious errors include two processors executing the same thread, and a currently-executing thread's stack being freed while some user-level thread is executing on that stack.
mthreads will make use of a lot of code that should already be
familiar to you from the uthreads assignment. The logic for creating
threads, joining and detaching threads, scheduling, and dealing with
mutexes and condition variables is the same, only now you will find that
the concerned structures have been augmented with a pthread_mutex_t
,
which should be used in the appropriate contexts to synchronize various
LWPs contending for access.
You only need to worry about modifying functions with TODOs. Each of
these also contains a call to our NOT_YET_IMPLEMENTED
macro. To see a
list of all the functions you have yet to modify/implement, run
make nyi
in the project directory.
The majority the work in the assignment will go into implementing
uthread_switch
and lwp_switch
, which is the routine where LWPs
execute independent of any uthread
context, waiting for a runnable
uthread
to become available. Here's a list of functions you'll need to
implement / modify while working on this assignment:
// uthread.c
uthread_exit(void *status);
uthread_join(uthread_id_t uid, void **return_value);
uthread_detach(uthread_id_t uid);
make_reapable(uthread_t *uth);
// uthread_sched.c
uthread_yield(void);
uthread_wake(uthread_t *uth);
uthread_setprio(uthread_id_t id, int prio);
uthread_startonrunq(uthread_id_t id, int prio);
uthread_switch(utqueue_t *q, int saveonrq, pthread_mutex_t *m);
lwp_switch(void);
lwp_park(void);
uthread_runq_enqueue(uthread *thr);
uthread_runq_requeue(uthread *thr, int oldprio);
// uthread_mtx.c
uthread_mtx_lock(uthread_mtx_t *mtx);
uthread_mtx_trylock(uthread_mtx_t *mtx);
uthread_mtx_unlock(uthread_mtx_t *mtx);
// uthread_cond.c
uthread_cond_wait(uthread_cond_t *cond, uthread_mtx_t *mtx);
uthread_cond_broadcast(uthread_cond_t *cond);
uthread_cond_signal(uthread_cond_t *cond);
It may help to implement uthread_switch
and lwp_switch
before
proceeding to the other functions even though they require more work
because a) these routines interact with each other in a somewhat
complicated way and b) they make the majority of the heavy-weight
scheduling decisions that involve switching between the context of a
uthread
and the context of an LWP. The understanding you gain from
implementing these two functions should make editing the remaining ones
much easier.
Many of the functions mentioned above (particularly lwp_switch
) have
extensive comments in the source code which explain what is expected of
you, but to save you some time, we will give you a brief summary of how
the system works as a whole.
The first thing that any executable that uses your threads package
should call is uthread_start
. This should be called exactly once and
is responsible for setting up global data structures such as the
uthreads
array (as it did in uthreads).
At the end of uthread_start
, create_first_thr
is called. This
function is not only responsible for setting up the main thread and the reaper
thread, and placing them both on the run queue. It is also responsible for
setting up all the LWPs, by calling uthread_start_lwp
.
uthread_start_lwp
spawns a pthread which runs lwp_start
. It is
important to note that, at this point, each of these new LWPs has
its own ut_curthr
and curlwp
variables. This can be seen from the
declarations near the top of uthread.c
:
// uthread.c
__thread uthread_t *ut_curthr = 0;
__thread lwp_t *curlwp;
The __thread
storage class keyword is an extension to GCC that can be
used alone or with the extern
or static
qualifiers for a global,
file-scoped static, or function-scoped static variable to ensure that
each thread is allocated its own private copy of that variable. When
used with extern
or static
, __thread
must appear as the last
storage class keyword. Addresses of such variables can be used by other
threads, though you shouldn't need to worry about that for this
assignment. This type of storage is known as thread-local storage (TLS). GCC implements it for
POSIX threads (which you use as your LWPs; not for your uthreads. Note that TLS is not async-signal safe. This means that if a signal handler can access a TLS item, you must make sure that
when threads access the item, signals are masked.
The implications of this are really important and somewhat subtle. As in uthreads, time slicing is implemented using the SIGVTALRM signal, whose signal handler calls uthread_yield
. Thus, effectively, the thread being yielded to is in the signal handler. In particular, it could happen (somewhat unlikely, but possible) that a user thread is interrupted while accessing a TLS variable, whose value is computed while the user thread is running on one LWP, but the value is used when it's running on another. This can cause problems that are very difficult to identify and debug. To avoid this, make sure that VTALRM
signals are masked while a TLS variable is being accessed and used.
Each LWP can refer to its own context in
addition to the context of the uthread
that it is currently running.
Thus we can expect that ut_curthr
(the pointer to the user-thread information of the thread an LWP is currently running) will change frequently throughout
the LWP's lifetime (in lwp_switch
) as it switches among various
user-level threads, whereas curlwp
will remain constant for each LWP. so that
various uthreads
can switch back into the context of their invoking
LWPs if they need to block or yield. A
uthread
running within uthread_switch
uses the thread-local curlwp
to identify the context information that makes it possible for it to
jump back into lwp_switch
, whereas an LWP uses the ut_curthr->ut_ctx
to identify the uthread context it is to switch into.
By the end of uthead_start
all the LWPs are ready to begin multiplexing
uthread
s. The remainder of the assignment consists of ensuring that
the uthread
API (that you implemented in the previous assignment) is
thread-safe and multi-processor safe. The mthreads library still
supports user-level preemption, however your TAs have provided you with
a solution to uthreads that deals with this appropriately in most of
the uthreads API you will be working with.
In this assignment there will be multiple threads running at a time,
and on multiple processors. Thus, you must make sure you guard
data structures appropriately. Be sure to review uthread.h
, which
contains definitions for uthread_t
and lwp_t
, as well as other
struct definitions, for hints as to which data structure accesses need to
be synchronized. Importantly, you cannot make the assumption that
each uthread
is scheduled on exactly one LWP at a time. This is
something that you must ensure with proper synchronization!
The stencil as provided will not work because uthreads aren't synchronized, so running the stencil as given will not give you any information. You may, however, assume that any functions that do not have TODOs will work properly once you make the required functions (described in the Assignment section) thread-safe and multiprocessor-safe. You will not be held accountable for potential problems with the stencil, but you must modify your API to work with the stencil. If you write an alternate implementation that doesn't work with the given stencil, you will receive no credit.
Understanding the concepts from uthreads will be very important in mthreads. You may want to reread the uthreads assignment, including the sections Swapping Contexts, Time Slicing and Preemption, Dangers of Preemption, and the Reaper.
Conceptually, the reaper in mthreads serves the same purpose it did in
uthreads
. However, because the reaper can now run in parallel with
other uthread
s, we must be sure that it doesn't clean up resources
associated with a thread that has just called uthread_exit
before that
thread has switched away from its context and is no longer calling
functions on that context's stack. For this reason, before the reaper
calls uthread_destroy
, it must first attempt to lock the exited
thread's ut_pmut mutex to be sure that it is no longer using its
context.
Currently, it is not possible to take any (already compiled) program and have it use mthreads instead of pthreads. In order to use mthreads, you will need to add all of mthreads's files to its project, modify it to use the mthreads API functions and recompile it.
As always, use of gdb
will make your life much easier when trying to
get this assignment up and running. However, gdb
can sometimes get
confused in multithreaded programs, and may have trouble printing stack
traces. If that happens, don't worry; it doesn't mean your code is
broken. Also, since uthreads is built as a library, gdb
won't find
the symbols in it right away, so tell it to wait for the "future shared
library to load". See the Debugging Guide for a list of commands to use in gdb when debugging multithreaded programs.
We can't stress enough how important test code is in an assignment like
this. Without proper test code, finding bugs will be next to impossible.
Make sure to test all sorts of situations with lots of threads at
different priority levels. The Makefile
included with the assignment
will compile a simple test program which uses the uthreads functions,
just to get you started (run ./test
from the directory your uthreads
library is in). As a note, the threads may not print 'hello' in exact
order. If it runs and exits cleanly, most of your basic functionality is
working, but be sure to test more complicated cases. There are also a
series of preemption tests which you can run with ./preemptive_test <test>
where <test> is a test number, 0 through 6.
Judicious use of assert()
will help you both understand your threads
package and debug it. This is your first real systems-level coding
project, and it is highly recommended that you assert a general sane
state of the system whenever you enter a function. Thinking about what a
"sane state" means should lead to a greater understanding of what is
happening at any given time and what could go wrong.
To hand in your project please refer to the Github and Gradescope guide.
Remember to include a README that briefly describes any important design decisions you made and mentions known bugs!
You can use the cs1670_reformat
script (./cs1670_reformat *.c *.h
) to improve the format of your code, but you will not be graded on whether or not you have run the script.