Project 2: Userprograms

For project 2, we had to implement processes and system calls. This page will discuss how it was implemented, and what project 2 exactly entails. After you read this, you should have a good idea one what project 2 was about.

I'm assuming you've read the pintos project 2 documentation from the Stanford Website. If not, read it or somethings in this page won't make sense to you. You don't have to understand all the stanford documentation, just at least read through it. If you find it too hard to understand, but fortunately you can also find powerpoints online that professors have made that documents this project as well, such as this one, although it won't be nearly as detailed, however, like powerpoints usually do, it will contain more visual aids.

Introduction: A Scenario

Before I talk about the project 2 code, I think it would be a good idea to talk about the life cycle of a user process/thread/program. You can think of a process, a thread, or a user program as representing the same thing for pintos. Keep in mind that they really shouldn't be thought of as this way, but for the sake of making this document easier to write, I'll talk of them as if they are. I'll be sure to point out their differences where applicable though.

For the life cycle of a process, the first thing that happens is the human requests for the computer to run one of its programs. Let's call this user program A, or UPA. For example, UPA could be just a hello world program:

For example, Main.c contains:

//User Program A
#include <stdio.h>
int main(void) {
 printf("Hello World\n");
 return 0; // essentially the same as saying exit(0);
 /** FYI, exit(0) would then be handled by a system call interrupt, 
   * which then calls the syscall_handler() in the directory 
   * userprogs/syscall.c 
   **/
}

UPA, or User Program A, would then be compiled and run by the user. The user would probably do this through the shell and type:

Let's label this command A:

$ gcc -o UPA main.c

Let's label this command B:

$ ./UPA

In pintos, we must do things a bit differently if you want to run this program on your pintos OS. The difference here is we actually get to see what goes on in an Operating System when the user types the two above commands. However, this is also another difference. We can't really run gcc on pintos (I'll explain why a bit later). We will have to get by with just loading up the already compiled program, the executable binary file, on to the pintos filesystem disk. So basically we are just running command B, since before we even boot up the OS we made sure to load up the binary file.

To better illustrate what I just said, I'm going to step through the entire process of how we can run the program UPA with pintos:

On your own OS that you use, such as Ubuntu or whatever flavor of Linux you enjoy using to develop code for this class (I point this out in case I offend some of you ), you would compile the program using gcc or whatever normally:
```
$ gcc -o /home/user/Documents/UPA main.c
```
You would then load up the generated binary file into your pintos disk.
```
 $ pintos -p /home/user/Documents/UPA/UPA -a UPA -- -q
```
As you can probably tell, -p tells pintos script where the program you want to load up is, and -a tells pintos script what the file name is.
This is the step that pretty much most resembles the user inputing in command B from above. However, we will actually get to see what the OS does behind the hood in order to get the UPA program to run later in this document:
```
$ pintos -q run "UPA"
```
On a side note, you would run UPA with arguments like this:
```
$ pintos -q run "UPA argument1 argument2 argument 3"
```
In case I offend some of you, you can do it like this as well:
```
$ pintos -q run 'UPA argument1 argument2 argument 3'
```
Please don't do this, pintos will think you are trying to run several different programs:
```
$ pintos -q run UPA argument1 argument2 argument 3
```
Quiz: How many arguments are there?
You can combine everything into one command, this includes creating a disk, loading up the program, and then booting up the OS and running UPA. Note the -f and -q, which tells the pintos script to format the disk, and quit pintos when done running UPA. On a side note, the "pintos" command that we have used in these 4 steps is a perl script located in the utils/ directory. This perl script takes in arguments such as -f, or -q, or even -p. From what I can tell, the perl script takes in these arguments, then boots up the actual pintos OS. Once the OS is booted up, the main thread, or the god thread, is initialized and starts running the code from init.c (I'll talk about that in a bit). Once the init.c program is running on the main thread, it then takes in the "run 'UPA'" command (I'll talk more about that in a bit as well). I just want to point out that there's an actually pintos script, and also a pintos OS that actually gets boot up, in case you mix these to things together or think of them as the same.
```
$ pintos --filesys-size=2 -p /home/user/Documents/UPA -a UPA -- -f -q run "UPA"
```

When we boot into pintos, it needs a thread to basically be a very crude shell. The crude shell takes in arguments such as "run."

SO when we run UPA, pintos or more specifically the main thread, which is running the crude shell code from init.c, will see that we have written run, which tells the OS that we are about to run a program. Then it looks at the second argument which is UPA, and calls process_execute with it:

process_wait( process_execute("UPA") ); // Note, if we did run "UPA arg1", then process execute would get that string instead. There is no args parsing that goes on here.

Process_execute then will call thread_create from thread.c to actually create the thread for running UPA. If you look in thread_create, a thread is created and is put on the scheduler. So, once the call to thread_create finishes, the created thread would then be free to start running, depending on how the scheduler is ran. The main thread would then wait for the child to finish loading. Here the general idea in code:

int proces_execute("UPA")
{
    tid_t tid = thread_create(); // create the thread
    wait_for_child_done_signal(); // wait for the child to finish setting itself up for running UPA
    if(child couldn't set itself up)
	    return -1;
	return tid;
}

So while the main thread is waiting, the child thread now goes into start_process. Note how in thread_create we passed in a pointer of the start_process function. So the child thread knows then to call start_process from process.c. In start_process, this is where we load the binary file of UPA into the thread, and then setup the stack for containing UPA's argument, in this case there will be none. Since the main thread is waiting on the child right now, once the child finishes up with start_process, it will notify the parent, in this case the main thread, that it has finished. Here's the general idea in code:

void start_process("UPA")
{
	boolean status = load("UPA"); // load the UPA binary file, or executable file, into the thread's allocated memory from disk, in order to run the program, then setup any other things such as the stack.
	
	send_child_done_signal_to_parent(); // notify parent that it has finished setting itself up, thus the parent knows to stop waiting
	
	if(status != true);
		exit(-1); // the child thread will exit if the load function failed

	/** There's a few lines of code here that causes the executable binary file machine code to start running, thus running UPA **/
}

boolean load("UPA") 
{
	/** For a few lines it attempts to load the executable file 
              * On a side note, if we had not loaded up the binary file into the disk earlier
              * the UPA executable binary file would thus not be found, thus load would fail.**/


	
	setup_stack(); // Then it calls setup_stack() only if nothing has failed, e.g. no memory running out.

	return true; // returns true only if load() is successful. 
}

Once UPA finishes, the main thread will close itself too, and thus the OS will then shutdown, note the -q argument from step 4. The main thread, which runs the crude shell code from init.c, is not created with process_execute (which uses thread_create) like the thread for running UPA was. Thus, any sort of initialization that you do for your thread/process implementation will also have to be done for the main thread as well. The main thread uses thread_init from threads.c to initialize itself. For example, both in thread_create and thread_init, you will find the same code used to initialize a process's file table linked list and child list.

Differences between a process, a thread, and a user program.

Although these three things can be thought of as the same, as pointed out earlier, we shouldn't do that. There are some differences. A process is one abstraction level higher than a thread. A thread deals more with the lower level side of things, such as scheduling threads or allocating memory for the thread. A process actually involves more with the user program that it runs. A process has a child parent relationship and as seen from the project 2 code, it actually loads up the user program and make sure it's set up correctly, and actually makes sure that the user program code is running, whereas a thread doesn't worry about that, it worries more about how threads are scheduled, which one should run, and as said before, the memory allocation for things not related to the user program code itself. A user program can be thought of as the highest level of abstraction, which involves only the user program code. More importantly, a user program can create threads and/or processes. several created threads from a user program share memory, whereas a new user program does not share any memory. A process thinks it has virtual memory, whereas a thread does not.

Summary

After reading everything from Project2: Userprograms wiki page, you should have an understanding of what exactly needs to be done for project 2. Just to summarize, we have to implement processes so that there is a parent child relationship between them, and we have to implement system calls so that processes can tell the Operating System to do things that it can't do, such as exec(), where the process creates a child process or exit(), where the process ends its life. Thus, we are implementing tools that a user program would use so that it can run correctly and access certain tools when needed. For example, the UPA program needs the system call exit(), and system call write(), for it to function properly. It also expects that the OS knows exactly how to actually run UPA, in this case knowing how to make a thread and set it up properly so that the UPA code can thus be run. Thus hopefully you see now why project 2 is called Userprograms. Gcc does run on pintos probably because there's a few system calls or the likes that we will never touch in pintos which it needs.

If you would like to see the implemenation, and a wiki that describes it as well, you can view it here.

Pintos Project User Programs - ZipingL/OperatingSystemsNotes GitHub Wiki

Project 2: Userprograms

Introduction: A Scenario

Differences between a process, a thread, and a user program.

Summary

⚠️ GitHub.com Fallback ⚠️

Pintos Project User Programs - ZipingL/OperatingSystemsNotes GitHub Wiki

Project 2: Userprograms

Introduction: A Scenario

Differences between a process, a thread, and a user program.

Summary

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️