Project Pintos Filesys - ZipingL/OperatingSystemsNotes GitHub Wiki

Pintos Filesys

Intro

For Filesys, we were required to implement sub directories and files beyond the root level. Currently, pintos only supports reading and creating files from root level directory, the only directory right now. We have to add the ability to add new directories and sub directories and sub files. With directories implemented, the required system calls for directories can then be implemented as noted in the stanford doc, e.g. chdir().

Lastly, we are required to implement growth for these files/directories. Writing past the EOF should not return an error or 0 anymore, it should instead grow the file so that the writing can fit. Only when there is no file disk memory left, a writing past EOF should return error.

A scenario

I'm gonna give an example before going into more details just to give more clarity. The key terms would then be explained after. An example would be say I create a new file 0 bytes, called "A".

Suppose that in this pintos fs_device layout: There are no directories except for root. The fs_device memory block is a total of 15 sectors large. The top row is the actual data type. The bottom row is the sector number. Any empty blocks are unused.

Fyi, pintos uses the same bitmap for its inode and data blocks, which makes things simpler for us.

BITMAP_INODE ROOT_DIR_INODE A
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Under the hood

The OS would then probably get a system call: create(), which calls filesys_create():

            create("./A", 0);
            create("A", 0);
            create("/A", 0);

Any of the above commands would create file A in the root directory. Thus, filesys_create() needs to parse the filepath correctly and check if it exists.

Then the filesys_create() does 4 things:

  1. It allocates an inode sector from the bitmap.

  2. It then adds the file's entry, containing which sector it is at and the string name of it, to the directory as specified by the path of the file.

  3. Then it will allocate any sectors for the actual file data. Since file size is zero, no sectors will be allocated for the data.

  4. Note, how it closed the directory which had the new file, at the end of the creation process. This is important, since close either deletes a file, or it closes it. By closing, meaning it saves the file's data that was read and manipulated on memory, into the the file disk memory, thus updating the file on the disk, more on this later. Note, directory.c/h is written for you, since it calls inode_write/inode_create. Inode_write, and inode_create, along with other functions in inode.c, needs to be changed of course.

             filesys_create("/A", 0) {
                 char file_name[2] = {'A', 0};
                 bool success = (dir != NULL
                           && free_map_allocate (1, &inode_sector) // 1)
                           && inode_create (inode_sector, initial_size) // 2)
                           && dir_add (dir, name, inode_sector)); // 3)
                 if (!success && inode_sector != 0) 
                   free_map_release (inode_sector, 1);
                 dir_close (dir); // 4)
             }
    

Inode_create() then makes a struct called inode_disk. struct inode_disk is the inode which will be on the filesystem, thus the size is required to be DATA_BLOCK_SIZE, as it is intended to be only on data block large. The version of the inode which is on memory is called struct inode. An inode and its inode_disk represent the same file. The key difference is the inode on memory does not have a size limit (the struct itself can be as large as you'd like). With these two inodes in mind, the inode on memory needs to make sure it has the latest version of the inode from the filesystem. The inode on disk (inode_disk) needs to make sure it gets properly updated. Proper design and use of opening and closing files and directories would insure this can happen: for an example, the importance of dir_close in filesys_create(). In regards to opening files/dirs, understanding when to reopen directory is important.

Note in this scenario, block_sector_t sectors would be 0, and thus free_map_allocate will not allocate inode sectors using the bitmap since the number of specified sectors to allocate is 0. Note that in this file implementation, free_map_allocate() has an input parameter sectors. It simply allocates the required number of sectors consecutively. For example. if I requested 4 sectors, I would get for example sectors 1,2,3,4. To read the file data, I don't have to worry much about where the data starts and ends, as I can look at the sectors 1,2,3,4 as one long array on the file. This becomes an issue when you no longer allocate sectors consecutively, since the boundaries of the data would then be more complex than just one long array, as seen in double indirect, and indirect blocks.

Note, for (i = 0; i < sectors; i++), which writes the inode_disk struct and its datablocks into the filesystem memory.

            bool
            inode_create (block_sector_t sector, off_t length)
            {
              struct inode_disk *disk_inode = NULL;
              bool success = false;
            
              ASSERT (length >= 0);
            
              /* If this assertion fails, the inode structure is not exactly
                 one sector in size, and you should fix that. */
              ASSERT (sizeof *disk_inode == BLOCK_SECTOR_SIZE);
            
              disk_inode = calloc (1, sizeof *disk_inode);
              if (disk_inode != NULL)
                {
                  size_t sectors = bytes_to_sectors (length);
                  disk_inode->length = length;
                  disk_inode->magic = INODE_MAGIC;
                  if (free_map_allocate (sectors, &disk_inode->start)) 
                    {
                      block_write (fs_device, sector, disk_inode);
                      if (sectors > 0) 
                        {
                          static char zeros[BLOCK_SECTOR_SIZE];
                          size_t i;
                          
                          for (i = 0; i < sectors; i++) 
                            block_write (fs_device, disk_inode->start + i, zeros);
                        }
                      success = true; 
                    } 
                  free (disk_inode);
                }
              return success;
            }

Now say that I call write to file A. Normally, this would fail, since I would be writing past the EOF. Since we must implement file growth, we would then need to create more data blocks for the file and update any metadata accordingly.

Since we are implementing indirect and dbl_indirect blocks, any code utilizing inode metadata would need to be changed to acknowledge indirect and dbl_indirect data blocks, not just direct.

Terms

Bitmap

Keeps tracks of which sectors are being used for both the inode and data blocks. A block is just one sector.The root dir is at sector 1. The bitmap is at sector 0. Thus, you can signify an invalid sector number with the number 0, as the bitmap is at sector 0. free_map_allocate() allocates sectors based on the bitmap.

Sectors and Blocks

This is just a number that maps to a block. This number is then further changed by block_read/block_write, by adding another constant to the sector number to get to the fs_device (filesystem memory) sector number. Think of it as a partiion, where all the file data and directory data goes into. Note how block_read/block_write take in "fs_device" as a parameter. A block contains either an inode or data.

Inode and Inode Disk

Recall an inode holds metadata for a file or directory, including where the data blocks are located. Inode is the inode on the memory. Inode_disk is the inode on the file disk memory. Inode can be any size as its in memory, inode_disk can only be 512 bytes large as it sits in one block in the filesystem disk memory. Both need to be correctly updated to insure they match each other at the required occasations, and not match at the accepatble occasations as noted in the standford doc.

Persistence

Pintos tests whether or not the contents of the file disk memory are correct after the OS is shutdown. As noted in the stanford doc, directories and growth need to be implemented to pass these tests since the tar program which is ran to archive the filesystem uses all growth/directory features. This created archive is then compared with the solution.

The Tar program basically goes through all the directories and files in filesystem disk starting from the root directory and copies the contents into a file it creates. The newly created archive file thus can grow to become quite large.

Our Plan of Attack

  • Implement Directories (key file filesys.c)
    • All directory tests passing only
  • Implement Growth: indirect blocks (key file inode.c)
    • All growth/directory tests passing only
    • Some persistence tests passing
  • Implement Growth: dbl_indirect blocks (key file inode.c)
    • All persistence tests passing