Virtual File System Help Guide - brown-cs1690/handout GitHub Wiki

Resources

What is Virtual File System (VFS)?

  • VFS is an interface (API) between high-level kernel and various possible filesystems (e.g. ramfs, s5fs, procfs, ext4, zfs, ntfs)
    • You'll be using ramfs as your filesystem for VFS, but you'll be writing a filesystem during S5FS that will replace ramfs
    • The design here allows us to define general functions for filesystems such as namev_lookup() and do_write() which will utilize a vnode's corresponding function that is defined by the filesystem. For example, namev_lookup() will use a vnode's lookup operation. During VFS, that would then call ramfs_lookup()
      • In ramfs.c, look at the ramfs_dir_vops and ramfs_file_vops. You're not required to know how these work, but it would be helpful to look through ramfs.c because you will be writing similar functionality for S5FS and the ramfs functions are called when you use a vnode's operation ("vnops")
  • It partially defines operations like open(), close(), read(), write(), mkdir(), mknod(), etc., which call upon the underlying functions such as do_open(), and do_read()
    • You will be writing these syscalls in VM, which in the process ultimately calls upon the do methods that you write in VFS

Note for 1670 Students

These help guides are oriented for students completing the lab, so some steps may not be applicable to you (such as modifying proc.c)

Data Structures

Vnodes (vnode_t)

  • Can represent files, directories, links, block devices, and character devices. Depending on what it represents, the vnode operations and the associated routines may vary (vn_ops). These operations manipuate the data the vnode represents (reading, writing, etc.)
  • Each one is initialized by the filesystem
  • One for each directory and file (and link, etc.)
  • Contains operations (methods) for this object (function pointers to the underlying filesystem functions)
  • Contains attributes such as type of vnode and pointer to inode (filesystem implementation's counterpart to vnode)

Files (file_t)

  • Represents files on the filesystem
  • Contains the position in the file you're currently at
  • Contains the mode in which the file was opened (read, write, append, max)
  • Contains the underlying vnode corresponding to the file
  • Files are "open" when they're being pointed to in a process's file descriptor table. If a file descriptor is pointing to NULL then that file descriptor is not being used
    • Multiple processes or different file descriptor entries in the same process can point to the same file_t , so it's important to manage your locks correctly!

File System (fs_t)

  • Represents the filesystem itself
  • Includes info like the root vnode of the filesystem ("/"), FS ops (like reading vnodes, i.e. initializing a vnode from the underlying filesystem)
  • Only one mounted filesystem is required - and the mounting is hard-coded (don't need to worry about it). For more information about mounting, refer to the VFS Handout

Memory Management and Locking

  • Make sure to lock the vnode if you're using its vn_ops or modifying a vnode's field -- vlock(), vunlock(),vlock_in_order(), and vunlock_in_order() will be essential for managing vnode locks
    • These functions actually lock and unlock the underlying memory object for a vnode to protect access to the memory object's page frames when an operation blocks (when interacting with the disk). This will make more sense in later projects!
  • We only want vnodes to be around when the file is open or in-use to be efficient with memory
    • Vnodes are reconstructed by the filesystem by reading the corresponding data off the disk (filesystem op read_vnode())
    • To minimize space usage, we can keep a reference count and free the vnode when the ref count drops to zero
    • To manage refcounts for vnodes we use vref() (increment refcount) and vput() (decrement refcount). If a vnode is locked you can use vput_locked()
    • Weenix panics on shutdown if the refcount is nonzero
  • Be sure to properly manage refcounts in the functions you write. Think about the net gain/loss in refcounts for different functions. It will depend on whether you're actively using the vnode after the function's end. This can refer to you using the vnode outside of the function that you initialized it in
  • Be sure to look at ramfs routines that affect refcounts such as ramfs_lookup() -- you want to vput() the vnode if you're no longer using it after ramfs_lookup() is called
  • You will not have to call vget() directly to retrieve vnodes during VFS. However, you may end up calling fget() directly to retrieve files. The alternative is accessing the process's file descriptor table directly for the file, with the difference being that fget() refs the file whereas accessing the file through the file descriptor table directly doesn't
    • Refcounts for vnodes and files are different. A file_t's reference count indicates how many times a file_t has been referenced—in other words, the number of file descriptor entries that refer to it. The reference count of its corresponding vnode will increase when the file is created and it will decrease when the file's reference count drops to zero

Macros and Flags

Helpful Macros

  • There are macros that you should use during VFS. Some of these macros are defined in config.h
    • NFILES - maximum number of open files
    • NAME_LEN - maximum length for directory entry name
  • There are more macros defined in stat.h. To use these macros, you should pass in a vnode's vn_mode.
    • S_ISCHR(mode) - returns 1 if the vnode is a char device
    • S_ISDIR(mode) - returns 1 if the vnode is a directory
    • S_ISBLK(mode) - returns 1 if the vnode is a block device
    • S_ISREG(mode) - returns 1 if the vnode is a regular file

OFLAGS, File Modes, and Masking

  • There will be flags passed into do_open() and namev_open() referred to as "oflags". These flags are defined in fcntl.h
    • O_CREAT - create the file if it's non-existent
    • O_TRUNC - truncate the file to zero length
    • O_APPEND - append to the file
    • O_RDONLY - file is read-only
    • O_WRONLY - file is write-only
    • O_RDWR - file is accessible to reading and writing
  • To take advantage of flags, we use bit masking!
    • For example, to check if a file is open only for writing, you can check if (oflags & O_WRONLY) == O_WRONLY
  • There are flags that specify the mode in which a file was opened to restrict operations that can be performed on the underlying vnode. These flags are defined in file.h
    • FMODE_READ, FMODE_WRITE, FMODE_APPEND, FMODE_MAX_VALUE (all three)
  • In do_open() you will be converting oflags into file modes
    • If you find that a file's access is read-only then the mode will be FMODE_READ
    • To combine multiple mode flags, use |(FMODE_READ | FMODE_WRITE if the file is read and write accessible)
    • If a file is accessible to reading and writing and the O_APPEND oflag is set, then you should use FMODE_MAX_VALUE

To-Dos and To-Do'nts

What is completed for you?

  • Filesystem setup (mounting)
  • Fliesystem (ramfs)

What do you have to complete?

  • Name operations (vnode lookup)
    • namev_lookup(), namev_dir(), namev_open()
  • System calls (do_open(), do_close(), etc.) in vfs_syscall.c and open.c
  • chardev_file_[read,write] in vnode_specials.c
  • Modify procs for cleanup and setup (proc_create(), proc_destroy(), proc_cleanup()) in proc.c

Namev

namev_lookup()

  • Given a vnode for a directory and a filename in that directory, return the vnode for that particular file if it exists
  • Mostly delegates to filesystem-specific directory lookup routine (vn_ops->lookup)
  • Don't forget to check for errors and special cases!

namev_dir()

  • Given a full path, return the base part as a string and the vnode for the directory part
  • /usr/bin/ls: usr/bin = directory, ls = base
  • Use basename on department filesystem or your VM to get a feel for this (basename /bin/bash)
  • Use dirname to extract the directory
  • Parse the path into parts using namev_tokenize() and for each part of the path call namev_lookup() down the tree until you get to the end
    • Test namev_tokenize() in an online C editor. It's very helpful

namev_dir() example

  • Path = “/usr/bin/ls”
    • First namev_lookup() “usr” in the root directory (because it starts with a forward slash). Then, lookup "bin" in the vnode given by namev_lookup(). Return that vnode and return "ls" for the name
  • Manage refcounts correctly!
  • Make sure you can handle cases like these (use the department filesystem or VM to see what should happen when you use a path like below with various commands and system calls!):
    • ////course/////////////cs169//////
    • namev_tokenize() will be very helpful for this

namev_open()

  • Given a pathname and the flags from open(2), returns the vnode for the full path (creating it, if O_CREAT is specified and the file doesn't exist)
    • Uses namev_dir(), namev_lookup(), and if O_CREAT is specified, the mknod vnode operation

Syscalls

  • Reference the man pages for the syscalls!
  • Use the functions that you wrote in namev.c
  • Helpful files for these syscalls: file.h, fcntl.h, stat.h, and vnode.h

do_mkdir() example

  • If we want to do do_mkdir("/dev")
    • We want to create a directory called dev in the root directory
    • We call namev_dir() to get the root directory vnode and the basename "dev"
    • We do some error checking
    • We use the parent directory's vnode mkdir operation
      • ramfs_mkdir() will create the directory inside the root directory
    • Now, the root directory has a dev directory!

Files you'll be using during the project

You will be directly modifying the following files:

  • vfs_syscall.c
    • do_x operations that get called by syscalls you write in VM. These operations call the underlying vnode's operation to perform the action
  • namev.c
    • Operations dealing with path parsing. These functions will be widely used in syscalls, so it's recommended that you start here
  • open.c
    • File containing do_open() and a helper function for getting empty file descriptors
  • vnode_specials.c
    • Special vn_ops for block and char devices
  • proc.c
    • Manage file descriptor tables upon creating and destroying a process

It will be helpful to refer to the following files:

  • file.c/file.h
    • Functions for managing file refcount and fields for file_t struct
  • vnode.c/vnode.h
    • Functions for managing vnode locks and refcount
    • Fields for vnode_t struct
    • Descriptions for different vn_ops
  • ramfs.c
    • File system written for use during VFS. These functions are used when you call a vnode's operation such as vn_op->lookup calling ramfs_lookup()
  • kernel/include/fs/stat.h
    • Useful for looking at vn_mode masks and macros
  • kernel/include/fs/fcntl.h
    • Modes for file access. Helpful when implementing do_open() and namev_open()
  • kernel/include/config.h
    • Macros for various parts of Weenix, including filesystem and vfs configuration parameters

Testing

  • Write your own test code
  • Use the tests in vfstest.c (make sure you're using the file within the kernel folder as there is a vfstest.c that looks very similar in the usr folder)
    • kshell command: vfstest or call vfstest_main() in kmain.c (as you've done for Drivers and Procs
      • Weenix should be able to halt cleanly even after running vfstest multiple times!
    • Make sure the relevant debug printouts are on so you can see vfstest's full output
    • You can comment out tests to make sure that individual tests/sections pass (and don't prevent Weenix from halting cleanly)
      • If you're encountering refcount issues it would be very helpful to comment out each function in the test suite and uncomment them out slowly to see where the refcount issue occurs

Debugging

  • Use dbg() macro to log information
    • dbg(DBG_VFS, "Something bad happened %d\n", val)
    • First arg is a debug mode - there are lots of different debug modes
    • Use this for reference-count debugging!
  • Enable/disable debug printouts by setting INIT_DBG_MODES variable in kernel/util/debug.c
    • When your Weenix runs, you'll see things outputted like test results and other helpful information based on the flags you've enabled
    • Use test, vfs, fref, and vnref
    • You can find a list of possible flags in debug.h

FAQs

  • How can I tell whether a specified path is referring to a directory or a file? 
    • Paths ending in a trailing slash implies a directory
    • “a/b/”
    • “a/b”
  • For propagating errors, generally return errors returned from lower level functions. Read the man pages + stencil comments for what to return in what scenario! 
    • Note that you will be returning a negative error code
  • What should the output be if I pass all tests? How do I debug reference count issues? 
    • First make sure Weenix can halt cleanly without running vfstest
    • Systematically comment out the test functions in vfstest_main, run vfstest, and see if Weenix can halt
    • Try setting a conditional breakpoint on vref for the vnode of interest. Keep track of the vnode ref counts yourself and break if it reaches a number it is not supposed to be

Getting Started

  • Double check that VFS = 1 is set in Config.mk and run make clean all
  • Read. Read through all of the documentation we give you. It will save you a lot of time if you spend some time reading through things and understanding how VFS works. You won't get a 100% understanding on your first read, but it's good to have some baseline understanding of what you're implementing
  • Start with namev_lookup(), namev_dir() and namev_open()
  • Move onto the system calls. They will utilize your namev functions
    • Refer to man pages for syscalls as they tell you what particular error is used. Errors are also documented in the comments
    • Remember to return negative error codes
  • S5FS will rely heavily on this project, so make sure you make it very robust