Rethinking the FHS Part 2 - gotolinux/gotolinux.github.com GitHub Wiki

Rethinking the FHS - Part 2

2013-04-30

The Layout So Far

After considerable deliberation a fairly clear file hierarchy had taken shape. The root level would simply be composed of a collection of user/groups and the single system directory, all of which were essentially the same.

System/
  Data
  Files
  Library
  Mount
  Programs
  Settings
User1/
  Data
  Desktop
  Files
  Library
  Programs
  Settings

Where Data is similar to var, Files to home and Settings to etc from the FHS. The Programs directory contains software organized like the GoboLinux directory of the same name. In the case of a user it was personal software (like Rootless), for System software to be used by all. And Library is like usr, but contains mostly symlinks back to Programs and is broken into categorical directories with more human readable names like Commands, Fonts, APIs, etc. The Desktop directory is simply a place for the user to link into other directories or files for convenient access, whereas for System, it is replaced by Mount.

Along Comes Objectroot

Then two days ago I came across objectroot. This well considered alternate file hierarchy raises some new questions and perspectives that need to be considered for any modern operating system. In particular it raises the point that a single computer doesn't have to run just one operating system. It can run multiple operating systems, via dual boots or virtually and simultaneously. So why then should there only be one System, what objectroot calls hosts? Good question.

Likewise, objectroot envisions the separation of vendors, which it calls org, independent of hosts. This also means that no user would ever have private installs of software. Rather all software is stored under an org/*publisher*/ directory. Personal access comes only via links to this location. Whereas access to all users is via the common user-group directory.

hosts/
  host1
  host2
org/
  org1
    ... programs go here ...
  org2
users/
  common
  user1
  user2

There is however a complication to this design. If an org has a compiled program then it will almost certainly have to be compiled against a specific host for reasons like architecture and compilation options, e.g. one host could be i386, while another is ARM, or Linux vs AROS. If a user switches to a different host, say via a reboot, then a different build of the software will likely be required. In that case, does it really make sense to have software located outside of the a hosts hierarchy? It does make good sense in some cases, such as clusters where all the hosts are identical. But what about whoely different operating systems?

To make this clear lets take the stark example of a dual booting system of Windows and Linux. We will simplify things a bit just so the directory listings don't run off the page, but it makes no difference to the outcome.

hosts/
  Windows/
  Linux/
org/
  microsoft.com/
    word.exe
  mozilla.org/
    firefox
users/
  johndoe/
    cmd/
      word    -> ?
      firefox -> ?

Now, despite what objectroot's official documentation says, we can't exactly just point the user's word program to /org/microsoft.org/word.exe. If the user logs into the Linux host then he would have a link to an executable that at the very least would fail, and at worse cause a really ugly crash. The fact is, Word is not a Linux program! To correct this, we would have to route the link indirectly via the host.

hosts/
  self -> Linux
  Windows/
    bin/
      word -> /org/microsoft.com/word.exe
      firefox -> /org/mozilla.org/firefox
  Linux/
    bin/
      firefox -> /org/mozilla.org/firefox
vendors/
  microsoft.com/
    word.exe
  mozilla.org/
    firefox
users/
  johndoe/
    org/
      word    -> /hosts/self/bin/word
      firefox -> /hosts/self/bin/firefox

Now if the user is running Linux, bin still contains a reference to the word program, but it is a dead link so it can't do anything. We use hosts/self to act as a reference to whatever the current host is. Note this can't be a normal file system link. Rather it has to be some kind of pseudo object akin to proc and dev entries in Linux so that it can be dynamically adjusted; for instance, if running a host instance via VMWare.

Okay, so we basically solved our problem. But in the end it is still rather unsatisfying and we've made some large assumptions. We assumed that all the operating systems support symlinks, we've assumed they all can share the same file system, and there are probably a few other issues we've overlooked. What becomes clear, is that while objectroot makes sense conceptually, in practice it's not so easy. It works well between multiple hosts of more or less the same operating system, ideally each host being identical. But at that point, what was the point? However, I think its overall concept still holds a great deal of potential.

Taking a Step Back

Lets take a step back for a moment and ask ourselves if these divisions really makes the most practical sense. Are we making things more complicated than need be? Is there perhaps a simpler way? The classic approach used to this very day is to think in terms of the host first, under which one finds installed software and users. It is rare to see things like shared usr, which only occurs in very special circumstances.

But how do we address objectroot's point of multiple OSs per machine and user? Well, might we instead turn the whole design on its head and thus say what is primary is not the host, but rather the user? Lets consider that. Lets make the user primary and under the users directory we can find a variety of hosts, and under each host we can find vendors with software. We would end up with something like this:

user1/
  Hosts/
    host1/
      Software/
        vendor1/
        ...
    host2/
      Software/
        vendor1/
          ...
user2/
  ...

In some ways this makes perfect sense. After all, isn't the operating system to use a personal choice? On the other hand, each user will very likely have duplicate hosts and software installs. And I mean crazily so because there will be at least one operating system installed for each user! But it's not quite that problematic because we can bring in groups to which different users can belong and thus have access to a share set of hosts. After that the only redundancy is in the software installs for each host (something we are already quite used too).

Another difficulty of supporting this, which for now I will only mention in passing, is that it would require a user to login before the operating system is booted. That would require either some BIOS hackery or at least a mini boot host that can pass off complete control to another host.

A more pressing difficulty arises however when we consider how some of our other directories fit into this design. For instance, where does Library go? Remember Library is primarily an index of Programs content. Would it go under user or under host? It would seem that, in so far as any directory might contain links to a specific host, it would have to be located in that hosts directory. That fact rather forces our hand and we get something like:

user1/
  Data
  Files
  Hosts/
    host1/
      Desktop
      Library
      Programs/
        vendor1/
          ...
      Settings

It is questionable as to whether Data can really be independent of the host, but it is at least imaginable. To play it safe it would probably be wise to link /user1/Hosts/host1/Data -> /user1/Data if it is possible. The same could be done with Files.

We could go one one step further toward objectroot in our design by moving the Programs directory out of the host. This would bring back the compile-host dependency issue we mention above. However we already know one way to mitigate the issue. Furthermore, we could adopt a scheme similar to the Nix package manager which would allow software to be truly host independent.

user1/
  Data
  Files
  Hosts/
    host1/
      Desktop
      Library
      Settings
  Programs/
    vendor1/
      ...

The Take Away

The primary idea to take a way from this consideration of objectroot, is that the user, and by extension user groups, are paramount. Users should come first, not operating systems. What we have today is primarily an artifact of history. Once upon a time, a computer had one operating system and one operating system only. In addition, it was very expensive, so it only made sense to have multiple users share it. Consequently, the host naturally took precedence over the user. Today these factors are no longer necessities. Computers can be cheap and they can run multiple operating systems, even simultaneously. The future designs of our computing systems need to adjust accordingly.