Friday, 29 May 2009

Thoughts on Unix File System Structure

There is a school of thought that believes that the Unix File System layout (or Filesystem Heirarchy Standard) is needlessly complicated. Some would go as far as to claim that it is fundementally broken.

The recent post on OSNews by Thom Holwerda (and especially many of the comments) provide some good examples of the perceived problems and the oft touted solutions.

There is a counter argument though, which goes something like this: "the Unix File System has been evolving for well over 30 years, isn't it strange that no-one noticed just how broken it is?"

Let's look at some of the arguments for and against the current system.

To make this easier, I'll pick up on some of the comments and see if they can be answered.

One quick point. In most cases I'm going to refer to Unix where this covers all Unix based operating systems. If I say something specific to GNU/Linux or another operating system then I'll name it.

I'm just trying to explain that many people are put off diving further into the intricacies of the computer simply because of how daunting everything is. By making a system easy to use and understand not only at the very highest level (the UI) but also all the levels below that, we might enable more people to actually *understand* their computers better, which would be beneficial to *all* of us.

I am of the strong belief that there is no sane reason WHATSOEVER why we couldn't make computers easier to use and understand on ALL levels, and not just at the top - other than geek job security.

This is a good place to start. The layout is there to simplify maintenance of the system, not to complicate it. This is nothing to do with "Job Security" - more to do with making a usable, maintainable system. Having people dipping into the OS structure (whether it be Windows, Unix or MacOS) would create MORE work for the support geek, not less.

I'll give you a real life example. Around ten years ago I installed Linux for a friend. This was back in the days that installing it was still a bit of an art. At the time getting XWindows up and running was cause for celebration, and as for working sound, IN YOUR DREAMS BUDDY!

After an hour or so of fiddling around with the config files, xf86config (remember that?), making sure that the correct packages were installed I gave him a quick run-though of how the system worked. As he had come from a DOS/Windows background I'd configured everything to look pretty similar, and showed how the basic commands worked (use "ls" instead of "dir", "cd" works about the same, "rm" instead of "del" and so forth) and gave a quick guided tour of XWindows, X11Amp and the other installed goodies.

He collared me the next day: "I thought you said this Linux stuff was stable. I restarted it and now I can't get back in! It's shit!"

On further investigation what he had done became apparent. He'd had a wander through the file system, picked some files that "didn't look important" (including /etc/passwd in this case) and deleted them to free up a bit of space.

This doesn't just happen with Linux. A year or so later the Telecoms manager at my company phoned me because his PC wouldn't boot any more. He'd been trying to upgrade Internet Explorer to the latest version and had run out of space on his C:\ drive. He'd managed to find a folder that didn't look that important but "had a lot of stuff in" it that he "didn't need" and deleted it. His PC had crashed part-way through and now it wouldn't start any more. Sadly the junk folder he'd chosen was called C:\WINDOWS.

OK, so these are vaguely amusing war stories but what is my point? Well, my point is this: Users don't understand operating systems. I'd go as far as to say that they shouldn't actually have to. In the majority of cases the best thing that a user can do is to not mess with the underlying OS at all. Hiding as much of it as possible from them is A Very Good Thing Indeed.

As is traditional at this point, lets turn to our old friend the analogy. Many people drive cars. You sit down, turn the ignition, grab the steering wheel, press down the accellerator and off you go (yes, I know that there is a little more to it than that, but you get the general idea). Now, how many drivers could strip an engine? How about the gears, know how they work? Could you strip the gearbox down if you needed to and reassemble it in a working condition afterwards?

The fact is that you don't need to know the mechanics of a car in order to drive one. Although anyone could pop open the bonnet and have a root around inside most people don't. If there is a problem, they take it to a garage.

Of course, some people DO tinker with their cars. They take a great deal of pride in being able to maintain and even customise their car. Is what they do easy? Of course not. Can anyone do it? No. Only an idiot would imagine that everyone can do everything, some degree of knowledge or learning may be required. This isn't meant to be an insult, but it is a fact.

To come back to the point of the analogy, is the Car any less useful because people don't understand how it works? Of course not.

This follows through to computers. Most people can use their computer quite happily with no idea of the underlying mechanisms. If they have problems then they can get in touch with their friendly neighbourhood technician. There is nothing stopping them learning about it if they want, just don't expect it to be easy. Just like a car, an operating system (and its component parts) is made to fulfil a function, not to be played around with.

pulseaudio is yet another layer on top of a broken audio foundation. Adding layers does not make things better, it just hides it a little longer.

Another good example of mistaken thinking. Abstraction can be a very good thing, and pulseaudio is an excellent example of this. Let's see how this works.

Just for the same of argument, lets say we were trying to write a simple audio player on GNU/Linux. Now, how do you make it play sounds? At a very basic level you might write directly to /dev/dsp. So now your app plays sounds. It might lock the /dev/dsp device but hey, this is just a simple example.

Let's up the stakes a bit and try and port the app to, say, Windows. What happened to /dev/dsp? It doesn't exist. How about MacOS X? Nope, not likely to work here either.

How does this relate to abstraction? Well, if our audio app uses pulseaudio to plays its sound it will now work on any platform that pulseaudio is supported on. For something like KDE that is aiming to be a cross platform environment this makes coding your apps an awful lot easier.

In other words the GNU/Linux audio foundation isn't broken, it just doesn't exist on Windows.

Why bin? Because that's where your 'binaries' are, right? oh, except there are programs now that are text files run through an interpreter, so that doesn't really apply. A user's files aren't under /usr, my webserver by default isn't under /svr, it's under /var/www. /etc? Yeah, something about etcetera really says 'config files'. Seriously, who thought /etc was a good name?

This is the biggie. To answer this, it is necessary to look at and understand where Unix came from.

First, another quick experiment. Try and find a Unix reference manual from any time in the last twenty years or so. The command references are still likely to work. Any shell scripts (providing you are using the modern version of the same interpreter) are also likely to work without any changes.

In the earliest days of Unix space was at a premium. Shorter command names meant shorter scripts (and less space in the file allocation tables). This is why the "base" commands are only two characters long, for example, "ls", "cd", "rm", "du" and so forth. Although we don't have the same physical limits these days there are a lot of scripts out there that rely on the short versions of the file names. Keeping them the same means that people don't have to re-learn all their skills with each new release of the OS (something that Microsoft could learn from).

This also follows through to the file system layout (again, I'm going to simplify this a bit, but hopefully you'll get the idea).

At the root of our Unix system we find these main folders:

/ -- root
/bin -- binaries
/sbin -- system tools (ie. fdisk, hdparm, fsck)
/lib -- libraries
/etc -- configuration files / scripts / anything that doesn't fit
in the other directories

These are the most basic parts of your Unix system. These are the base commands and libraries that are required to give you a bootable system with access to a network.

Moving down the tree, we come to /usr.

/usr -- root
/usr/bin -- binaries
/usr/sbin -- system tools
/usr/lib -- libraries
/usr/etc -- configuration files / scripts / anything that doesn't fit
in the other directories

This is the next level up. /usr is NOT where user files are stored, or for user generated versions of applications. In this case "usr" stands for Unix System Resources (although originally this was the location of users home directories). This is where the vendor provided files live (the stuff that isn't part of the standard base files). For those who argue about everything being shoved into /usr by Ubuntu, RedHat or whoever, this is actually where they SHOULD go. Anything in here should have been provided by the distro maintainers. Between / and /USR that should contain everything that your operating system needs. All applications, all configuration files, everything.

So what about /usr/local?

/usr/local -- root
/usr/local/bin -- binaries
/usr/local/sbin -- system tools
/usr/local/lib -- libraries
/usr/local/etc -- configuration files / scripts / anything that doesn't fit
in the other directories

The /usr/local section of the file system is where any binaries that YOU create are stored, along with their configuration files. If you wanted to create a custom version of any application it should appear in here. This keeps your stuff separate from what the vendor provides, and in theory prevents you from permanently damaging the operating system. If you do manage to balls things up totally then deleting /usr/local should be enough to fix it again (as all the vendor provided files should still be intact and untouched).

Another benefit of this approach is that once your root system is installed, the actual location of /usr becomes irelevent. It could just as easily be on a shared network drive as it could be on your local disk. If disk space is at a premium this can be a very effective way of working. It also means that every users has the same base system, because they are running the same apps from the same place.

OK, so thats not as useful for a single user system, but it is still functionality that is used in some places. Just because YOU don't use it, doesn't mean it isn't useful.

Before anyone pipes up yes, I am fully aware of /opt, /var, /tmp, /dev and so forth. All of these have their uses, but are not relevent for the purposes of this discussion.

For a start, it has a gaping hole: he doesn't explain how you separate "System" from "Programs".

That's a big giant gaping hole in Linux, not in Thom's proposed filesystem layout. There's no such distrinction in a Linux distro, as there's no such thing as "the OS" vs "the user apps". Once someone gets the balls to stand up and say "this is the OS" and package it separately from "the user apps", the FHS will never change.

Actually GNU/Linux and Unix already does separate the OS from the User Apps. Remember our three levels? The bottom level is the OS - the bit you need to get a working system (ie. /bin /sbin /lib and so on). Anything in /usr or above is a user app. Yes, you may see XFree86 as essential, but GNU/Linux can run without it. Same for Mozilla, and FireFox and anything else in /usr or /usr/local.

* * *

The biggest problem there is with operating systems in general (not just GNU/Linux) is that for some reason people assume that it should all be easy. The desktop is easy to use therefore the underlying system should also be easy to use.

This is a very strange form of logic. Simplifying where necessary is a good thing, providing it doesn't impact on functionality or reliability. To go back to our car analogy there would be an argument for simplifying the innards of the car to make it much easier to understand and maintain for the common user. As a thought experiment, let's try it.

Let's start with the gearbox. Much too complicated and a potential point of failure, choosing a good default gear should do away with the need for that. How about a petrol engine? All that internal combustion malarkey sounds a bit dangerous to me. Running a vehicle based on small controlled explosions? Stuff that for a game of soldiers! Let's replace that with an electric one. But wait, maybe some people don't understand how an electric motor works either. So on second thoughts, let's replace it with a pedal driven one.

Hmm, it's a bit heavy to pedal, so lets remove most of the metal bodywork, a canvass roof should suffice (plus it's easy to repair or replace).

Anti-lock brakes? They'd have to go as well. Disc brakes are much simpler. Power streering? Not really needed now, drop that too. We can also leave out the airbags as we won't be going that fast now anyway.

So what are we left with? Basically a four-wheeled bicycle. Handy in some circumstances, easy to maintain but not necessarily as useful as what we started with.

Yes, this is taking it to the extremes, but that is the equivalent of what people are suggesting is done to the Unix file system. Let's remove everything that we don't understand the reasons for and just use what is left. Sadly what is left may be easy to understand, but its functionality would likely be crippled.

* * *

Does any of this mean that people (like GoBoLinux for example) shouldn't experiment and try different things? Of course not. Finding new (and potentially better) ways of doing things is something that can end up as a benefit to everyone. But making changes for the sake of being different is not so good.

Looking closer at GoBoLinux it is adding one hell of a lot of complexity to the system in order to just keep things working (have a check of and ask yourself about all the symlinks), whilst loosing some of the benefits of the traditional Unix system.

Reading gives plenty of information on why GoBoLinux have chosen their approach. It also re-inforces some of the points made above, especially with regard to the three-tier approach of traditional Unix.

* * *

In the end, used properly the current Unix File System Layout actually works rather well, changing to something else isn't going to solve the problems of people ignoring a standard. All it will achieve is change for the sake of it, and chances are some benefits will be lost in the process.

No comments: