A Gentle Introduction to Linux Userspace - GrayHatsCC/Wiki GitHub Wiki

Linux is an operating system kernel, which is the component of the software on your computer that runs at the maximum privilege level on your computer. It contains the drivers and defines the syscalls that get used by your regular day-to-day software, like games, web browsers, or services. There are other kernels, like Windows NT (the backbone of Windows OS's), FreeBSD, OpenBSD, Minix, Plan 9, etc., but they are either important enough to get their own presentation (Windows), similar enough that you can probably follow along with this (*BSD), or obscure enough that no one really cares about them (Minix, Plan 9).

As a small disclaimer, the title to this document is a lie. There is no such thing as a gentle introduction to Linux, and anyone saying otherwise is lying. Linux is expert-friendly, as it is easy to work with if you know what you are doing. However, when you first start out, you are basically being thrown into a pool full of sharks. Have fun!

Introduction to Initialization

Systemd is an implementation of what is called the init system of a computer. It is responsible for starting up the userland (i.e. the segment of memory that is unprivileged that runs user applications). The task of starting up the userland involves setting up background processes (called daemons). There are other implementations like runit (used in void), OpenRC (used in gentoo/alpine/trueos), and SysV (basically obsolete), but they are all in limited use compared to systemd on linux. The BSD operating systems use their own init systems, but their usage is somewhat similar to linux.

It also starts not-so-background processes, like a display manager (e.g. gdm, sddm, lxdm, or lightdm) which is used to start/stop the user's desktop environment (e.g. gnome, kde, lxde, xfce, cinnamon, mate, or enlightenment) and provide a graphical interface so a user can log in to their desktop. The desktop environments all do things differently, and the fact that you can swap them out is why linux is so customizable on the desktop, but they are also responsible for doing basic tasks like starting graphical applications (which they also all do differently), and so I will not be covering that in this guide. This guide is mainly talking about linux on the server.

The way an init system generally works is by starting up a series of units, which are specific processes that the sysadmin wants to have run continuously on that system. These could be web servers, SSH servers, or anything else for that matter. One could also be the display manager unit, of which every user-facing graphical application is a child of. Different services are generally installed via the system's package manager, which varies greatly between distributions, but generally also includes all the necessary configuration settings to get added to whatever init system the distro uses. As a result, the general workflow for installing a service onto a server is: install the package via the package manager, enable the service so that the init system can start it at boot, and start the service.

For example, on debian, which uses the APT package manager and systemd init system, the process to install a mariadb server looks like this:

$ sudo apt install mariadb-server
$ sudo systemctl enable mariadb-server
$ sudo systemctl start mariadb-server

To verify that this was done correctly, try the following:

$ sudo systemctl status mariadb-server

After which, you would then use a MySQL/MariaDB-compatible DB administration tool to properly configure the database to do what you want with it. But these steps are all you really need to get the service running on your system.

On other systems, the exact syntax for this process uses different commands, but the steps remain the same. Here is the same process on OpenBSD, which uses doas instead of sudo, pkg_add instead of APT, and rcctl instead of systemd:

$ doas pkd_add mariadb
$ doas mysql_install_db
$ doas rcctl enable mysqld

Note that on unix-esque systems, you can also find information about all of these specific processes using their manual pages: a handy bit of documentation about something on your computer provided by the distribution. Use apropos to search and man to read (I'm using OpenBSD's manual pages because they are very good). Use them to read about any nuances or additional options that I leave out in this document.

Very Common Services

Let's talk now about different common services, and what they do, so that when you find them on your system, you know what they are and what they do.

  • sshd: secure shell server. The secure shell is a service to enable remote access to the command line of the machine. The most common implementation is OpenSSH, but an embedded system (e.g. a router) may use dropbear instead. While you use the ssh command to remote in from your own machine, the remote machine must have the sshd service enabled.
  • httpd: a web server. This is a service that allows the server to respond to http requests, and serve html web pages to a client's web browser. The two most common implementations are apache and nginx. Web servers also can serve dynamic content, which requires the management of a web application, rather than just web pages. It is common for these to connect to the web server using either CGI or FastCGI. If you see a service called php-fpm, this is PHP's FastCGI process manager, which is used to serve PHP scripts to a running web server.
  • avahi: a service discovery daemon. Service discovery is a protocol often using mDNS to allow the discovery of services on other devices on the local area network. This is often used to find printers or wifi-enabled cameras, so that you can browse them wirelessly. Avahi is the only useful linux implementation (currently), though Apple has an implementation for Windows and macOS called Bonjour, in case you were wondering what that was. The purpose of running the daemon is so that, in addition to being able to find other services, your server could also create services that other users can discover.
  • cups: common unix printing service. This daemon is maintained by Apple, and is the de-facto standard for finding and using printers on unix-esque operating systems.This daemon requires avahi to find printers, and is required if you want your system to print documents. There are a couple different services that are cups-related. cups-browsed is useful for allowing your system to find printers, while cups-server is useful if you are trying to create a print server.
  • pulseaudio: the de-facto standard for linux audio management. This is usually run as a user-specific daemon within the desktop environment, so you do not have to worry about it from a sysadmin perspective, but it can be used to create a networked audio server. Some alternatives include ALSA (which is part of the linux kernel and pulse uses), JACK (for DAWs), and sndio (for OpenBSD fans).

Common Networking Services

There are a lot of daemons that are components of systemd, such as colord, PolKit, or anything with a systemd-* prefix. Do not worry about those for now. There are also many daemons that are essential for getting your system connected to the internet. These are:

  • dhcpcd: a daemon to fetch dynamic ip addresses from your router. You do not really need this if your network only uses static ip addresses.
  • wpa_supplicant: a daemon to connect your computer to wireless networks
  • NetworkManager: a user-friendly daemon to automatically manage network connections
  • ModemManager: a daemon to manage broadband connections, often for mobile devices

Less Common Services

Some other services that you may find are:

  • A DNS server: dns is used to resolve web addresses (e.g. google.com) to IP addresses (e.g. 8.8.8.8). There are different servers used for slightly different purposes. NSD is used to create an authoritative domain name server, while Unbound is used to create a dns cache for a LAN.
  • An SMTP server: mail servers are used for email communications, and it is not uncommon for people to host their own email servers. The most common implementation is sendmail, but OpenSMTP is also common.
  • Samba: a collection of services for file sharing and using Microsoft protocols. Includes rsync and SMB.
  • An NTP service: used to calculate the current time from network services. There is a reference implementation, but it really sucks big time, so use OpenBSD's OpenNTPd (or maybe DragonFlyBSD's dntpd, though I do not know whether it has been ported to Linux).

Filesystem

Now that you have a general idea of the types of services are, which run on a linux server, we should probably discuss how the filesystem of a linux machine is laid out.

  • /bin, /usr/bin, /usr/local/bin - binary executables. For a while, it was that /bin was for core system executables and /usr/bin was for userland executables (generally installed with the pkg manager), but that distinction was sort of artificial, so most of the time, one is symlinked to the other. /usr/local/ is generally for applications that get installed outside the package management system
  • /lib, /usr/lib, /usr/local/lib - binary libraries. The same distinctions apply here that apply to /bin, but this holds libraries of shared code for useage between multiple executables, generally suffixed with *.so or *.so.(version) (this would be *.dll on windows).
  • /include, /usr/include, /usr/local/include - header files for development purposes.
  • /sbin, /usr/sbin, /usr/local/sbin, /libexec - system administration tools and init system stuff. Generally only accessible to the super user, though sometimes symlinked to */bin.
  • /share, /usr/share, /usr/local/share - application specific data files, including manual pages, default configurations, and game assets.
  • /etc - system configuration files. For all information pertaining to the setup of the system, as well as configuration files for system daemons
  • /proc - a special filesystem that is mapped to all processes running on the machine. Linux's API to process management for tools like top. This is inspired by Plan 9, and does not exist on any BSD.
  • /dev - a special filesystem that is mapped to all devices attached to the system. You can use it to see what hardware is recognized by the system, and what the system refers to it as. You can get an idea of what correlates to what by using dmesg.
  • /tmp - a set of temporary files, that are only used by the system during runtime and cleared at shutdown.
  • /home - contains the home directories of user accounts on the system, which contain whatever a user wants them to contain, except for the root user, which is in /root
  • /var - contains the persistent non-config related files used by system services, such as IPC sockets and databases
  • /mnt - for loading/unloading to external storage devices

As an important thing to note, development of C/C++ software on unix-based systems is easier than windows in part because of the /lib, /include directories, which provide an easy location for libraries. Take your time and explore what is put in each of these directories as you figure out your system.

Conclusion

Now that you understand what gets run on your system and whether stuff is located, we can talk a little bit about how to actually do various things on your system. Setting up a service would be done by changing configuration files in /etc and moving the files that get served around in /var, and if you get lost, there is probably a default configuration stored somewhere in /usr/share. These specifics for a service are probably told by the documentation in its manpage, accessible with apropos and man.