Linux Process State

Overview

The Linux kernel exposes a great deal of per-process information through the /proc filesystem interface. This can be very helpful in things like:

  • Forensics – What is this process that’s running and where did it come from? What files/resources does it have open?
  • Automation – Being able to programmatically retrieve information about processes without trying to parse the output of commands.

We’ll look at a few common, useful objects under the proc filesystem, but you can do no harm looking around, so please do.

Under /proc, there is a subdirectory for each process ID (PID) currently running on the system. There is also a special symbolic link called self that points to the PID subdirectory for the current process (your shell). For example:

$ ls /proc/self/
arch_status         fd                 net            setgroups
attr                fdinfo             ns             smaps
autogroup           gid_map            numa_maps      smaps_rollup
auxv                io                 oom_adj        stack
cgroup              ksm_merging_pages  oom_score      stat
clear_refs          ksm_stat           oom_score_adj  statm
cmdline             limits             pagemap        status
comm                loginuid           patch_state    syscall
coredump_filter     map_files          personality    task
cpu_resctrl_groups  maps               projid_map     timens_offsets
cpuset              mem                root           timers
cwd                 mountinfo          sched          timerslack_ns
environ             mounts             schedstat      uid_map
exe                 mountstats         sessionid      wchan
$

We’ll take a look at few examples below, and potentially look at others in future articles.

What is that process?

Let’s say, for example, we see a running process called ./foo.

$ ps ax
 ...
 269792 pts/2    Sl     0:00 ./foo
 ...
$

We can see that it was started in the directory containing the executable, but we can’t see from this output which directory that is, or the full path to the executable started. There are multiple ways to do this, but we can quickly find answers to both these questions in a way that is also automation-friendly.

Under /proc/<pid> virtual directory (where <pid> is the process ID of the process we’re interested in), we can see two symbolic links of interest:

  1. The exe symlink, which points to the binary used to instantiate this running process, and
  2. The cwd symlink, which points to the current working directory of the process.

Using the PID 269792 belonging to the process in the output above, we can see both of these files and what they point to:

$ ls -l /proc/269792/{exe,cwd}
lrwxrwxrwx 1 matthew matthew 0 Oct 25 15:38 /proc/269792/cwd -> /home/matthew/src/rust/foo
lrwxrwxrwx 1 matthew matthew 0 Oct 25 15:38 /proc/269792/exe -> /home/matthew/src/rust/foo/foo
$

From this, we can tell that the program foo was started from the binary /home/matthew/src/rust/foo/foo, and that its current working directory (the default location for reading and writing files) is the same directory the binary was executed from, /home/matthew/src/rust/foo.

As a side benefit, the ls -l output above also shows the owner of the files. The Linux proc filesystem uses this to tell us which user the process is running as.

Which files does it have open?

Now that we know where this process was started from, we likely want to know more about what it’s doing. One potential avenue of investigation would be to see what files it has open on the filesystem. The fd subdirectory contains symbolic links that point to files open by that process. Using the above process as our example again:

$ ls -l /proc/269792/fd/
total 0
lrwx------ 1 matthew matthew 64 Oct 25 15:56 0 -> /dev/pts/2
lrwx------ 1 matthew matthew 64 Oct 25 15:56 1 -> /dev/pts/2
lrwx------ 1 matthew matthew 64 Oct 25 15:56 2 -> /dev/pts/2
lrwx------ 1 matthew matthew 64 Oct 25 15:56 3 -> /tmp/.data/log
$

We can see that file descriptors 0, 1 and 2 point to a terminal, which probably makes them the stdin, stdout, and stderr for the process. We can also see that it has a file open called log in the /tmp/.data directory. That might be a good place to continue our search.

Summary

We’ve taken a quick look at a few objects under the /proc directory that can be used for forensics or in automation to find out information about currently running processes in real time. We’ll likely look into others in the future. Suggestions welcome.