UTLK - CHAPTER 3 Processes
10 Jun 2018Processes, Lightweight Processes, and Threads
In this book, the term “process” refers to an instance of a program in execution. From the kernel’s point of view, a process is an entity to which system resources (CPU time, memory, etc.) are allocated.
In older Unix systems, when a process is created, it shares the same code as its
parent (the text
segment) while having its own data
segment so that changes
it makes won’t affect the parent and vice versa. Modern Unix systems need to
support multithreaded applications, i.e., multiple execution flows of the same
program need to share some section of the data segment. In earlier Linux kernel
versions, multithreaded applications are implemented in User Mode, by the POSIX
threading libraries (pthread).
In newer Linux versions, Light Weight Processes (LWP) are used to create threads. LWPs are processes that can share some resources like open files, the address space, and so on. Whenever one of them modifies a shared resource, the others immediately see the change. A straight forward way to implement multithreaded applications is to associate a LWP to each user thread. Many pthread libraries do this.
Thread Groups. Linux uses thread groups to manage a set of LWPs, which
together act as a whole with regards to some system calls like getpid()
,
kill()
, _exit()
. We are going to discuss these at length later, but
basically these syscalls all use the thread group leader’s ID as process ID,
treating a group of LWPs as one process.
Process Descriptor
Process descriptor is the data structure that encodes everything the kernel needs to know about a process - whether is running on a CPU or blocked on some events, its scheduling priority, address space allocated, open files, etc. In Linux, the type task_struct is defined for process descriptors (figure 3-1). The six data structures on the right hand side refer to specific resources owned by the process, and will be covered in future chapters. This chapter focuses on two of them: process state and process parent/child relationships.
Figure 3-1. The Linux Process Descriptor
Process State
This field describes the state of the process - running, stopped etc. The current implementation defines it as an array of flags (bits), each describes a possible state. All states are mutually exclusive, hence at any given time, only one flag is set and others are all cleared. Possible states (source):
TASK_RUNNING
. The process is either running on a CPU or waiting to be executed.TASK_INTERRUPTIBLE
. The process is suspended (sleeping) until some condition is met - an hardware interrupt (e.g., disk controller), a lock is released, a signal is received, etc.TASK_UNINTERRUPTIBLE
. Similar to above, except that delivering a signal to the sleeping process leaves its state unchanged. That is, it can be waken up only when if certain condition is met, say, a hardware interrupt.TASK_STOPPED
. Process execution has been stopped; upon receivingSIGSTOP
,SIGTSTP
,SIGTTIN
orSIGTTOU
.TASK_TRACED
. Process execution has been stopped by a debugger - thinkptrace
.
Apart from these five states, the state
field (and the exit_state
field) can
have two additional states. As their name suggests, a process can reach on of
these states only when they are terminated:
EXIT_ZOMBIE
. Process is terminated, but its parent hasn’t calledwait4()
orwaitpid()
yet. This state is important because the kernel cannot discard a process’s descriptor until its parent has called await()
-like syscall as per a Unit design principle. This will be detailed later in this chapter.EXIT_DEAD
. The final state: the process is being removed from the system because its parent has calledwait4()
orwaitpid()
for it. The next phase of the process’ life is the descriptor being deleted. So you may ask, “why not just delete the descriptor already?”. Well, having this last state is useful to avoid race conditions due to other threads that executewait()
-like calls on the same process (see Chapter 5).
The value of the state field is usually set with a simple assignment, e.g.:
p->state = TASK_RUNNING;
The kernel also uses the set_task_state and set_current_state macros to set the state of a specified process or the current running process. These macros also ensure atomicity of these instructions. See Chapter 5 for more details on this.
Identifying a Process
Process descriptor pointers. Each execution context that can be scheduled
independently must have its own process descriptor; even lightweight processes
(threads) have their own task_struct
. The strict one-to-one mapping between
processes and process descriptors makes the address of the task_struct
structure a useful means for the kernel to identify processes. These addresses
are referred to as process descriptor pointers.
PIDs. On the other hand, users like to address a process by an integer
called the Process ID or PID. To satisfy this requirement the kernel stores
PID in the pid
field of the task_struct
. PIDs are numbered sequentially and
there is an upper limit (determined by /proc/sys/kernel/pid_max
). When the
kernel reaches this limit, it must start recycling the lower, unused PIDs. If
the rate of process/thread creation exceeds the rate of PID recycle, process
creation is going to fail with likely an EAGAIN (-11)
error. The default for
/proc/sys/kernel/pid_max
is usually 32K, and can be lifted up to 4,194,303 on
a 64-bit system.
pidmap_array. When recycling PID numbers, the kernel needs to know which of
the 32K numbers are currently assigned and which are unused. It uses a bit map
for that, called the pidmap_array. On 32-bit systems, a page contains 32K bits
(4K bytes), which is perfect for storage of a pidmap_array
. On 64-bit systems,
multiple pages might be needed for the pidmap_array
depending on the value of
/proc/sys/kernel/pid_max
. These pages are never released.
Threads. Each lightweight process has its own PID. To satisfy the user
application requirement that multi-threaded applications should have only one
PID, the kernel groups lightweight processes that belong to the same application
(thread group), and makes system calls like getpid()
return the PID of the
thread group leader. In each LWP’s task_struct
, the tgid
field stores the
PID of the thread group leader and for the thread group leader itself, that
value is equal to its pid
value.
PID to process descriptor pointer. Many system calls like kill()
use PID
to denote the affected process. It is thus important to make the process
descriptor lookup efficient. As we will see soon enough, Linux has a few tricks
to do that.