2. Interlude - Process API

10 min read
M
Marwan
Author
2. Interlude - Process API

1.The fork() System Call

Overview of fork()

The fork() system call is used to create a new process in Unix-like operating systems. When a process calls fork(), it creates a duplicate of itself. This newly created process is called the child process, while the original process that called fork() is the parent process.


Understanding the Code (p1.c)

Code:
#include <stdio.h> #include <stdlib.h> #include <unistd.h> int main(int argc, char *argv[]) { printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); // Creating a new process if (rc < 0) { // fork() failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // This block runs in the child process printf("hello, I am child (pid:%d)\n", (int) getpid()); } else { // This block runs in the parent process printf("hello, I am parent of %d (pid:%d)\n", rc, (int) getpid()); } return 0; }

Step-by-Step Execution of p1.c
  1. The program starts execution in the main process (parent).

  2. It prints:

    hello world (pid:29146)  
    

    Here, 29146 is the process ID (PID) of the original process.

  3. The program then calls fork(), which creates a new child process.

  4. After fork(), both parent and child execute independently.

Return Values of fork()
  • In the parent process, fork() returns the PID of the child (e.g., 29147).
  • In the child process, fork() returns 0.
Two Possible Outputs

Since the OS scheduler decides which process runs first, we might see:

  1. Parent runs first:

    hello world (pid:29146)  
    hello, I am parent of 29147 (pid:29146)  
    hello, I am child (pid:29147)  
    
  2. Child runs first:

    hello world (pid:29146)  
    hello, I am child (pid:29147)  
    hello, I am parent of 29147 (pid:29146)  
    

    The order varies due to non-determinism in process scheduling.

Understanding fork() with wait() (p2.c)

The wait() system call ensures that the parent waits for the child process to complete before continuing.

Code (p2.c)
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char *argv[]) { printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); if (rc < 0) { // fork failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // Child process printf("hello, I am child (pid:%d)\n", (int) getpid()); } else { // Parent process waits for the child to complete int rc_wait = wait(NULL); printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n", rc, rc_wait, (int) getpid()); } return 0; }
How wait() Works
  1. Parent calls fork() and creates a child process.
  2. Parent executes wait(NULL), which pauses execution until the child finishes.
  3. The child process runs and prints its message.
  4. Once the child terminates, the parent resumes execution and prints its message.
Possible Output (p2.c)
hello world (pid:30000)  
hello, I am child (pid:30001)  
hello, I am parent of 30001 (rc_wait:30001) (pid:30000)  

Here, rc_wait:30001 confirms that the parent successfully waited for the child.


2. What is wait()?

Overview of wait()

  • wait() makes a parent process pause until a child process terminates.
  • The parent ensures that the child completes before continuing.
  • This prevents zombie processes (processes that have exited but still hold resources).

Syntax of wait()

#include <sys/types.h> #include <sys/wait.h> pid_t wait(int *wstatus);
  • Returns: The PID of the terminated child.
  • If successful, wait() returns immediately when the child exits.
  • If no children exist, it returns -1.

Practical Example: Using wait()

Let's modify our fork() example to include wait(), ensuring the parent waits for the child to complete.

Code: wait() in Action

#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { printf("Parent process (PID: %d)\n", getpid()); int rc = fork(); if (rc < 0) { fprintf(stderr, "Fork failed\n"); exit(1); } else if (rc == 0) { // Child process printf("Child process (PID: %d)\n", getpid()); sleep(2); printf("Child exiting...\n"); exit(0); } else { // Parent process waits for child int status; int child_pid = wait(&status); printf("Parent: Child %d finished execution.\n", child_pid); } return 0; }
Expected Output
Parent process (PID: 12345)  
Child process (PID: 12346)  
Child exiting...  
Parent: Child 12346 finished execution.  

Why does this happen?

  • The parent process calls wait().
  • The child prints its messages and then exits.
  • The parent resumes execution only after the child terminates.

Why Does wait() Make Execution Deterministic?

Without wait(), both parent and child can run concurrently, meaning their outputs may appear in any order.

Example without wait():

Parent process (PID: 12345)  
Parent process finished  
Child process (PID: 12346)  
Child exiting...  

The parent might finish before the child, creating a zombie process.

With wait(), the output is always:

Parent process (PID: 12345)  
Child process (PID: 12346)  
Child exiting...  
Parent: Child 12346 finished execution.  

✅ The child always finishes before the parent resumes execution.


Handling Multiple Child Processes with wait()

What if a parent creates multiple children? The wait() system call can handle this.

Code: Multiple Children with wait()
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { printf("Parent process (PID: %d)\n", getpid()); for (int i = 0; i < 2; i++) { int rc = fork(); if (rc == 0) { printf("Child %d (PID: %d) started\n", i + 1, getpid()); sleep(2); printf("Child %d (PID: %d) exiting...\n", i + 1, getpid()); exit(0); } } // Parent waits for both children int status; while (wait(&status) > 0) { printf("A child process finished.\n"); } printf("Parent finishing execution.\n"); return 0; }
Expected Output
Parent process (PID: 12345)  
Child 1 (PID: 12346) started  
Child 2 (PID: 12347) started  
Child 1 (PID: 12346) exiting...  
A child process finished.  
Child 2 (PID: 12347) exiting...  
A child process finished.  
Parent finishing execution.  

✅ The parent waits for all children to exit before finishing.


Using waitpid() for More Control**

waitpid() allows selective waiting for a specific child, unlike wait(), which waits for any child.

Syntax of waitpid()
pid_t waitpid(pid_t pid, int *wstatus, int options);
  • pid > 0 → Waits for a specific process.
  • pid == -1 → Equivalent to wait(), waits for any child.
  • options → Additional behavior (e.g., WNOHANG to prevent blocking).
Example: Using waitpid()
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { printf("Parent process (PID: %d)\n", getpid()); int rc = fork(); if (rc == 0) { // Child process printf("Child process (PID: %d) running...\n", getpid()); sleep(2); printf("Child exiting...\n"); exit(0); } else { // Parent waits for specific child int status; waitpid(rc, &status, 0); printf("Parent: Child %d finished execution.\n", rc); } return 0; }

waitpid() is useful when you need to wait for a specific child instead of any available child.


Common Issues and Debugging wait()

Issue 1: Parent Finishes Before Child

If the parent terminates before the child, the child becomes orphaned and is adopted by the init process.

Solution: Use wait()
int status; wait(&status);
Issue 2: Zombie Processes

A zombie process is a process that has exited but is still listed in the process table because the parent didn't collect its exit status.

Solution: Ensure Parent Calls wait()
while (wait(NULL) > 0);
Issue 3: wait() Blocks the Parent Forever

If the child never terminates, wait() can block the parent indefinitely.

Solution: Use waitpid() with WNOHANG Option
while (waitpid(-1, NULL, WNOHANG) > 0);

This makes waitpid() non-blocking


3. The exec() System Call

Overview

The exec() family of system calls plays a crucial role in process management by replacing the current process image with a new program. Let's break it down thoroughly, exploring how it works, its variants, memory changes, and real-world applications.


1. What Exactly Does exec() Do?

  • The exec() system call loads a new program into the current process’s memory and begins executing it.
  • Unlike fork(), which creates a new process, exec() replaces the existing process without creating a new one.
  • The PID remains the same, but the code, stack, and heap are replaced with the new program's data.

Example Scenario:

  • Suppose we have a shell that allows users to run commands. When a user types ls, the shell:
    1. Creates a child process (fork()) to avoid replacing itself.
    2. Calls exec() in the child to replace the process image with /bin/ls.
    3. The ls program runs, displaying the directory contents.

2. Execution Flow of exec() in p3.c

Code Analysis:
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <sys/wait.h> int main(int argc, char *argv[]) { printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); if (rc < 0) { // Fork failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // Child process printf("hello, I am child (pid:%d)\n", (int) getpid()); // Prepare arguments for the new program char *myargs[3]; myargs[0] = strdup("wc"); // Program name myargs[1] = strdup("p3.c"); // Argument (filename) myargs[2] = NULL; // Terminate argument list execvp(myargs[0], myargs); // Replaces child with wc program // This line will only execute if execvp fails printf("this shouldn’t print out"); } else { // Parent process waits for child int rc_wait = wait(NULL); printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n", rc, rc_wait, (int) getpid()); } return 0; }
Step-by-Step Breakdown
StepAction
1Parent process starts execution and prints hello world with its PID.
2fork() is called, creating a child process.
3Parent gets child’s PID as rc, child gets rc = 0.
4Child process executes execvp("wc", myargs).
5execvp() replaces the child’s code with the wc program (word count).
6Parent process waits for the child to finish execution.
7Parent prints message after child completes.

3. Memory Layout Changes

When a process calls exec(), its memory layout changes drastically:

Before exec() (Executing p3.c)
SectionContents
TextCode of p3.c
HeapAllocated memory using malloc(), etc.
StackFunction calls, local variables
DataGlobal variables
After exec() (Executing wc p3.c)
SectionContents
TextCode of /usr/bin/wc
HeapCleared and initialized for wc
StackReset for wc
DataCleared and initialized for wc

Thus, the old program is completely wiped out, and the process now runs as wc.


4. Variants of exec()

The exec() family includes multiple functions, each offering different capabilities:

FunctionArgumentsSearches PATH?Allows Environment Variables?
execl()List of arguments❌ No❌ No
execv()Array of arguments❌ No❌ No
execlp()List of arguments✅ Yes❌ No
execvp()Array of arguments✅ Yes❌ No
execle()List + environment variables❌ No✅ Yes
execve()Array + environment variables❌ No✅ Yes

Example Differences:

execl("/bin/ls", "ls", "-l", NULL); // Using absolute path execvp("ls", args); // Searches in PATH execle("/bin/ls", "ls", "-l", NULL, env); // Passes custom environment

5. Handling Errors in exec()

Since exec() never returns on success, any code after exec() only runs if it fails.

  • If the file does not exist or is not executable, exec() returns -1, and errno is set.
Error Handling Example
if (execvp(myargs[0], myargs) == -1) { perror("exec failed"); exit(1); }

6. Real-World Applications of exec()

  • Shells (e.g., Bash, Zsh, Fish)
    • fork() creates a new process for each command.
    • exec() replaces the new process with the actual command.
  • Process Managers (e.g., systemd, init, supervisord)
    • Used to spawn and manage system processes.
  • Web Servers (e.g., Apache, Nginx, CGI scripts)
    • exec() helps execute backend scripts (PHP, Python).
  • Programming Languages (e.g., Python, Java)
    • Used in interpreters to execute compiled programs.

4. Why? Motivating The API

Why Separate fork() and exec()?

The combination of fork() and exec() may seem odd at first, but it's a fundamental design choice in UNIX systems. The main reason is that it allows modifications to be made between process creation (fork()) and execution (exec()). This flexibility is crucial in building shell functionalities such as redirection (>, <), pipes (|), and environment modifications.

How Shells Work with fork() and exec()

A shell is simply a user program that:

  1. Reads user input (e.g., ls -l or wc p3.c > newfile.txt).
  2. Calls fork() to create a new process.
  3. In the child process, it calls exec() to replace the process with the desired program (e.g., ls, wc).
  4. The parent process calls wait() to wait for the child to finish execution before showing the next prompt.

Example: Output Redirection (> in Shells)

Consider the command:

wc p3.c > newfile.txt

This means "Run wc p3.c but store the output in newfile.txt instead of displaying it on the screen."

How the Shell Implements It
  1. The shell calls fork() to create a child process.
  2. In the child:
    • It closes STDOUT_FILENO (standard output, usually the screen).
    • It opens the file newfile.txt. The OS assigns it the first available file descriptor, which is STDOUT_FILENO (file descriptor 1).
    • It calls exec() to replace itself with wc p3.c.
  3. The output of wc is now redirected to newfile.txt.
Code Implementation (p4.c)
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/wait.h> int main(int argc, char *argv[]) { int rc = fork(); if (rc < 0) { // Fork failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // Child: Redirect output to a file close(STDOUT_FILENO); open("./p4.output", O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU); // Execute wc program char *myargs[3]; myargs[0] = "wc"; // Program: wc (word count) myargs[1] = "p4.c"; // Argument: file to count words in myargs[2] = NULL; // Null-terminated array execvp(myargs[0], myargs); // If exec succeeds, this won't print printf("This shouldn’t print out\n"); } else { // Parent: Wait for child to finish wait(NULL); } return 0; }

How File Descriptors Enable Redirection

In UNIX-like systems, file descriptors are assigned as follows:

  • 0 → Standard Input (STDIN_FILENO)
  • 1 → Standard Output (STDOUT_FILENO)
  • 2 → Standard Error (STDERR_FILENO)

By closing STDOUT_FILENO and opening a file, the new file descriptor takes the place of standard output. As a result, any printf() or other write operations in the child process get redirected to the file instead of the terminal.


Example: Output of Running p4.c

prompt> ./p4  
prompt> cat p4.output  
32   109   846 p4.c  

Here, wc has counted 32 lines, 109 words, and 846 bytes in p4.c, but the result was written to p4.output, not the terminal.


Pipes (|) in Shells

Pipes (|) work similarly, but instead of redirecting output to a file, the shell connects the output of one process to the input of another. Example:

grep -o foo file | wc -l

This counts occurrences of "foo" in file.

How the Shell Implements Pipes
  1. The shell creates a pipe using pipe(), which provides two file descriptors:
    • One for writing (output of grep).
    • One for reading (input to wc).
  2. The shell fork()s two child processes:
    • The first child runs grep -o foo file, with its output redirected to the pipe.
    • The second child runs wc -l, reading from the pipe instead of the keyboard.
  3. The parent waits for both processes to complete.