2. Interlude - Process API
1.The fork()
System Call
Overview of fork()
The fork()
system call is used to create a new process in Unix-like operating systems. When a process calls fork()
, it creates a duplicate of itself. This newly created process is called the child process, while the original process that called fork()
is the parent process.
Understanding the Code (p1.c
)
Code:
#include <stdio.h> #include <stdlib.h> #include <unistd.h> int main(int argc, char *argv[]) { printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); // Creating a new process if (rc < 0) { // fork() failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // This block runs in the child process printf("hello, I am child (pid:%d)\n", (int) getpid()); } else { // This block runs in the parent process printf("hello, I am parent of %d (pid:%d)\n", rc, (int) getpid()); } return 0; }
Step-by-Step Execution of p1.c
-
The program starts execution in the main process (parent).
-
It prints:
hello world (pid:29146)
Here,
29146
is the process ID (PID) of the original process. -
The program then calls
fork()
, which creates a new child process. -
After
fork()
, both parent and child execute independently.
Return Values of fork()
- In the parent process,
fork()
returns the PID of the child (e.g.,29147
). - In the child process,
fork()
returns0
.
Two Possible Outputs
Since the OS scheduler decides which process runs first, we might see:
-
Parent runs first:
hello world (pid:29146) hello, I am parent of 29147 (pid:29146) hello, I am child (pid:29147)
-
Child runs first:
hello world (pid:29146) hello, I am child (pid:29147) hello, I am parent of 29147 (pid:29146)
The order varies due to non-determinism in process scheduling.
Understanding fork()
with wait()
(p2.c
)
The wait()
system call ensures that the parent waits for the child process to complete before continuing.
Code (p2.c
)
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char *argv[]) { printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); if (rc < 0) { // fork failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // Child process printf("hello, I am child (pid:%d)\n", (int) getpid()); } else { // Parent process waits for the child to complete int rc_wait = wait(NULL); printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n", rc, rc_wait, (int) getpid()); } return 0; }
How wait()
Works
- Parent calls
fork()
and creates a child process. - Parent executes
wait(NULL)
, which pauses execution until the child finishes. - The child process runs and prints its message.
- Once the child terminates, the parent resumes execution and prints its message.
Possible Output (p2.c
)
hello world (pid:30000)
hello, I am child (pid:30001)
hello, I am parent of 30001 (rc_wait:30001) (pid:30000)
Here, rc_wait:30001
confirms that the parent successfully waited for the child.
2. What is wait()
?
Overview of wait()
wait()
makes a parent process pause until a child process terminates.- The parent ensures that the child completes before continuing.
- This prevents zombie processes (processes that have exited but still hold resources).
Syntax of wait()
#include <sys/types.h> #include <sys/wait.h> pid_t wait(int *wstatus);
- Returns: The PID of the terminated child.
- If successful,
wait()
returns immediately when the child exits. - If no children exist, it returns
-1
.
Practical Example: Using wait()
Let's modify our fork()
example to include wait()
, ensuring the parent waits for the child to complete.
Code: wait()
in Action
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { printf("Parent process (PID: %d)\n", getpid()); int rc = fork(); if (rc < 0) { fprintf(stderr, "Fork failed\n"); exit(1); } else if (rc == 0) { // Child process printf("Child process (PID: %d)\n", getpid()); sleep(2); printf("Child exiting...\n"); exit(0); } else { // Parent process waits for child int status; int child_pid = wait(&status); printf("Parent: Child %d finished execution.\n", child_pid); } return 0; }
Expected Output
Parent process (PID: 12345)
Child process (PID: 12346)
Child exiting...
Parent: Child 12346 finished execution.
✅ Why does this happen?
- The parent process calls
wait()
. - The child prints its messages and then exits.
- The parent resumes execution only after the child terminates.
Why Does wait()
Make Execution Deterministic?
Without wait()
, both parent and child can run concurrently, meaning their outputs may appear in any order.
Example without wait()
:
Parent process (PID: 12345)
Parent process finished
Child process (PID: 12346)
Child exiting...
The parent might finish before the child, creating a zombie process.
With wait()
, the output is always:
Parent process (PID: 12345)
Child process (PID: 12346)
Child exiting...
Parent: Child 12346 finished execution.
✅ The child always finishes before the parent resumes execution.
Handling Multiple Child Processes with wait()
What if a parent creates multiple children? The wait()
system call can handle this.
Code: Multiple Children with wait()
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { printf("Parent process (PID: %d)\n", getpid()); for (int i = 0; i < 2; i++) { int rc = fork(); if (rc == 0) { printf("Child %d (PID: %d) started\n", i + 1, getpid()); sleep(2); printf("Child %d (PID: %d) exiting...\n", i + 1, getpid()); exit(0); } } // Parent waits for both children int status; while (wait(&status) > 0) { printf("A child process finished.\n"); } printf("Parent finishing execution.\n"); return 0; }
Expected Output
Parent process (PID: 12345)
Child 1 (PID: 12346) started
Child 2 (PID: 12347) started
Child 1 (PID: 12346) exiting...
A child process finished.
Child 2 (PID: 12347) exiting...
A child process finished.
Parent finishing execution.
✅ The parent waits for all children to exit before finishing.
Using waitpid()
for More Control**
waitpid()
allows selective waiting for a specific child, unlike wait()
, which waits for any child.
Syntax of waitpid()
pid_t waitpid(pid_t pid, int *wstatus, int options);
pid > 0
→ Waits for a specific process.pid == -1
→ Equivalent towait()
, waits for any child.options
→ Additional behavior (e.g.,WNOHANG
to prevent blocking).
Example: Using waitpid()
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <sys/wait.h> int main() { printf("Parent process (PID: %d)\n", getpid()); int rc = fork(); if (rc == 0) { // Child process printf("Child process (PID: %d) running...\n", getpid()); sleep(2); printf("Child exiting...\n"); exit(0); } else { // Parent waits for specific child int status; waitpid(rc, &status, 0); printf("Parent: Child %d finished execution.\n", rc); } return 0; }
✅ waitpid()
is useful when you need to wait for a specific child instead of any available child.
Common Issues and Debugging wait()
❌ Issue 1: Parent Finishes Before Child
If the parent terminates before the child, the child becomes orphaned and is adopted by the init
process.
✅ Solution: Use wait()
int status; wait(&status);
❌ Issue 2: Zombie Processes
A zombie process is a process that has exited but is still listed in the process table because the parent didn't collect its exit status.
✅ Solution: Ensure Parent Calls wait()
while (wait(NULL) > 0);
❌ Issue 3: wait()
Blocks the Parent Forever
If the child never terminates, wait()
can block the parent indefinitely.
✅ Solution: Use waitpid()
with WNOHANG
Option
while (waitpid(-1, NULL, WNOHANG) > 0);
This makes waitpid()
non-blocking
3. The exec()
System Call
Overview
The exec()
family of system calls plays a crucial role in process management by replacing the current process image with a new program. Let's break it down thoroughly, exploring how it works, its variants, memory changes, and real-world applications.
1. What Exactly Does exec()
Do?
- The
exec()
system call loads a new program into the current process’s memory and begins executing it. - Unlike
fork()
, which creates a new process,exec()
replaces the existing process without creating a new one. - The PID remains the same, but the code, stack, and heap are replaced with the new program's data.
Example Scenario:
- Suppose we have a shell that allows users to run commands. When a user types
ls
, the shell:- Creates a child process (
fork()
) to avoid replacing itself. - Calls
exec()
in the child to replace the process image with/bin/ls
. - The
ls
program runs, displaying the directory contents.
- Creates a child process (
2. Execution Flow of exec()
in p3.c
Code Analysis:
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <string.h> #include <sys/wait.h> int main(int argc, char *argv[]) { printf("hello world (pid:%d)\n", (int) getpid()); int rc = fork(); if (rc < 0) { // Fork failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // Child process printf("hello, I am child (pid:%d)\n", (int) getpid()); // Prepare arguments for the new program char *myargs[3]; myargs[0] = strdup("wc"); // Program name myargs[1] = strdup("p3.c"); // Argument (filename) myargs[2] = NULL; // Terminate argument list execvp(myargs[0], myargs); // Replaces child with wc program // This line will only execute if execvp fails printf("this shouldn’t print out"); } else { // Parent process waits for child int rc_wait = wait(NULL); printf("hello, I am parent of %d (rc_wait:%d) (pid:%d)\n", rc, rc_wait, (int) getpid()); } return 0; }
Step-by-Step Breakdown
Step | Action |
---|---|
1 | Parent process starts execution and prints hello world with its PID. |
2 | fork() is called, creating a child process. |
3 | Parent gets child’s PID as rc , child gets rc = 0 . |
4 | Child process executes execvp("wc", myargs) . |
5 | execvp() replaces the child’s code with the wc program (word count). |
6 | Parent process waits for the child to finish execution. |
7 | Parent prints message after child completes. |
3. Memory Layout Changes
When a process calls exec()
, its memory layout changes drastically:
Before exec()
(Executing p3.c
)
Section | Contents |
---|---|
Text | Code of p3.c |
Heap | Allocated memory using malloc() , etc. |
Stack | Function calls, local variables |
Data | Global variables |
After exec()
(Executing wc p3.c
)
Section | Contents |
---|---|
Text | Code of /usr/bin/wc |
Heap | Cleared and initialized for wc |
Stack | Reset for wc |
Data | Cleared and initialized for wc |
Thus, the old program is completely wiped out, and the process now runs as wc
.
4. Variants of exec()
The exec()
family includes multiple functions, each offering different capabilities:
Function | Arguments | Searches PATH ? | Allows Environment Variables? |
---|---|---|---|
execl() | List of arguments | ❌ No | ❌ No |
execv() | Array of arguments | ❌ No | ❌ No |
execlp() | List of arguments | ✅ Yes | ❌ No |
execvp() | Array of arguments | ✅ Yes | ❌ No |
execle() | List + environment variables | ❌ No | ✅ Yes |
execve() | Array + environment variables | ❌ No | ✅ Yes |
Example Differences:
execl("/bin/ls", "ls", "-l", NULL); // Using absolute path execvp("ls", args); // Searches in PATH execle("/bin/ls", "ls", "-l", NULL, env); // Passes custom environment
5. Handling Errors in exec()
Since exec()
never returns on success, any code after exec()
only runs if it fails.
- If the file does not exist or is not executable,
exec()
returns-1
, anderrno
is set.
Error Handling Example
if (execvp(myargs[0], myargs) == -1) { perror("exec failed"); exit(1); }
6. Real-World Applications of exec()
- Shells (e.g., Bash, Zsh, Fish)
fork()
creates a new process for each command.exec()
replaces the new process with the actual command.
- Process Managers (e.g.,
systemd
,init
,supervisord
)- Used to spawn and manage system processes.
- Web Servers (e.g., Apache, Nginx, CGI scripts)
exec()
helps execute backend scripts (PHP
,Python
).
- Programming Languages (e.g., Python, Java)
- Used in interpreters to execute compiled programs.
4. Why? Motivating The API
Why Separate fork()
and exec()
?
The combination of fork()
and exec()
may seem odd at first, but it's a fundamental design choice in UNIX systems. The main reason is that it allows modifications to be made between process creation (fork()
) and execution (exec()
). This flexibility is crucial in building shell functionalities such as redirection (>
, <
), pipes (|
), and environment modifications.
How Shells Work with fork()
and exec()
A shell is simply a user program that:
- Reads user input (e.g.,
ls -l
orwc p3.c > newfile.txt
). - Calls
fork()
to create a new process. - In the child process, it calls
exec()
to replace the process with the desired program (e.g.,ls
,wc
). - The parent process calls
wait()
to wait for the child to finish execution before showing the next prompt.
Example: Output Redirection (>
in Shells)
Consider the command:
wc p3.c > newfile.txt
This means "Run wc p3.c
but store the output in newfile.txt
instead of displaying it on the screen."
How the Shell Implements It
- The shell calls
fork()
to create a child process. - In the child:
- It closes
STDOUT_FILENO
(standard output, usually the screen). - It opens the file
newfile.txt
. The OS assigns it the first available file descriptor, which isSTDOUT_FILENO
(file descriptor 1). - It calls
exec()
to replace itself withwc p3.c
.
- It closes
- The output of
wc
is now redirected tonewfile.txt
.
Code Implementation (p4.c
)
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <fcntl.h> #include <sys/wait.h> int main(int argc, char *argv[]) { int rc = fork(); if (rc < 0) { // Fork failed fprintf(stderr, "fork failed\n"); exit(1); } else if (rc == 0) { // Child: Redirect output to a file close(STDOUT_FILENO); open("./p4.output", O_CREAT | O_WRONLY | O_TRUNC, S_IRWXU); // Execute wc program char *myargs[3]; myargs[0] = "wc"; // Program: wc (word count) myargs[1] = "p4.c"; // Argument: file to count words in myargs[2] = NULL; // Null-terminated array execvp(myargs[0], myargs); // If exec succeeds, this won't print printf("This shouldn’t print out\n"); } else { // Parent: Wait for child to finish wait(NULL); } return 0; }
How File Descriptors Enable Redirection
In UNIX-like systems, file descriptors are assigned as follows:
0
→ Standard Input (STDIN_FILENO
)1
→ Standard Output (STDOUT_FILENO
)2
→ Standard Error (STDERR_FILENO
)
By closing STDOUT_FILENO
and opening a file, the new file descriptor takes the place of standard output. As a result, any printf()
or other write operations in the child process get redirected to the file instead of the terminal.
Example: Output of Running p4.c
prompt> ./p4
prompt> cat p4.output
32 109 846 p4.c
Here, wc
has counted 32 lines, 109 words, and 846 bytes in p4.c
, but the result was written to p4.output
, not the terminal.
Pipes (|
) in Shells
Pipes (|
) work similarly, but instead of redirecting output to a file, the shell connects the output of one process to the input of another. Example:
grep -o foo file | wc -l
This counts occurrences of "foo"
in file
.
How the Shell Implements Pipes
- The shell creates a pipe using
pipe()
, which provides two file descriptors:- One for writing (output of
grep
). - One for reading (input to
wc
).
- One for writing (output of
- The shell
fork()
s two child processes:- The first child runs
grep -o foo file
, with its output redirected to the pipe. - The second child runs
wc -l
, reading from the pipe instead of the keyboard.
- The first child runs
- The parent waits for both processes to complete.