So from the view of the process executing the for loop:
the thread of execution is a continuous sequence of instructions:
1,2,4,3,2,4,3,........
However from the point of view of the CPU:
this thread of execution will be intertwined with the threads of execution of all the other processes running at the time
Each sequence being interrupted at each context switch.
Indeed this thread may even be interrupted within this process if the process receives a signal for which it has a handler installed.
The obvious extension of this model of computation is for a single process to have multiple threads of execution.
Overview of Threads
Threads are a programming tool for creating parallelism in programs
Processes are also a programming tool for creating parallelism in programs.
One spawns a process to complete a task in parallel and uses shared memory (i.e. pipes, fifos or mapped memory) to synchronise & communicate.
One spawns a thread to complete a task in parallel and uses shared memory to synchronise & communicate.
The main difference is that threads share much much more!
Threading on Linux has a bit of a history
Overview of Threads
A thread consists of the information necessary to represent an execution context within a program.
Everything within a process is shared among the threads of a process including:
The text of the executable
The global and heap memory
The stacks
The file descriptors
You can see that mutual exclusion is going to be important when developing multi-threaded software!
Overview of Threads
The UNIX standard for threading is called pthreads.
Its not the greatest standard in the world, but it is the only one.
It is only quite recently that Linux has had an implementation that comes close to conforming:
The new [Native POSIX Thread Library for Linux (NPTL)(http://people.redhat.com/drepper/nptl-design.pdf)
This library was introduced around the transition from 2.4 to 2.5 kernel.
Depending on the Linux distribution.
The older library is not nearly so conformal: LinuxThreads
Overview of Threads
#include <sys/types.h> /* for getppid */
#include <unistd.h> /* for getppid */
#include <pthread.h> /* for threads */
#include <stdio.h> /* for perror etc */
#include <stdlib.h> /* for exit */
static pid_t mainThread;
static pid_t childThread;
void *child(void* arg){
childThread = getppid();
return arg;
}
int main(){
pthread_t tid;
mainThread = getppid();
if(pthread_create(&tid, NULL, child, (void *)NULL){
perror("Could not create thread");
exit(EXIT_FAILURE);
} else {
pthread_join(tid, NULL);
}
if(childThread == mainThread){
printf("Your system is using NPTL\n");
} else {
printf("Your system is using LinuxThreads\n");
}
exit(EXIT_SUCCESS);
}
Processes vs Threads
Once upon a time, Unix processes were heavy-weight processes:
Their creation was relatively expensive (requires copying memory and executables, allocating pids, scheduling etc).
The parallelism was reasonably course grained, a process executes for a while and is then suspended by a context switch, its next turn depends on how many other processes are running.
However, on Linux, fork is implemented using copy-on-write pages.
The only penalty incurred by fork is the time and memory required to duplicate the parent's page tables, and to create a unique task structure for the child.
In fact both fork and thread creation in LinuxThreads and NPTL are implemented by the same Linux system call clone, differing only in the flags used!
Processes vs Threads
A thread can be thought of as a type of light-weight process:
their creation is cheap (perhaps cheaper than fork);
the parallelism could be extremely fine (compared to a fork);
A thread is simply a particular sequence of execution steps through a single process's executable.
So two threads within a process may execute in parallel without being interrupted by a context switch.
Separate threads within a process share the same memory (& the same executable),
Hence two threads that use strtok, for example, can cause havoc.
This hints at the idea of thread-safe functions.
Processes vs Threads
A thread is small & cheap, consisting merely of:
a program counter
register set
stack
state
Consequently creating them is cheap.
On the other hand since they share memory & executable,
programming with them can be a subtle business.
Similar to signals.
Processes vs Threads
A typical problem well solved by threads is monitoring several file descriptors.
There are two classic approaches in UNIX:
Method A: Use non-blocking I/O
By setting a flag using fcntl we can force read to return immediately if there is nothing to read (it returns -1 and sets errno to be EAGAIN)
Thus it is a simple matter to continuously monitor several file descriptors by repeatedly checking them. i.e. busy waiting.
Method B:
Use the select operation.
Processes vs Threads
Other possibilities
Method C:
Use Asynchronous I/O with SIGPOLL
This is not part of POSIX though.
Method D:
Use POSIX1.b Asynchronous I/O
Method E:
Use poll to block
Again this is not POSIX
Another classic use would be a multi-threaded server
int f(int i){
return i + 2;
}
int g(int i){
return f(i) + 2;
}
Reentrant Code
Basic principles:
Reentrant code may not hold any static or global data
Reentrant code may not modify its own code
Reentrant code may not call non-reentrant code
Reentrant Code
Final points - it is possible for function to be thread-safe but not reentrant:
int myFunction(){
mutex_lock();
// ...
function body code
// ...
mutex_unlock();
}
If this function is used as part of a reentrant interrupt handler and a second call is made while the first is within the mutual exclusion block, a deadlock may occur in the second call.
This situation will arise if a second interrupt occurs while the first is executing the mutual exclusion block.
Multi-threaded Examples
In this example we will consider the two programs:
The first monitors a file in the main thread.
The second creates a separate thread to do the same task.
The file descriptor is being passed into the library procedure via a pointer to void the C way of circumventing the type checker
Why do we need to circumvent the type checker?
Remember C does not have any form of polymorphism!
Keep this in mind when we look at the pthread primitives.
Multi-threaded Examples
To do this we first present a simple library of procedures.
exampl01.c - A simple single-threaded function call
#include "process.h"
#include "headers.h"
int main(){
int fd_1 = open("makefile", O_RDONLY);
if(fd_1 == -1)
perror("Could not open makefile");
else
process_fd(&fd_1);
return 0;
}
Multi-threaded Examples
exampl02.c - A simple single-threaded function call
#include "process.h"
#include "headers.h"
int main(){
pthread_t tid;
int fd = open("makefile", O_RDONLY);
if(fd == -1)
perror("Could not open makefile");
else if(pthread_create(&tid, NULL, process_fd, (void *)&fd) != 0)
perror("Could not create thread");
else if(pthread_join(tid, NULL) != 0)
perror("Could not join with thread");
else exit(EXIT_SUCCESS);
exit(EXIT_FAILURE);
}
Multi-threaded Examples
Compile all multithreaded code with the -pthread flag.
It affects include files in three ways:
The include files define prototypes for the reentrant variants of some of the standard library functions, e.g. strtok_r() as a reentrant equivalent to strtok().
If -pthread is defined, some <stdio.h> functions are no longer defined as macros, e.g. getc() and putc(). In a multithreaded program, such functions require additional locking, which the macros don't perform, so we must call functions instead.
More importantly, <errno.h> redefines errno so that errno refers to the thread-specific errno location, rather than the global errno variable.
Why would a shared errno be a bad idea? - Race conditions: Which thread is producing the error?
Multi-threaded Examples
The following operations are the basic POSIX thread management operations.
They require #include <pthread.h> and to be compiled using gcc (and linked) with the flag: -pthread
They are:
pthread_create
pthread_join
pthread_exit
Most pthread routines return zero on success, and a non-zero error code on failure.
Some functions (pthread_exit for example) do not return a value.
Summary
Overview of Threads
Processes vs Threads
Thread Creation
Reentrant Code
Multi-threaded Examples
Questions?
Reading
Chapters 11 and 10 (Just section 10.6) from Advanced Programming in the UNIX environment