Sunday, October 25, 2009

Section 11.4.  Thread Creation










11.4. Thread Creation


The traditional UNIX process model supports only one thread of control per process. Conceptually, this is the same as a threads-based model whereby each process is made up of only one thread. With pthreads, when a program runs, it also starts out as a single process with a single thread of control. As the program runs, its behavior should be indistinguishable from the traditional process, until it creates more threads of control. Additional threads can be created by calling the pthread_create function.



[View full width]

#include <pthread.h>

int pthread_create(pthread_t *restrict tidp,
const pthread_attr_t *restrict
attr,
void *(*start_rtn)(void), void
*restrict arg);



Returns: 0 if OK, error number on failure



The memory location pointed to by tidp is set to the thread ID of the newly created thread when pthread_create returns successfully. The attr argument is used to customize various thread attributes. We'll cover thread attributes in Section 12.3, but for now, we'll set this to NULL to create a thread with the default attributes.


The newly created thread starts running at the address of the start_rtn function. This function takes a single argument, arg, which is a typeless pointer. If you need to pass more than one argument to the start_rtn function, then you need to store them in a structure and pass the address of the structure in arg.


When a thread is created, there is no guarantee which runs first: the newly created thread or the calling thread. The newly created thread has access to the process address space and inherits the calling thread's floating-point environment and signal mask; however, the set of pending signals for the thread is cleared.


Note that the pthread functions usually return an error code when they fail. They don't set errno like the other POSIX functions. The per thread copy of errno is provided only for compatibility with existing functions that use it. With threads, it is cleaner to return the error code from the function, thereby restricting the scope of the error to the function that caused it, instead of relying on some global state that is changed as a side effect of the function.



Example

Although there is no portable way to print the thread ID, we can write a small test program that does, to gain some insight into how threads work. The program in Figure 11.2 creates one thread and prints the process and thread IDs of the new thread and the initial thread.


This example has two oddities, necessary to handle races between the main thread and the new thread. (We'll learn better ways to deal with these later in this chapter.) The first is the need to sleep in the main thread. If it doesn't sleep, the main thread might exit, thereby terminating the entire process before the new thread gets a chance to run. This behavior is dependent on the operating system's threads implementation and scheduling algorithms.


The second oddity is that the new thread obtains its thread ID by calling pthread_self instead of reading it out of shared memory or receiving it as an argument to its thread-start routine. Recall that pthread_create will return the thread ID of the newly created thread through the first parameter (tidp). In our example, the main thread stores this in ntid, but the new thread can't safely use it. If the new thread runs before the main thread returns from calling pthread_create, then the new thread will see the uninitialized contents of ntid instead of the thread ID.


Running the program in Figure 11.2 on Solaris gives us




$ ./a.out
main thread: pid 7225 tid 1 (0x1)
new thread: pid 7225 tid 4 (0x4)



As we expect, both threads have the same process ID, but different thread IDs. Running the program in Figure 11.2 on FreeBSD gives us




$ ./a.out
main thread: pid 14954 tid 134529024 (0x804c000)
new thread: pid 14954 tid 134530048 (0x804c400)



As we expect, both threads have the same process ID. If we look at the thread IDs as decimal integers, the values look strange, but if we look at them in hexadecimal, they make more sense. As we noted earlier, FreeBSD uses a pointer to the thread data structure for its thread ID.


We would expect Mac OS X to be similar to FreeBSD; however, the thread ID for the main thread is from a different address range than the thread IDs for threads created with pthread_create:




$ ./a.out
main thread: pid 779 tid 2684396012 (0xa000a1ec)
new thread: pid 779 tid 25166336 (0x1800200)



Running the same program on Linux gives us slightly different results:




$ ./a.out
new thread: pid 6628 tid 1026 (0x402)
main thread: pid 6626 tid 1024 (0x400)



The Linux thread IDs look more reasonable, but the process IDs don't match. This is an artifact of the Linux threads implementation, where the clone system call is used to implement pthread_create. The clone system call creates a child process that can share a configurable amount of its parent's execution context, such as file descriptors and memory.


Note also that the output from the main thread appears before the output from the thread we create, except on Linux. This illustrates that we can't make any assumptions about how threads will be scheduled.




Figure 11.2. Printing thread IDs



#include "apue.h"
#include <pthread.h>

pthread_t ntid;

void
printids(const char *s)
{
pid_t pid;
pthread_t tid;

pid = getpid();
tid = pthread_self();
printf("%s pid %u tid %u (0x%x)\n", s, (unsigned int)pid,
(unsigned int)tid, (unsigned int)tid);
}

void *
thr_fn(void *arg)
{
printids("new thread: ");
return((void *)0);
}

int
main(void)
{
int err;

err = pthread_create(&ntid, NULL, thr_fn, NULL);
if (err != 0)
err_quit("can't create thread: %s\n", strerror(err));
printids("main thread:");
sleep(1);
exit(0);
}












    No comments: