Threading Models Master/slave (aka Boss/worker, Dispatcher/worker) Work crew (aka Peer) Pipeline
c 2007 Stevens Institute of Technology
1
Master/Slave Model 1 master thread receives input N slave threads do work based on input Two variations on the idea: on demand slaves, thread pool
c 2007 Stevens Institute of Technology
2
On Demand Master/Slave Master ... 1. Reads input(s) 2. Creates slave, passing “work request� as argument (i.e., pthread create) 3. Perhaps waits for slave to terminate (i.e., pthread join) Slave ... 1. Does work specified by its argument (including any output) 2. Terminates Adv: simple to program Disadv: latency/overhead of create/join on every input
c 2007 Stevens Institute of Technology
3
Illustration assign work read
master write
Input
Output
slaves
c 2007 Stevens Institute of Technology
4
Thread Pool Master/Slave Typically master is 1st thread — i.e., starts from main() Master creates pool of N slaves before reading any input Master ... 1. Reads input(s) 2. Writes “work request” record to queue Slave ... 1. Reads work request record from queue 2. Does work, including any output Adv: no create/join latency Disadv: queue shared among N+1 threads Disadv: how to choose N? c 2007 Stevens Institute of Technology
5
Illustration get work assign work read
master
shared queue
write
Input
Output
slaves
c 2007 Stevens Institute of Technology
6
Workcrew Model All/most threads are “peers” Two tiny variations on the idea:
• 1st thread creates others then waits for them to terminate, then it terminates • 1st thread creates others then calls “do work” function just like others In both cases, program ends when all threads terminate
c 2007 Stevens Institute of Technology
7
Illustration read
write
Input
Output
peers
c 2007 Stevens Institute of Technology
8
Workcrew Input/Output Each worker thread does its own input & output Adv: no shared queue of work requests as in master/slave Disadv: not all programs can be easily structured so each thread can do I/O independently Workcrew model works best with “data parallel” problems — problem can be divided into predictable number of disjoint sub-problems Workcrew model well suited for multiprocessor
c 2007 Stevens Institute of Technology
9
Multi-thread Input/Output, I In master/slave, master thread is only thread reading input (e.g., read(2)) In workcrew, each thread has separate input What about shared I/O — i.e., multiple threads read/write common descriptor? Results are “implementation dependent” ⇒ AVOID!
c 2007 Stevens Institute of Technology
10
Multi-thread Input/Output, II UNIX I/O is byte-oriented, not record-oriented Example: threads A and B attempt 50-byte read Thread A may get first 30 bytes, thread B following 20 bytes ... both awaiting remaining bytes (20 for thread A, 30 for thread B) that will never come Similar story for write Moral: each thread should have its own descriptors OR access to shared descriptor should be serilaized
c 2007 Stevens Institute of Technology
11
Pipeline Model Like an assembly line: predictable number of “stages� to problem Thread dedicated to particular stage 1 or more threads per stage Adv: easy programming for problems with natural pipeline structure Disadv: only certain problems fit this pattern Disadv: need shared queue between EACH pair of stages Disadv: pipeline no faster than slowest stage
c 2007 Stevens Institute of Technology
12
Producer/Consumer Problem, I Work queue shared among threads is example producer/consumer problem Aka “bounded buffer problem” 1 or more threads write to end of queue/buffer 1 or more threads “read from” end of queue/buffer — but it’s really a write operation on queue because items are removed
c 2007 Stevens Institute of Technology
13
Producer/Consumer Problem, II Potential concurrency: if queue is long enough, can add & delete simultaneously Potential problems:
• Multiple simultaneous adds • Multiple simultaneous deletes • Simultaneous add & delete of “short” queue — e.g., link list pointers might go awry
c 2007 Stevens Institute of Technology
14
Semaphore Example II: Producer/Consumer /* used for critical section */ int mutex = 1; /* these are "condition variables" */ int spaces = buffer_size; int items = 0; Producer: P(spaces); P(mutex); <add to buffer> V(mutex); V(items); Consumer: P(items); P(mutex); <remove from buffer> V(mutex); V(spaces);
c 2007 Stevens Institute of Technology
15
Semaphore Example II: Producer/Consumer Condition variables indicate the state (i.e., condition) of the buffer In this case, programmer must think hard about where to place P, V!
c 2007 Stevens Institute of Technology
16
Producer/Consumer Notes, I Q: Why is mutex needed? A: To ensure at most 1 p/c accessing buffer at a time Q: Why are TWO condition variables needed? A: Semaphore operations affect the value of only one condition variable. Need to wakeup sleepers when buffer fills OR empties. So need a semaphore variable that will wakeup sleepers when the buffer fills AND ANOTHER that will wakeup sleepers when the buffer empties.
c 2007 Stevens Institute of Technology
17
Producer/Consumer Notes, II Q: Why are P/V for spaces & items placed OUTSIDE mutex? A: Danger of deadlock otherwise. Consider what could happen if code were instead
Producer: P(mutex); P(spaces); ... Consumer: P(mutex); P(items); ...
c 2007 Stevens Institute of Technology
18
Producer/Consumer Notes, III This could happen if buffer were empty: 1. Consumer executes P(mutex) — passes thru 2. Consumer executes P(items) — blocks waiting for Producer’s V(items) 3. Producer executes P(mutex) — blocks
c 2007 Stevens Institute of Technology
19
Whatâ&#x20AC;&#x2122;s Going On, I First step: /* producer */ P(mutex); <add to buffer> V(mutex); <signal to waiters for items: there is another item> /* consumer */ P(mutex); <remove from buffer> V(mutex); <signal to waiters for spaces: there is another space>
c 2007 Stevens Institute of Technology
20
Whatâ&#x20AC;&#x2122;s Going On, II Second step: /* producer */ <if (full) then pause until non-full> P(mutex); <add to buffer> V(mutex); <signal to waiters for items: there is another item> /* consumer */ <if (empty) then pause until non-empty> P(mutex); <remove from buffer> V(mutex); <signal to waiters for spaces: there is another space>
Compare to monitor solution on page 29
c 2007 Stevens Institute of Technology
21
Semaphore Summary Semaphores accomplish two purposes: 1. Limiting access (P) 2. “Signaling” sleeper when access becomes permissible (V) Semaphores awkward for non-trivial situations: must translate real synchronization condition into non-intuitive increment/decrement of integer — e.g., producer/consumer solution Semaphore can be implemented in OS or library
c 2007 Stevens Institute of Technology
22
Monitors, I Originally, a language-level idea A monitor is:
• Shared data • Set of monitor procedures that are the ONLY ONES to access the data • A lock, for implementing “monitor invariant” • Syntactic sugar for encapsulating all the above (also hiding the lock) • Two operations wait, signal (also sometimes a third: broadcast)
c 2007 Stevens Institute of Technology
23
Monitors, II Also, a behavioral constraint—the monitor invariant: no more than one monitor procedure will “run” at any time What run actually means is tricky—see below Monitor invariant enforced by (1) compiler emitting code that uses lower-level method to make monitor be critical section, and by (2) threads exiting properly
c 2007 Stevens Institute of Technology
24
Example At most one of foo, bar may run at any moment:
begin monitor; int var1; int var2;
// monitor data
void foo { // statements accessing var1 and/or var2 } void bar { // statements accessing var1 and/or var2 } end monitor;
c 2007 Stevens Institute of Technology
25
Monitors, III To enter monitor, thread must have lock Behavior of monitor operations:
• Wait—atomically releases lock then waits on condition. When thread re-awakens, lock has been re-acquired (to allow entry into monitor) • Broadcast—enables ALL waiting threads to run. When call returns, lock is still held • Signal—an optimization of broadcast; enables ONE waiting thread to run (thread is picked according to some policy, just as with semaphores). When call returns, lock is still held.
c 2007 Stevens Institute of Technology
26
Example Implementations Wait:
drop lock wait on condition ... (get scheduled) ... get lock return
// these two atomic so that // thread certain to be waiti
Signal:
indicate to scheduler that some waiter may run return
c 2007 Stevens Institute of Technology
27
Monitor Example: Producer/Consumer, I begin monitor; /* condition variables */ condition items; condition spaces; char buffer[]; /* also: pointers into buffer */
c 2007 Stevens Institute of Technology
28
Monitor Example: Producer/Consumer, II void produce() { /* hidden action: get lock */ if (<buffer is full>) WAIT(spaces); <produce> SIGNAL(items); /* hidden action: drop lock */ return; } void consume() { /* hidden action: get lock */ if (<buffer is empty>) WAIT(items); <consume> SIGNAL(spaces); /* hidden action: drop lock */ return; } end monitor;
c 2007 Stevens Institute of Technology
29
Monitor Example: Producer/Consumer, III Monitor invariantâ&#x20AC;&#x201D;implemented by hidden lockâ&#x20AC;&#x201D;ensures critical section for <produce> and <consume> Hidden get/drop lock actions performed by code automatically produced by compiler, which knows what a monitor is because monitor is a language-level concept
c 2007 Stevens Institute of Technology
30
Monitor Example: Producer/Consumer, IV Compare to semaphore solution on page 21 1. Monitor invariant (an application-INdependent condition) eliminates need for mutex semaphore Note: lock is OUTSIDE wait/signal, therefore this solution permits less concurrency than semaphore solution (see page 15)
c 2007 Stevens Institute of Technology
31
Monitor Example: Producer/Consumer, V 2. Application-dependent condition (i.e., buffer must never overflow/underflow) is enforced by placement of wait and signal similar to semaphore P/V “Hoare formulation” uses: if (<app-specific condition>) { ...
}
“Mesa formulation” uses: while (<app-specific condition>) { ...
}
Adv of monitor over semaphore: writing “app-specific condition” is more natural than P/V on an integer
c 2007 Stevens Institute of Technology
32
Monitors, IV Beyond the monitor invariant, it is typical for procedures to have an application-specific condition for signaling; e.g., signal when buffer empties or fills Programmer places if (not app-specific condition) then wait(cv) at start of function (Hoare formulation) Wait ends when another monitor procedure executes signal (or broadcast) with same cv as argumentâ&#x20AC;&#x201D;this will awaken some waiter
cv is a condition variable (better term would be â&#x20AC;&#x153;condition name,â&#x20AC;? because a condition variable is not a variable in the usual sense)
c 2007 Stevens Institute of Technology
33
Monitors, V Waiters and signalers agree to perform all communication about the establishment of the application-specific condition through wait and signal on the relevant condition variable Since at most 1 thread can be “in the monitor,” what to do with signaling thread? (Waiter will awaken in the monitor, making both signaler and waiter “in the monitor” simultaneously)
c 2007 Stevens Institute of Technology
34
Monitors, VI Alternatives: 1. Halt signaler & run signaled thread immediately. (Hoare’s formulation.) Note: when a thread is halted this way, it is not considered “in the monitor” because scheduler ensures it isn’t running. 2. Require signaler to exit the monitor immediately—merely a convention. (Mesa formulation.)
c 2007 Stevens Institute of Technology
35
Thread Scheduling Policies Which sleeper to wake up? Random? FCFS? Policy for which signal-ee to wakeup should be purposely left unspecifiedâ&#x20AC;&#x201D;user can assume nothing Broadcast: wake up all & let compete according to policy Implementor is free to choose easiest/fastest/best policy
c 2007 Stevens Institute of Technology
36
Mesa Monitors, I In Hoare’s formulation, programmer puts if (not app-specific condition) then wait at start of monitor procedures that have application-specific consistency conditions on the data This works because signaled procedure executes immediately—the VERY NEXT ONE to execute—after signaler executes signal() But ... this assumes that thread scheduler cooperates
c 2007 Stevens Institute of Technology
37
Mesa Monitors, II Q: How to have monitors when ... 1. Language doesn’t have monitor primitive (e.g., C/C++) 2. OS thread scheduler doesn’t/can’t cooperate (e.g., UNIX) A1: Create language that includes monitor & schedules its own threads (e.g., Java) (Such a complicated language runtime system is possible only on OS that supports threads!) A2: See below
c 2007 Stevens Institute of Technology
38
Mesa Signaling Primitives Mesa begat C-Threads begat POSIX Pthreads Pthreads API: pthread_cond_wait(pthread_cond_t *c, pthread_mutex_t *m) pthread_cond_signal(pthread_cond_t *c) pthread_cond_broadcast(pthread_cond_t *c) pthread_mutex_lock(pthread_mutex_t *m) pthread_mutex_unlock(pthread_mutex_t *m)
c 2007 Stevens Institute of Technology
39
Use of Mesa Condition Variables, I Operation of pthread cond wait(c,m): 1. Set thread state to â&#x20AC;&#x153;waitingâ&#x20AC;? 2. Add thread to set of threads waiting for this condition variable 3. unlock(m) 4. while(thread state == "waiting") thread sleeps for a while; When thread finishes sleeping, its state is "ready". Eventually, scheduler will pick to it run. Then its state will be "running". 5. lock(m) Thread is now running but may have to wait for lock
c 2007 Stevens Institute of Technology
40
Use of Mesa Condition Variables, II Operation of pthread cond signal(c): 1. Select thread from condition variableâ&#x20AC;&#x2122;s wait-set and set its state to "ready" Operation of pthread mutex lock(m): 1. Spin-wait using TAS or similar mechanism 2. m = threadâ&#x20AC;&#x2122;s ID; Operation of pthread mutex unlock(m): m = 0;
c 2007 Stevens Institute of Technology
41
Use of Mesa Condition Variables, III void produce() { pthread_mutex_t mutex; pthread_cond_t spaces; pthread_cond_t items; pthread_mutex_lock(&mutex); while (<buffer is full>) pthread_cond_wait(&spaces, &mutex); <produce data> pthread_cond_signal(&items); pthread_mutex_unlock(&mutex); return; }
Mutex enforces monitor invariantâ&#x20AC;&#x201D;mutex implemented by programmer, not language Data must be in state associated with app-specific condition spaces BEFORE this operation runs, and will be in state assoc. w/ app-specific condition items AFTER operation runs c 2007 Stevens Institute of Technology
42
Use of Mesa Condition Variables, IV In Hoareâ&#x20AC;&#x2122;s formulation, after signal(), signaled thread is guaranteed next to run Therefore, wait with:
if (<not condition>) wait();
c 2007 Stevens Institute of Technology
43
Use of Mesa Condition Variables, V In Pthreads for C/UNIX ... 1. Other threads may run between signal-er and signal-ee Between when thread A’s signal() zeroes thread B’s wait indicator and when thread B waits for the lock, some random thread C may have run and undone application-specific condition that A established for B =⇒ A (or some other) will have to re-run and re-establish it
c 2007 Stevens Institute of Technology
44
Use of Mesa Condition Variables, VI 2.
Therefore, only monitor invariant (and nothing more) may be assumed
3.
Therefore, condition must be re-checked when signal-ee awakens
4.
Therefore, waiters wait with:
while (<not condition>) wait();
c 2007 Stevens Institute of Technology
45
Example, I Ideal execution: 0. Thread A holds lock & is in monitor procedure 1. Thread A establishes application-specific condition on monitor data 2. Thread A signals appropriate condition variable, thereby changing state of Thread B from waiting to non-waiting 3. Thread A drops lock 4. Thread A returns from monitor procedure 5. Thread B is scheduled 6. Thread B, which is no longer waiting, tries to get lock 7. Thread B succeeds in getting lock & its pthread_cond_wait returns 8. Thread B tests application-specific condition in test of while statement 9. Test is passed & thread B enters critical section of monitor procedure
c 2007 Stevens Institute of Technology
46
Example, II In Mesa formulation (OS thread scheduler doesnâ&#x20AC;&#x2122;t know about monitors), what COULD happen: ... same as above, steps 0 thru 4 5. Thread C is scheduled 6. Thread C gets lock, enters its critical section, and changes monitor data so as to UNDO thread Bâ&#x20AC;&#x2122;s application-specific condition ... Thread C eventually leaves its monitor procedure ... same as above, steps 5 thru 8 9. Test FAILS! 10. Thread B calls wait again
This execution shows why thread must re-test application-specific condition after returning from wait
c 2007 Stevens Institute of Technology
47
Use of Mesa Condition Variables, VII This conventional use of wait and signal provides proper monitor entry & exit Costs are: 1. Extra evaluation of condition (after return from wait) 2. Possible starvation, since no scheduler cooperation Benefits are: 1. Decouple OS thread scheduler policy from application monitorsâ&#x20AC;&#x201D;can have monitors w/o language or OS support 2. Does not require forced (and possibly non-optimal) Hoare-style switch to signaled thread
c 2007 Stevens Institute of Technology
48