April 17th Lecture
====================

critical section - a section of code that accesses shared variables

solving the synchronization problem:
 * identify all of a program's critical sections
 * add entry code before each CS
 * add exit code after each CS
 * entry and exit code are designed to allow only 1 thread at a time
   to execute CS code
 * many different synchronization mechanisms can be used

In general:

    T1              T2
    ---             ---
    enter()         enter()
    CS()            CS()
    exit()          exit()

Desirable properties of solutions:
 * mutual exclusion - at most 1 thread in CS at a time
 * absence of deadlock - if multiple threads are trying to enter CS,
   one is guaranteed to succeed
 * absence of unnecessary delay - a thread in non-CS code does not prevent
   other threads from entering their CS
 * fairness ("no starvation") - a thread trying to enter its CS will
   eventually succeed

Solution #1: Strict Alternation

T0                      T1
===                     ===
while (1) {             while (1) {
    while (turn != 0)       while (turn != 1)
        ;                       ;
    CS();                   CS();
    turn = 1;               turn = 0;
    nonCS();                nonCS();
}                       }

Problems:
 * busy-waiting is wasteful
 * unnecessary delay


General idea: lock variables
 * "lock" L is an int (initially 0); 0=unlocked, 1=locked
 * before entering a CS, a process tests L
   - 0 => set L=1, enter CS, set L=0 when done
   - 1 => wait
Problem with naive implementation:
 * two threads A & B both about to enter CS
 * A reads L, sees it as 0
 * (context switch)
 * B reads L, sees it as 0
 * B sets L=1
 * B enters CS
 * (context switch)
 * A sets L=1
 * A enters CS
Both threads are in the CS at the same time!

Solution #2: Locking via TSL instruction
 * TSL = "test and set lock"
 * hardware support for locking available on *some* architectures
 * pseudo assembly:  "TSL R, lock"
 * does two things atomically:
    - moves "lock" value to register
    - writes 1 to lock

atomic operation - an operation that happens as an indivisible unit (i.e.,
  no context switch is possible in the middle)

To use TSL:

    enter_CS:
        tsl R0, lock
        cmp R0, #0
        jne enter_cs
        ret

    leave_CS:
        load R0, #0
        store lock, R0
        ret

Note the above uses a "spin lock;" a "sleep lock" may be a better choice

spin lock
 * only used on multi-CPU systems
 * used for very short duration waits to avoid context switch overhead

sleep lock
 * must be used on single-CPU systems
 * used for cases where delay is expected to dwarf context switch time

hybrid lock
 * spin a few times, then sleep if lock is still held

FreeBSD Lock Manager: supports lock upgrades, downgrades, etc.
 * shared lock (okay for multiple processes to hold a shared lock)
 * exclusive lock (no other processes may hold a shared or exclusive lock)


Solution #3: Dekker's Algorithm
=================================
f0 = false
f1 = false
turn = 0

T0:                             T1:
    f0 = true                   f1 = true
    while f1:                   while f0:
        if turn != 0:               if turn != 1:
            f0 = false                  f1 = false
            while turn != 0:            while turn != 1:
                wait                        wait
    turn = 1                    turn = 0
    CS()                        CS()
    f0 = false                  f1 = false

f0, f1 = flags for intent to enter CS
turn = which thread has priority
 - withdraw intention to enter until turn is given to current thread

Note that if the non-CS is fast enough, it's possible to reenter the CS
before the other process notices its new priority

Possible problem:  compiler optimizations!
 - removes writes to f0, f1 since they appear to be unused variables
 - use the "volatile" keyword to solve this problem

CPU instruction reordering could also be a problem if memory barriers aren't
used

Solution #4: Peterson's Algorithm
==================================

f0 = 0
f1 = 0
turn = 0

T0:                             T1:
    f0 = 1                          f1 = 1
    turn = 1                        turn = 0
    while (f1 && turn == 1)         while (f0 && turn == 0)
        wait                            wait
    CS()                            CS()
    f0 = 0                          f1 = 0
    non-CS()                        non-CS()

if T0 is in CS:
 - f1 = false or turn = 0

 * a thread can reenter if the other isn't interested
 * if other thread is waiting, current thread will only run once
   ("bounded waiting")