Operating Systems Lecture 16: Concurrency bugs
Understanding Multi-Threaded Bugs
Introduction to Multi-Threaded Bugs
- Writing multi-threaded programs is challenging due to the occurrence of non-deterministic bugs that depend on thread execution order. These bugs can be difficult to reproduce and debug.
Types of Bugs in Concurrent Programs
- There are two main types of bugs:
- Deadlock Bugs: Threads become stuck, unable to proceed, causing the program to freeze.
- Non-Deadlock Bugs: Threads continue executing but may crash or produce unexpected results, such as incorrect values from shared resources. An example includes incrementing a shared counter without proper synchronization.
Non-Deadlock Bugs
Atomicity Bugs
- Atomicity bugs occur when programmers assume certain code sections execute atomically but get interrupted, violating these assumptions. The solution involves using locks for mutual exclusion during critical sections.
Order Violation Bugs
- Order violation bugs arise when one thread assumes another has completed its task before proceeding, which may not happen due to scheduling by the operating system. Using condition variables helps ensure correct ordering between threads.
Example of Atomicity Bug
- In an example where Thread 1 checks if a data structure is not null before printing it while Thread 2 sets it to null, an interruption can lead Thread 1 to attempt printing a null value, resulting in a segmentation fault despite its initial check being correct. Locking mechanisms can prevent this issue by ensuring atomic execution of critical sections.
Example of Order Violation Bug
- If Thread 1 initializes a variable and Thread 2 assumes it's initialized without checking first, it could lead to errors if Thread 2 runs first due to scheduling issues. Condition variables should be used so that Thread 2 waits until notified by Thread 1 after initialization completes.
Exploring Deadlock Bugs
Classic Deadlock Scenario
- A classic deadlock occurs when:
- Thread 1 acquires Lock 1 and then Lock 2.
- Thread 2 acquires Lock 2 and then attempts to acquire Lock 1.
This creates a situation where both threads are waiting indefinitely for each other’s locks, leading to a deadlock state where neither can proceed or release their held locks.
Conditions for Deadlocks
To understand how deadlocks occur, four necessary conditions must hold:
- Mutual Exclusion: At least one resource (e.g., lock) must be held in an exclusive mode.
- Hold and Wait: Threads holding resources are allowed to request additional resources.
- No Preemption: Resources cannot be forcibly taken from threads holding them; they must voluntarily release them.
- Circular Wait: A cycle exists where each thread holds at least one resource needed by another thread.
Deadlock Prevention Techniques
Understanding Deadlocks
- A deadlock can be prevented by ensuring that not all conditions for a deadlock are true. This involves making some of the necessary conditions false.
Circular Wait Prevention
- To prevent circular wait, always acquire locks in a consistent order. For instance, if two threads consistently acquire lock L1 before L2, a deadlock situation is avoided.
Lock Ordering
- Implementing an ordering system for locks is crucial. This can be achieved through total or partial ordering based on lock variable addresses to ensure consistent acquisition.
- For example, when acquiring two locks (M1 and M2), check their addresses; always acquire the lock with the lower address first to maintain order and avoid circular waits.
Avoiding Hold and Wait
- Another technique is to avoid hold-and-wait situations by using a single large lock that must be acquired before any other locks. This prevents scenarios where multiple locks are held while waiting for others.
- However, this method may limit concurrent executions due to coarse-grained locking, which could impact performance in systems where deadlocks are possible.
Operating System Solutions
- Some algorithms like the Banker's Algorithm have been developed to help operating systems manage potential deadlocks by scheduling processes based on their lock requirements.
- For instance, if four threads need specific locks (L1 and L2), knowing their requirements allows the OS to schedule them in a way that avoids simultaneous requests for conflicting resources.
- The key strategy is to execute threads sequentially rather than concurrently when they require the same resources, thus preventing potential deadlocks from occurring.
Practical Limitations of Deadlock Algorithms
- Despite theoretical solutions available through algorithms, modern operating systems often lack visibility into process-level lock acquisitions. Therefore, these methods are rarely implemented in practice today.