Parallel Programming
There are a variety of ways to harness the the computer's ability to execute operations in parallel.
Consider an example where you are cooking in a restaurant. An order comes in for eggs and toast. In a synchronous context, you would cook the eggs, and when the eggs are finished, you would cook the toast. You would then put the eggs and the toast on a plate and serve the meal.
Asynchronous Programming
Asynchronous programming is about tasks. As an asynchonrous cook, you would start cooking the eggs and set a timer. You would then start the toast cooking, and set another timer. While both are cooking would you clean the sink. When the timers go off, you put the eggs on a plate and the toast on a plate and serve them.
In asynchronous single-threaded workflows you have a graph of tasks where some tasks depend on the results of others; as each task completes it invokes the code that schedules the next task that can run, given the results of the just-completed task. However, this often requires just one worker to perform all tasks.
Multithreaded Programming
Multithreaded programming is about workers. As a multithreaded cook, you would hire two more cooks, one of which will cook eggs and one of which will cook toast. You now need to figure out how to coordinate the cooks so that they do not conflict with one another when using shared resources in the kitchen. You also have to pay the cooks.
In multithreaded workflows you assign tasks to workers. Many tasks are not processor-bound, but if they are, it makes sense to use as many worker threads as there are processors to work in parallel.
Synchronization Challenges
Multithreaded operations must synchronize access to share resources, as problems can occur when multiple threads try to modify these resources at the same time.
Race Conditions
A race condition occurs when the correctedness of a program depends on the relative timing of events in two or more concurrent operations.
Lost Update
An operation writes a value. A second operation concurrently writes a second value on top of the first. The first value is now lost to any additional concurrent operations which require the first value rather than the second. Those operations will have incorrect results at the end of processing.
Dirty Read
An operation may attempt to write a value, but for whatever reason it fails or aborts. This value should no longer exist in the system. A dirty read occurs when other concurrent operations read this value before the first operation aborts.
Incorrect Summary
One operation reads a summary over the values of all instances of a repeated value, while a second concurrent operation updates some instances of that value. The resulting summary does not reflect a correct result for any precedence order between the two operations, but rather a random result dependent upon timing.
Deadlock
Deadlock occurs when one operation is waiting for the completion of a second operation, while the second is waiting on the completion of the first.
Concurrency
Concurrency the ability of the parts of a program, algorithm, or problem to be excecuted out-of-order or in partial order without affecting the final outcome of the work. This can significantly improve the overall speed of execution in multi-processor/multi-core systems.
Concurrency is achived typically through the sharing of resources, or via message passing.
Concurrency Control
Concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible. Introducing concurrency control into a system means applying operation constraints which typically result in some performance reduction. These constraints can easily become quite complicated.
ACID
ACID is a concurrency control mechanism applied to Database Management Systems, implemented through the concept of a transaction, a unit of work that typically encapsulates a variable number of database operations.
- Atomicity: A transaction is atomic of it commits either all or none of the requested operations.
- Consistency: Consistency requires that every operation must leave the database in a consistent or correct state.
- Isolation: Isolation ensures that one transaction cannot interfere with another transcation.
- Durability: Durability requires that the effects of successful transcations must persist through database crashes.
Optimistic Concurrency
Optimistic concurrency assumes an operation will succeed, and delays looking for any concurrency violations until the end of execution. If concurrency was violated, the operation is aborted. This approach typically incurs overhead. It works well when concurrency violations are relatively low.
Pessimistic Concurrency
Pessimistic concurrency blocks the completion of an operation until when it causes concurrency violations, until the possibility of a violation is removed. This approach typically incurs a reduction in performance.
Semi-Optimistic Concurrency
Semi-optimistic concurrency blocks for some operations and not others.
Concurrency Methods
Locking
A lock can be placed on a shared resource, so that only one operation may modify that resource at a time. Other operations are blocked and must wait.
Timestamp Ordering
Operations or transactions are assigned a timestamp. Operations are controlled by checking timestamp order.
Commitment Ordering
Operations or transactions are assigned a precedence and are executed in order.