Pipeline: Throughput and Latency

Consider a system (like a network) where jobs (like messages) arrive, and after completion leave the network.

Throughput measures the average number of jobs completed per second.

Latency measures the time (worst-case or average, we will typically consider worst-case) to complete a job.

The owners of any system want to maximize throughput (to maximize the money they get!), while users of a system want low latencies so they don't waste their time using it.

Consider a medical centre. If the surgery keeps you waiting for a long time, it does that so that when the doctor is ready for the next patient there is one waiting to see him/her. The practice is optimizing the surgury for throughput, not to minimize your latency.

A busy trac signal should typically maximize for throughput by having each signal direction stay on for a long time; this minimizes the startup overhead every time the signal changes. However, that means that even if there is nobody at the intersection, you may have to wait a long time till the signal light changes if you are unlucky and arrive just as your light changes to red. This illustrates that throughput is more important for busy systems and latency is more important for idle systems.

Pipelining increases throughput by minimising and standardizing latency helps this to happen.

Pipelining doesn't help the latency of single task, it helps throughput of entire workload. Pipelining is possible when multiple tasks operate simultaneously, that is only possible if they are using different resources.

-------------------------------

The diagram below shows four instructions going through the MIPS stages

Latency: time to completely execute a certain task —for example, time to complete an instruction (5 clock cycles of the above)
Throughput: amount of work that can be done over a period of time (in 9 clock cycles we have completed 4 instructions - so the work done in the time is 4 instructions / 9 cycles = 0.44 instructions/cycle)