How many tasks can be handled simultaneously over a period of time
measurement: business units/ second
transactions/second (TPS)
number of bits/second
Bandwidth:
Theoretical Maximum Data Rate
Throughput can never reach bandwidth due to real-world overheads, errors, inefficiencies etc.
In Electrical Engineering (EE), It is different and calculated as:
High_used_frequency - Low_used_frequency
Unit is Hz
Latency
How long will a single task take?
Unchecked latency problems will become throughput problems
If you cannot improve latency, focus on throughput
measurement: milliseconds
Throughput vs Latency
Increasing Bandwidth does not make data arrive faster
It only lets you send more data at once
If it takes 50ms for a signal to reach a server (due to speed of light limits, network hops, etc.), no amount of bandwidth changes that delay
You could have 1 Gbps or 100 Gbps, but the very first byte still takes the same time to arrive.
Scalability
It describes the ability to improve throughput or capacity when additional computing resources (such as additional CPUs, memory, storage or I/O bandwidth) are added
Multithreading overheads
When threading is employed effectively, these costs are more than made up for by greater throughput, responsiveness or capacity
Coordinating b/w threads:
Locking
Signaling
Memory Synchronization
Increased Context switching
Thread creation and teardown
Scheduling overhead
Amdahl’s Law
F = fraction of the calculation that must be executed serially
N = number of processors
Speedup≤F+N(1−F)1
Implications
N→∞ maximum speedup is 1/F
If program is 50% serializable
Max speedup is 2x
If program is 10% serializable
10 processors, speedup = 5.3x
100 processors, speedup = 9.2x
Max speedup = 10x
Resource Bound Performance
When the performance of an activity is limited by availability of a particular resource, we say it is bound by that resource:
CPU bound
I/O bound
Memory bound
CPU bound and I/O bound
A program is CPU bound if it would go faster if the CPU were faster, i.e. it spends the majority of its time simply using the CPU (doing calculations)
tend to have few and long CPU bursts.
examples:
matrix multiplication
graphics operations
High-Performance Computing (HPC) systems
A program is I/O bound if it would go faster if the I/O subsystem was faster.
characterized by many and fewer CPU bursts
it includes disk, networking and communication
examples:
word processing systems
web applications
copying files
downloading files
CPU burst refers to the amount of time taken to execute a task, usually with the CPU.