Chapter 2: Computer Architecture Fundamentals
Introduction
To write system software, you must understand the hardware it runs on. This chapter explores the fundamental components of computer architecture: the CPU, memory, storage, and buses that connect them. Understanding how these pieces work together is essential for system programming.
Why This Matters
When you write system-level code, you're not just writing instructions for an abstract machine - you're writing for real hardware with specific characteristics. The CPU has a limited number of registers. Memory access takes time. Storage is even slower. Understanding these constraints helps you write efficient, correct system software.
How to Study This Chapter
- Visualize - Draw diagrams of how components connect
- Research your own hardware - Look up your CPU specs
- Compare - Think about how phone CPUs differ from desktop CPUs
- Connect - Relate each component to programs you've written
The Basic Computer Model
At its core, every computer follows the von Neumann architecture (named after mathematician John von Neumann):
+-------------+ +-------------+
| | <----> | |
| CPU | | Memory |
| | <----> | |
+-------------+ +-------------+
^ ^
| |
v v
+-------------+ +-------------+
| Storage | <----> | I/O |
+-------------+ +-------------+
Key principle: Both program instructions and data are stored in memory and can be manipulated by the CPU.
The Central Processing Unit (CPU)
The CPU is the "brain" of the computer. It fetches instructions from memory, decodes them, and executes them.
CPU Components
1. Arithmetic Logic Unit (ALU)
- Performs mathematical operations (add, subtract, multiply, divide)
- Performs logical operations (AND, OR, NOT, XOR)
- Compares values
2. Control Unit (CU)
- Fetches instructions from memory
- Decodes instructions
- Manages execution
- Controls other components
3. Registers
- Tiny, ultra-fast storage locations inside the CPU
- Typically 8, 16, 32, or 64 bits
- Different types:
- General-purpose registers - Store data and addresses
- Program Counter (PC) - Points to next instruction
- Stack Pointer (SP) - Points to top of stack
- Instruction Register (IR) - Holds current instruction
- Flags/Status Register - Stores condition codes
4. Cache
- Fast memory between CPU and main memory
- Multiple levels: L1, L2, L3
- L1 is fastest but smallest (~32-64 KB)
- L3 is slower but larger (up to several MB)
The Instruction Cycle (Fetch-Decode-Execute)
Every program runs through this cycle, billions of times per second:
1. FETCH
- Read instruction from memory address in PC
- Load into Instruction Register
- Increment PC
2. DECODE
- Determine what the instruction means
- Identify operands needed
3. EXECUTE
- Perform the operation
- Write results back to registers or memory
4. REPEAT
Example: Adding Two Numbers
Assembly pseudocode:
LOAD R1, [0x1000] ; Load value from memory address 0x1000 into R1
LOAD R2, [0x1004] ; Load value from memory address 0x1004 into R2
ADD R3, R1, R2 ; Add R1 and R2, store result in R3
STORE R3, [0x1008] ; Store R3 to memory address 0x1008
For each instruction:
- Fetch: Get instruction from memory
- Decode: Determine it's a LOAD/ADD/STORE
- Execute: Perform the memory access or arithmetic
Clock Speed and Performance
Clock Speed (measured in GHz - gigahertz)
- How many cycles per second the CPU can execute
- 3.5 GHz = 3.5 billion cycles per second
- Higher isn't always better (efficiency matters too)
Cores
- Modern CPUs have multiple cores
- Each core can execute independently
- Enables parallel processing
Threads
- Some CPUs support multiple threads per core (hyperthreading)
- Improves utilization when one thread is waiting
Memory
Memory stores both program instructions and data. Understanding memory is crucial for system programming.
Memory Hierarchy
From fastest to slowest:
Registers (CPU) ~1 cycle ~bytes
L1 Cache ~4 cycles ~32-64 KB
L2 Cache ~12 cycles ~256-512 KB
L3 Cache ~40 cycles ~8-32 MB
Main Memory (RAM) ~200 cycles GB range
SSD Storage ~50,000 cycles ~100 GB - TB
HDD Storage ~10,000,000 cycles TB range
Key insight: There's a trade-off between speed, size, and cost.
RAM (Random Access Memory)
Characteristics:
- Volatile (loses data when power is off)
- Fast access to any location (hence "random access")
- Organized as an array of bytes
- Each byte has a unique address
Types:
- DRAM (Dynamic RAM) - Main memory, needs refresh
- SRAM (Static RAM) - Used for cache, faster, more expensive
Memory Organization:
Address Value
0x0000 0x4A
0x0001 0x7C
0x0002 0x91
...
0xFFFF 0x23
Each address points to one byte (8 bits).
Memory Access
Reading from Memory:
- CPU puts address on address bus
- Memory controller retrieves data
- Data travels back on data bus
- CPU stores in register
Writing to Memory:
- CPU puts address on address bus
- CPU puts data on data bus
- Memory controller stores data at address
Memory Size Units
1 Byte (B) = 8 bits
1 Kilobyte (KB) = 1,024 bytes
1 Megabyte (MB) = 1,024 KB = 1,048,576 bytes
1 Gigabyte (GB) = 1,024 MB = ~1 billion bytes
1 Terabyte (TB) = 1,024 GB = ~1 trillion bytes
Note: Sometimes KB means 1000 bytes (decimal), sometimes 1024 (binary). Context matters.
Storage
Storage is non-volatile - it persists data even when powered off.
Hard Disk Drives (HDD)
How they work:
- Spinning magnetic platters
- Read/write heads move across surface
- Mechanical movement = relatively slow
- Typical speed: 7200 RPM or 5400 RPM
Characteristics:
- High capacity (several TB)
- Inexpensive per GB
- Slow random access (~10 ms)
- Can fail mechanically
Solid State Drives (SSD)
How they work:
- Flash memory chips (no moving parts)
- Electronically store data
- Much faster than HDD
Characteristics:
- Fast (microseconds vs milliseconds)
- More expensive per GB than HDD
- Limited write cycles (but very high)
- No mechanical failure
Storage Interface
Common interfaces:
- SATA - Serial ATA, common for HDDs and SSDs
- NVMe - Much faster, uses PCIe interface
- USB - External storage
Buses
Buses are communication pathways that connect components.
Types of Buses
1. Data Bus
- Carries actual data between CPU, memory, and devices
- Width (8-bit, 16-bit, 32-bit, 64-bit) affects performance
- Bidirectional
2. Address Bus
- Carries memory addresses
- Width determines maximum addressable memory
- 32-bit = 4 GB max, 64-bit = 16 exabytes max
- Unidirectional (CPU to memory)
3. Control Bus
- Carries control signals
- Read/Write signals
- Clock signals
- Interrupt signals
Example: 32-bit vs 64-bit Systems
32-bit:
- 32-bit registers
- 32-bit address bus → maximum 4 GB RAM (2³² bytes)
- Each memory address is 32 bits
64-bit:
- 64-bit registers
- 64-bit address bus → theoretically 16 exabytes (2⁶⁴ bytes)
- Each memory address is 64 bits
- Can handle larger numbers in a single operation
Input/Output (I/O)
I/O devices let computers interact with the external world.
Common I/O Devices
Input:
- Keyboard
- Mouse
- Camera
- Microphone
- Network interface
Output:
- Display
- Speakers
- Printer
- Network interface
I/O Methods
1. Port-Mapped I/O
- Special I/O instructions (IN, OUT in x86)
- Separate address space for devices
2. Memory-Mapped I/O
- Devices mapped to memory addresses
- Use regular memory read/write operations
- Simpler programming model
3. Direct Memory Access (DMA)
- Device can access memory directly without CPU
- CPU sets up transfer, then device handles it
- Much more efficient for large transfers
Putting It All Together
When you run a program:
- CPU fetches instruction from memory
- Instruction might load data from memory into registers
- ALU performs operations on register data
- Results stored back to memory
- If data isn't in cache, fetch from RAM
- If reading/writing files, access storage
- If user input needed, read from I/O devices
- If output needed, send to I/O devices
Everything cycles through this pattern, billions of times per second.
Key Concepts
- CPU fetches, decodes, and executes instructions
- Registers are the fastest storage, inside the CPU
- Cache bridges the speed gap between CPU and RAM
- RAM stores programs and data, volatile
- Storage (HDD/SSD) persists data, much slower than RAM
- Buses connect components and transfer data
- I/O devices enable external interaction
Common Mistakes
- Confusing RAM and storage - RAM is temporary, storage is permanent
- Ignoring caching - Cache behavior dramatically affects performance
- Assuming instant memory access - Memory has latency
- Overlooking word size - 32-bit vs 64-bit matters
- Forgetting I/O is slow - Disk and network I/O are bottlenecks
Debugging Tips
- Check memory size - Know your RAM limits
- Monitor CPU usage - Use tools like top, htop, Task Manager
- Profile cache performance - Cache misses slow programs
- Consider storage speed - SSD vs HDD makes huge difference
- Watch for I/O bottlenecks - Often the slowest part
Mini Exercises
- Find your CPU model and look up its specifications (cores, clock speed, cache sizes)
- Check how much RAM your computer has
- Determine if you have an SSD, HDD, or both
- Calculate: How many memory addresses can a 32-bit system have?
- Draw a diagram of the von Neumann architecture
- Research: What is the clock speed of your CPU in GHz?
- Find out: How many cores does your CPU have?
- Calculate: If a CPU runs at 3 GHz, how many cycles per second?
- Look up: What is the size of your CPU's L1, L2, and L3 caches?
- Investigate: What interface does your storage use (SATA, NVMe)?
Review Questions
- What are the main components of a CPU?
- Describe the fetch-decode-execute cycle.
- Why is there a memory hierarchy? Why not just use registers for everything?
- What's the difference between volatile and non-volatile memory?
- How does the width of the address bus affect maximum memory?
Reference Checklist
By the end of this chapter, you should be able to:
- Explain the von Neumann architecture
- Describe the components of a CPU
- Understand the fetch-decode-execute cycle
- Explain the memory hierarchy
- Differentiate between RAM and storage
- Understand the role of buses
- Explain 32-bit vs 64-bit systems
- Describe basic I/O mechanisms
Next Steps
Now that you understand computer architecture, the next chapter dives into how computers represent data. You'll learn about binary, hexadecimal, character encoding, and how everything in a computer is ultimately just numbers.
Key Takeaway: Computers are organized into a hierarchy: CPU (with registers and cache), RAM, storage, and I/O, all connected by buses. Understanding this structure is fundamental to writing efficient system software.