Chapter 2: Computer Architecture Fundamentals

Introduction

To write system software, you must understand the hardware it runs on. This chapter explores the fundamental components of computer architecture: the CPU, memory, storage, and buses that connect them. Understanding how these pieces work together is essential for system programming.

Why This Matters

When you write system-level code, you're not just writing instructions for an abstract machine - you're writing for real hardware with specific characteristics. The CPU has a limited number of registers. Memory access takes time. Storage is even slower. Understanding these constraints helps you write efficient, correct system software.

How to Study This Chapter

Visualize - Draw diagrams of how components connect
Research your own hardware - Look up your CPU specs
Compare - Think about how phone CPUs differ from desktop CPUs
Connect - Relate each component to programs you've written

The Basic Computer Model

At its core, every computer follows the von Neumann architecture (named after mathematician John von Neumann):

+-------------+          +-------------+
|             |  <---->  |             |
|     CPU     |          |   Memory    |
|             |  <---->  |             |
+-------------+          +-------------+
      ^                        ^
      |                        |
      v                        v
+-------------+          +-------------+
|   Storage   |  <---->  |    I/O      |
+-------------+          +-------------+

Key principle: Both program instructions and data are stored in memory and can be manipulated by the CPU.

The Central Processing Unit (CPU)

The CPU is the "brain" of the computer. It fetches instructions from memory, decodes them, and executes them.

CPU Components

1. Arithmetic Logic Unit (ALU)

Performs mathematical operations (add, subtract, multiply, divide)
Performs logical operations (AND, OR, NOT, XOR)
Compares values

2. Control Unit (CU)

Fetches instructions from memory
Decodes instructions
Manages execution
Controls other components

3. Registers

Tiny, ultra-fast storage locations inside the CPU
Typically 8, 16, 32, or 64 bits
Different types:
- General-purpose registers - Store data and addresses
- Program Counter (PC) - Points to next instruction
- Stack Pointer (SP) - Points to top of stack
- Instruction Register (IR) - Holds current instruction
- Flags/Status Register - Stores condition codes

4. Cache

Fast memory between CPU and main memory
Multiple levels: L1, L2, L3
L1 is fastest but smallest (~32-64 KB)
L3 is slower but larger (up to several MB)

The Instruction Cycle (Fetch-Decode-Execute)

Every program runs through this cycle, billions of times per second:

1. FETCH
   - Read instruction from memory address in PC
   - Load into Instruction Register
   - Increment PC

2. DECODE
   - Determine what the instruction means
   - Identify operands needed

3. EXECUTE
   - Perform the operation
   - Write results back to registers or memory

4. REPEAT

Example: Adding Two Numbers

Assembly pseudocode:
  LOAD R1, [0x1000]    ; Load value from memory address 0x1000 into R1
  LOAD R2, [0x1004]    ; Load value from memory address 0x1004 into R2
  ADD R3, R1, R2       ; Add R1 and R2, store result in R3
  STORE R3, [0x1008]   ; Store R3 to memory address 0x1008

For each instruction:

Fetch: Get instruction from memory
Decode: Determine it's a LOAD/ADD/STORE
Execute: Perform the memory access or arithmetic

Clock Speed and Performance

Clock Speed (measured in GHz - gigahertz)

How many cycles per second the CPU can execute
3.5 GHz = 3.5 billion cycles per second
Higher isn't always better (efficiency matters too)

Cores

Modern CPUs have multiple cores
Each core can execute independently
Enables parallel processing

Threads

Some CPUs support multiple threads per core (hyperthreading)
Improves utilization when one thread is waiting

Memory

Memory stores both program instructions and data. Understanding memory is crucial for system programming.

Memory Hierarchy

From fastest to slowest:

Registers    (CPU)     ~1 cycle     ~bytes
L1 Cache               ~4 cycles    ~32-64 KB
L2 Cache               ~12 cycles   ~256-512 KB
L3 Cache               ~40 cycles   ~8-32 MB
Main Memory (RAM)      ~200 cycles  GB range
SSD Storage            ~50,000 cycles   ~100 GB - TB
HDD Storage            ~10,000,000 cycles  TB range

Key insight: There's a trade-off between speed, size, and cost.

RAM (Random Access Memory)

Characteristics:

Volatile (loses data when power is off)
Fast access to any location (hence "random access")
Organized as an array of bytes
Each byte has a unique address

Types:

DRAM (Dynamic RAM) - Main memory, needs refresh
SRAM (Static RAM) - Used for cache, faster, more expensive

Memory Organization:

Address    Value
0x0000     0x4A
0x0001     0x7C
0x0002     0x91
...
0xFFFF     0x23

Each address points to one byte (8 bits).

Memory Access

Reading from Memory:

CPU puts address on address bus
Memory controller retrieves data
Data travels back on data bus
CPU stores in register

Writing to Memory:

CPU puts address on address bus
CPU puts data on data bus
Memory controller stores data at address

Memory Size Units

1 Byte (B)      = 8 bits
1 Kilobyte (KB) = 1,024 bytes
1 Megabyte (MB) = 1,024 KB = 1,048,576 bytes
1 Gigabyte (GB) = 1,024 MB = ~1 billion bytes
1 Terabyte (TB) = 1,024 GB = ~1 trillion bytes

Note: Sometimes KB means 1000 bytes (decimal), sometimes 1024 (binary). Context matters.

Storage

Storage is non-volatile - it persists data even when powered off.

Hard Disk Drives (HDD)

How they work:

Spinning magnetic platters
Read/write heads move across surface
Mechanical movement = relatively slow
Typical speed: 7200 RPM or 5400 RPM

Characteristics:

High capacity (several TB)
Inexpensive per GB
Slow random access (~10 ms)
Can fail mechanically

Solid State Drives (SSD)

How they work:

Flash memory chips (no moving parts)
Electronically store data
Much faster than HDD

Characteristics:

Fast (microseconds vs milliseconds)
More expensive per GB than HDD
Limited write cycles (but very high)
No mechanical failure

Storage Interface

Common interfaces:

SATA - Serial ATA, common for HDDs and SSDs
NVMe - Much faster, uses PCIe interface
USB - External storage

Buses

Buses are communication pathways that connect components.

Types of Buses

1. Data Bus

Carries actual data between CPU, memory, and devices
Width (8-bit, 16-bit, 32-bit, 64-bit) affects performance
Bidirectional

2. Address Bus

Carries memory addresses
Width determines maximum addressable memory
32-bit = 4 GB max, 64-bit = 16 exabytes max
Unidirectional (CPU to memory)

3. Control Bus

Carries control signals
Read/Write signals
Clock signals
Interrupt signals

Example: 32-bit vs 64-bit Systems

32-bit:

32-bit registers
32-bit address bus → maximum 4 GB RAM (2³² bytes)
Each memory address is 32 bits

64-bit:

64-bit registers
64-bit address bus → theoretically 16 exabytes (2⁶⁴ bytes)
Each memory address is 64 bits
Can handle larger numbers in a single operation

Input/Output (I/O)

I/O devices let computers interact with the external world.

Common I/O Devices

Input:

Keyboard
Mouse
Camera
Microphone
Network interface

Output:

Display
Speakers
Printer
Network interface

I/O Methods

1. Port-Mapped I/O

Special I/O instructions (IN, OUT in x86)
Separate address space for devices

2. Memory-Mapped I/O

Devices mapped to memory addresses
Use regular memory read/write operations
Simpler programming model

3. Direct Memory Access (DMA)

Device can access memory directly without CPU
CPU sets up transfer, then device handles it
Much more efficient for large transfers

Putting It All Together

When you run a program:

CPU fetches instruction from memory
Instruction might load data from memory into registers
ALU performs operations on register data
Results stored back to memory
If data isn't in cache, fetch from RAM
If reading/writing files, access storage
If user input needed, read from I/O devices
If output needed, send to I/O devices

Everything cycles through this pattern, billions of times per second.

Key Concepts

CPU fetches, decodes, and executes instructions
Registers are the fastest storage, inside the CPU
Cache bridges the speed gap between CPU and RAM
RAM stores programs and data, volatile
Storage (HDD/SSD) persists data, much slower than RAM
Buses connect components and transfer data
I/O devices enable external interaction

Common Mistakes

Confusing RAM and storage - RAM is temporary, storage is permanent
Ignoring caching - Cache behavior dramatically affects performance
Assuming instant memory access - Memory has latency
Overlooking word size - 32-bit vs 64-bit matters
Forgetting I/O is slow - Disk and network I/O are bottlenecks

Debugging Tips

Check memory size - Know your RAM limits
Monitor CPU usage - Use tools like top, htop, Task Manager
Profile cache performance - Cache misses slow programs
Consider storage speed - SSD vs HDD makes huge difference
Watch for I/O bottlenecks - Often the slowest part

Mini Exercises

Find your CPU model and look up its specifications (cores, clock speed, cache sizes)
Check how much RAM your computer has
Determine if you have an SSD, HDD, or both
Calculate: How many memory addresses can a 32-bit system have?
Draw a diagram of the von Neumann architecture
Research: What is the clock speed of your CPU in GHz?
Find out: How many cores does your CPU have?
Calculate: If a CPU runs at 3 GHz, how many cycles per second?
Look up: What is the size of your CPU's L1, L2, and L3 caches?
Investigate: What interface does your storage use (SATA, NVMe)?

Review Questions

What are the main components of a CPU?
Describe the fetch-decode-execute cycle.
Why is there a memory hierarchy? Why not just use registers for everything?
What's the difference between volatile and non-volatile memory?
How does the width of the address bus affect maximum memory?

Reference Checklist

By the end of this chapter, you should be able to:

Explain the von Neumann architecture
Describe the components of a CPU
Understand the fetch-decode-execute cycle
Explain the memory hierarchy
Differentiate between RAM and storage
Understand the role of buses
Explain 32-bit vs 64-bit systems
Describe basic I/O mechanisms

Next Steps

Now that you understand computer architecture, the next chapter dives into how computers represent data. You'll learn about binary, hexadecimal, character encoding, and how everything in a computer is ultimately just numbers.

Key Takeaway: Computers are organized into a hierarchy: CPU (with registers and cache), RAM, storage, and I/O, all connected by buses. Understanding this structure is fundamental to writing efficient system software.