Computer Architecture Fundamentals

Chapter 2: Computer Architecture Fundamentals

Introduction

To write system software, you must understand the hardware it runs on. This chapter explores the fundamental components of computer architecture: the CPU, memory, storage, and buses that connect them. Understanding how these pieces work together is essential for system programming.

Why This Matters

When you write system-level code, you're not just writing instructions for an abstract machine - you're writing for real hardware with specific characteristics. The CPU has a limited number of registers. Memory access takes time. Storage is even slower. Understanding these constraints helps you write efficient, correct system software.

How to Study This Chapter

  1. Visualize - Draw diagrams of how components connect
  2. Research your own hardware - Look up your CPU specs
  3. Compare - Think about how phone CPUs differ from desktop CPUs
  4. Connect - Relate each component to programs you've written

The Basic Computer Model

At its core, every computer follows the von Neumann architecture (named after mathematician John von Neumann):

+-------------+          +-------------+
|             |  <---->  |             |
|     CPU     |          |   Memory    |
|             |  <---->  |             |
+-------------+          +-------------+
      ^                        ^
      |                        |
      v                        v
+-------------+          +-------------+
|   Storage   |  <---->  |    I/O      |
+-------------+          +-------------+

Key principle: Both program instructions and data are stored in memory and can be manipulated by the CPU.

The Central Processing Unit (CPU)

The CPU is the "brain" of the computer. It fetches instructions from memory, decodes them, and executes them.

CPU Components

1. Arithmetic Logic Unit (ALU)

  • Performs mathematical operations (add, subtract, multiply, divide)
  • Performs logical operations (AND, OR, NOT, XOR)
  • Compares values

2. Control Unit (CU)

  • Fetches instructions from memory
  • Decodes instructions
  • Manages execution
  • Controls other components

3. Registers

  • Tiny, ultra-fast storage locations inside the CPU
  • Typically 8, 16, 32, or 64 bits
  • Different types:
    • General-purpose registers - Store data and addresses
    • Program Counter (PC) - Points to next instruction
    • Stack Pointer (SP) - Points to top of stack
    • Instruction Register (IR) - Holds current instruction
    • Flags/Status Register - Stores condition codes

4. Cache

  • Fast memory between CPU and main memory
  • Multiple levels: L1, L2, L3
  • L1 is fastest but smallest (~32-64 KB)
  • L3 is slower but larger (up to several MB)

The Instruction Cycle (Fetch-Decode-Execute)

Every program runs through this cycle, billions of times per second:

1. FETCH
   - Read instruction from memory address in PC
   - Load into Instruction Register
   - Increment PC

2. DECODE
   - Determine what the instruction means
   - Identify operands needed

3. EXECUTE
   - Perform the operation
   - Write results back to registers or memory

4. REPEAT

Example: Adding Two Numbers

Assembly pseudocode:
  LOAD R1, [0x1000]    ; Load value from memory address 0x1000 into R1
  LOAD R2, [0x1004]    ; Load value from memory address 0x1004 into R2
  ADD R3, R1, R2       ; Add R1 and R2, store result in R3
  STORE R3, [0x1008]   ; Store R3 to memory address 0x1008

For each instruction:

  1. Fetch: Get instruction from memory
  2. Decode: Determine it's a LOAD/ADD/STORE
  3. Execute: Perform the memory access or arithmetic

Clock Speed and Performance

Clock Speed (measured in GHz - gigahertz)

  • How many cycles per second the CPU can execute
  • 3.5 GHz = 3.5 billion cycles per second
  • Higher isn't always better (efficiency matters too)

Cores

  • Modern CPUs have multiple cores
  • Each core can execute independently
  • Enables parallel processing

Threads

  • Some CPUs support multiple threads per core (hyperthreading)
  • Improves utilization when one thread is waiting

Memory

Memory stores both program instructions and data. Understanding memory is crucial for system programming.

Memory Hierarchy

From fastest to slowest:

Registers    (CPU)     ~1 cycle     ~bytes
L1 Cache               ~4 cycles    ~32-64 KB
L2 Cache               ~12 cycles   ~256-512 KB
L3 Cache               ~40 cycles   ~8-32 MB
Main Memory (RAM)      ~200 cycles  GB range
SSD Storage            ~50,000 cycles   ~100 GB - TB
HDD Storage            ~10,000,000 cycles  TB range

Key insight: There's a trade-off between speed, size, and cost.

RAM (Random Access Memory)

Characteristics:

  • Volatile (loses data when power is off)
  • Fast access to any location (hence "random access")
  • Organized as an array of bytes
  • Each byte has a unique address

Types:

  • DRAM (Dynamic RAM) - Main memory, needs refresh
  • SRAM (Static RAM) - Used for cache, faster, more expensive

Memory Organization:

Address    Value
0x0000     0x4A
0x0001     0x7C
0x0002     0x91
...
0xFFFF     0x23

Each address points to one byte (8 bits).

Memory Access

Reading from Memory:

  1. CPU puts address on address bus
  2. Memory controller retrieves data
  3. Data travels back on data bus
  4. CPU stores in register

Writing to Memory:

  1. CPU puts address on address bus
  2. CPU puts data on data bus
  3. Memory controller stores data at address

Memory Size Units

1 Byte (B)      = 8 bits
1 Kilobyte (KB) = 1,024 bytes
1 Megabyte (MB) = 1,024 KB = 1,048,576 bytes
1 Gigabyte (GB) = 1,024 MB = ~1 billion bytes
1 Terabyte (TB) = 1,024 GB = ~1 trillion bytes

Note: Sometimes KB means 1000 bytes (decimal), sometimes 1024 (binary). Context matters.

Storage

Storage is non-volatile - it persists data even when powered off.

Hard Disk Drives (HDD)

How they work:

  • Spinning magnetic platters
  • Read/write heads move across surface
  • Mechanical movement = relatively slow
  • Typical speed: 7200 RPM or 5400 RPM

Characteristics:

  • High capacity (several TB)
  • Inexpensive per GB
  • Slow random access (~10 ms)
  • Can fail mechanically

Solid State Drives (SSD)

How they work:

  • Flash memory chips (no moving parts)
  • Electronically store data
  • Much faster than HDD

Characteristics:

  • Fast (microseconds vs milliseconds)
  • More expensive per GB than HDD
  • Limited write cycles (but very high)
  • No mechanical failure

Storage Interface

Common interfaces:

  • SATA - Serial ATA, common for HDDs and SSDs
  • NVMe - Much faster, uses PCIe interface
  • USB - External storage

Buses

Buses are communication pathways that connect components.

Types of Buses

1. Data Bus

  • Carries actual data between CPU, memory, and devices
  • Width (8-bit, 16-bit, 32-bit, 64-bit) affects performance
  • Bidirectional

2. Address Bus

  • Carries memory addresses
  • Width determines maximum addressable memory
  • 32-bit = 4 GB max, 64-bit = 16 exabytes max
  • Unidirectional (CPU to memory)

3. Control Bus

  • Carries control signals
  • Read/Write signals
  • Clock signals
  • Interrupt signals

Example: 32-bit vs 64-bit Systems

32-bit:

  • 32-bit registers
  • 32-bit address bus → maximum 4 GB RAM (2³² bytes)
  • Each memory address is 32 bits

64-bit:

  • 64-bit registers
  • 64-bit address bus → theoretically 16 exabytes (2⁶⁴ bytes)
  • Each memory address is 64 bits
  • Can handle larger numbers in a single operation

Input/Output (I/O)

I/O devices let computers interact with the external world.

Common I/O Devices

Input:

  • Keyboard
  • Mouse
  • Camera
  • Microphone
  • Network interface

Output:

  • Display
  • Speakers
  • Printer
  • Network interface

I/O Methods

1. Port-Mapped I/O

  • Special I/O instructions (IN, OUT in x86)
  • Separate address space for devices

2. Memory-Mapped I/O

  • Devices mapped to memory addresses
  • Use regular memory read/write operations
  • Simpler programming model

3. Direct Memory Access (DMA)

  • Device can access memory directly without CPU
  • CPU sets up transfer, then device handles it
  • Much more efficient for large transfers

Putting It All Together

When you run a program:

  1. CPU fetches instruction from memory
  2. Instruction might load data from memory into registers
  3. ALU performs operations on register data
  4. Results stored back to memory
  5. If data isn't in cache, fetch from RAM
  6. If reading/writing files, access storage
  7. If user input needed, read from I/O devices
  8. If output needed, send to I/O devices

Everything cycles through this pattern, billions of times per second.

Key Concepts

  • CPU fetches, decodes, and executes instructions
  • Registers are the fastest storage, inside the CPU
  • Cache bridges the speed gap between CPU and RAM
  • RAM stores programs and data, volatile
  • Storage (HDD/SSD) persists data, much slower than RAM
  • Buses connect components and transfer data
  • I/O devices enable external interaction

Common Mistakes

  1. Confusing RAM and storage - RAM is temporary, storage is permanent
  2. Ignoring caching - Cache behavior dramatically affects performance
  3. Assuming instant memory access - Memory has latency
  4. Overlooking word size - 32-bit vs 64-bit matters
  5. Forgetting I/O is slow - Disk and network I/O are bottlenecks

Debugging Tips

  • Check memory size - Know your RAM limits
  • Monitor CPU usage - Use tools like top, htop, Task Manager
  • Profile cache performance - Cache misses slow programs
  • Consider storage speed - SSD vs HDD makes huge difference
  • Watch for I/O bottlenecks - Often the slowest part

Mini Exercises

  1. Find your CPU model and look up its specifications (cores, clock speed, cache sizes)
  2. Check how much RAM your computer has
  3. Determine if you have an SSD, HDD, or both
  4. Calculate: How many memory addresses can a 32-bit system have?
  5. Draw a diagram of the von Neumann architecture
  6. Research: What is the clock speed of your CPU in GHz?
  7. Find out: How many cores does your CPU have?
  8. Calculate: If a CPU runs at 3 GHz, how many cycles per second?
  9. Look up: What is the size of your CPU's L1, L2, and L3 caches?
  10. Investigate: What interface does your storage use (SATA, NVMe)?

Review Questions

  1. What are the main components of a CPU?
  2. Describe the fetch-decode-execute cycle.
  3. Why is there a memory hierarchy? Why not just use registers for everything?
  4. What's the difference between volatile and non-volatile memory?
  5. How does the width of the address bus affect maximum memory?

Reference Checklist

By the end of this chapter, you should be able to:

  • Explain the von Neumann architecture
  • Describe the components of a CPU
  • Understand the fetch-decode-execute cycle
  • Explain the memory hierarchy
  • Differentiate between RAM and storage
  • Understand the role of buses
  • Explain 32-bit vs 64-bit systems
  • Describe basic I/O mechanisms

Next Steps

Now that you understand computer architecture, the next chapter dives into how computers represent data. You'll learn about binary, hexadecimal, character encoding, and how everything in a computer is ultimately just numbers.


Key Takeaway: Computers are organized into a hierarchy: CPU (with registers and cache), RAM, storage, and I/O, all connected by buses. Understanding this structure is fundamental to writing efficient system software.