File API & Binary Data

Chapter 7: File API & Binary Data

Modern JavaScript applications handle complex binary data—images, videos, encrypted streams, and custom file formats—directly in the browser. This chapter provides a deep technical reference for the File API, Blobs, and low-level memory structures like ArrayBuffers and TypedArrays.


I. Core Binary Objects: Blob and File

The foundation of binary data in the browser rests on two primary interfaces: Blob and File.

1. The Blob (Binary Large Object)

A Blob is an immutable object representing raw data. It does not necessarily have a name or a file system location.

Technical Reference: new Blob(blobParts [, options])

  • blobParts: An Array of Blob, ArrayBuffer, TypedArray, DataView, or USVString objects.
  • options: An object with:
    • type: MIME type (e.g., 'image/png').
    • endings: How to handle line endings ('transparent' or 'native').
  • Methods:
    • slice(start, end, contentType): Returns a new Blob containing data from a sub-range.
    • arrayBuffer(): Returns a Promise resolving to an ArrayBuffer.
    • text(): Returns a Promise resolving to a string.
    • stream(): Returns a ReadableStream.

2. The File Object

A File inherits from Blob and adds specific properties related to the file system.

PropertyTypeDescription
namestringThe filename (e.g., "report.pdf").
lastModifiednumberTimestamp of the last modification.
sizenumberThe size in bytes (inherited from Blob).
typestringThe MIME type (inherited from Blob).

Relationship: File extends BlobBlob (Raw Data)File (+Metadata)


II. Reading & Consuming Binary Data

To access the contents of a Blob or File, you must use an asynchronous reader or a reference URL.

1. The FileReader API

The FileReader object reads Blob or File contents into memory using various formats.

MethodSyntaxPromise ResultUse Case
readAsText()fr.readAsText(blob)stringReading .json, .txt, .csv files.
readAsDataURL()fr.readAsDataURL(blob)base64 stringSmall previews (data:image/...).
readAsArrayBuffer()fr.readAsArrayBuffer(blob)ArrayBufferLow-level binary manipulation.

Implementation: Event-Based Reading

const readFile = (file) => {
  const reader = new FileReader();
  
  reader.onload = (e) => console.log('Content:', e.target.result);
  reader.onerror = () => console.error('Read failed');
  reader.onprogress = (e) => {
    if (e.lengthComputable) {
      const pct = (e.loaded / e.total) * 100;
      console.log(`Loading: ${pct}%`);
    }
  };

  reader.readAsText(file);
};

2. URL.createObjectURL (Memory Reference)

For media previews, createObjectURL is superior to readAsDataURL because it creates a direct memory pointer without Base64 overhead.

[!IMPORTANT] Memory Leak Warning: You must call URL.revokeObjectURL(url) when the reference is no longer needed (e.g., after the image loads or the component unmounts).


III. Low-Level Memory: ArrayBuffer & TypedArrays

In the JavaScript engine, ArrayBuffer represents the raw heap—a fixed-length, contiguous block of memory. To manipulate this memory with specific type semantics (e.g., treating bytes as 32-bit floats), you must overlay it with a TypedArray or a DataView.

1. The ArrayBuffer (The Memory Container)

An ArrayBuffer allocates a specific number of bytes in memory. It cannot be resized once created.

  • Syntax: new ArrayBuffer(byteLength)
  • Transferability: In multi-threaded environments (Web Workers), an ArrayBuffer can be transferred (not cloned) for zero-copy performance. Once transferred, the original buffer is "detached" and becomes inaccessible to the main thread.

2. The TypedArray Ecosystem

TypedArrays provide a window into the buffer where each element has a fixed bit-width and range.

TypedArrayByte SizeRangeUse Case
Int8Array1-128 to 127Signed 8-bit integers.
Uint8Array10 to 255Raw byte streams, file headers.
Uint8ClampedArray10 to 255Specialized for Canvas: Clamps values (e.g., 300 becomes 255).
Int16Array2-32768 to 3276716-bit audio data.
Uint32Array40 to 4294967295Large counters, memory pointers.
Float32Array41.2e-38 to 3.4e38WebGL: Vertex and color data.
BigUint64Array80 to 2^64 - 164-bit precision (requires n suffix, e.g., 100n).

3. Memory Alignment and Multiple Views

One of the most powerful features of binary data is the ability to map multiple views to the same underlying buffer.

Shared ArrayBuffer (8 Bytes)Uint32 [0]Uint32 [1]Uint8Array overlay: Individual bytes 0-7Changes in Uint8Array[0-3] immediately update Uint32Array[0].

const buffer = new ArrayBuffer(8);

const u32View = new Uint32Array(buffer);
const u8View = new Uint8Array(buffer);

u8View[0] = 0xFF;
u8View[1] = 0x00;
u8View[2] = 0x00;
u8View[3] = 0x00;

// On a Little-endian system, u32View[0] is now 255
console.log(u32View[0]); 

4. subarray() vs slice(): Reference vs Copy

Understanding the difference between these two methods is critical for memory management and performance.

MethodBehaviorTechnical Result
subarray(start, end)View CreationReturns a new TypedArray pointing to the same memory. Fast.
slice(start, end)Data CloningAllocates new memory and copies the data. Slower.

Buffer Asubarray() (Same Buffer)Buffer Bslice() (New Memory)


3. DataView (Flexibility & Endianness)

DataView provides a low-level interface for reading and writing multiple types to a single buffer, with explicit control over Endianness.

const buffer = new ArrayBuffer(8);
const view = new DataView(buffer);

// Set Uint32 at byte 0, Big-endian (Network Byte Order)
view.setUint32(0, 4294967295, false); 

// Read back as Little-endian
const leValue = view.getUint32(0, true);

IV. Advanced Processing: Chunking & Slicing

For multi-gigabyte files, reading the entire content into memory will crash the browser. Use Blob.slice() to process files in segments.

Implementation: Sequential Chunk Processor

const processInChunks = async (file) => {
  const CHUNK_SIZE = 1024 * 1024; // 1MB
  let offset = 0;

  while (offset < file.size) {
    const chunk = file.slice(offset, offset + CHUNK_SIZE);
    const buffer = await chunk.arrayBuffer();
    
    // Process binary data (e.g., hashing or encryption)
    processChunk(buffer);
    
    offset += CHUNK_SIZE;
    updateProgress(offset / file.size);
  }
};

V. Core Engineering Standards

1. Performance Mandates

  • Zero-Copy Transfers: Use Transferable objects when passing ArrayBuffers to Web Workers. This transfers ownership instead of cloning data.
  • Off-Main-Thread: Perform heavy binary processing (image manipulation, decompression) in a Web Worker to avoid blocking the UI thread.

2. Memory & Security Mandates

  • Revocation: Always call URL.revokeObjectURL() in a finally block or when a component unmounts.
  • XSS Prevention: Never insert binary data converted to strings into the DOM via innerHTML.
  • Validation: Always validate file.type and file.size before processing to prevent DoS (Denial of Service) via massive files.

Binary State: [OK ]