Chapter 7: File API & Binary Data
Modern JavaScript applications handle complex binary data—images, videos, encrypted streams, and custom file formats—directly in the browser. This chapter provides a deep technical reference for the File API, Blobs, and low-level memory structures like ArrayBuffers and TypedArrays.
I. Core Binary Objects: Blob and File
The foundation of binary data in the browser rests on two primary interfaces: Blob and File.
1. The Blob (Binary Large Object)
A Blob is an immutable object representing raw data. It does not necessarily have a name or a file system location.
Technical Reference: new Blob(blobParts [, options])
blobParts: AnArrayofBlob,ArrayBuffer,TypedArray,DataView, orUSVStringobjects.options: An object with:type: MIME type (e.g.,'image/png').endings: How to handle line endings ('transparent'or'native').
- Methods:
slice(start, end, contentType): Returns a newBlobcontaining data from a sub-range.arrayBuffer(): Returns aPromiseresolving to anArrayBuffer.text(): Returns aPromiseresolving to a string.stream(): Returns aReadableStream.
2. The File Object
A File inherits from Blob and adds specific properties related to the file system.
| Property | Type | Description |
|---|---|---|
name | string | The filename (e.g., "report.pdf"). |
lastModified | number | Timestamp of the last modification. |
size | number | The size in bytes (inherited from Blob). |
type | string | The MIME type (inherited from Blob). |
II. Reading & Consuming Binary Data
To access the contents of a Blob or File, you must use an asynchronous reader or a reference URL.
1. The FileReader API
The FileReader object reads Blob or File contents into memory using various formats.
| Method | Syntax | Promise Result | Use Case |
|---|---|---|---|
readAsText() | fr.readAsText(blob) | string | Reading .json, .txt, .csv files. |
readAsDataURL() | fr.readAsDataURL(blob) | base64 string | Small previews (data:image/...). |
readAsArrayBuffer() | fr.readAsArrayBuffer(blob) | ArrayBuffer | Low-level binary manipulation. |
Implementation: Event-Based Reading
const readFile = (file) => {
const reader = new FileReader();
reader.onload = (e) => console.log('Content:', e.target.result);
reader.onerror = () => console.error('Read failed');
reader.onprogress = (e) => {
if (e.lengthComputable) {
const pct = (e.loaded / e.total) * 100;
console.log(`Loading: ${pct}%`);
}
};
reader.readAsText(file);
};
2. URL.createObjectURL (Memory Reference)
For media previews, createObjectURL is superior to readAsDataURL because it creates a direct memory pointer without Base64 overhead.
[!IMPORTANT] Memory Leak Warning: You must call
URL.revokeObjectURL(url)when the reference is no longer needed (e.g., after the image loads or the component unmounts).
III. Low-Level Memory: ArrayBuffer & TypedArrays
In the JavaScript engine, ArrayBuffer represents the raw heap—a fixed-length, contiguous block of memory. To manipulate this memory with specific type semantics (e.g., treating bytes as 32-bit floats), you must overlay it with a TypedArray or a DataView.
1. The ArrayBuffer (The Memory Container)
An ArrayBuffer allocates a specific number of bytes in memory. It cannot be resized once created.
- Syntax:
new ArrayBuffer(byteLength) - Transferability: In multi-threaded environments (Web Workers), an
ArrayBuffercan be transferred (not cloned) for zero-copy performance. Once transferred, the original buffer is "detached" and becomes inaccessible to the main thread.
2. The TypedArray Ecosystem
TypedArrays provide a window into the buffer where each element has a fixed bit-width and range.
| TypedArray | Byte Size | Range | Use Case |
|---|---|---|---|
Int8Array | 1 | -128 to 127 | Signed 8-bit integers. |
Uint8Array | 1 | 0 to 255 | Raw byte streams, file headers. |
Uint8ClampedArray | 1 | 0 to 255 | Specialized for Canvas: Clamps values (e.g., 300 becomes 255). |
Int16Array | 2 | -32768 to 32767 | 16-bit audio data. |
Uint32Array | 4 | 0 to 4294967295 | Large counters, memory pointers. |
Float32Array | 4 | 1.2e-38 to 3.4e38 | WebGL: Vertex and color data. |
BigUint64Array | 8 | 0 to 2^64 - 1 | 64-bit precision (requires n suffix, e.g., 100n). |
3. Memory Alignment and Multiple Views
One of the most powerful features of binary data is the ability to map multiple views to the same underlying buffer.
const buffer = new ArrayBuffer(8);
const u32View = new Uint32Array(buffer);
const u8View = new Uint8Array(buffer);
u8View[0] = 0xFF;
u8View[1] = 0x00;
u8View[2] = 0x00;
u8View[3] = 0x00;
// On a Little-endian system, u32View[0] is now 255
console.log(u32View[0]);
4. subarray() vs slice(): Reference vs Copy
Understanding the difference between these two methods is critical for memory management and performance.
| Method | Behavior | Technical Result |
|---|---|---|
subarray(start, end) | View Creation | Returns a new TypedArray pointing to the same memory. Fast. |
slice(start, end) | Data Cloning | Allocates new memory and copies the data. Slower. |
3. DataView (Flexibility & Endianness)
DataView provides a low-level interface for reading and writing multiple types to a single buffer, with explicit control over Endianness.
const buffer = new ArrayBuffer(8);
const view = new DataView(buffer);
// Set Uint32 at byte 0, Big-endian (Network Byte Order)
view.setUint32(0, 4294967295, false);
// Read back as Little-endian
const leValue = view.getUint32(0, true);
IV. Advanced Processing: Chunking & Slicing
For multi-gigabyte files, reading the entire content into memory will crash the browser. Use Blob.slice() to process files in segments.
Implementation: Sequential Chunk Processor
const processInChunks = async (file) => {
const CHUNK_SIZE = 1024 * 1024; // 1MB
let offset = 0;
while (offset < file.size) {
const chunk = file.slice(offset, offset + CHUNK_SIZE);
const buffer = await chunk.arrayBuffer();
// Process binary data (e.g., hashing or encryption)
processChunk(buffer);
offset += CHUNK_SIZE;
updateProgress(offset / file.size);
}
};
V. Core Engineering Standards
1. Performance Mandates
- Zero-Copy Transfers: Use
Transferableobjects when passingArrayBuffersto Web Workers. This transfers ownership instead of cloning data. - Off-Main-Thread: Perform heavy binary processing (image manipulation, decompression) in a Web Worker to avoid blocking the UI thread.
2. Memory & Security Mandates
- Revocation: Always call
URL.revokeObjectURL()in afinallyblock or when a component unmounts. - XSS Prevention: Never insert binary data converted to strings into the DOM via
innerHTML. - Validation: Always validate
file.typeandfile.sizebefore processing to prevent DoS (Denial of Service) via massive files.