Advanced CRUD (Update & Delete)

Chapter 3: Advanced CRUD (Update & Delete)

Updating and deleting documents in MongoDB is highly efficient when using atomic operators. Unlike a full document replacement, atomic operators allow you to modify specific fields in-place, which minimizes BSON serialization overhead and prevents Write Amplification. This "Surgical Update" pattern is critical for maintaining high throughput in write-intensive applications.

I. Atomic Update Internals

The updateOne() and updateMany() methods utilize Update Operators (e.g., $set, $inc, $push) to modify data. When an update is issued, the WiredTiger storage engine performs a "Modify-In-Place" operation within the internal cache. Instead of rewriting the entire document from scratch, WiredTiger uses MVCC (Multi-Version Concurrency Control) to create a new version of the modified page. If the update only changes a small field, the engine only records the delta in the Journal (Write-Ahead Log), significantly reducing Disk I/O compared to relational systems that might rewrite entire rows or pages.

BSON Document{ _id: 1, val: 10 }$inc: { val: 1 }Atomic OpWiredTiger CachePage Update (In-Memory)Delta -> Journal (WAL)O(1) Memory Latency

1. Key Atomic Operators

  • $set: Replaces the value of a field with a specified value without affecting other fields.
  • $inc: Increments a field by a specified value. This is highly efficient for counters as it avoids the "Read-Modify-Write" race condition.
  • $push / $pull: Adds or removes items from an array. These operations are executed server-side, ensuring that the array remains consistent even under high concurrency.

II. Upserts & FindAndModify

An Upsert (Update or Insert) is an idempotent operation that ensures a document exists. If no document matches the filter, a new one is created. For workflows requiring the updated document to be returned immediately, MongoDB provides the findOneAndUpdate() method. This method is atomic at the document level, making it the standard choice for implementing Job Queues or Sequence Generators.

// Atomic Job Claiming Pattern
db.jobs.findOneAndUpdate(
  { status: "PENDING" },
  { $set: { status: "PROCESSING", worker: "node_1" } },
  { returnDocument: "after", sort: { priority: -1 } }
)

III. Deletion Mechanics & Tombstones

Deletions in MongoDB do not immediately reclaim disk space. When a document is deleted via deleteOne() or deleteMany(), WiredTiger marks the corresponding record as "Deleted" in its B-Tree and adds the space to an internal Free List. This space is reused for future inserts. To fully reclaim disk space and defragment the data files, a compact command or an initial sync must be performed.


IV. Production Anti-Patterns

  • Unbounded Array Growth: Using $push on an array that grows indefinitely (e.g., logs). This eventually hits the 16MB BSON limit and causes massive Index Fragmentation.
  • The "Double Update" Hazard: Issuing two separate updateOne() calls for the same document instead of combining them into a single atomic update. This doubles the network round-trips and Oplog entries.
  • Lack of Unique Indexes for Upserts: Relying on upsert: true without a unique index on the shard key or filter fields. This can result in duplicate documents under high concurrency.

V. Performance Bottlenecks

  • Oplog Saturation: Large deleteMany() or updateMany() operations generate a massive stream of Oplog entries, which can saturate the network bandwidth and cause Replication Lag on secondaries.
  • Write Conflict Retries: In high-concurrency environments, multiple threads updating the same document will trigger Write Conflicts in WiredTiger, forcing the database to retry the operation and spiking CPU usage.
  • Index Update Penalty: Every update to an indexed field requires a corresponding update to the B-Tree index. Updating a field that is part of 5 different indexes will incur 5x the I/O cost.