Chapter 2: MongoDB CRUD Operations (Insertion & Querying)
CRUD operations (Create, Read, Update, Delete) are the atomic primitives of data interaction in MongoDB. This chapter provides an exhaustive technical specification of Creation and Querying, detailing the underlying wire protocol, internal query optimization, and production-grade implementation patterns.
I. Creating Documents: The Persistence Layer
In MongoDB, "Creating" data involves the serialization of BSON documents and their transmission to the mongod process. Every write operation is handled by the WiredTiger engine using Snapshot Isolation.
1. db.collection.insertOne(document, options)
This method inserts a single document into the collection. It is the primary tool for discrete, low-volume writes.
- Parameters:
document(Object): The BSON document to persist. Fields must not contain.or start with$.options(Object):writeConcern: (e.g.,{ w: "majority", j: true, wtimeout: 5000 }).bypassDocumentValidation: (Boolean) If true, allows the write to ignore schema validation rules.
- Return Value:
InsertOneResultacknowledged: BooleaninsertedId: The_idof the document.
Code Example: Structured Financial Record
// Inserting a single transaction with strict write concern
db.transactions.insertOne(
{
txn_id: BinData(4, "f47ac10b-58cc-4372-a567-0e02b2c3d479"),
amount: NumberDecimal("1250.50"),
currency: "USD",
metadata: {
gateway: "stripe",
ip_address: "192.168.1.1"
},
created_at: ISODate("2026-04-15T10:00:00Z")
},
{ writeConcern: { w: "majority", j: true } }
);
2. db.collection.insertMany(documents, options)
The insertMany method is the high-performance choice for bulk data ingestion. It minimizes network overhead by batching documents into a single wire protocol message.
- Parameters:
documents(Array): An array of documents to insert.options(Object):ordered(Boolean): Iftrue(default), the operation stops at the first failure. Iffalse, MongoDB continues to process remaining documents even if one fails (e.g., due to a duplicate key).
- Return Value:
InsertManyResultinsertedCount: Total documents successfully written.insertedIds: A mapping of indices to ObjectIds.
Code Example: Unordered Bulk Sensor Data
// High-throughput sensor logging with unordered insertion to maximize availability
db.sensor_readings.insertMany(
[
{ sensor: "SN-001", temp: 22.5, ts: ISODate() },
{ sensor: "SN-002", temp: 24.1, ts: ISODate() },
{ sensor: "SN-003", temp: 21.8, ts: ISODate() }
],
{ ordered: false }
);
II. Querying Documents: The Search & Filter Engine
MongoDB's query engine uses a Cost-Based Optimizer (CBO) to determine the most efficient execution path. Queries are expressed as BSON filter documents.
1. db.collection.find(filter, projection)
Returns a Cursor to the documents that match the filter.
- Parameters:
filter(Object): Selection criteria. An empty object{}matches all documents.projection(Object): Determines which fields are returned.
- Returns: A
Cursorobject (Lazy evaluation).
2. Exhaustive Query Operators Reference
A. Comparison Operators
| Operator | Description | Example |
|---|---|---|
$eq | Equality | { age: { $eq: 25 } } |
$ne | Not Equal | { status: { $ne: "REJECTED" } } |
$gt / $gte | Greater Than (or Equal) | { price: { $gt: 100 } } |
$lt / $lte | Less Than (or Equal) | { score: { $lte: 50 } } |
$in | In Array | { region: { $in: ["US", "EU"] } } |
$nin | Not In Array | { category: { $nin: ["spam"] } } |
B. Logical Operators
$and: Joins query clauses with a logical AND.$or: Joins query clauses with a logical OR.$nor: Joins query clauses with a logical NOR.$not: Inverts the effect of a query expression.
Code Example: Complex Logical Query
// Find active users in US or EU with premium status
db.users.find({
status: "ACTIVE",
$or: [
{ region: "US" },
{ region: "EU" }
],
"subscription.tier": "PREMIUM"
});
C. Element & Array Operators
$exists: Matches documents that have the specified field.$type: Selects documents if a field is of the specified BSON type.$all: Matches arrays that contain all elements specified in the query.$elemMatch: Selects documents if at least one element in an array field matches all the specified query criteria.
Code Example: Array Element Matching
// Find documents where at least one sub-document in 'scores' array is > 90 and type is 'final'
db.students.find({
scores: {
$elemMatch: { score: { $gt: 90 }, type: "final" }
}
});
III. Projections: Optimizing Network & Memory
Projections limit the data sent over the wire. This is critical for performance in environments with large documents.
- Inclusion:
{ name: 1, email: 1 }(only these fields returned). - Exclusion:
{ password: 0, ssn: 0 }(all fields except these returned). $slice: Limits the number of elements returned in an array.{ comments: { $slice: 5 } }(return only first 5 comments).
Code Example: Covered Query with Projection
// If an index exists on {sku: 1, price: 1}, this query is "Covered"
// (The engine never touches the actual document on disk)
db.products.find(
{ sku: "XYZ-123" },
{ price: 1, _id: 0 }
);
IV. The Cursor Lifecycle & Execution Plan
When find() is called, MongoDB does not immediately fetch all data. It returns a Cursor, which can be modified before execution.
1. Cursor Methods
.sort({ field: 1 }): Orders the results. 1 = Asc, -1 = Desc..limit(n): Restricts the number of documents..skip(n): Offset for pagination..hint(indexName): Forces the optimizer to use a specific index..maxTimeMS(ms): Sets a hard time limit for query execution to prevent "Runaway Queries."
2. Query Execution Architecture
V. Production Anti-Patterns
- The "Pagination Trap": Using large
.skip()values (e.g., page 1000). MongoDB must scan all previous records to reach the offset. Solution: Use Range-Based Pagination (searching by_id > lastSeenId). - Unanchored Regex: Querying
{ key: /.*suffix/ }. This forces a Full Collection Scan regardless of indexes. Always anchor regexes to the start^when possible. - Deep Array Exploration: Querying deeply nested elements in large arrays without
$elemMatch. This can lead to incorrect results and high CPU parsing costs.
VI. Performance Bottlenecks
- Plan Cache Thrashing: When a query has many variations, the optimizer may spend too much time regenerating plans. Monitor
planCachestats. - In-Memory Sort Failure: If a query requires a sort and there is no supporting index, MongoDB uses an internal 100MB RAM buffer. If the dataset exceeds this, the query Crashes.
- Serialization Latency: Fetching thousands of small documents instead of fewer large documents or using projections. Each document incurs BSON overhead.