MongoDB in MongoDB is best learned by connecting the rule to a product catalog or user activity store. Start with the smallest collection query, observe the output, and then add one realistic constraint so the concept becomes practical.
The key habit for this lesson is to watch document shape and index as it changes. That makes the topic easier to debug, easier to explain in interviews, and easier to use in real code without memorizing isolated syntax.
Data modelling in MongoDB is the process of deciding how to structure your documents and collections. Unlike relational databases where the schema is fixed, MongoDB gives you the flexibility to choose between embedding related data inside a document or storing it in separate collections with references. The right choice depends on your access patterns, data size, and relationship cardinality.
The two fundamental approaches to modelling relationships in MongoDB are:
// ONE-TO-ONE: EMBEDDED (preferred when data is always accessed together)
{
"_id": ObjectId("..."),
"name": "Alice Johnson",
"email": "alice@example.com",
"profile": {
"bio": "Software engineer with 8 years experience",
"avatar": "https://cdn.example.com/avatars/alice.jpg",
"website": "https://alice.dev"
}
}
// ONE-TO-ONE: REFERENCED (use when profile is large or rarely needed)
// users collection
{ "_id": ObjectId("u1"), "name": "Alice Johnson", "profileId": ObjectId("p1") }
// profiles collection
{ "_id": ObjectId("p1"), "bio": "Software engineer...", "userId": ObjectId("u1") }
// ONE-TO-MANY: EMBEDDED (good for small, bounded arrays like addresses)
{
"_id": ObjectId("u1"),
"name": "Alice Johnson",
"addresses": [
{ "type": "home", "street": "123 Main St", "city": "New York" },
{ "type": "work", "street": "456 Park Ave", "city": "New York" }
]
}
// ONE-TO-MANY: REFERENCED (better for large or unbounded sets like orders)
// users collection
{ "_id": ObjectId("u1"), "name": "Alice Johnson" }
// orders collection - each order references the user
{ "_id": ObjectId("o1"), "userId": ObjectId("u1"), "total": 99.99, "status": "shipped" }
{ "_id": ObjectId("o2"), "userId": ObjectId("u1"), "total": 45.00, "status": "pending" }
// Query all orders for a user
db.orders.find({ userId: ObjectId("u1") })
// MANY-TO-MANY: Students and Courses
// students collection
{
"_id": ObjectId("s1"),
"name": "Bob Smith",
"enrolledCourses": [ObjectId("c1"), ObjectId("c2"), ObjectId("c3")]
}
// courses collection
{
"_id": ObjectId("c1"),
"title": "MongoDB Fundamentals",
"enrolledStudents": [ObjectId("s1"), ObjectId("s2")]
}
// Find all courses a student is enrolled in
db.courses.find({ _id: { $in: [ObjectId("c1"), ObjectId("c2"), ObjectId("c3")] } })
// Or use $lookup aggregation for a join
db.students.aggregate([
{ $match: { _id: ObjectId("s1") } },
{ $lookup: {
from: "courses",
localField: "enrolledCourses",
foreignField: "_id",
as: "courses"
}}
])
| Factor | Embed | Reference |
|---|---|---|
| Access pattern | Data always accessed together | Data accessed independently |
| Data size | Small, bounded sub-documents | Large or unbounded arrays |
| Update frequency | Updated together with parent | Updated independently and frequently |
| Data sharing | Not shared across documents | Shared by multiple documents |
| Document size | Stays well under 16MB limit | Would exceed 16MB if embedded |
MongoDB has several well-known patterns for common data modelling challenges:
// BUCKET PATTERN: Group time-series data into hourly buckets
// Instead of one document per sensor reading, group them
{
"_id": ObjectId("..."),
"sensorId": "sensor_42",
"date": ISODate("2024-06-01T10:00:00Z"),
"readings": [
{ "ts": ISODate("2024-06-01T10:00:05Z"), "temp": 22.1 },
{ "ts": ISODate("2024-06-01T10:00:10Z"), "temp": 22.3 },
{ "ts": ISODate("2024-06-01T10:00:15Z"), "temp": 22.0 }
],
"count": 3,
"avgTemp": 22.13
}
// COMPUTED PATTERN: Pre-compute expensive aggregations
// Instead of computing total revenue on every read, store it
{
"_id": ObjectId("..."),
"productId": "LAPTOP-001",
"name": "ProBook 15",
"totalSales": 1250, // pre-computed
"totalRevenue": 1623750, // pre-computed
"lastUpdated": ISODate("2024-06-01T00:00:00Z")
}
// OUTLIER PATTERN: Handle documents that exceed normal array bounds
{
"_id": ObjectId("..."),
"postId": "viral-post-123",
"title": "10 MongoDB Tips",
"likes": [ObjectId("u1"), ObjectId("u2"), /* ... up to 1000 */],
"hasExtraLikes": true // flag indicating overflow documents exist
}
Use MongoDB when the program needs a clear answer to a specific problem, not because the keyword looks familiar. In a real MongoDB task, first name the input, then name the transformation, then name the output. This small discipline shows whether the topic is being used correctly or only copied from an example.
A reliable practice flow is: create the smallest working collection query, add one normal case, add one edge case such as missing, repeated, empty, or boundary input, and then confirm the result with explain plan and sample documents. If the result surprises you, reduce the code until the behavior is visible again.
The most common trap here is copying the syntax before understanding the behavior. Avoid it by writing one sentence before the code that explains why MongoDB is the right choice. After the code runs, verify the lesson by doing this: change one input and explain the changed output.
Copying the syntax before understanding the behavior.
Write the expected behavior first, then make the example prove it.
Practicing only the perfect input.
Also test missing, repeated, empty, or boundary input before considering the lesson complete.
Looking only at the final output.
Trace document shape and index through each important step.
Use it when the problem matches the behavior shown in the example and when the result can be verified through explain plan and sample documents.
Start with a tiny case, then test missing, repeated, empty, or boundary input. The main warning sign is copying the syntax before understanding the behavior.
Trace document shape and index, predict the result, run the example, and compare your prediction with the actual output.
Explore 500+ free tutorials across 20+ languages and frameworks.