Data modelling in MongoDB is the process of deciding how to structure your documents and collections. Unlike relational databases where the schema is fixed, MongoDB gives you the flexibility to choose between embedding related data inside a document or storing it in separate collections with references. The right choice depends on your access patterns, data size, and relationship cardinality.
The two fundamental approaches to modelling relationships in MongoDB are:
// ONE-TO-ONE: EMBEDDED (preferred when data is always accessed together)
{
"_id": ObjectId("..."),
"name": "Alice Johnson",
"email": "alice@example.com",
"profile": {
"bio": "Software engineer with 8 years experience",
"avatar": "https://cdn.example.com/avatars/alice.jpg",
"website": "https://alice.dev"
}
}
// ONE-TO-ONE: REFERENCED (use when profile is large or rarely needed)
// users collection
{ "_id": ObjectId("u1"), "name": "Alice Johnson", "profileId": ObjectId("p1") }
// profiles collection
{ "_id": ObjectId("p1"), "bio": "Software engineer...", "userId": ObjectId("u1") }
// ONE-TO-MANY: EMBEDDED (good for small, bounded arrays like addresses)
{
"_id": ObjectId("u1"),
"name": "Alice Johnson",
"addresses": [
{ "type": "home", "street": "123 Main St", "city": "New York" },
{ "type": "work", "street": "456 Park Ave", "city": "New York" }
]
}
// ONE-TO-MANY: REFERENCED (better for large or unbounded sets like orders)
// users collection
{ "_id": ObjectId("u1"), "name": "Alice Johnson" }
// orders collection - each order references the user
{ "_id": ObjectId("o1"), "userId": ObjectId("u1"), "total": 99.99, "status": "shipped" }
{ "_id": ObjectId("o2"), "userId": ObjectId("u1"), "total": 45.00, "status": "pending" }
// Query all orders for a user
db.orders.find({ userId: ObjectId("u1") })
// MANY-TO-MANY: Students and Courses
// students collection
{
"_id": ObjectId("s1"),
"name": "Bob Smith",
"enrolledCourses": [ObjectId("c1"), ObjectId("c2"), ObjectId("c3")]
}
// courses collection
{
"_id": ObjectId("c1"),
"title": "MongoDB Fundamentals",
"enrolledStudents": [ObjectId("s1"), ObjectId("s2")]
}
// Find all courses a student is enrolled in
db.courses.find({ _id: { $in: [ObjectId("c1"), ObjectId("c2"), ObjectId("c3")] } })
// Or use $lookup aggregation for a join
db.students.aggregate([
{ $match: { _id: ObjectId("s1") } },
{ $lookup: {
from: "courses",
localField: "enrolledCourses",
foreignField: "_id",
as: "courses"
}}
])
| Factor | Embed | Reference |
|---|---|---|
| Access pattern | Data always accessed together | Data accessed independently |
| Data size | Small, bounded sub-documents | Large or unbounded arrays |
| Update frequency | Updated together with parent | Updated independently and frequently |
| Data sharing | Not shared across documents | Shared by multiple documents |
| Document size | Stays well under 16MB limit | Would exceed 16MB if embedded |
MongoDB has several well-known patterns for common data modelling challenges:
// BUCKET PATTERN: Group time-series data into hourly buckets
// Instead of one document per sensor reading, group them
{
"_id": ObjectId("..."),
"sensorId": "sensor_42",
"date": ISODate("2024-06-01T10:00:00Z"),
"readings": [
{ "ts": ISODate("2024-06-01T10:00:05Z"), "temp": 22.1 },
{ "ts": ISODate("2024-06-01T10:00:10Z"), "temp": 22.3 },
{ "ts": ISODate("2024-06-01T10:00:15Z"), "temp": 22.0 }
],
"count": 3,
"avgTemp": 22.13
}
// COMPUTED PATTERN: Pre-compute expensive aggregations
// Instead of computing total revenue on every read, store it
{
"_id": ObjectId("..."),
"productId": "LAPTOP-001",
"name": "ProBook 15",
"totalSales": 1250, // pre-computed
"totalRevenue": 1623750, // pre-computed
"lastUpdated": ISODate("2024-06-01T00:00:00Z")
}
// OUTLIER PATTERN: Handle documents that exceed normal array bounds
{
"_id": ObjectId("..."),
"postId": "viral-post-123",
"title": "10 MongoDB Tips",
"likes": [ObjectId("u1"), ObjectId("u2"), /* ... up to 1000 */],
"hasExtraLikes": true // flag indicating overflow documents exist
}
Explore 500+ free tutorials across 20+ languages and frameworks.