Tutorials Logic, IN info@tutorialslogic.com
Navigation
Home About Us Contact Us Blogs FAQs
Tutorials
All Tutorials
Services
Academic Projects Resume Writing Website Development
Practice
Quiz Challenge Interview Questions Certification Practice
Tools
Online Compiler JSON Formatter Regex Tester CSS Unit Converter Color Picker
Compiler Tools

MongoDB Data Modelling — Embedded vs References

Introduction to Data Modelling

Data modelling in MongoDB is the process of deciding how to structure your documents and collections. Unlike relational databases where the schema is fixed, MongoDB gives you the flexibility to choose between embedding related data inside a document or storing it in separate collections with references. The right choice depends on your access patterns, data size, and relationship cardinality.

Embedded Documents vs References

The two fundamental approaches to modelling relationships in MongoDB are:

  • Embedding (Denormalization): Store related data inside the same document. Best for data that is always accessed together.
  • Referencing (Normalization): Store related data in separate collections and link them by _id. Best for large, frequently updated, or shared data.
One-to-One: Embedded vs Referenced
// ONE-TO-ONE: EMBEDDED (preferred when data is always accessed together)
{
  "_id": ObjectId("..."),
  "name": "Alice Johnson",
  "email": "alice@example.com",
  "profile": {
    "bio": "Software engineer with 8 years experience",
    "avatar": "https://cdn.example.com/avatars/alice.jpg",
    "website": "https://alice.dev"
  }
}

// ONE-TO-ONE: REFERENCED (use when profile is large or rarely needed)
// users collection
{ "_id": ObjectId("u1"), "name": "Alice Johnson", "profileId": ObjectId("p1") }

// profiles collection
{ "_id": ObjectId("p1"), "bio": "Software engineer...", "userId": ObjectId("u1") }
One-to-Many: Embedded vs Referenced
// ONE-TO-MANY: EMBEDDED (good for small, bounded arrays like addresses)
{
  "_id": ObjectId("u1"),
  "name": "Alice Johnson",
  "addresses": [
    { "type": "home", "street": "123 Main St", "city": "New York" },
    { "type": "work", "street": "456 Park Ave", "city": "New York" }
  ]
}

// ONE-TO-MANY: REFERENCED (better for large or unbounded sets like orders)
// users collection
{ "_id": ObjectId("u1"), "name": "Alice Johnson" }

// orders collection - each order references the user
{ "_id": ObjectId("o1"), "userId": ObjectId("u1"), "total": 99.99, "status": "shipped" }
{ "_id": ObjectId("o2"), "userId": ObjectId("u1"), "total": 45.00, "status": "pending" }

// Query all orders for a user
db.orders.find({ userId: ObjectId("u1") })
Many-to-Many: Array of References
// MANY-TO-MANY: Students and Courses
// students collection
{
  "_id": ObjectId("s1"),
  "name": "Bob Smith",
  "enrolledCourses": [ObjectId("c1"), ObjectId("c2"), ObjectId("c3")]
}

// courses collection
{
  "_id": ObjectId("c1"),
  "title": "MongoDB Fundamentals",
  "enrolledStudents": [ObjectId("s1"), ObjectId("s2")]
}

// Find all courses a student is enrolled in
db.courses.find({ _id: { $in: [ObjectId("c1"), ObjectId("c2"), ObjectId("c3")] } })

// Or use $lookup aggregation for a join
db.students.aggregate([
  { $match: { _id: ObjectId("s1") } },
  { $lookup: {
      from: "courses",
      localField: "enrolledCourses",
      foreignField: "_id",
      as: "courses"
  }}
])

When to Embed vs Reference

FactorEmbedReference
Access patternData always accessed togetherData accessed independently
Data sizeSmall, bounded sub-documentsLarge or unbounded arrays
Update frequencyUpdated together with parentUpdated independently and frequently
Data sharingNot shared across documentsShared by multiple documents
Document sizeStays well under 16MB limitWould exceed 16MB if embedded

Schema Design Patterns

MongoDB has several well-known patterns for common data modelling challenges:

Bucket Pattern and Computed Pattern
// BUCKET PATTERN: Group time-series data into hourly buckets
// Instead of one document per sensor reading, group them
{
  "_id": ObjectId("..."),
  "sensorId": "sensor_42",
  "date": ISODate("2024-06-01T10:00:00Z"),
  "readings": [
    { "ts": ISODate("2024-06-01T10:00:05Z"), "temp": 22.1 },
    { "ts": ISODate("2024-06-01T10:00:10Z"), "temp": 22.3 },
    { "ts": ISODate("2024-06-01T10:00:15Z"), "temp": 22.0 }
  ],
  "count": 3,
  "avgTemp": 22.13
}

// COMPUTED PATTERN: Pre-compute expensive aggregations
// Instead of computing total revenue on every read, store it
{
  "_id": ObjectId("..."),
  "productId": "LAPTOP-001",
  "name": "ProBook 15",
  "totalSales": 1250,        // pre-computed
  "totalRevenue": 1623750,   // pre-computed
  "lastUpdated": ISODate("2024-06-01T00:00:00Z")
}

// OUTLIER PATTERN: Handle documents that exceed normal array bounds
{
  "_id": ObjectId("..."),
  "postId": "viral-post-123",
  "title": "10 MongoDB Tips",
  "likes": [ObjectId("u1"), ObjectId("u2"), /* ... up to 1000 */],
  "hasExtraLikes": true   // flag indicating overflow documents exist
}

Ready to Level Up Your Skills?

Explore 500+ free tutorials across 20+ languages and frameworks.