Tutorials Logic, IN info@tutorialslogic.com
Navigation
Home About Us Contact Us Blogs FAQs
Tutorials
All Tutorials
Services
Academic Projects Resume Writing Website Development
Practice
Quiz Challenge Interview Questions Certification Practice
Tools
Online Compiler JSON Formatter Regex Tester CSS Unit Converter Color Picker
Compiler Tools

MongoDB Aggregation Pipeline — $match, $group, $lookup

The Aggregation Pipeline

The aggregation pipeline is MongoDB's powerful data processing framework. Documents pass through a sequence of stages, where each stage transforms the data. It is the preferred way to perform complex data analysis, grouping, joining, and reshaping operations.

$match, $group, $project, $sort, $limit, $skip
// Count orders by status and sort by count descending
db.orders.aggregate([
  // Stage 1: filter only completed or shipped orders
  { $match: { status: { $in: ["completed", "shipped"] } } },

  // Stage 2: group by status and compute totals
  { $group: {
      _id: "$status",
      count: { $sum: 1 },
      totalRevenue: { $sum: "$total" },
      avgOrderValue: { $avg: "$total" }
  }},

  // Stage 3: reshape the output document
  { $project: {
      _id: 0,
      status: "$_id",
      count: 1,
      totalRevenue: { $round: ["$totalRevenue", 2] },
      avgOrderValue: { $round: ["$avgOrderValue", 2] }
  }},

  // Stage 4: sort by count descending
  { $sort: { count: -1 } },

  // Stage 5: limit to top 5
  { $limit: 5 }
])

$unwind and $lookup (Join)

$unwind and $lookup for Joins
// $unwind - deconstruct an array field into separate documents
db.orders.aggregate([
  { $unwind: "$items" },
  { $group: {
      _id: "$items.productId",
      totalQtySold: { $sum: "$items.qty" }
  }},
  { $sort: { totalQtySold: -1 } }
])

// $lookup - left outer join with another collection
db.orders.aggregate([
  { $lookup: {
      from: "users",           // collection to join
      localField: "userId",    // field in orders
      foreignField: "_id",     // field in users
      as: "customer"           // output array field name
  }},
  // $unwind to flatten the joined array (since it's a 1-to-1 join)
  { $unwind: "$customer" },
  { $project: {
      orderId: "$_id",
      customerName: "$customer.name",
      customerEmail: "$customer.email",
      total: 1,
      status: 1
  }}
])

$addFields, $count, and $facet

$addFields, $count, $facet
// $addFields - add computed fields without removing existing ones
db.products.aggregate([
  { $addFields: {
      discountedPrice: { $multiply: ["$price", 0.9] },
      fullName: { $concat: ["$brand", " ", "$name"] }
  }}
])

// $count - count documents passing through the pipeline
db.users.aggregate([
  { $match: { active: true } },
  { $count: "activeUsers" }
])
// Output: { activeUsers: 142 }

// $facet - run multiple sub-pipelines in parallel (useful for dashboards)
db.products.aggregate([
  { $facet: {
      "byCategory": [
        { $group: { _id: "$category", count: { $sum: 1 } } }
      ],
      "priceStats": [
        { $group: { _id: null, avgPrice: { $avg: "$price" }, maxPrice: { $max: "$price" } } }
      ],
      "topRated": [
        { $sort: { rating: -1 } },
        { $limit: 5 },
        { $project: { name: 1, rating: 1 } }
      ]
  }}
])

Common Aggregation Expressions

Aggregation Expressions: $sum, $avg, $concat, $dateToString
db.orders.aggregate([
  { $group: {
      _id: { $dateToString: { format: "%Y-%m", date: "$createdAt" } },
      totalOrders: { $sum: 1 },
      totalRevenue: { $sum: "$total" },
      avgRevenue: { $avg: "$total" },
      minOrder: { $min: "$total" },
      maxOrder: { $max: "$total" },
      allStatuses: { $addToSet: "$status" }
  }},
  { $sort: { _id: 1 } }
])

// String expressions
db.users.aggregate([
  { $project: {
      fullName: { $concat: ["$firstName", " ", "$lastName"] },
      upperEmail: { $toUpper: "$email" },
      nameLength: { $strLenCP: "$name" },
      // Conditional expression
      ageGroup: {
        $cond: {
          if: { $gte: ["$age", 18] },
          then: "adult",
          else: "minor"
        }
      }
  }}
])
Aggregation Pipeline Stages Reference
// Common pipeline stages:
// $match       - filter documents (like find() filter)
// $group       - group by field and compute aggregates
// $project     - reshape documents (include/exclude/compute fields)
// $sort        - sort documents
// $limit       - limit number of documents
// $skip        - skip N documents
// $unwind      - deconstruct array into separate documents
// $lookup      - left outer join with another collection
// $addFields   - add new computed fields
// $count       - count documents
// $facet       - multiple sub-pipelines in parallel
// $bucket      - categorize into buckets
// $bucketAuto  - auto-determine bucket boundaries
// $replaceRoot - replace root document with a sub-document
// $merge       - write pipeline results to a collection
// $out         - write results to a new collection
// $sample      - randomly select N documents
Key Takeaways
  • The aggregation pipeline processes documents through sequential stages - each stage transforms the data.
  • $match should always be the first stage to filter documents early and use indexes.
  • $group groups documents by a field and computes aggregates like $sum, $avg, $min, $max, $count.
  • $lookup performs a left outer join with another collection - similar to SQL JOIN.
  • $project reshapes documents - include/exclude fields, add computed fields, rename fields.
  • $unwind deconstructs an array field into separate documents - one document per array element.
  • Use $limit and $skip for pagination; always pair with $sort for consistent results.
Common Mistakes to Avoid
WRONG $group before $match
RIGHT $match before $group
Always filter with $match first to reduce the number of documents processed by subsequent stages. This also allows index usage.
WRONG $lookup without indexes on localField/foreignField
RIGHT Create indexes on both join fields before using $lookup
$lookup without indexes performs a full collection scan on the foreign collection - extremely slow on large datasets.
WRONG $unwind on a field that may not exist
RIGHT $unwind: { path: "$tags", preserveNullAndEmptyArrays: true }
Without preserveNullAndEmptyArrays, documents missing the field are silently dropped from results.

Frequently Asked Questions

Ready to Level Up Your Skills?

Explore 500+ free tutorials across 20+ languages and frameworks.