Aggregation Pipeline

Written by
MacHamza Kargin
Published on
--
Views
978
Comments
3
Aggregation Pipeline

TL;DR

MongoDB Aggregation Pipeline is your go-to for transforming data efficiently—filter, group, unwind, join, and more, all server-side. This guide focuses on practical stages with real-world snippets using the native MongoDB Node.js driver (pure MongoDB, no Mongoose). Master these patterns for analytics, reports, and complex queries without ORM overhead.

Preface

Aggregation Pipeline is where MongoDB truly flexes—turning raw documents into insights with chained stages. Whether it's filtering sales data, totaling by category, handling arrays, or simulating joins, these patterns cover 90% of real-world needs. We'll use the native MongoDB driver in Node.js (no Mongoose) for direct, lightweight control. Examples keep an e-commerce vibe. Let's dive in!

Introduction: What is Aggregation Pipeline?

A series of stages processing documents sequentially: input → stages → output. Runs in the DB for speed and efficiency. Why it shines:

  • Complex transformations without multiple queries
  • Handles nested/array data naturally
  • Perfect for dashboards, recommendations, reporting

MongoDB vs SQL: Quick Comparison

FeatureMongoDB AggregationSQL Equivalent
Filtering✅ $match✅ WHERE
Grouping & Aggregations✅ $group + operators✅ GROUP BY + aggregates
Shaping Output✅ $project/$addFields✅ SELECT
Sorting & Limiting✅ $sort/$limit✅ ORDER BY/LIMIT
Joins✅ $lookup✅ JOIN
Array Handling✅ $unwind❌ Manual handling
Multi-step Transformations✅ Pipeline stages❌ Multiple queries/CTEs

Pipeline wins big on flexible, nested data.

Setup for Aggregation (Native MongoDB Node.js Driver)

Terminal
npm install mongodb

Connect:

JavaScript
JavaScript
import { MongoClient } from "mongodb";

const client = new MongoClient(
  process.env.MONGODB_URI || "mongodb://localhost:27017",
);
await client.connect();
const db = client.db("myapp");

We'll use collections directly (assume products and orders collections exist).

Aggregation Pipelines

Filter Data ($match)

Narrow down documents early for performance.

JavaScript
JavaScript
const results = await db
  .collection("products")
  .aggregate([{ $match: { status: "active", price: { $gte: 50 } } }])
  .toArray();
Group and Total Data ($group + operators)

Aggregate totals, averages, counts by key.

JavaScript
JavaScript
const results = await db
  .collection("products")
  .aggregate([
    { $match: { status: "active" } },
    {
      $group: {
        _id: "$category",
        totalSales: { $sum: "$sales" },
        avgPrice: { $avg: "$price" },
        productCount: { $sum: 1 },
      },
    },
  ])
  .toArray();
Unwind Arrays and Group Data ($unwind + $group)

Deconstruct arrays, then aggregate (e.g., tags analytics).

JavaScript
JavaScript
const results = await db
  .collection("products")
  .aggregate([
    { $unwind: "$tags" },
    {
      $group: {
        _id: "$tags",
        productsWithTag: { $sum: 1 },
        totalSales: { $sum: "$sales" },
      },
    },
    { $sort: { totalSales: -1 } },
  ])
  .toArray();
Perform One-to-One Joins ($lookup basic)

Simple foreign key populate.

JavaScript
JavaScript
const results = await db
  .collection("orders")
  .aggregate([
    {
      $lookup: {
        from: "users",
        localField: "userId",
        foreignField: "_id",
        as: "userDetails",
      },
    },
    { $unwind: { path: "$userDetails", preserveNullAndEmptyArrays: true } },
  ])
  .toArray();
Perform Multi-Field Joins ($lookup with pipeline/let)

Join on multiple or computed fields (e.g., order items with product details).

JavaScript
JavaScript
const results = await db
  .collection("orders")
  .aggregate([
    { $unwind: "$products" },
    {
      $lookup: {
        from: "products",
        let: { productId: "$products.productId" },
        pipeline: [
          { $match: { $expr: { $eq: ["$_id", "$$productId"] } } },
          { $match: { status: "active" } },
        ],
        as: "productInfo",
      },
    },
    { $unwind: "$productInfo" },
  ])
  .toArray();

Key Stages Cheat Sheet

StagePurposeCommon With
$matchFilter earlyAlways first
$groupAggregate by key$sum, $avg, $push
$unwindFlatten arraysArrays/tags/embedded docs
$lookupJoin collectionsOne-to-many or computed
$projectReshape/select fieldsCleanup output
$sort/$limitOrder & paginateFinal steps

Advanced Tips

  • Index heavily used fields ($match, $sort, $group keys): await db.collection('products').createIndex({ status: 1, category: 1 })
  • Use .explain('executionStats') for optimization
  • Combine with $facet for multi-view results
  • Watch memory: Large $group can hit limits—use allowDiskUse if needed

Conclusions

These patterns—filter, group, unwind, and join—cover most aggregation needs with pure MongoDB. Going native gives you full control, better performance, and no extra abstraction layer. Chain them creatively and you'll handle complex data flows entirely in the DB. Practice on real data and it'll click instantly. Got a tricky scenario? Share it—let's pipeline it!

References

Check out GitHub
Last updated: --