TIL — MongoDB Sorting and Aggregation Explained

When working with MongoDB, you may often need to sort or aggregate data to draw insights or perform operations on it.

Aggregation Pipelines in MongoDB

Aggregation is a powerful feature in MongoDB, allowing you to perform complex data transformations and analytics by passing documents through a series of stages. Each stage processes the input documents and generates output documents, which are then passed to the next stage in the pipeline.

The following are some common aggregation stages:

  1. $match: Filters the documents based on a specified condition.
  2. $group: Groups documents based on a specified expression.
  3. $sort: Sorts documents based on a specified field.
  4. $project: Selects specific fields from the documents.
  5. $limit: Limits the number of documents passed to the next stage.

Consider a MongoDB collection called “students” with the following documents:

[
{ "_id": 1, "name": "Alice", "age": 20 },
{ "_id": 2, "name": "Bob", "age": 22 },
{ "_id": 3, "name": "Carol", "age": 19 }
]

Let’s say we want to find the average age of students whose age is greater than or equal to 20. We would use the following code:

(async () => {
try {
await client.connect();
const db = client.db('test');
const students = db.collection('students');

const pipeline = [
{ $match: { age: { $gte: 20 } } },
{ $group: { _id: null, averageAge: { $avg: '$age' } } }
];

const result = await students.aggregate(pipeline).toArray();
console.log(result);
} catch (err) {
console.error(err);
} finally {
await client.close();
}
})();

Sorting in MongoDB

Sorting is a way to arrange documents in a specific order based on one or more fields. MongoDB provides the sort() method to perform sorting operations. The sort() method accepts an object containing the field(s) to be sorted and the order (1 for ascending and -1 for descending).

To sort the documents by age in ascending order, we would use the following code:

(async () => {
try {
await client.connect();
const db = client.db('test');
const students = db.collection('students');
const result = await students.find().sort({ age: 1 }).toArray();
console.log(result);
} catch (err) {
console.error(err);
} finally {
await client.close();
}
})();

MongoDB allows you to sort documents by multiple fields, meaning you can arrange the documents based on a primary field and, in case of ties, use secondary fields to break the tie. To sort by multiple fields, simply pass an object to the sort() method containing the field(s) and their corresponding sort order (1 for ascending and -1 for descending).

In the following example, we sort the students by ascending age first, if there is a tie, the students will then be sorted by name ascending.

const result = await students.find().sort({ age: 1, name: 1 }).toArray();

Sorting Strings

When sorting strings in ascending order, MongoDB uses lexicographic (dictionary) ordering based on the Unicode code points of the characters. In this ordering, strings are arranged according to their characters’ Unicode values, starting with the first character of each string and proceeding to the next character(s) when there is a tie.

Here’s an example to illustrate the concept of ascending order for strings:

Consider the following list of strings:

["banana", "apple", "grape", "cherry", "kiwi"]

Sorting this list in ascending order will result in the following arrangement:

["apple", "banana", "cherry", "grape", "kiwi"]

Keep in mind that, in lexicographic ordering, uppercase characters have lower Unicode code points than lowercase characters. Therefore, when sorting strings containing both uppercase and lowercase characters, uppercase characters will appear before lowercase characters. For example:

["Banana", "apple", "Grape", "cherry", "Kiwi"]

Sorting this list in ascending order will result in:

["Banana", "Grape", "Kiwi", "apple", "cherry"]

In real-world applications, if you need to perform case-insensitive sorting, you may need to create a case-insensitive index or apply case-insensitive collation to your query.