Example of Removing Duplicates with $unionWith

To understand How to Remove Duplicates by using $unionWith in MongoDB we need a collection and some documents on which we will perform various operations and queries. Here we will consider a collection called users and collection2 which contains the information shown below:

[
{
"_id": ObjectId("60f3727c81c1b4e14f252d12"),
"name": "Alice",
"email": "alice@example.com",
"age": 30
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d13"),
"name": "Bob",
"email": "bob@example.com",
"age": 35
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d14"),
"name": "Charlie",
"email": "charlie@example.com",
"age": 40
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d15"),
"name": "David",
"email": "david@example.com",
"age": 45
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d16"),
"name": "Eve",
"email": "eve@example.com",
"age": 50
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d17"),
"name": "Frank",
"email": "frank@example.com",
"age": 55
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d18"),
"name": "Alice",
"email": "alice@example.com",
"age": 60
}
]

collection2:

// collection2
[
{
"_id": ObjectId("60f3727c81c1b4e14f252d19"),
"name": "Alice",
"email": "alice@example.com",
"age": 65
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d20"),
"name": "Bob",
"email": "bob@example.com",
"age": 70
}
]

Example 1: Remove duplicates based on the “name” field

db.users.aggregate([
{ $unionWith: { coll: "collection2" } },
{
$group: {
_id: "$name",
doc: { $first: "$$ROOT" }
}
},
{ $replaceRoot: { newRoot: "$doc" } }
])

Output:

[
{
"_id": ObjectId("60f3727c81c1b4e14f252d12"),
"name": "Alice",
"email": "alice@example.com",
"age": 30
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d13"),
"name": "Bob",
"email": "bob@example.com",
"age": 35
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d14"),
"name": "Charlie",
"email": "charlie@example.com",
"age": 40
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d15"),
"name": "David",
"email": "david@example.com",
"age": 45
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d16"),
"name": "Eve",
"email": "eve@example.com",
"age": 50
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d17"),
"name": "Frank",
"email": "frank@example.com",
"age": 55
}
]

Explanation: This MongoDB aggregation pipeline combines documents from the users collection with collection2, groups them by the “name” field, and retains only the first document encountered for each name. The $replaceRoot stage then replaces each document with the retained document, effectively removing duplicates based on the “name” field.

Example 2: Remove duplicates based on the “email” field

To remove duplicates based on the “email” field, you can modify the $group stage in the aggregation pipeline

db.users.aggregate([
{ $unionWith: { coll: "collection2" } },
{
$group: {
_id: "$email",
doc: { $first: "$$ROOT" }
}
},
{ $replaceRoot: { newRoot: "$doc" } }
])

Output:

[
{
"_id": ObjectId("60f3727c81c1b4e14f252d12"),
"name": "Alice",
"email": "alice@example.com",
"age": 30
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d13"),
"name": "Bob",
"email": "bob@example.com",
"age": 35
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d14"),
"name": "Charlie",
"email": "charlie@example.com",
"age": 40
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d15"),
"name": "David",
"email": "david@example.com",
"age": 45
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d16"),
"name": "Eve",
"email": "eve@example.com",
"age": 50
},
{
"_id": ObjectId("60f3727c81c1b4e14f252d17"),
"name": "Frank",
"email": "frank@example.com",
"age": 55
}
]

Explanation: This MongoDB aggregation pipeline merges documents from the users collection with collection2, groups them by the “email” field, and retains only the first document encountered for each email. The $replaceRoot stage then replaces each document with the retained document, effectively removing duplicates based on the “email” field

How to Remove Duplicates by using $unionWith in MongoDB?

Duplicate documents in a MongoDB collection can often lead to inefficiencies and inconsistencies in data management. However, MongoDB provides powerful aggregation features to help us solve such issues effectively.

In this article, we’ll explore how to remove duplicates using the $unionWith aggregation stage in MongoDB. We’ll cover the concepts, syntax, and practical examples to demonstrate its usage and effectiveness.

Similar Reads

Understanding $unionWith

The $unionWith aggregation stage in MongoDB is used to combine documents from multiple collections or aggregation pipelines into a single stream of documents. It allows us to merge the results of different data sources which can be useful for various data processing tasks, including removing duplicates....

Example of Removing Duplicates with $unionWith

To understand How to Remove Duplicates by using $unionWith in MongoDB we need a collection and some documents on which we will perform various operations and queries. Here we will consider a collection called users and collection2 which contains the information shown below:...

Conclusion

Overall, we explored how to remove duplicates from a MongoDB collection using the $unionWith aggregation stage. We discussed the syntax and provided a step-by-step example to demonstrate its usage. By using the aggregation pipelines and $unionWith, MongoDB enables efficient and effective removal of duplicate documents, ensuring data integrity and consistency in your database. As you continue to work with MongoDB, mastering aggregation pipelines and their stages will prove invaluable for various data processing tasks....