Click here to Skip to main content
15,886,362 members
Articles / Database Development / NoSQL
Tip/Trick

Removing documents from a collection in Mongodb

Rate me:
Please Sign up or sign in to vote.
0.00/5 (No votes)
12 Apr 2015CPOL1 min read 9.3K   3  
How to remove documents from a collection more efficiently

Introduction

This tip covers the basics of moving data out of the database, including the following:

1) remove

2) drop

For part #1:

To remove the documents from a collection:

> db.task1.remove({})

This will remove all the documents in the task1 collection, but please note that it doesn't actually remove the collection, all the meta information about it is still existed.

And usually remove all the documents in the collection is not often happened, most of time we perform to remove the documents with specific criteria:

> db.task1.remove({"user" : "Coldsky"})

Then only the documents which match the criteria will be removed.

And please pay attention to the remove method due to all the remove data is uncoverable.

Just like the insert method, let's also measure the remove speed in a function, see below:

> var removeTime = function() { 
    for (var i=0; i< 10; i++) {
          var tasks = new Array();
          for (var j=0; j< 100000;j++) {
              tasks[j] = {"user" : "Coldsky", "finished" : i*100000 +j, "unfinished": 1000000 - i*100000 -j}
          }
          db.task2.insert(tasks);
      }
      var start = (new Date()).getTime();
      db.task2.remove({}); 
      db.task2.findOne(); 
      var end = (new Date()).getTime(); 
      var diff = end - start; 
      print("Reove 1M documents took " + diff + "ms"); }
> removeTime()
Reove 1M documents took 9763ms

First, we use the bulk insert method to add one million documents into task2 collection, then remove them, the remove rate is 100000 documents per second, maybe most of us are not satified with the remove speed, and seek for a way to speed up. Fortunately, mongodb also provide the drop method for the removing. Please refer to the part #2.

For part #2:

Let's also write a function to drop one million documents: 

> var dropTime = function() {
   for (var i=0; i< 10; i++) {
      var tasks = new Array();
      for (var j=0; j< 100000;j++) {
          tasks[j] = {"user" : "Coldsky", "finished" : i*100000 +j, "unfinished": 1000000 - i*100000 -j}      }
      db.task2.insert(tasks);
      }
   var start = (new Date()).getTime();
   db.task2.drop();
   var end = (new Date()).getTime();
   var diff = end - start;
   print("drop one million documents took " + diff + "ms")
 }
> dropTime()
drop one million documents took 1ms

We can see the drop rate is amazing, the performance improve a lot when compare with the remove, but it has a shorting: can't specify any criteria. And all the metadata of collection is also been removed, proved by below command:

> show collections

And we find that task2 collection is not belonged to current database any more.

 

This tip has reached the end. Thanks for reading and feel free to contact me if you have any questions.

By the way, discussion is welcomed.

 

 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
China China
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
-- There are no messages in this forum --