Home >>MongoDB Tutorial >MongoDB Map Reduce

MongoDB Map Reduce

What is the use of Map-reduce in MongoDB

Map-reduce is a data processing paradigm for condensing large volumes of data into usable aggregated results, as per the MongoDB documentation. For map-reduce operations, MongoDB uses the mapReduce command. For processing large data sets, MapReduce is commonly used.

MapReduce Command

Following is the syntax of the basic mapReduce command −

>db.collection.mapReduce(
   function() {emit(key,value);},  //map function
   function(key,values) {return reduceFunction}, {   //reduce function
      out: collection,
      query: document,
      sort: document,
      limit: number
   }
)

First the map-reduce function queries the collection, then maps the result documents to result key-value pairs, which are then reduced on the basis of multiple keys.

In the syntax above −

  • Map is a JavaScript function that maps a key value and issues a key-value pair.
  • Reduce is a function of JavaScript that reduces or groups all documents with the same key
  • Specifies the location of the result of the Map-Reduce query
  • The query specifies the optional selection criteria for document selection.
  • Sort Defines the optional criterion for sorting
  • The optional maximum number of documents to return determines the limit.

Using MapReduce

Consider the following document structure storing user posts. The document stores user_name of the user and the status of post.

{
   "post_text": "phptpoint is an awesome website for tutorials",
   "user_name": "mark",
   "status":"active"
}

Now in our posts collection, we will use a mapReduce function to select all the active posts, group them based on user name, and then count each user's number of posts using the following code.

>db.posts.mapReduce( 
   function() { emit(this.user_id,1); }, 
	
   function(key, values) {return Array.sum(values)}, {  
      query:{status:"active"},  
      out:"post_total" 
   }
)

The above mapReduce query outputs the following result −

{
   "result" : "post_total",
   "timeMillis" : 9,
   "counts" : {
      "input" : 4,
      "emit" : 4,
      "reduce" : 2,
      "output" : 2
   },
   "ok" : 1,
}

The result shows that a total of 4 documents matched the query (status: "active"), 4 documents with key-value pairs were given by the map function, and finally the reduce function divided mapped documents with the same keys into 2.

Using the find operator to see the result of this mapReduce query –

>db.posts.mapReduce( 
   function() { emit(this.user_id,1); }, 
   function(key, values) {return Array.sum(values)}, {  
      query:{status:"active"},  
      out:"post_total" 
   }
	
).find()

The above query gives the following result showing that both users have two posts in active states, tom and mark.

{ "_id" : "tom", "value" : 2 }
{ "_id" : "mark", "value" : 2 }

Similarly, MapReduce queries can be used to construct large, complex queries for aggregation. MapReduce, which is very flexible and powerful, allows use of custom Javascript functions.


No Sidebar ads