Friday, March 22, 2013

Mongodb group by count, update

The new Aggregation Framework from mongodb (available in version 2.1 and above) is pretty useful...rather than write mapreduce functions, you can use their aggregation framework to do simple group by, count(*), max and sum type of queries.


--This is how you insert into a collection based on query on another collection
 db.basecollection.aggregate( [  
{ $group: { _id: "$c1",              
   rowcount: { $sum: 1 },  
                    lastupdated: { $max: "$c2" } } }
 ], {$sort:{'lastupdated':1}} ).result.forEach(function(doc){
   db.aggcollection.insert(doc)});

The equivalent sql would be something like:
insert into your_aggregate_table (category, rowcount, last_updated)
select category, count(*), max(update_date)
from your_base_table
group by category


--update (insert if don't exist) a collection based on query on another collection
db.basecollection.aggregate( [  
{ $group: { _id: "$c1",              
   rowcount: { $sum: 1 },  
                    lastupdated: { $max: "$c2" } } }
 ]).result.forEach(function(doc){
   db.aggcollection.update({_id: doc._id}, { $set: {rowcount: doc.rowcount, lastupdated: doc.lastupdated}}, {upsert:true})
   });


No comments:

Post a Comment