Friday, February 22, 2013

MongoDB cheatsheet

I've been dabbling quite a bit with MongoDB lately due to my new hobby (developing iPhone apps).

Here's my list of MongoDB cheatsheet for the commands I run pretty often:

DROPPING COLLECTIONS
db.yourcollection.drop();

DELETIONS
--To remove documents from a collection with ObjectId greater than a given id.
db.yourcollection.remove({"_id":{$gt:ObjectId("51212d98e4b07347c88e0f6f")}})

--To remove documents from a collection with a Column greater than a value
db.yourcollection.remove({yourcolumn:{$gt:50}})

--To remove documents from a collection with ObjectId given an id. 
db.yourcollection.remove({"_id":ObjectId("50ff1185e4b0f9e7d1c23586")});

IMPORT csv file into a collection in MongoDB
mongoimport -h yourmongodbhost -d database -c collection -u user -p password --type csv -f column1name,column2name --file c:\filename.csv

QUERYING
--find a document where ObjectId = an id
db.yourcollection.find({"_id":ObjectId("5101c346ade1db93e6f8bdff")})

--return first 10 rows 
db.yourcollection..find().sort({_id:1}).limit(10);

--return last 10 rows, if you want to return last row, just change to limit(1)
db.yourcollection..find().sort({_id:-1}).limit(10);



Pig Latin - How to do a group by, count distinct

I had to convert this sql into pig recently, took me a few tries and it's not very intuitive to me, so if you ever come across this problem..here's a template you can copy.


SQL:
select month_year, osname, appname, count (distinct sessionid) as distinct_sessions
from tableA
where starttime >= '2013-01-01'
group by month_year, osname, appname;

PIG:

A = LOAD '/user/yourfolder/*/part*' USING PigStorage('\t') AS (startday:chararray, appname:chararray, osname:chararray, uniquesessionid:chararray);

B = FILTER A by startday matches '2013.*';

--project startday to month
C = FOREACH B GENERATE SUBSTRING(TRIM(startday), 0, 7) as monthyear, appname, osname, uniquesessionid;

D = GROUP C BY (monthyear, appname, osname);

SESSIONS = foreach D {
    unique_sessions = DISTINCT C.uniquesessionid;
    generate FLATTEN(group), COUNT(unique_sessions) as sessions;
};

STORE SESSIONS INTO '/user/yourfolder' USING PigStorage('\t');

Thursday, February 21, 2013

netezza nzsql command


nzsql -host YourServer -d YourDatabase -u YourUserName -pw YourPassword

Once you connect, you can just execute sql statements, end it with semi colon, hit enter and it will display the results of your sql.

prompt => select  * from yourtableA limit 10;