Thursday, January 14, 2016

Mongo DB getting started - CRUD operations - mongo shell

1. Insert

We can create json object this way:
doc={name:"Joe", age:32, profession:"hacker"}
{ "name" : "Joe", "age" : 32, "profession" : "hacker" }

After that we can put object into collection :
> db.people.insert(doc)
WriteResult({ "nInserted" : 1 })

Also, it's possible to put json object description just into insert method :
> db.people.insert({name:"Black", age:33, profession:"baker"})
WriteResult({ "nInserted" : 1 })


2. Select (Find)

Now we can check for just created records :
> db.people.find()
{ "_id" : ObjectId("56955a4eeab00cffbb649f10"), "name" : "Joe", "age" : 32, "profession" : "hacker" }
{ "_id" : ObjectId("56955d41eab00cffbb649f11"), "name" : "Black", "age" : 33, "profession" : "baker" }


_id field - is a primary key. If inserted object doesn't have such field(as in case above) - db will add it automatically. 

Method findOne() - returning just one random record as a result:
> db.people.findOne()
{
"_id" : ObjectId("56955a4eeab00cffbb649f10"),
"name" : "Joe",
"age" : 32,
"profession" : "hacker"
}

To find specific object we have to use search criteria : 
> db.people.findOne({name:"Black"})
{
"_id" : ObjectId("56955d41eab00cffbb649f11"),
"name" : "Black",
"age" : 33,
"profession" : "baker"
}

Also we can define which fields we want to be returned : 
> db.people.findOne({name:"Black"}, {name:true})
{ "_id" : ObjectId("56955d41eab00cffbb649f11"), "name" : "Black" }

- by default _id field is returned this way. So if we don't want to see it - we have explicitly hide it : 
> db.people.findOne({name:"Black"}, {name:true, _id:false})
{ "name" : "Black" }

For conditions like > and <  we need to make a substitution :> $gt, >=$gte, <$lt, <=$lte 
> db.people.find({age:{$gt:32, $lte:33}})
{ "_id" : ObjectId("56955d41eab00cffbb649f11"), "name" : "Black", "age" : 33, "profession" : "baker" }


For checking  of existence of some field we can use "exist" operator :  
> db.people.insert({name:"Unknown"})
WriteResult({ "nInserted" : 1 })
- record in contrary with previous ones doesn't have fields : age and profession.
To query such record :
> db.people.find({age:{$exists:false}})
{ "_id" : ObjectId("5695674feab00cffbb64aaca"), "name" : "Unknown" }

For regular expression queries : "regex" keyword :
> db.people.find({name:{$regex:"ck"}})
{ "_id" : ObjectId("56955d41eab00cffbb649f11"), "name" : "Black", "age" : 33, "profession" : "baker" }

For OR operator we can use "$or" :
> db.people.find({$or:[{name:{$regex:"ck"}},{age:{$exists:false}}]})
{ "_id" : ObjectId("56955d41eab00cffbb649f11"), "name" : "Black", "age" : 33, "profession" : "baker" }
{ "_id" : ObjectId("5695674feab00cffbb64aaca"), "name" : "Unknown" }

For AND operator  we can use ","
> db.people.find({name:{$regex:"ck",$regex:"Bl"}})
{ "_id" : ObjectId("56955d41eab00cffbb649f11"), "name" : "Black", "age" : 33, "profession" : "baker" }

3. Find  - Arrays

We have 3 documents with arrays : 
> db.accounts.find()
{ "_id" : ObjectId("5696aa17ab119bf16d1f92d7"), "name" : "Howard", "favorites" : [ "pretzels", "beer" ] }
{ "_id" : ObjectId("5696aa3aab119bf16d1f92d8"), "name" : "George", "favorites" : [ "ice cream", "pretzels" ] }
{ "_id" : ObjectId("5696abd0ab119bf16d1f92d9"), "name" : "Lucas", "favorites" : [ "beer", "cheese" ] }

With $all operator we can search for documents which contains all elements from list :  
> db.accounts.find({favorites:{$all:["beer","cheese"]}})
{ "_id" : ObjectId("5696abd0ab119bf16d1f92d9"), "name" : "Lucas", "favorites" : [ "beer", "cheese" ] }

With $in operator we can search for documents which contains at least one element from list :  
> db.accounts.find({favorites:{$in:["beer","cheese"]}})
{ "_id" : ObjectId("5696aa17ab119bf16d1f92d7"), "name" : "Howard", "favorites" : [ "pretzels", "beer" ] }
{ "_id" : ObjectId("5696abd0ab119bf16d1f92d9"), "name" : "Lucas", "favorites" : [ "beer", "cheese" ] }

4. Embedded documents.

Let's create a document with embedded document "email" :  
> db.users.insert({ name:"richard", email:{work:"richard@work.com", home:"richard@home.com"} })

To query this record by it embedded document we have to use "."  for complex field name :
> db.users.find({"email.work":"richard@work.com"})
{ "_id" : ObjectId("5696b64bab119bf16d1f92da"), "name" : "richard", "email" : { "work" : "richard@work.com", "home" : "richard@home.com" } }

5. Cursors 

We can create a cursor this way : 
> cur=db.users.find();null;
null

After we can just print it by typing it name. Or create a loop to print it content : 

> while (cur.hasNext()) printjson(cur.next())
{
"_id" : ObjectId("568d374c27073d1383c52d9e"),
"name" : "Andrew Erlichson",
"city_of_birth" : "Queens",
"favorite_color" : "blue"
}

We can also limit cursor resultset by executing "limit" method : 
cur.limit(5)

Also we can perform find() operation on our cursor : 
cur.find(....)

And sort : 
cur.sort({name:-1})

In addition we can skip several records from result set by "skip" method : 
cur.skip(2)

So, in case of executing : 
cur.sort({name: -1}).limit(5).skip(2)
Mongo will sort collection in reverse order by field "name", skip first 2 records and starting from record #3 it will return 5 records. 

Counting
To get just count of records without it returning we just have to replace "find" with "count" : 
> db.scores.count()
3000
> db.scores.count({type:"exam"})
1000

6. Updating

Let's say, we have a document : 
> db.people.find({name:"Unknown"})
{ "_id" : ObjectId("5695674feab00cffbb64aaca"), "name" : "Unknown" }

For update statement we need 2 objects : 1.matching object (like for "find" statement) 2.object which to be placed instead of original project  - it will replace the whole document, except _id field. 
> db.people.update({name:"Unknown"},{name:"Updated", age:100})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.people.find({name:"Updated"})
{ "_id" : ObjectId("5695674feab00cffbb64aaca"), "name" : "Updated", "age" : 100 }
As you can see, object id remained the same, but content of the document was replaced. 

If we don't want to replace the whole document - we can use $set operator - it will update fields if they are exists, and insert them if they don't exist: 
> db.people.update({name:"Joe"}, {$set:{city:"Houston"}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
- we just added city=Houston to existing fields of Joe, other fields remain the same :
> db.people.find({name:"Joe"})
{ "_id" : ObjectId("56955a4eeab00cffbb649f10"), "name" : "Joe", "age" : 32, "profession" : "hacker", "city" : "Houston" }
  
The opposite  operation is removing file - $unset operator. Using it we can remove just added field "city" : 
> db.people.update({name:"Joe"}, {$unset:{city:1}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.people.find({name:"Joe"})
{ "_id" : ObjectId("56955a4eeab00cffbb649f10"), "name" : "Joe", "age" : 32, "profession" : "hacker" }

7. Update - Arrays 

Let's create a record with array : 
> db.arrays.insert({_id:1, a:[1,2,3,4,5]})
WriteResult({ "nInserted" : 1 })

Now  we can update element of array : 
> db.arrays.update({_id:1}, {$set: {"a.2":33} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 1, 2, 33, 4, 5 ] }

To add/remove elements  : push/pop operators : 
> db.arrays.update({_id:1}, {$push: {"a":6} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 1, 2, 33, 4, 5, 6 ] }

> db.arrays.update({_id:1}, {$pop: {"a":1} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 1, 2, 33, 4, 5 ] }

To add several element - $pushAll operator: 
> db.arrays.update({_id:1}, {$pushAll: {"a":[6,7,8,9,10]} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 1, 2, 33, 4, 5, 6, 7, 8, 9, 10 ] }

To remove element by itposition  - $pull operator : 
> db.arrays.update({_id:1}, {$pull: {"a":1} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 2, 33, 4, 5, 6, 7, 8, 9, 10 ] }

If we want to remove several elements - we can use $pullAll operator : 
> db.arrays.update({_id:1}, {$pullAll: {"a":[1,2,3]} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 33, 4, 5, 6, 7, 8, 9, 10 ] }


$addToSet operator is using array as set and if value exists - it will not be added, if not  - added. 
> db.arrays.update({_id:1}, {$addToSet: {"a":1} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 33, 4, 5, 6, 7, 8, 9, 10, 1 ] }
- element was added 

> db.arrays.update({_id:1}, {$addToSet: {"a":1} })
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 0 })
> db.arrays.find({_id:1})
{ "_id" : 1, "a" : [ 33, 4, 5, 6, 7, 8, 9, 10, 1 ] }
- element already exists, so it was not added again 

8. Upsert

Upsert is combination of update (if element exists) and insert (if not). 
> db.people.update({name:"Huan"}, {age:25})
WriteResult({ "nMatched" : 0, "nUpserted" : 0, "nModified" : 0 })
- Matched=0 - nothing was updated because Huan is not present yet. 

With upsert : 
> db.people.update({name:"Huan"}, {age:25}, {upsert:true})
WriteResult({
"nMatched" : 0,
"nUpserted" : 1,
"nModified" : 0,
"_id" : ObjectId("569802f643831b32e4777438")
})
- record was inserted 

Multi-update
To update all documents in collection we have to use empty search criteria {} as first argument, and also we have to add third argument {multi:true}  
> db.people.update({},{$set:{title:"Dr."}},{multi:true})
WriteResult({ "nMatched" : 5, "nUpserted" : 0, "nModified" : 5 })
> db.people.find()
{ "_id" : ObjectId("56955a4eeab00cffbb649f10"), "name" : "Joe", "age" : 32, "profession" : "hacker", "title" : "Dr." }
{ "_id" : ObjectId("56955d41eab00cffbb649f11"), "name" : "Black", "age" : 33, "profession" : "baker", "title" : "Dr." }

9.Removing

Syntax is very similar to find() : 
> db.people.remove({name:"Joe"})
WriteResult({ "nRemoved" : 1 })

To remove all documents from collection : 
> db.people.remove({})
WriteResult({ "nRemoved" : 4 })

Now collection is empty but it exists. We can drop it  : 
> db.people.drop()
true

10. Indexes

Creating of index is very simple :
 db.students.createIndex({"name":1})

To check if index is used we have to add "explain" to "find" command :
> db.students.find({"name" : "Wilburn Spiess"}).explain()
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "school.students",
"indexFilterSet" : false,
"parsedQuery" : {
"name" : {
"$eq" : "Wilburn Spiess"
}
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : {
"name" : 1
},
"indexName" : "name_1",
"isMultiKey" : false,
"isUnique" : false,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 1,
"direction" : "forward",
"indexBounds" : {
"name" : [
"[\"Wilburn Spiess\", \"Wilburn Spiess\"]"
]
}
}
},
"rejectedPlans" : [ ]
},
"serverInfo" : {
"host" : "ubuntu",
"port" : 27017,
"version" : "3.2.1",
"gitVersion" : "a14d55980c2cdc565d4704a7e3ad37e4e535c1b2"
},
"ok" : 1
}

- mongo used index scan.


To check which indexes we have in db, we have to execute  "db.collectionName.getIndexes()":
> db.students.getIndexes() db.students.createIndex({"name":1})
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "school.students"
},
{
"v" : 1,
"key" : {
"name" : 1
},
"name" : "name_1",
"ns" : "school.students"
}
]
- beside index which we created there is a index by "_id" field - it was created automatically.

To remove index we have to use the same index as for creation, but with "dropIndex" :
> db.students.dropIndex({"name":1})
{ "nIndexesWas" : 2, "ok" : 1 }

11. Geo indexes
Let's say we have 3 objects with coordinates in our collection :
> db.places.find()
{ "_id" : ObjectId("56accdbe4956cc1aa16f2bb4"), "name" : "home", "location" : [ 1, 1 ] }
{ "_id" : ObjectId("56accdc94956cc1aa16f2bb5"), "name" : "work", "location" : [ 3, 3 ] }
{ "_id" : ObjectId("56accdd74956cc1aa16f2bb6"), "name" : "shop", "location" : [ 5, 5 ] }


To make index based on coordinates, we have execute :
db.places.ensureIndex({"location":"2d","type":1})

Now can search for "closest" place:
> db.places.find({"location":{"$near":[3.5,4]}})
{ "_id" : ObjectId("56accdc94956cc1aa16f2bb5"), "name" : "work", "location" : [ 3, 3 ] }
{ "_id" : ObjectId("56accdd74956cc1aa16f2bb6"), "name" : "shop", "location" : [ 5, 5 ] }
{ "_id" : ObjectId("56accdbe4956cc1aa16f2bb4"), "name" : "home", "location" : [ 1, 1 ] }

Also it's possible to use latitude/longitude indexes. Collection :
> db.places.find()
{ "_id" : ObjectId("56acd22d4956cc1aa16f2bb7"), "store_id" : 8, "loc" : { "type" : "Point", "coordinates" : [ -37.47891236119904, 4.488667018711567 ] } }
{ "_id" : ObjectId("56acd2464956cc1aa16f2bb8"), "store_id" : 9, "loc" : { "type" : "Point", "coordinates" : [ -38.47891236119904, 5.488667018711567 ] } }
{ "_id" : ObjectId("56acd2584956cc1aa16f2bb9"), "store_id" : 10, "loc" : { "type" : "Point", "coordinates" : [ -39.47891236119904, 6.488667018711567 ] } }

Creating of index : 
> db.places.ensureIndex({"loc":"2dsphere"})

Search: 
> db.places.find({loc:{$near:{$geometry:{type:"Point",coordinates:[-37,4]},$maxDistance:200000}}})
{ "_id" : ObjectId("56acd22d4956cc1aa16f2bb7"), "store_id" : 8, "loc" : { "type" : "Point", "coordinates" : [ -37.47891236119904, 4.488667018711567 ] } }

11.Perfomance checking 

First, we can use system profiler : 
db.system.profile.find({millis:{$gt:1000}}).sort({ts:-1})

Second : mongostat

demien@ubuntu:~/Developer$ mongostat
insert query update delete getmore command % dirty % used flushes  vsize   res qr|qw ar|aw netIn netOut conn                      time
    *0    *0     *0     *0       0     1|0     0.0    0.0       0 249.0M 77.0M   0|0   0|0   79b    18k    2 2016-01-30T17:02:08+01:00
    *0    *0     *0     *0       0     1|0     0.0    0.0       0 249.0M 77.0M   0|0   0|0   79b    18k    2 2016-01-30T17:02:09+01:00
    *0    *0     *0     *0       0     1|0     0.0    0.0       0 249.0M 77.0M   0|0   0|0   79b    18k    2 2016-01-30T17:02:10+01:00
    *0    *0     *0     *0       0     1|0     0.0    0.0       0 249.0M 77.0M   0|0   0|0   79b    18k    2 2016-01-30T17:02:11+01:00

Third - mongotop:
demien@ubuntu:~/Developer$ mongotop 3
2016-01-30T21:48:14.855+0100 connected to: 127.0.0.1

                  ns    total    read    write    2016-01-30T21:48:17+01:00
  admin.system.roles      0ms     0ms      0ms                             
admin.system.version      0ms     0ms      0ms                             
          blog.posts      0ms     0ms      0ms                             
       blog.sessions      0ms     0ms      0ms                             
          blog.users      0ms     0ms      0ms                             
     course.findTest      0ms     0ms      0ms                             
        course.hello      0ms     0ms      0ms                             
   local.startup_log      0ms     0ms      0ms                             
local.system.replset      0ms     0ms      0ms                             
   m101.funnynumbers      0ms     0ms      0ms  


12. Aggregation

min/max/avg/sum :
> db.grades.aggregate([{$group:{"_id":"$state", "max_pop":{"$max":"$pop"}}}])
{ "_id" : "NY", "max_pop" : 51947 }
{ "_id" : "NJ", "max_pop" : 69646 }
{ "_id" : "CT", "max_pop" : 60670 }
{ "_id" : "CA", "max_pop" : 64996 }

Count - used "sum", but instead of field name, we use value 1
> db.grades.aggregate([{$group:{"_id":"$state", "postal_codes":{"$sum":1}}}])
{ "_id" : "NY", "postal_codes" : 50 }
{ "_id" : "NJ", "postal_codes" : 50 }
{ "_id" : "CT", "postal_codes" : 50 }
{ "_id" : "CA", "postal_codes" : 50 }

Push used to "collect" all elements by "group by" into array(like str_agg) : 
> db.grades.aggregate([{$group:{"_id":"$city", "postal_codes":{"$push":"$_id"}}}])
{ "_id" : "WOODBRIDGE", "postal_codes" : [ "07095" , "07095","07095"] }
{ "_id" : "STRATHMERE", "postal_codes" : [ "08248" ] }
{ "_id" : "PATERSON", "postal_codes" : [ "07502" ] }

AddToSet - the same as "push", but elements are added into set  - so we will not have duplicates
> db.grades.aggregate([{$group:{"_id":"$city", "postal_codes":{"$addToSet":"$_id"}}}])
{ "_id" : "WOODBRIDGE", "postal_codes" : [ "07095" ] }
{ "_id" : "STRATHMERE", "postal_codes" : [ "08248" ] }
{ "_id" : "PATERSON", "postal_codes" : [ "07502" ] }


Projection - used for creating another collection from current : 
Current collection : 
> db.grades.findOne()
{
"_id" : "92278",
"city" : "TWENTYNINE PALMS",
"pop" : 11412,
"state" : "CA"
}

 Creating another collection using "projection":
> db.grades.aggregate([{"$project": {
        "_id":0,
        "st":{"$toLower":"$state"},
        "details":{"city_name":"$city", "population":{$multiply:["$pop",10]}} 
}}])
{ "st" : "ca", "details" : { "city_name" : "TWENTYNINE PALMS", "population" : 114120 } }
{ "st" : "ct", "details" : { "city_name" : "WATERBURY", "population" : 251280 } }
{ "st" : "nj", "details" : { "city_name" : "HACKETTSTOWN", "population" : 234400 } }

Match - the same as "group by" in SQL DB. Shoud be used with another aggregate operations : 
> db.grades.aggregate([
    {"$match":{"state":"CA"}},
    {"$group":{"_id":"$city", "zip_codes":{"$addToSet":"$_id"}}}
]) 
{ "_id" : "LUDLOW", "zip_codes" : [ "92338" ] }
{ "_id" : "WILLOW CREEK", "zip_codes" : [ "95573" ] }
{ "_id" : "MURRIETA", "zip_codes" : [ "92563" ] }
{ "_id" : "SUISUN CITY", "zip_codes" : [ "94585" ] }

Also we can stage "project" and sort to previous step if we want to rename some fields and sort: 
> db.grades.aggregate([
       {"$match":{"state":"CA"}},
       {"$group":{"_id":"$city", "zip_codes":{"$addToSet":"$_id"}}},
       {"$project":{"_id":0,"city":"$_id","zip_codes":1}},
       {"$sort":{"city":1}}
])
{ "zip_codes" : [ "95915" ], "city" : "BELDEN" }
{ "zip_codes" : [ "90706" ], "city" : "BELLFLOWER" }
{ "zip_codes" : [ "93430" ], "city" : "CAYUCOS" }
{ "zip_codes" : [ "96107" ], "city" : "COLEVILLE" }

Also we can add : "skip" and "limit":
> db.grades.aggregate([{"$match":{"state":"CA"}},{"$group":{"_id":"$city", "zip_codes":{"$addToSet":"$_id"}}},{"$project":{"_id":0,"city":"$_id","zip_codes":1}},{"$sort":{"city":1}},{"$skip":5},{"$limit":3}])
{ "zip_codes" : [ "93941" ], "city" : "FORT ORD" }
{ "zip_codes" : [ "93720" ], "city" : "FRESNO" }
{ "zip_codes" : [ "95439" ], "city" : "FULTON" }

Unwind - used for transforming array into "flat" structure :   
> db.grades.findOne()
{
"_id" : "92278",
"city" : "TWENTYNINE PALMS",
"loc" : [
-116.06041,
34.237969
],
"pop" : 11412,
"state" : "CA"
}

with unwind:
> db.grades.aggregate([{"$unwind":"$loc"}])
{ "_id" : "92278", "city" : "TWENTYNINE PALMS", "loc" : -116.06041, "pop" : 11412, "state" : "CA" }
{ "_id" : "92278", "city" : "TWENTYNINE PALMS", "loc" : 34.237969, "pop" : 11412, "state" : "CA" }