MongoDB : What is Time to Live (TTL) index

TTL or time to live indexes are special single-field indexes that MongoDB can use to automatically remove documents from a collection after a certain amount of time. Data expiration is useful for certain types of information like machine generated event data, logs, and session information that only need to persist in a database for a finite amount of time.

To create a TTL index, use the db.collection.createIndex() method with theexpireAfterSeconds option on a field whose value is either a date or an array that contains date values.

For example, to create a TTL index on the lastModifiedDate field of the eventlog collection, use the following operation in the mongo shell:

db.eventlog.createIndex( { "lastModifiedDate": 1 }, { expireAfterSeconds: 3600 } )


Warning: The TTL index does not guarantee that expired data will be deleted immediately. 
There may be a delay between the time a document expires and the time that MongoDB removes the document from the database.

 

MongoDB : How to make query result look nice in mongo shell

Whenever we do find() query in mongoDB collection the shell is filled up with too much of data which is difficult to understand and ugly to look.

 

> db.devices.find()
{ "_id" : ObjectId("55ff79deb86a4b0eb1110ba4"), "time_allocated" : null,  "is_allocated" : false, "aggr_zone" : "xxx", "dc" : "DFW1", "time_suspended" : null, "device_type" : "server", "core_template_id" : "12345", "device_swapped_to" : null, "is_suspended" : false, "time_created" : "Sun Sep 20 22:30:38 2015", "device_id" : 48080, "is_decommed" : false }
{ "_id" : ObjectId("55ff79deb86a4b0eb1110ba5"), "time_allocated" : null,  "is_allocated" : false, "aggr_zone" : "xxx", "dc" : "ORD1", "time_suspended" : null, "device_type" : "server", "core_template_id" : "54321", "device_swapped_to" : null, "is_suspended" : false, "time_created" : "Sun Sep 20 22:30:38 2015", "device_id" : 45244, "is_decommed" : false }


 

To make the query result data look nicer there are 2 ways.
1. Use Pretty()

> db.devices.find().pretty()
{
	"_id" : ObjectId("55ff79deb86a4b0eb1110ba4"),
	"time_allocated" : null,
	"is_allocated" : false,
	"aggr_zone" : "xxx",
	"dc" : "DFW1",
	"time_suspended" : null,
	"device_type" : "server",
	"core_template_id" : "12345",
	"device_swapped_to" : null,
	"is_suspended" : false,
	"time_created" : "Sun Sep 20 22:30:38 2015",
	"device_id" : 48080,
	"is_decommed" : false
}
{
	"_id" : ObjectId("55ff79deb86a4b0eb1110ba5"),
	"time_allocated" : null,
	"is_allocated" : false,
	"aggr_zone" : "xxx",
	"dc" : "ORD1",
	"time_suspended" : null,
	"device_type" : "server",
	"core_template_id" : "54321",
	"device_swapped_to" : null,
	"is_suspended" : false,
	"time_created" : "Sun Sep 20 22:30:38 2015",
	"device_id" : 45244,
	"is_decommed" : false
}

Note : This would give your iterator, which mean you need to type "it" for next 20 records.
2. Use to Array

> db.devices.find().limit(2).toArray()
[
{
“_id” : ObjectId(“55ff79deb86a4b0eb1110ba4”),
“time_allocated” : null,
“is_allocated” : false,
“aggr_zone” : “xxx”,
“dc” : “DFW1”,
“time_suspended” : null,
“device_type” : “server”,
“core_template_id” : “12345”,
“device_swapped_to” : null,
“is_suspended” : false,
“time_created” : “Sun Sep 20 22:30:38 2015”,
“device_id” : 48080,
“is_decommed” : false
},
{
“_id” : ObjectId(“55ff79deb86a4b0eb1110ba5”),
“time_allocated” : null,
“is_allocated” : false,
“aggr_zone” : “xxx”,
“dc” : “ORD1”,
“time_suspended” : null,
“device_type” : “server”,
“core_template_id” : “54321”,
“device_swapped_to” : null,
“is_suspended” : false,
“time_created” : “Sun Sep 20 22:30:38 2015”,
“device_id” : 45244,
“is_decommed” : false
}
]
>

Note : This will display all records in shell, so use limit. Iterator is not given, so you might get any records mentioned unless you give sorting option.


What is Memcached ?

As name suggested Memcached is cache which could be though of as big key value pair bucket residing on RAM which can deliver frequently used data instantly by avoiding datasource access.

Memcached is a general-purpose distributed memory caching system. It is often used to speed up dynamic database-driven websites by caching data and objects in RAM to reduce the number of times an external data source (such as a database or API) must be read.

Memcached’s APIs provide a very large hash table distributed across multiple machines. When the table is full, subsequent inserts cause older data to be purged in least recently used (LRU) order.[3][4] Applications using Memcached typically layer requests and additions into RAM before falling back on a slower backing store, such as a database.

Converting database or object creation queries to use Memcached is simple. Typically, when using straight database queries, example code would be as follows:

 function get_foo(int userid) {
    data = db_select("SELECT * FROM users WHERE userid = ?", userid);
    return data;
 }

After conversion to Memcached, the same call might look like the following

 function get_foo(int userid) {
    /* first try the cache */
    data = memcached_fetch("userrow:" + userid);
    if (!data) {
       /* not found : request database */
       data = db_select("SELECT * FROM users WHERE userid = ?", userid);
       /* then store in cache until next get */
       memcached_add("userrow:" + userid, data);
    }
    return data;
 }

The client would first check whether a Memcached value with the unique key “userrow:userid” exists, where userid is some number. If the result does not exist, it would select from the database as usual, and set the unique key using the Memcached API add function call.

However, if only this API call were modified, the server would end up fetching incorrect data following any database update actions: the Memcached “view” of the data would become out of date. Therefore, in addition to creating an “add” call, an update call would also be needed using the Memcached set function.

 function update_foo(int userid, string dbUpdateString) {
   /* first update database */
    result = db_execute(dbUpdateString);
    if (result) {
       /* database update successful : fetch data to be stored in cache */
       data = db_select("SELECT * FROM users WHERE userid = ?", userid);
       /* the previous line could also look like data = createDataFromDBString(dbUpdateString); */
       /* then store in cache until next get */
       memcached_set("userrow:" + userid, data);
    }
 }

This call would update the currently cached data to match the new data in the database, assuming the database query succeeds. An alternative approach would be to invalidate the cache with the Memcached delete function, so that subsequent fetches result in a cache miss. Similar action would need to be taken when database records were deleted, to maintain either a correct or incomplete cache.

What is ISO date format

As world is becoming a big family with globalization, we need the standard to followed to avoid confusion especially related to date and time. If every country or person would write dates in different format, it would be confusing to figure out exact date while doing communication across countries, hence ISO has come up with ISO date format to be followed by every one. 

International Standard ISO 8601 specifies numeric representations of date and time. This standard notation helps to avoid confusion in international communication caused by the many different national notations and increases the portability of computer user interfaces.

YYYY-MM-DDThh:mm:ss.sTZD

where:

     YYYY = four-digit year

     MM   = two-digit month (01=January, etc.)

     DD   = two-digit day of month (01 through 31)

     hh   = two digits of hour (00 through 23) (am/pm NOT allowed)

     mm   = two digits of minute (00 through 59)

     ss   = two digits of second (00 through 59)

     s    = one or more digits representing a decimal fraction of a second

     TZD  = time zone designator (Z or +hh:mm or -hh:mm)

example: 

 1997-07-16T19:20:30.45Z (Z at end indicates time is taken with Zero correction i.e. UTC time)

 1997-07-16T19:20:30.45+01:00 (+1:00 at end indicates time is taken at time zone which is 1:00 ahead of UTC time)

Advantages of the ISO 8601 standard date notation compared to other commonly used variants:

  • easily readable and writeable by software (no ‘JAN’, ‘FEB’, … table necessary)
  • easily comparable and sortable with a trivial string comparison
  • language independent
  • can not be confused with other popular date notations
  • consistency with the common 24h time notation system, where the larger units (hours) are also written in front of the smaller ones (minutes and seconds)
  • strings containing a date followed by a time are also easily comparable and sortable (e.g. write “1995-02-04 22:45:00”)
  • the notation is short and has constant length, which makes both keyboard data entry and table layout easier
  • identical to the Chinese date notation, so the largest cultural group (>25%) on this planet is already familiar with it 🙂
  • date notations with the order “year, month, day” are in addition already widely used e.g. in Japan, Korea, Hungary, Sweden, Finland, Denmark, and a few other countries and people in the U.S. are already used to at least the “month, day” order
  • a 4-digit year representation avoids overflow problems after 2099-12-31

MongoDB : Convert Oplog timestamp into ISO date

Most easy way to figure out the recent changes (any DB any collection) in mongo Database is to query to Oplog with sorting on timestamp

> db.oplog.$main.find().sort({“ts”:-1}).limit(5)

Output : { “ts” : Timestamp(1457990841, 1), “op” : “i”, “ns” : “nis.devices”, “o” : { “_id” : ObjectId(“56e72cb9389c38c2aeeab698”), “name” : “premaseemmarch” } }

This would show 5 recent records in any collection operated in the database.(we just sorted by time stamp in reverse order) To verify the the date we might have to convert the Timestamp in ISO date. -> “ts” : Timestamp(1457990841, 1)

Way 1 : 

> new Date(1000* 1457990841)

ISODate(“2016-03-14T21:27:21Z”)

Way 2 : 

> x = Timestamp(1457989386, 1)

Timestamp(1457989386, 1)

> new Date(x.t * 1000)

MongoDb Tutorial by Premaseem (Free video course on MongoDB)

Hi Friends,

I am certified MongoDb expert. I though to share my knowledge with all, so that this entire world might get benefitted by it. It’s a free youtube video series with short lecture  covering different topics (31 topics).

Free video tutorial link :  https://www.youtube.com/playlist?list=PL13Vva6TJcSsAFUsZwYpJOfR-ENWypLAe

mongoDB tutorial on youtube.png

Use this free tutorial and share it with friends 🙂

MongoDB : Script to run Sharding with replica set on local machine

This simple script will help to you run sharing with multiple replica set on your local box. This makes very cool and uplifting, (if on linux use sudo / root to run shell script or commands manually. )

 

# MongoDB
# script to start a sharded environment on localhost

# clean everything up
echo "killing mongod and mongos"
killall mongod
killall mongos
echo "removing data files"
rm -rf /data/config
rm -rf /data/shard*


# start a replica set and tell it that it will be shard0
echo "starting servers for shard 0"
mkdir -p /data/shard0/rs0 /data/shard0/rs1 /data/shard0/rs2
mongod --replSet s0 --logpath "s0-r0.log" --dbpath /data/shard0/rs0 --port 37017 --fork --shardsvr --smallfiles
mongod --replSet s0 --logpath "s0-r1.log" --dbpath /data/shard0/rs1 --port 37018 --fork --shardsvr --smallfiles
mongod --replSet s0 --logpath "s0-r2.log" --dbpath /data/shard0/rs2 --port 37019 --fork --shardsvr --smallfiles

sleep 5
# connect to one server and initiate the set
echo "Configuring s0 replica set"
mongo --port 37017 << 'EOF'
config = { _id: "s0", members:[
          { _id : 0, host : "localhost:37017" },
          { _id : 1, host : "localhost:37018" },
          { _id : 2, host : "localhost:37019" }]};
rs.initiate(config)
EOF

# start a replicate set and tell it that it will be a shard1
echo "starting servers for shard 1"
mkdir -p /data/shard1/rs0 /data/shard1/rs1 /data/shard1/rs2
mongod --replSet s1 --logpath "s1-r0.log" --dbpath /data/shard1/rs0 --port 47017 --fork --shardsvr --smallfiles
mongod --replSet s1 --logpath "s1-r1.log" --dbpath /data/shard1/rs1 --port 47018 --fork --shardsvr --smallfiles
mongod --replSet s1 --logpath "s1-r2.log" --dbpath /data/shard1/rs2 --port 47019 --fork --shardsvr --smallfiles

sleep 5

echo "Configuring s1 replica set"
mongo --port 47017 << 'EOF'
config = { _id: "s1", members:[
          { _id : 0, host : "localhost:47017" },
          { _id : 1, host : "localhost:47018" },
          { _id : 2, host : "localhost:47019" }]};
rs.initiate(config)
EOF

# start a replicate set and tell it that it will be a shard2
echo "starting servers for shard 2"
mkdir -p /data/shard2/rs0 /data/shard2/rs1 /data/shard2/rs2
mongod --replSet s2 --logpath "s2-r0.log" --dbpath /data/shard2/rs0 --port 57017 --fork --shardsvr --smallfiles
mongod --replSet s2 --logpath "s2-r1.log" --dbpath /data/shard2/rs1 --port 57018 --fork --shardsvr --smallfiles
mongod --replSet s2 --logpath "s2-r2.log" --dbpath /data/shard2/rs2 --port 57019 --fork --shardsvr --smallfiles

sleep 5

echo "Configuring s2 replica set"
mongo --port 57017 << 'EOF'
config = { _id: "s2", members:[
          { _id : 0, host : "localhost:57017" },
          { _id : 1, host : "localhost:57018" },
          { _id : 2, host : "localhost:57019" }]};
rs.initiate(config)
EOF


# now start 3 config servers
echo "Starting config servers"
mkdir -p /data/config/config-a /data/config/config-b /data/config/config-c
mongod --logpath "cfg-a.log" --dbpath /data/config/config-a --port 57040 --fork --configsvr --smallfiles
mongod --logpath "cfg-b.log" --dbpath /data/config/config-b --port 57041 --fork --configsvr --smallfiles
mongod --logpath "cfg-c.log" --dbpath /data/config/config-c --port 57042 --fork --configsvr --smallfiles


# now start the mongos on a standard port
mongos --logpath "mongos-1.log" --configdb localhost:57040,localhost:57041,localhost:57042 --fork
echo "Waiting 60 seconds for the replica sets to fully come online"
sleep 60
echo "Connnecting to mongos and enabling sharding"

# add shards and enable sharding on the test db
mongo <<'EOF'
db.adminCommand( { addshard : "s0/"+"localhost:37017" } );
db.adminCommand( { addshard : "s1/"+"localhost:47017" } );
db.adminCommand( { addshard : "s2/"+"localhost:57017" } );
db.adminCommand({enableSharding: "school"})
db.adminCommand({shardCollection: "school.students", key: {student_id:1}});
EOF


 

NOTE:

Newer mongodb versions doesn’t support mirrored config servers. Thus, you need to setup them as an ordinary replSet. To do so, replace lines 71-73 by these:

mongod –replSet cs –logpath “cfg-a.log” –dbpath /data/config/config-a –port 57040 –fork –configsvr –smallfiles
mongod –replSet cs –logpath “cfg-b.log” –dbpath /data/config/config-b –port 57041 –fork –configsvr –smallfiles
mongod –replSet cs –logpath “cfg-c.log” –dbpath /data/config/config-c –port 57042 –fork –configsvr –smallfiles

And add thereafter these lines:

echo “Configuring config servers”
mongo –port 57040 << ‘EOF’
config = { _id: “cs”, members:[
{ _id : 0, host : “localhost:57040” },
{ _id : 1, host : “localhost:57041” },
{ _id : 2, host : “localhost:57042” }]};
rs.initiate(config)
EOF

Finally, replace line 86 by this one:

mongos –logpath “mongos-1.log” –configdb cs/localhost:57040,localhost:57041,localhost:57042 –fork

How to write and execute mongoDB scripts

There are times when we need to save the steps or commands of mongo shell and need to get it executed in the same order (for reusability and automation and avoid errors). The best solution is to save it in a file with any extension preferred is js, however I call it as mjs which means mongo javascript. With below command you can run it in bash

aseem278$ mongo < /Users/asee2278scripts/mongoScript.mjs

TIP : To be effective while writing query, you write commands using IDE to use auto complete, formatting, Json format verification and then redirect input to mongo 

Sample mongoScript.mjs

use school

db.scores.drop()

var types = [‘exam’, ‘homework’, ‘quiz’]

for (student_id = 0; student_id < 100; student_id++) {

    for (type=0; type < 3; type++) {

   var r = {‘student_id’:student_id, ‘type’:types[type], ‘score’:Math.random() * 100};

   db.scores.insert(r);

    }

}

How to export data from mongoDB

# Make sure that we have mongoD process running and file below command from bash and not from Mongo shell to export data. This will export data for one collection in Db, to export entire DB use mongo dump

MQ428GG8WL:week2 asee2278$ mongoexport –db students –collection grades –out gradesOut.json

  • 2016-01-14T14:21:18.813-0600 connected to: localhost
  • 2016-01-14T14:21:18.831-0600 exported 800 record

Sample exported record would look like :

{"_id":{"$oid":"50906d7fa3c412bb040eb583"},"score":92.6244233936537,"student_id":3,"type":"exam"}