MongoDB

6 min readMay 12, 2021

On this weekend I have attended a workshop on “MongoDB” delivered by Vimal Daga sir, it was a 2 days, and during this workshop I have learned lot of concepts of MongoDB in such a short span of time.

This post is a glimpse of what I have learnt in these two days:

Data model

Data model is the way that we manage data. It decides how and where to use the database.

SQL

SQL(Sequential Query Language) is a place where the data model is sequential i.e. the data is arranged in a table with a static type schema and hence is not flexible.

NOSQL

NOSQL is short form for Not only SQL where there is no static schema in which our data is to be sorted in key and value pairs. It is also known as flexible data oriented model.

Document oriented Database

Document oriented DB is a data model where a single row(record) is treated as it is one single document.

JSON

JSON is short for JavaScript Object Notation is a tool for arranging in data in some particular notation for maintaining uniformity from one tool to another.

CRUD

CRUD operation in MongoDB can be done in multiple different ways depending upon the requirements.

CRUD operations :

Create or insert operations : add new documents to a collection. If the collection does not currently exist, insert operations will create the collection.
Read operations : retrieve documents from a collection; i.e. query a collection for documents
Update operations : modify existing documents in a collection
Delete operations : remove documents from a collection

MongoDB using CLI and GUI

For Using MongoDB we have to first install MongoDB compass. and after installation MongoDB provide 4 Option to use it.

CLI
GUI (MongoDB Compass)
API
Cloud

For CLI we have to only provide a path where the MongoDB installed and we run with the Command

>> Mongo

use <database_name> : create database and use it
show <database_name> : show database
<database_name>.create_collection(“<Collection_name>”) : create Collection
<database_name>.<collection_name>.insert({“field”:”data”, “field”:”data”,- — }) : insert data in collection and make a Document
<database_name>.<collection_name>.find() : use for search anything
<database_name>.<collection_name>.drop() : delete collection
<databse_name>.dropdatabase() : delete Database
<database_name>.<collection_name>.find().Pretty() : give output in real json form
<database_name>.<collection_name>.execute().find() : show the deatils of collection and Explain it.
<database_name>.<collection_name>.getIndexes() : Returns an array that holds a list of documents that identify and describe the existing indexes on the collection, including hidden indexes.
<database_name>.<collection_name>. createIndex( { key : 1/-1} ) : create an index for the key with 1 as ascending order and -1 as descending order.
<database_name>.<collection_name>.execute(“executionstate”).find() : Explain the Execution state with details.

For GUI we have to provide the Ip:port_no to connect with server and after connection we can use MongoDB with buttons and Clicks in GUI mode.

Pymongo

To integrate mongoDB with python we can use the pymongo library in python. And then use the MongoClient function to define that our code is a client of the server.

>> pip install pymongo

>> import pymongo

>> Client = pymongo.MongoClient(‘mongodb://127.0.0.1:27017’)

This will Connect python to Mongo Server and Client variable is know work as object of MongoDB in python.

To configure compass we can just install compass from the official website and then we can just feed in the in URL for the server.

To upload datasets in MongoDB we can use the mongoinsert command in mongo prompt.

Indexes

Indexes are a unique number set which are defined to all the documents in our database collection.

These indexes are also the primary key as they will always uniquely identify one and only one document at a time.

Indexing is a way of arranging the data such that the searching operations in the database is faster as the data is re-arranged based on some criteria which allows us to traverse lesser amount of data to search something.

Sharding

Sharding in MongoDB is way to decide data based on some criteria in the slave nodes so that it gives us better IO speed.

COLLSCAN

In MongoDB find function will go to every document and compare the query we are looking for and then give the result. Therefore it is quite time consuming if we have huge database. This is by default nature of MongoDB and it is known as Collection Scan(COLLSCAN).

IXSCAN

The method of retrieving particular field with id and place in ordered format is known as index. After creating index if we search something using that index then it is known as index scanning (IXSCAN).

By default MongoDB creates 1 index for ID and sort it in ascending order.

MongoDB also supports user defined indexes on multiple fields i.e, compound indexes. For example if a requirement is to search females of age greater than 25 so we need to examine two keys- age and gender

Aggregation Framework

For Aggregation MongoDB has Aggregation framework. Pipeline is created to do aggregation. A pipeline consists of multiple stages and output of every state is passed to the next stage.

$match : for searching
$group : for grouping
$sort : for sorting

In an enterprise we need multi node setup for avoiding single point of failure. Due to any issue if our single node goes down then the entire database is unavailable. Therefore to avoid the risk of single point of failure we always use multi node cluster.

Replicaset

We always copy the data in multiple nodes and this is known as ReplicaSet.

In data bases we can store data on different nodes on basis of groups and categories.

Clustering Architecture

MongoDB also supports clustering architecture that is master slave architecture. The master stores the metadata and have router program which tells the client that where the data you are looking for is present.

MongoDB Atlas

MongoDB Atlas is a fully managed cloud database developed by the same people that built MongoDB. It is a mongo cloud that which can set up entire MongoDB database and behind the scene it uses cloud providers like AWS, GCP, and azure to setup the clusters.

Reference Data Model

Sometimes in our multiple documents, some fields are common. So instead of creating field or embedded document in main document we put that information in some other centralized document and provide its reference in the main document. This is known as reference data model.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —