• Blog Home
  • Tech Talk
    • Best Practices
    • Java
    • .NET
    • Mobile
    • UI/ UX
    • Systems Engineering
    • Quality Assurance
  • ClubM

Sign in

  • Mazarin Corporate Site »
Mazarin Blog
stay connected
Join us on Facebbook! Follow Us on Twitter! Subscribe to our RSS Feed!
Sep
28
2017
Tech Talk

Let’s move to NoSQL Databases with MongoDB – Mazarin

Author Article by Kaushal Senevirathne    Comments No Comments

NoSQL

Introduction to NoSQL

NoSQL is designed to provide a mechanism to store and retrieve data in a distributed database, NoSQL concept
is mostly used with big data and real-time web applications. This concept was introduced to the world in late
1960s, but it was not popular back then as “NoSQL”. Even Though it supports SQL-like query languages, it is
not a replacement for SQL. It is rather a complementary addition to RDBMS and SQL. MongoDB, BigTable,
Redis, Neo4j , RavenDb, Cassandra, Hbase and CouchDb are known as popular NoSQL databases available in
the market.

 

Why do we need NoSQL?

Today’s web, mobile and IoT applications have one or more of the following characteristics.

  • Support large numbers of concurrent users (tens of thousands, perhaps millions)
  • Deliver highly responsive experiences
  • available at all times– no downtime
  • Handle semi-structured and unstructured data
  • Rapidly adapt to changing requirements with frequent updates and new features

Since it was challenging to achieve these new features with typical relational databases, the requirement for NoSQL emerged.

 

CAP theorem & NoSQL

NoSQL database follows the Brewer’s CAP theorem which was Published by Eric Brewer in 2000. This theorem describes a set of basic requirements that describe any distributed system.

CAP theorem consists of three guarantees named  Consistency, Availability and Partition Tolerance. Theoretically, it’s impossible to have all 3 requirements simultaneously, so a combination of 2 is chosen. No distributed system is safe from network failures, thus network partitioning generally has to be tolerated. When choosing consistency over availability, the system will return an error or a time-out if particular information cannot be guaranteed to be up to date due to network partitioning. When choosing availability over consistency, the system will always process the query and try to return the most recent available version of the information, even if it cannot guarantee it is up to date due to network partitioning. When the distributed system is running normally without network failure, both availability and consistency can be satisfied.

 

Types of NoSQL Databases

There are four types of NoSQL databases and all are designed for storing, retrieving, and managing information

 

Key-Value Database

This has a  unique key and a pointer to a particular item of data. Unique key is used  to find the record quickly in database. There are no fields to update, instead the entire value other than the key must be updated if changes are to be made. Redis is the most popular implementation of a key-value database

 

Graph Database

This represent and store data using nodes, edges and properties. The strength of a graph database is in traversing the

connections between the nodes. But they generally require all data to fit on one machine, limiting their scalability. Neo4j is a Java-based Graph Database.

 

Column Family Database

This consists of a key-value pair, where the key is mapped to a value which  is a set of columns. This is created to store and process very large amounts of data distributed over many machines. In these databases each column consists of a column name, a value and a timestamp. There are two types of column families namely standard column family which contains only columns and super column family which contains a map of columns.

 

Document Database

This is inherently a subclass of the key-value store and this  stores a record as a “document”. Unlike in relational databases this store all information for a given object in a single instance . It also supports querying and indexing features with enhanced efficiency . Mongodb is the leading Document Database.

 

RDBMS vs NoSQL

RDBMS vs NoSQL

 

RDBMS vs NoSQL

 

SQL Schema vs NoSQL Schemaless

In SQL it is mandatory to define tables, fields, field types  while it’s optional to define primary key, foreign key, indexes, triggers and stored procedures. In here, data structure is fixed in SQL and Schema must be designed and implemented before any business logic.

In NoSQL it is not necessary to define document design, collection etc. The data structure is not also  fixed and data can be added anywhere, at any time. This  is more suited to projects where the initial data requirements are difficult to ascertain.

 

SQL vs NoSQL Scaling

RDBMS is not designed to run efficiently on clusters as it’s  limited to scaling up since adding more processors, memory, and storage to a single physical server. It becomes more expensive as enterprises have to purchase large servers. It also can result in downtime if the database has to be taken offline to perform hardware upgrades.

In contrast to this,NoSQL run well on clusters And does scaling up by adding more servers as it scales  on-demand and without downtime. NoSQL were engineered to distribute reads, writes, and storage. thus it is easy to install, configure, and scale.

 

SQL Normalization vs NoSQL De-normalization

SQL Data is read and written by disassembling and reassembling objects which results in inefficiency as illustrates in diagram 1.5-a.

On the other hand, NoSQL reads and writes data formats including XML, YAML, and JSON as well as binary forms like BSON. This eliminates the object-relational impedance mismatch and the overhead of ORM frameworks which leads to faster queries. It is inefficient if data is getting updated, but normalization techniques can be used in NoSQL as shown in the diagram 1.5-b.

Figure1 : Normalization techniques (Source: https://www.couchbase.com/resources/why-nosql)

Introduction to mongodb

MongoDB is an open source, cross-platform, document oriented database that provides high performance, high availability and   Easy scalability.  This has become one of the most popular NoSQL database in the current market since its’  inception in 2009. mongodb server is open source, which means users can install and use free of charge. There are many mongodb clients available that connect to applications written in different languages. According to the following report mongodb became the fastest growing NoSQL database.

 

Figure2 : NoSQL databases (Source: https://www.slideshare.net/mongodb/webinar-how-to-visually-explore-and-manipulate-your-mongodb-data)

 

Why use mongodb?

There are many reasons to use mongodb over traditional RDBMS. Mongodb stores data in JSON formatted binary files(BSON). It can store data regardless of number of attributes needs to store. Mongodb suits for systems which needs to maintain mixed types of data sets as a single collection. Mongodb servers can be easily configured for cluster environment. It supports huge amount of concurrent threads using clustered server resources that ensures high availability of data which results in no server downtime.  The servers can handle fast data growth, such as 1000’s millions of write queries per second. Mongodb can select data and process them as large data sets without slowing down the system or it’s operations unlike in RDBMS where it selects and process data by dividing into small batch files and process them in order to keep the database performance at optimal level

 

Data Modeling

Mongodb doesn’t need declared data structure like RDMS. Mongodb and it has dynamic schema in a collection. Mongodb collection maintains similar fields and document structure. Data in a collection can model in two ways; normalized data model and denormalized data model. Since mongodb has flexible document structure  A  preferred data modeling can be used for necessary system requirements

Following are two types of Data models provided by MongoDB namely embedded data model and normalized data mode

 

Embedded Data Model (AKA denormalized model)

In this model, all related data contains in single document.  Embedding allows faster read operations than its Normalized model. This data model allows to read, write and update data with single database operation.

As an example consider the following diagram:  It has user id, user name, contact and access fields. The contact and access fields can be considered as normalize-able data even though it’s maintained  in the same document.

 

Figure3 : Embedded Data Model (Source: https://docs.mongodb.com/manual/core/data-model-design/)

Normalized Data Model

This model keeps related data in multiple documents. The main document(parent) has relationship with sub document(child). Normalized Data model is useful to show data in multiple hierarchies and nested arrays of data. MongoDB doesn’t provide foreign key references and CRUD operations have to create relationship and execute the operation.

Consider the following diagram as an example

The document has user id, user name, contact and access fields. The contact and access fields can be considered as normalize-able data which is kept in separate documents.

Figure 4 : Normalized Data Model (Source: https://docs.mongodb.com/manual/core/data-model-design/)

Install Mongodb using Docker

Creating a mongodb instant using docker is very simple. Following steps provide a guidance on how to install a mongodb instance in docker easily.

 

Create a Dockerfile image for mongodb

Following list of commands creates a mongodb docker image version  3.0.1 in Ubuntu 14.14.

FROM   ubuntu:14.04
MAINTAINER chpa@mazarin.lk

# Import MongoDB public GPG key AND create a MongoDB list file
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
RUN echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.0 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-3.0.list

# Update apt-get sources AND install MongoDB
RUN apt-get update && apt-get install -y mongodb-org=3.0.1 mongodb-org-server=3.0.1 mongodb-org-shell=3.0.1 mongodb-org-mongos=3.0.1 mongodb-org-tools=3.0.1

# Create the MongoDB data directory
RUN mkdir -p /data/db

# Expose port 27017 from the container to the host
EXPOSE 27017

# Set usr/bin/mongod as the dockerized entry-point application
ENTRYPOINT ["/usr/bin/mongod"]

Build the Dockerfile image.

To build the docker file execute the following command, where the dockerfile is located.

docker build --tag mazarin/mongo:v1

Run created Dockerfile image.

Following command creates a docker container instant by the name of “mongo” with port number 21017. The mounted container’s data directory is set to “containers /data/db” directory, which resides externally.

docker run --name mongo -p 27017:27017 -v $(pwd)/data:/data/db -d mazarin/mongo:v1

Now, we have successfully created a docker container running Mongo version 3.0.1 . It can be used by any mongodb client to access this instant through mongodb default port 27017.

 

Create a replica-set using Mongodb

Scalability is considered as a key element of NoSQL database design. Mongodb architecture also designed to address  this feature in an unique way. Mongodb achieves its scalability by using replica sets. Mentioned below are the steps to create a mongodb replica set in a local machine using a docker instance.

1. Multiple mongodb instants are needed to create a replica-set.Run docker image three

times with different ports. An additional configuration option “–replSet” to indicate that we are

creating a cluster and it belongs to a replica-set named “rs0”

docker run --name mongo_001 -p 28001:27017 -v $(pwd)/data:/data/db -d mazarin/mongo:v1 --replSet rs0

docker run --name mongo_002 -p 28002:27017 -v $(pwd)/data:/data/db -d mazarin/mongo:v1 --replSet rs0

docker run --name mongo_003 -p 28003:27017 -v $(pwd)/data:/data/db -d mazarin/mongo:v1 --replSet rs0

2. Initiate the cluster

rs.initate()

 

3. Log in to any mongodb instance using a preferable client. It shows as the SECONDARY at firstand it will automatically be primary within few seconds.

mongo –port 28001

 

4. Find the IP addresses of other mongo instances that are running to add them to the primary docker instance.

docker inspect  |grep  IPAddress

 

5. Add the secondary mongo instances to primary mongo instance.

rs.add("172.18.0.2")

 

6. Check the status of the replica-set, after adding secondary mongo instances

rs.status();

Now we have successfully created a mongodb replica set. Add some data to the PRIMARY and you can read the same data from the SECONDARY mongo server.

 

Important

    • If the status of the cluster keep saying STARTUP please check the name of the primary server . If it is not a ip address, run the below command in primary server to correct the name.
cfg = rs.conf()
cfg.members[0].host = ""
rs.reconfig(cfg)
rs.conf()
    • To read data from the SECONDARY servers you may have to execute the below command on SECONDARY server
rs.slaveOk()

Referances

  • https://en.wikipedia.org/wiki/Graph_database
  • http://data-magnum.com/lesson-5-key-value-stores-aka-tuple-stores/
  • http://www.getbreezenow.com/zza-mongo
  • https://10kloc.wordpress.com/tag/column-family/
  • https://www.couchbase.com/resources/why-nosql
  • https://www.slideshare.net/mongodb/webinar-how-to-visually-explore-and-manipulate-your-mongodb-data
  • https://docs.mongodb.com/manual/core/data-model-design/
  • https://en.wikipedia.org/wiki/Graph_database
  • http://data-magnum.com/lesson-5-key-value-stores-aka-tuple-stores/
  • http://www.getbreezenow.com/zza-mongo
  • https://10kloc.wordpress.com/tag/column-family/

Authors

  • Kaushal Senevirathne
  • Charith Padmasiri
  • Sirikumara Ranathunga

Related Post

Mazarin Foodies 2014
Importance of Big Data and Managing Data with Elasticsearch
Mazarin Aurudu Ulela 2015
What is NFC – The Ultimate Guide
Serverless Architecture with AWS Lambda
Mazarin Christmas Celebrations 2014
An Introduction to Node.js – Kickstarter
Company Culture

On this Page

  • NoSQL
    • Introduction to NoSQL
    • Why do we need NoSQL?
    • CAP theorem & NoSQL
    • Types of NoSQL Databases
    • RDBMS vs NoSQL
    • Introduction to mongodb
    • Why use mongodb?
    • Data Modeling
  • Install Mongodb using Docker
    • Create a Dockerfile image for mongodb
    • Build the Dockerfile image.
    • Run created Dockerfile image.
  • Create a replica-set using Mongodb
    • Important
    • Referances
    • Authors
  • Related Post
Tags: Mazarin, MongoDB, mongodb nosql, mongodb vs nosql, NoSQL, nosql db
Did you enjoy reading this article? Share it! Share on Facebook Tweet this! Bookmark on Delicious StumbleUpon Digg This!

Related Posts

  • Serverless Architecture with AWS Lambda
  • Without Redux and with Redux application state behaviorProductive Development With React Redux
  • Elements of CultureCompany Culture
  • What is Docker ? Getting Started with Docker
avatar

About the Author: Kaushal Senevirathne

Leave a comment

Click here to cancel reply.

CAPTCHA
Refresh

*

Follow Us on Twitter!

On this Page

  • NoSQL
    • Introduction to NoSQL
    • Why do we need NoSQL?
    • CAP theorem & NoSQL
    • Types of NoSQL Databases
    • RDBMS vs NoSQL
    • Introduction to mongodb
    • Why use mongodb?
    • Data Modeling
  • Install Mongodb using Docker
    • Create a Dockerfile image for mongodb
    • Build the Dockerfile image.
    • Run created Dockerfile image.
  • Create a replica-set using Mongodb
    • Important
    • Referances
    • Authors

Related Post

Sass and LESS: An Introduction to CSS Preprocessor...
Azure Functions – Learn more about it
Firebase – Mobile Application Development &#...
Serverless Architecture with AWS Lambda
Productive Development With React Redux
Beginners’ Guide to CSS (CSS for dummies)
Company Culture
What is Docker ? Getting Started with Docker
Hybrid Mobile App Development with Ionic and Angul...
Test Automation of Mobile Applications using Appiu...
What Power BI Can Do – Major Benefits
Data Mining using SQL Server Analysis Server
Learn Cucumber Test Automation with Ruby Core Fram...
How to Succeed With Designing Scalable Web Apps
Importance of Big Data and Managing Data with Elas...
An Introduction to Node.js – Kickstarter
MS SQL Server BI (Business Intelligence)
How To Start Cloud Computing with AWS
What is NFC – The Ultimate Guide
5 Principles: How To Use Lean Startup Towards A Su...
Avatars by Sterling Adventures

Team Mazarin

A team of individuals dedicated to share common goals and vision of the company. Mazarin's endowed team consists of Managers, Software Engineers, User Interface Engineers, Business Analysts, Finance and Administration. We are a blend of quality people. We strive to maintain the open culture and work in close association. The way we work enables everyone to contribute while feeling contented sharing opinions and ideas to deliver the best software solutions.

Read More

Mazarin © 2021. All Rights Reserved.