HomeBlogWeb DevelopmentMongoDB Data Modeling and its Process

MongoDB Data Modeling and its Process

05th Sep, 2023
view count loader
Read it in
8 Mins
In this article
    MongoDB Data Modeling and its Process

    Learn how to use Data Modeling in MongoDB while working with different types of data. Understand the importance of data models and how it is used. The main challenge in data modelling is balancing the application's needs, the database engine's performance characteristics, and the data retrieval patterns. Always consider the application usage of the data (i.e. queries, updates, and data processing) as well as the inherent structure of the data itself when designing data models.

    What is Data Modeling in MongoDB?

    In general, data modelling is the analysis of data items in a database and how they relate to other objects in that database. We can have a users collection and a profile collection in MongoDB, 

    for example: The users’ collection contains a list of usernames for a specific application, whereas the profile collection contains information about each user's profile settings.

    In data modelling, we must create a relationship that connects each user to the appropriate profile. In a nutshell, data modelling is the first step in database design, as well as the foundation for object-oriented programming. It also provides an indication of how the physical application will appear as development progress. An example of an application-database integration architecture is shown below.

    MongoDB Data Modelling

    Types of Data Models in MongoDB (Document Structure)

    Documents in MongoDB play a significant role in determining which technique to use for a given set of data. In general, there are two types of MongoDB document model /data modelling:

    • Embedded Data Model
    • Reference Data Model

    1. Embedded Data Model [with Example]

    In this case, related data is stored as a field value or an array within a single document. The main benefit of this method is that data is denormalized, making it possible to manipulate related data in a single database operation. As a result, the efficiency of CRUD operations improves, and fewer queries are required. Consider the following document as an example:

    { "_id" : ObjectId("5b98bfe7e8b9ab9875e4c80c"),
         "StudentName" : "Ishan Jain",
            "Settings" : {
            "location" : "Embassy",
      "ParentPhone" : 123987456
            "bus" : "KAZ 450G",
            "distance" : "4",
            "placeLocation" : {
                "lat" : -0.376252,
                "lng" : 36.937389

    In this set of data, we have a student with his name and some additional information. The Settings field has an object embedded in it, as well as the location. The location field is also embedded with a latitude and longitude configuration object. This student's data has been compiled into a single document. If we need to get all of this student's information, we simply run:

    db.students.findOne({StudentName : "Ishan Jain"})

    Pros of Embedded Model

    1. Increased data access speed 
    2. Reduced data inconsistency 
    3. Reduced CRUD operations 

    Cons of Embedded Model

    1. Restricted document size
    2. Data duplication

    2. Reference Data Model [with Example]

    The related data is stored in separate documents in this case, but there is a reference link between them. The sample data can be reassembled as follows:

    User Document:

    { "_id" : xyz,
         "StudentName" : "Ishan Jain",
         "ParentPhone" : 123987456,

    Settings Document:

         "id" :xyz,
         "location" : "Embassy",
         "bus" : "KAZ 450G",
         "distance" : "4",
         "lat" : -0.376252,
         "lng" : 36.937389

    Although the documents are separate, they are linked by the same _id and id fields. The MongoDB data model has been normalized as a result. We must, however, issue additional queries to access information from a related document, which increases execution time.

    Pros of Reference Data Model

    1. Data consistency 
    2. Improved data integrity 
    3. Improved cache utilization 
    4. Improved flexibility 
    5. Faster writes
    6. Efficient hardware utilization

    Cons of Reference Data Model

    1. Multiple lookups
    2. Many queries are issued to achieve some operation 

    The Process of Data Modeling in MongoDB

    MongoDB data modelling boosts database performance, but it comes at the cost of some factors, such as:

    • Patterns of data retrieval
    • Balancing application requirements such as queries, updates, and data processing
    • The chosen database engine's performance characteristics
    • The data's own inherent structure

    Key Considerations for MongoDB Data Modeling

    When deciding on the best data model, there are several factors to consider. These aspects differ depending on the stage of the Data Lifecycle for which we are designing. These elements are as follows:

    1. Data Creation, Modification Speed, and Frequency: Small amounts of data should be captured more quickly while maintaining consistency.
    2. Data Retrieval Speed: The ability to retrieve small or large amounts of data for reporting and analysis.
    3. ACID Properties: Atomicity, Consistency, Isolation, and Transaction Durability
    4. Business scope: involves one or more departments or business functions.
    5. Access to the Finest Grain of Data: Different data use cases may necessitate access to the finest level of detail or various levels of aggregation.

    Other factors may exist, but the ones mentioned above have a significant impact on the decision-making process for selecting the best data model.

    MongoDB Data Modeling Schema

    For a given set of data, a schema is essentially a skeleton of fields and data types that each field should contain. All rows should have the same columns, and each column should contain the defined data type, according to SQL. However, MongoDB data modelling comes with a flexible Schema that doesn't require all documents to conform to the same standards.

    1. Flexible Schema

    In MongoDB, a flexible schema specifies that documents do not have to have the same fields or data types, and that a field can vary between documents within a collection. The main benefit of this concept is that it allows you to add new fields, delete existing ones, or change field values to a different type, resulting in a new structure for the document.

    2. Rigid Schema

    You may decide to create a rigid schema even though these documents may differ from one another. A rigid schema specifies that all documents in a collection have the same structure, allowing you to create document validation rules to ensure data integrity during insert and update operations.

    3. Schema Validation Levels

    There are three levels of validation:

    • Strict: The default validation level in MongoDB is this, which applies validation rules to all inserts and updates.
    • Moderate: Validation rules are only applied during inserts, updates, and to existing documents that meet the validation criteria.
    • Off: sets the validation rules for a given schema to null, implying that no validation will be performed on the documents.


    Insert the below data into a client collection.

    db.clients.insert([{"_id" : 1,"name" : "Abhresh","phone" : "+91 123987456","city" : "Pune","status" : "Married"},{"_id" : 2,"name" : "Bhavesh", "city" : "Kota"}]);

    If we apply the moderate validation level using:

    db.runCommand( {
       collMod: "test",
       validator: { $jsonSchema: {
          bsonType: "object",
          required: [ "phone", "name" ],
          properties: {
             phone: {
                bsonType: "string",
                description: "must be a string and is required"
             name: {
                bsonType: "string",
                description: "must be a string and is required"
       } },
       validationLevel: "moderate"
    } )

    The validation rules will only be applied to the document with _id 1 because it meets all the criteria.

    The second document will not be validated because the validation rules do not match the issued criteria.

    4. Schema Validation Actions

    There may be some documents that violate the validation rules after they have been validated. When this occurs, there is always a need to act.

    MongoDB offers two actions for documents that fail to pass the validation rules:

    • Error: If the validation criteria are not met, this is the default MongoDB action, which rejects any insert or update.
    • Warn: This action will log the violation in the MongoDB log, but it will not prevent the insert or update operation from continuing. As an example:
    db.createCollection("students", {
       validator: {$jsonSchema: {
             bsonType: "object",
             required: [ "name", "gpa" ],
             properties: {
                name: {
                   bsonType: "string",
                   description: "must be a string and is required"
                gpa: {
                   bsonType: [ "double" ],
                   minimum: 0,
                   description: "must be a double and is required"
    validationAction: “warn”

    If we try to insert a document like this:

    db.students.insert( { name: "Ishan", status: "Updated" } );

    Because the validation action is set to warn, despite the fact that the gpa is a required field in the schema design, the document will be saved, and an error message will be recorded in the MongoDB log.

    Looking to level up your programming skills? Join the best programming course and unlock your potential. Don't miss out on this unique opportunity to enhance your coding abilities. Sign up now!

    Benefits of Using MongoDB Data Modelling

    Data modelling may appear to be an unorthodox process, far removed from the data analytics projects that generate measurable value for the organisation. However, data modelling is necessary foundational work that allows data to be stored more easily in a database and has a positive impact on data analytics. These are some of the key advantages of data modelling and why organizations will continue to use them:

    1. Higher data quality

    The visual representation of requirements and business rules enables developers to anticipate what could lead to large-scale data corruption before it occurs. Data models enable developers to define rules that monitor data quality, reducing the possibility of errors.

    2. Internal communication about data and data processes improves

    Creating data models forces the business to define how data is generated and moved across applications.

    3. Costs of development and maintenance reduce

    Data modelling exposes errors and inconsistencies early in the process, making it easier and less expensive to fix.

    4. Performance improves

    An organised database is one that is more efficiently operated; data modelling prevents the schema from endless searching and returns results more quickly.

    Data Modelling Trend

    You now understand how data modelling in MongoDB differs from relational DBMs, particularly in terms of schema. Organizations are increasingly involving business users to solve the problem of data quality. Modern data preparation platforms now enable business users to prepare data for specific analytic initiatives themselves, rather than burdening developers with the task of building data models and resolving all data quality issues.


    Abhresh Sugandhi


    Abhresh is specialized as a corporate trainer, He has a decade of experience in technical training blended with virtual webinars and instructor-led session created courses, tutorials, and articles for organizations. He is also the founder of Nikasio.com, which offers multiple services in technical training, project consulting, content development, etc.

    Share This Article
    Ready to Master the Skills that Drive Your Career?

    Avail your free 1:1 mentorship session.

    Your Message (Optional)

    Upcoming Web Development Batches & Dates

    NameDateFeeKnow more
    Course advisor icon
    Course Advisor
    Whatsapp/Chat icon