MongoDB Interview Questions

Follow the top basic and advanced MongoDB interview questions and turn yourself into an essential MongoDB Developer. We have covered the most-asked questions on how MongoDB supports efficient querying against array fields and also how recursive queries are supported within MongoDB. With these top interview questions on MongoDB, you will understand the detailed structure of MongoDB and the different applications of MongoDB. These will qualify you to become a MEAN Stack Developer, Backend engineer, front-end developer and many more.

  • 4.8 Rating
  • 19 Question(s)
  • 30 Mins of Read
  • 3293 Reader(s)

Beginner

db.<collection>.find().skip(n).limit(n)

Note: n is the pagesize, for the first page skip(n) will not be applicable

limit(n) limits the documents to be returned from the cursor to n, skip(n) will skip n documents

from the cursor

This can be achieved in MongoDB using the $type operator. A null value, i.e., BSON type null has the type number 10.  Using this type number, only those documents can be retrieved whose value is null. 

Take the example of the below two documents in startup collection

{ _id: 1, name: "XYZ Tech", website: null },   { _id: 2, name: “ABC Pvt Ltd” }

The query { website : { $type: 10 } } will retrieve only those documents where the website is null, in the above case it would be the startup “XYZ Tech”

Note: The query { website : null } on the other hand will match documents where the website is null or the documents where the website field does not exist. For the above collection data, this query will return both the startups.

only those documents that contain the field specified in the query. 

For the following documents in employee collection

{ _id: 1, name: "Jonas", linkedInProfile: null },   { _id: 2, name: “Williams” }

The query { linkedInProfile: { $exists: true } } will return only the employee “Jonas” 

Advanced

  • Compound indexes not only support queries that match all the index fields, they also support queries on the index prefixes as well.

         Consider the following compound index
                       { "accountHolder": 1, "accountNumber": 1, "currency": 1 }

The index prefixes are

                      { accountHolder: 1 }

                     { accountHolder: 1, accountNumber: 1 }

Query plan will use this index if the query has the following fields

  • accountholder
  • accountHolder and  accountNumber
  • accountholder and accountNumber and currency
  • Ordering is very important, the order of fields in the queries should match the order of fields in a compound index (left to right) for the index to be used

The $addToSet operator should be used with the $each modifier for this. The $each modifier allows the $addToSet operator to add multiple values to the array field.

Example, start ups are tagged as per the technology skill that they excel in

{ _id: 5, name: "XYZ Technology", skills: [ "Big Data", "AI", “Cloud” ] } 

Now the start up needs to be updated with additional skills

db.startups.update(   { _id: 5 },

         { $addToSet: { skills: { $each: [ "Machine Learning", "RPA" ] } } }

      )

The resultant document after update()

{ _id: 5, name: "XYZ Technology", skills: [ "Big Data", "AI", “Cloud”,  "Machine Learning", "RPA"] }

Note: There is no particular ordering of elements in the modified set, $addToSet does not guarantee that. Duplicate items will not be added.   

When "fast reads" are the single most important criteria, Embedded documents can be the best way to model one-to-one and one-to-many relationships.

Consider the example of certifications awarded to an employee, in the below example the certification data is embedded in the employee document which is a denormalized way of storing data

{
   _id: "10",
   name: "Sarah Jones",
   certifications: [
                { certification: "Certified Project Management Professional”, 
   certifying_auth: “PMI”,
    date: "06/06/2015"
                },
                {  certification: "Oracle Certified Professional”,
     certifying_auth: “Oracle Corporation”,
     date: "10/10/2017"
                }
              ]
 }

In a normalized form, there would be a reference to the employee document from the certificate document, example

{                  employee_id: "10",
         certification: "Certified Project Management Profesional”, 
        certifying_auth: “PMI”,
         date: "06/06/2015"
}

Embedded documents are best used when the entire relationship data needs to be frequently retrieved together. Data can be retrieved via single query and hence is much faster.

Note: Embedded documents should not grow unbounded, otherwise it can slow down both read and write operations. Other factors like consistency and frequency of data change should be considered before making the final design decision for the application.

MongoDB has the db.collection.explain(), cursor.explain() and explain command to provide information on the query plan. The results of explain contain a lot of information, key ones being

  • rejectedPlans, other plans considered by the database (if any)
  • winningPlan, the plan selected by the query optimizer for execution
  • executionStats, gives detailed information on the winning plan
  • IXSCAN stage, if the query planner selects an index, the explain result will include this stage. This is one of the key things to look for when analyzing the query plan for performance optimization

Recursive queries can be performed within a collection using $graphLookUp which is an aggregate pipeline stage.

If a collection has a self-referencing field like the classic example of Manager for an employee, then a query to get the entire reporting structure for manager “David” would look like this

db.employees.aggregate( [
   {
      $graphLookup: {
         from: "employees",
         startWith: "David",
         connectFromField: "manager",
         connectToField: "name",
         as: "Reporting Structure"
      }
   }
] )

For the following documents in the employee collection,

{ "_id" : 4, "name" : " David "  , "manager" : "Sarah" }
{ "_id" : 5, "name" : "John"      , "manager" : "David" }
{ "_id" : 6, "name" : "Richard", "manager" : " John " }
{ "_id" : 7, "name" : "Stacy"    , "manager" : " Richard " }

Output of the above $graphLookup operation would result in the following 3 documents returned

{ "_id" : 5, "name" : "John"      , "manager" : "David", … }
{ "_id" : 6, "name" : "Richard", "manager" : " John ",  … }
{ "_id" : 7, "name" : "Stacy"    , "manager" : " Richard", … }

The hierarchy starts with “David” which is specified in startWith and there on the data for each of the members in that reporting hierarchy are fetched recursively 

The $graphLookup looks like this for a query from the employees collection where “manager” is the self-referencing field

db.employees.aggregate( [
   {
      $graphLookup: {
         from: "employees",
         startWith: "David",
         connectFromField: "manager",
         connectToField: "name",
         as: "Reporting Structure"
      }
   }
] )

The value of as, which is “Reporting Structure” in this case is the name of the array field which contains the documents traversed in the $graphLookup to reach the output document.   

For the following documents in the employee collection,

{ "_id" : 4, "name" : " David "  , "manager" : "Sarah" }
{ "_id" : 5, "name" : "John"      , "manager" : "David" }
{ "_id" : 6, "name" : "Richard", "manager" : " John " }
{ "_id" : 7, "name" : "Stacy"    , "manager" : " Richard " }

“Reporting Structure” for each output document would look like this

{
"_id" : 5, "name" : "John" , "manager" : "David",
"Reporting Structure" : []
}
{
"_id" : 6, "name" : "Richard", "manager" : " John ",  
"Reporting Structure" : [{ "_id" : 5, "name" : "John" , "manager" : "David" }]
}
{
"_id" : 7, "name" : "Stacy"    , "manager" : " Richard",
"Reporting Structure" : [{ "_id" : 5, "name" : "John", "manager" : "David" }
      { "_id" : 6, "name" : "Richard", "manager" : " John " }]
 }

Yes, there is very much a simpler way of achieving this without having to do this programmatically. The $unwind operator deconstructs an array field resulting in a document for each element. 

Consider user “John” with multiple addresses

{
"_id" : 1, "name" : "John",  addresses: [ "Permanent Addr", "Temporary Addr", "Office Addr"]
} 
db.users.aggregate( [ { $unwind : "$addresses" } ] ) 

would result in 3 documents, one for each of the addresses

{ "_id" : 1, " name " : " John ", " addresses " : "Permanent Addr" }
{ "_id" : 1, " name " : " John ", " addresses " : "Temporary Addr" }
{ "_id" : 1, " name " : " John ", " addresses " : "Office Addr" }

MongoDB supports Capped collections which are fixed-size collections. Once the allocated space is filled up, space is made for new documents by removing (overwriting) oldest documents. The insertion order is preserved and if a query does not specify any ordering then the ordering of results is same as the insertion order. The oplog.rs collection is a capped collection, thus ensuring that the collection of logs do not grow infinitely.

A query that is able to return entire results only by using the index is called a Covered Query. This is one of the optimization techniques that can be used with queries for faster retrieval of data. A query can be a covered query only if

  • all the fields in the query are part of an index and
  • all the fields that are returned are also part of the same index

Since everything is part of the index, there is no need for the query to check the documents for any information.

Multikey indexes can be used for supporting efficient querying against array fields. MongoDB creates an index key for each element in the array.

Note: MongoDB will automatically create a multikey index if any indexed field is an array, no separate indication required.

Consider the startups collection with array of skills

{ _id: 1, name: "XYZ Technology", skills: [ "Big Data", "AI", “Cloud” ] }

Multikey indexes allow to search on the values in the skills array

db.startups.createIndex( { skills :  1 } )

The query db.startups.find( { skills :  "AI" } ) will use this index on skills to return the matching document

All the 3 projection operators, i.e., $, $elemMatch, $slice are used for manipulating arrays. They are used to limit the contents of an array from the query results.   

For example, db.startups.find( {}, { skills: { $slice: 2 } } ) selects the first 2 items from the skills array for each document returned.

Starting in version 4.0, multi-document transactions are possible in MongoDB. Earlier to this version, atomic operations were possible only on a single document.

With embedded documents and arrays, data in the documents are generally denormalized and stored in a single structure. With this as the recommended data model, MongoDB's single document atomicity is sufficient for most of the applications. 

Multi-document transactions now enable the remaining small percentage of applications which require this (due to related data spread across documents) to depend on the database to handle transactions automatically rather than implement this programmatically into their application (which can cause performance overheads).

Note: Performance cost is more for multi-document transactions (in most of the cases), hence it should be judiciously used.

In the case of an error, whether the remaining operations get processed or not is determined if the bulk operation is ordered or unordered.  If it is orderedthen MongoDB will not process the remaining operations, whereas if it is unordered , MongoDB will continue to process the remaining operations.

Note: “ordered is an optional Boolean parameter that can be passed to bulkWrite(), by default this is true.

The MongoDB enterprise version includes auditing capability and this is fairly easy to set up. Some salient features of auditing in MongoDB

  • DML, DDL as well as authentication and authorization actions can be captured.
  • Logging every event will impact performance, usage of audit filters is recommended to log only specific events.
  • Audit logs can be written in multiple formats and to various destinations – to console and syslog , to a file (JSON / BSON). Performance wise, printing to a file in BSON format is better than JSON format.  
  • The file can be passed to the MongoDB utility bsondump for a human readable output.

Note: Auditing adds performance overhead and the amount of overhead is determined by a combination of the several factors listed above. The specific needs of the application should be taken into account to arrive at the optimal configuration.

Once selected, the shard key can't be changed later automatically. Hence it should be chosen after a lot of consideration. The distribution of the documents of a collection between the cluster shards is based on the shard key. Effectiveness of the chunk distribution is important for the efficient querying and writing of the MongoDB database and this effectiveness of the chunk distribution is directly related to the shard key. That is why choosing of the right shard key up front is of utmost importance.

When any text content within a document needs to be searchable, all the string fields of the document can be indexed using the $** wildcard specifier.  db.articles.createIndex( { "$**" : "text" } )

NoteAny new string field added to the document after creating the index will automatically be indexed. When data is huge, wildcard indexes will have an impact on performance and hence should be used with due consideration of this.

Description

MongoDB is an open-source NoSQL database that uses a document-oriented data model and a non-structured query language. It overcame one of the biggest pitfalls of the traditional database systems, that is scalability. MongoDB is being used by some of the biggest companies in the world, known for its best features and offers a unique set of features to the companies in order to resolve the unstructured data.

MongoDB is used across several companies in multiple domains. The research found that 26,929 companies are using it. The companies using MongoDB are most often found in the United States mostly in the Computer Software industry. Companies with 10-50 employees and with a revenue of 1 Million -10 Million dollars using this.

There is a huge demand for professionals who are qualified and certified in working with the advanced and basics of MongoDB and can expect to have a promising career. Organizations around the world are utilizing the innovation of MongoDB to meet the fast-changing requirements of their customers.

The MongoDB Interview Questions and answers are prepared by experienced industry experts and can prove to be very useful for newcomers as well as the experienced professionals who want to become a MongoDB Developer. These interview questions on MongoDB here will help you strengthen your technical skills, prepare for the new job test and quickly revise the concepts. You will have an in-depth knowledge by going through these MongoDB Interview Questions and help you ace your MongoDB interview.

To relieve you of the worry and burden of preparation for your upcoming interviews, we have compiled the above MongoDB Interview Questions and answers with answers prepared by industry experts.  These common interview questions on MongoDB will help you ace your MongoDB Interview.

Learning MongoDB will definitely give a boost to your career because of the demand for MongoDB in the market is increasing at a tremendous pace. All the best!

Read More
Levels