Read it in 8 Mins
In this article, you will learn about one of the leading NoSQL databases - MongoDB. You will understand the basics of MongoDB and how data is stored in a NoSQL database, but most of the article will focus on the data types that are supported by MongoDB.
MongoDB is a cross-platform, document-oriented. NoSQL database. MongoDB is known for its high scalability, amazing availability and higher performance compared to a similar SQL database like MySQL.
In any NoSQL database, data is stored as a set of key-value pairs. Here is an example.
"name": "Knowledgehut"
When we store related key-value pairs together in a set of key-value pairs, the set is known as a document. Here is an example of a document that contains data about an employee.
{ "employee_name": "John Doe", "employee_skills": "UI Design", "employee_salary": 40000, "employee_status": true, }
In the document above, you can see that we have stored multiple values for an employee. This is very similar to how we stored data within a row in a typical RDBMS. Similar documents are stored together within a collection. You can think of collections as the NoSQL equivalent of an RDBMS table with some key differences that we are not going to discuss in this article.
In the above document, you can see that we have 4 different key-value pairs. The values can be of different types, for example, in this case the employee_name and employee_skills have the values of String type, employee_salary is of the number type and the employee_status is of the boolean type.
Having these (and more) data types in MongoDB allows us to store the data in a more efficient format and also perform highly efficient and robust queries on the stored data.
Using the correct data type for storing the data fields in a document is crucial to the success of the database system. Here are some of the most used data types available in MongoDB.
We will have a look at all of these with examples but before that, let’s have a look at JSON and BSON to understand how MongoDB stores the data.
JSON stands for JavaScript Object Notation. It is a very common format used by APIs and web services to return the data to the client. This format is widely used because of its simplicity and ease of parsing. Most modern programming languages do not need an additional application layer to parse JSON data.
JSON objects are simple associative containers, where the data is stored as a set of key-value pairs. In this case, a key is mapped to a value (which can be a number, string, function, or even another object).
MongoDB also stores the data as JSON documents but the JSON data is binary encoded. This results in BSON. BSON simply stands for Binary JSON. BSON’s binary structure encodes type and length information, which allows it to be parsed much more quickly and therefore delivers better performance.
In a nutshell, MongoDB stores data in BSON format both internally, and over the network, but that doesn’t mean you can’t think of MongoDB as a JSON database. Anything you can represent in JSON can be natively stored in MongoDB, and retrieved just as easily in JSON.
Let’s have a look at each data type offered by MongoDB with examples and understand the best use-cases for them.
{ "employee_name": "John Doe", "employee_skills": "UI Design", "employee_salary": 40000, "employee_status": true, }
The above document has two keys that have the value of String type. Namely, employee_name and employee_skills have the String values. These are the simplest values and are used to represent a bunch of characters.
{ "employee_name": "John Doe", "employee_skills": "UI Design", "employee_salary": 40000, "employee_status": true, }
The key employee_salary stores a numeric value and therefore it is of the type integer.
{ "employee_name": "John Doe", "employee_skills": "UI Design", "employee_score": 97.67, "employee_status": true, }
{ "employee_name": "John Doe", "employee_skills": "UI Design", "employee_score": 97.67, "employee_status": true, }
Booleans use less storage than an integer or string and avoid any unexpected side effects of comparison.
{ "employee_name": "John Doe", "employee_skills": ["UI Design", "Graphic Design", "2D Animation"], "employee_score": 97.67, "employee_status": true, }
In the above example, the employee_skills field contains an array of type String where each value within the array is a String.
Here is another example where instead of an array of a simple type (String), documents are embedded within the array.
{ "item_code": "1234-ABCD", "item_price": 49.99, "item_stock": [{ "warehouse": "Warehouse A", "qty": 1200 }, { "warehouse": "Warehouse B", "qty": 900 }], }
In the above document, the field item_stock contains an array of embedded documents.
Here is an example of how a date is stored in a document.
{ "student_name": "Bob Stan", "student_dob": ISODate("2006-02-10T10:50:42.389Z"), "student_marks": 78.98 }
In the above example, the stored date can be easily converted to a readable format using JavaScript's new Date("2006-02-10T10:50:42.389Z") function. It will return the following output.
Fri Feb 10 2006 16:20:42 GMT+0530 (India Standard Time)
Internally, Date objects are stored as a signed 64-bit integer representing the number of milliseconds since the Unix epoch (Jan 1, 1970).
{ "item_code": "1234-ABCD", "item_price": 49.99, "item_dimensions": { "item_height": 1200, "item_width": 100, "item_depth": 900, }, "item_availability": true, }
In the above example, the item_dimensions field is an embedded document as it contains its own set of key-value pairs. This field therefore is of the type Object.
Here is how the timestamp value looks like in the document when it is queried.
{ "item_code": "1234-ABCD", "item_price": 49.99, "item_created": Timestamp(1412180887, 1), "item_availability": true, }
The timestamp data type is generally used to keep track of document creation/editing/updation times. The new Timestamp() function is used during the insertion and the server automatically adds the timestamp to the field.
{ "item_code": "1234-ABCD", "item_price": 49.99, "item_color": null, "item_availability": true, }
This is similar to the following document as well where the field is completely absent.
{ "item_code": "1234-ABCD", "item_price": 49.99, "item_availability": true, }
Here is an example.
{ "_id": "5349b4ddd2781d08c09890f3", "item_code": "1234-ABCD", "item_price": 49.99, "item_availability": true, }
The _id field is automatically added for every document if you do not specify a field explicitly with the ObjectID type.
Here is an example.
{ "_id": "5349b4ddd2781d08c09890f3", "item_code": "1234-ABCD", "item_price": 49.99, "item_availability": true, "item_picture":BinData(1, "wekud3298eyx2398ey293..."), }
BinData here is the base64 representation of the binary content.
{ "item_code": "1234-ABCD", "item_price": 49.99, "item_color": undefined, "item_availability": true, }
Undefined is now deprecated in MongoDB 4.4.
{ "item_code": "1234-ABCD", "item_price": 49.99, "item_color": undefined, "item_prefix": /%_Y675%, }
In BSON, there are two different types defined for functions without closures, JavaScript and another one for functions with closures, JavaScript with Scope. JavaScript with Scope is now deprecated in MongoDB 4.4.
So these are all the key and most prominent datatypes in MongoDB. BSON supports more datatypes than JSON. Some older and less often used datatypes are removed from the MongoDB support shelf and the range or support for newer types is improved with time. This is an evergreen process.
31 Oct 2019
17 Dec 2018
29 Dec 2020
07 Jun 2018
09 Jul 2019
25 Feb 2021