Search

What is Node JS Process?

How to use the global process moduleNode.js is built using C++ program with a V8 engine embedded in it to provide an environment to execute JavaScript outside the browser. Many of the accessible global methods are actually wrappers around methods which make calls to core C libraries.Node.js has many in-built global identifiers that are available and known to developers. There are also some which are accessible at the module level via module inheritance.Few of the global objects are:global - It is a namespace and accessible across the application. Setting a property to this namespace makes it accessible throughout the running process.process - Process is a module which provides interaction with the current Node.js process.console - Console is a module mostly used for logging the information or error. It wraps the STDIO functionality of a process.setTimeout(), clearTimeout(), setInterval(), clearInterval() - All these can be categorized as timer functions.Some of the globules accessible via module inheritance are module, exports, __filename, __dirname, require() etc.In this article, we attempt to understand more about the ‘process’ object and its details with examples. ‘process’ object is a global which is an instance of EventEmitter and can be accessed directly. It can also be explicitly accessed using module global i.e. require.const process = require(‘process’);process object has a property ‘argv’ which is an array containing the properties passed to node process.Create a simple index.js file and lets console.log the process.argvconsole.log(process.argv) Type in ‘node index.js’ in terminal. On pressing Enter, the following output should be provided:[   '/Users/***/.nvm/versions/node/v12.20.1/bin/node',   '/Users/***/index.js' ]Now, let’s pass in some other parameters and you should see the parameters being displayed say ‘node index.js test’.Also note that process has ‘process.stdout’ and ‘process.stderr’ which helps us to send a message to the standard output channel and standard error channel if there is any error.Infact, the console.log is internally doing process.stdout.write(msg + ‘\n’).console.log(‘Hello world’) is the same as process.stdout.write(‘Hello world’ + ‘\n’).Files can be read as stream and we can pipe this to process.stdout. For example, replace the content of index.js with the below code.var fs = require('fs') fs.createReadStream(__filename).pipe(process.stdout);Running the ‘node index.js’ should display the content of the file. Another interesting thing about Node.js is that when it is done or doesn’t have anything left to do, then it will exit out of the process. Let’s understand this with an example.setTimeout(() => { process.stdout.write('Executed after 1000 ms' + '\n'); }, 1000)This one will wait for 1 sec and then outputs ‘Executed after 1000 ms’ and terminates the process.If we want the process to run forever, then we can replace the setTimeout with setInterval which will execute the callback every time, post the time interval. And the only way to exit is by pressing ‘Ctrl+C’ or the process gets crashed.To get a quick walk through of properties, methods and events on the process object, add ‘console.log(process)’ in index.js and run node index.js. Most of them are self-explanatory as per their name.version: 'v12.20.1', //current version of node versions: {…}, // gives insight about the node and its core components like V8 engine version arch: 'x64', platform: 'darwin', release: {…}, // details of node source and version of lts. moduleLoadList: [...], // details of modules available with node. binding: [Function: binding], _events: [Object: null prototype] {   newListener: [Function: startListeningIfSignal], // whenever a new listener is added   removeListener: [Function: stopListeningIfSignal], // existing listener is removed   warning: [Function: onWarning],   SIGWINCH: [Function]   },   _eventsCount: 4,   _maxListeners: undefined,   domain: null,   _exiting: false,   config: {   target_defaults: {…},   variables: {...}   },   cpuUsage: [Function: cpuUsage],   resourceUsage: [Function: resourceUsage],   memoryUsage: [Function: memoryUsage],   kill: [Function: kill],   exit: [Function: exit],   openStdin: [Function],   getuid: [Function: getuid],   geteuid: [Function: geteuid],   getgid: [Function: getgid],   getegid: [Function: getegid],   getgroups: [Function: getgroups],   allowedNodeEnvironmentFlags: [Getter/Setter],   assert: [Function: deprecated],   features: {…},   setUncaughtExceptionCaptureCallback: [Function: setUncaughtExceptionCaptureCallback],   hasUncaughtExceptionCaptureCallback: [Function: hasUncaughtExceptionCaptureCallback],   emitWarning: [Function: emitWarning],   nextTick: [Function: nextTick],   stdout: [Getter],   stdin: [Getter],   stderr: [Getter],   abort: [Function: abort],   umask: [Function: wrappedUmask],   chdir: [Function: wrappedChdir],   cwd: [Function: wrappedCwd],   initgroups: [Function: initgroups],   setgroups: [Function: setgroups],   setegid: [Function],   seteuid: [Function], setgid: [Function], setuid: [Function], env: {…}, // environment details for the node application title: 'node', argv: [      '/Users/srgada/.nvm/versions/node/v12.20.1/bin/node',     '/Users/srgada/index.js'  ], execArgv: [], pid: 29708, ppid: 19496, execPath: '/Users/srgada/.nvm/versions/node/v12.20.1/bin/node', debugPort: 9229, argv0: 'node', mainModule: Module {…} //details of the main starting file or module. This is deprecated in latest one and use require.main instead  }Let’s take a look at a few of the properties which are most used or required.pid – gives the process id platform – is linux or darwin version – node version title – process name, by default it is node and can be changed execPath – for executable process path argv – arguments passedSome common methods are:exit - exits the process and accepts the exit code as argument. cwd – to get the current working directory and to change we can use ‘chdir’. nextTick – as the name suggests, it places the callback passed to this function in the next iteration of event loop. It is different to setTimeout with 0 ms delay.process.nextTick(() => {      console.log('Got triggered in the next iteration of event loop');   });   setTimeout(() => {      console.log("Even after nextTick is executed");   }, 0);   console.log("First text to be printed"); Output: First text to be printed   Got triggered in the next iteration of event loop   Executed after some delayEVENTS: To log (or) perform any cleaning before exiting the process, we can hook to ‘exit’ event which is raised when process.exit is invoked.console.log(process.argv); process.on('exit', () => {      console.log('Perform any clean up like saving or releasing any memory');   });Exit is fired after the event loop is terminated. As a result, we can’t perform any async work in the handler. So if you want to perform some calls like saving content to db, we can hook to ‘beforeExit’ when the process gets exited.process.on('beforeExit', code => {     // Can make asynchronous calls     setTimeout(() => {       console.log(`Process will exit with code: ${code}`)       process.exit(code)     }, 1000)   });   process.on('exit', code => {     // Only synchronous calls     console.log(`Process exited with code: ${code}`)   });     console.log('After this, process will try to exit');Another event ‘uncaughtException’, as the name suggests, is raised when there is an unhandled exception in the application. Whenever an unhandled exception is found, the Node.js application logs the stack raise and exit.process.on('exit', () => {       console.log('Perform any clean up like saving or releasing any memory');   });   process.on('uncaughtException', (err) => {       console.error('An unhandled exception is raised. Look at stack for more details');       console.error(err.stack);     process.exit(1); }); var test = {};   //raises an exception.   test.unKnownObject.toString();OutputAn unhandled exception is raised. Look at stack for more details TypeError: Cannot read property 'toString' of undefined at Object.<anonymous> (/Users/srgada/index.js:10:20) at Module._compile (internal/modules/cjs/loader.js:999:30) at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10) at Module.load (internal/modules/cjs/loader.js:863:32) at Function.Module._load (internal/modules/cjs/loader.js:708:14) at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12) at internal/main/run_main_module.js:17:47 Perform any clean up like saving or releasing any memorySimilar to ‘uncaughtException’, a newer concept called unhandledRejection error is introduced. This is raised if a promise is rejected and there is no handler to the response.In both the cases, it is expected that the application will crash and should not be continued.  One reason could be that the application might be in an undefined state. If you are wondering why someone would hook to this event, then it is to perform synchronous cleanup of allocated resources (e.g. file descriptors, handles, etc) before shutting the process.Note: ‘beforeExit’ event is not fired when there is an ‘uncaughtException’ or process.exit is called explicitly.Signal Events: Events emitted by operating system to Node.js process are referred to as signals. Most common among them are SIGTERM and SIGINT. Both are related to process termination. Signals are not available to worker threads. Let’s look into an example for SIGINT:setInterval(() => {     console.log('continued process');  }, 1000); process.on('SIGINT', signal => {     console.log(`Process ${process.pid} has been interrupted`)     process.exit(0) });In the terminal, execute node index.js. This will be a continuous process without any exit criteria because of setInterval.  Press Ctrl+C, the result is that ‘SIGINT’ event is raised to node application and is captured in handler. Because of process.exit command in handler, the process exits.Node.js is a single thread process. And in some cases, you may want some specific logic to be run in the child process and not in the main one, so that if any crash happens the main process is still alive.Taking the previous example of displaying the content of the index.js file, let’s do it this time with the help of ‘child_process’ module.var exec = require('child_process').exec; exec('cat index.js',(err, stdout, stderr) => { console.log(stdout); });Note: cat is a binary which is available on iOS. This may vary based on your operating system. ‘spawn’ on child process is similar to exec but it gives more granular control of how the processes are executed.Let’s spin off a child process from the parent process and pass on the data from child process to child process.var spawn = require('child_process').spawn;   if (process.argv[2] === 'childProcess')   {      console.log('Child process is executed’);   } else {      var child = spawn(process.execPath, [__filename, 'childProcess']);      child.stdout.on('data', (data) => {         console.log('from child:', data.toString());      })   }Overview: At line 6, spawn is provided with the process to execute and second parameter is the parameter passed to it.Since we want to spin off another node process, we rely on ‘process.execPath’.‘__filename’ is the name of the file i.e. index.js and second argument being ‘childProcess’.When child process is spawned [like node index.js childProcess], condition at line 2 is satisfied and sends out the data to the parent process.Parent process captures the stdout of child on data changed event and prints the same in stdout. Output: from child: Child process is runningAs we learnt earlier, all stdout is a stream and we can pipe the child.stdout to the parent.stdout directly instead of listening to the data changed event. Replace line 7,8,9 with this:child.stdout.pipe(process.stdout);Another shorthand version is to provide the 3rd parameters to the spawn without the need to pipe as shown below: [it basically inherits allstdio from the parent and does pipe for you]var child = spawn(process.execPath, [__filename, 'childProcess'], {   stdio: 'inherit' });Note that, each child process is self-contained and data is not shared across multiple child processes or parent.What if you want to transfer the data or control the child process like terminate if needed?The third parameter of the spawn containing the stdio is basically an array with stdIn, stdOut and stdErr. Passing null will take the default. We can pass in the fourth item ‘pipe’ which helps in sending the data from child process.var spawn = require('child_process').spawn; if (process.argv[2] === 'childProcess') {     var net = require('net');     var pipe = new net.Socket({ fd: 3 });     pipe.write('Terminate me'); } else {     var child = spawn(process.execPath, [__filename, 'childProcess'], {        stdio: [null, null, null, 'pipe']     });     child.stdio[3].on('data', (data) => {     if (data.toString() === 'Terminate me') {        console.log('Terminated child process');        child.kill();     } }); }From the above code snippet, we can see that child creates a socket with parent and sends the data. The parent is listening to it and does the required operation i.e. to terminate the child process.ConclusionNode.js is a single-threaded, non-blocking performance and works great for a single process. But what if we want to scale up and have a distributed application? Regardless of how performant the server is, single thread can only support a limited load. To overcome this, Node.js has to work with multiple processes and even on multiple machines.

What is Node JS Process?

3K
What is Node JS Process?

How to use the global process module

Node.js is built using C++ program with a V8 engine embedded in it to provide an environment to execute JavaScript outside the browser. Many of the accessible global methods are actually wrappers around methods which make calls to core C libraries.

Node.js has many in-built global identifiers that are available and known to developers. There are also some which are accessible at the module level via module inheritance.

Few of the global objects are:

  • global - It is a namespace and accessible across the application. Setting a property to this namespace makes it accessible throughout the running process.
  • process - Process is a module which provides interaction with the current Node.js process.
  • console - Console is a module mostly used for logging the information or error. It wraps the STDIO functionality of a process.
  • setTimeout(), clearTimeout(), setInterval(), clearInterval() - All these can be categorized as timer functions.

Some of the globules accessible via module inheritance are module, exports, __filename, __dirname, require() etc.

In this article, we attempt to understand more about the ‘process’ object and its details with examples. ‘process’ object is a global which is an instance of EventEmitter and can be accessed directly. It can also be explicitly accessed using module global i.e. require.

const process = require(‘process’);

process object has a property ‘argv’ which is an array containing the properties passed to node process.

Create a simple index.js file and lets console.log the process.argv

console.log(process.argv) 

Type in ‘node index.js’ in terminal. On pressing Enter, the following output should be provided:

[
  '/Users/***/.nvm/versions/node/v12.20.1/bin/node',
  '/Users/***/index.js'
]

Now, let’s pass in some other parameters and you should see the parameters being displayed say ‘node index.js test’.
Also note that process has ‘process.stdout’ and ‘process.stderr’ which helps us to send a message to the standard output channel and standard error channel if there is any error.
Infact, the console.log is internally doing process.stdout.write(msg + ‘\n’).
console.log(‘Hello world’) is the same as process.stdout.write(‘Hello world’ + ‘\n’).
Files can be read as stream and we can pipe this to process.stdout. For example, replace the content of index.js with the below code.

var fs = require('fs')
fs.createReadStream(__filename).pipe(process.stdout);

Running the ‘node index.js’ should display the content of the file. 

Another interesting thing about Node.js is that when it is done or doesn’t have anything left to do, then it will exit out of the process. Let’s understand this with an example.

setTimeout(() => { 
process.stdout.write('Executed after 1000 ms' + '\n'); 
}, 1000)

This one will wait for 1 sec and then outputs ‘Executed after 1000 ms’ and terminates the process.

If we want the process to run forever, then we can replace the setTimeout with setInterval which will execute the callback every time, post the time interval. And the only way to exit is by pressing ‘Ctrl+C’ or the process gets crashed.

To get a quick walk through of properties, methods and events on the process object, add ‘console.log(process)’ in index.js and run node index.js. Most of them are self-explanatory as per their name.

version: 'v12.20.1', //current version of node 
versions: {…}, // gives insight about the node and its core components like V8 engine version 
arch: 'x64', 
platform: 'darwin', 
release: {…}, // details of node source and version of lts. 
moduleLoadList: [...], // details of modules available with node. 
binding: [Function: binding], 
_events: [Object: null prototype] {  
newListener: [Function: startListeningIfSignal], // whenever a new listener is added  
removeListener: [Function: stopListeningIfSignal], // existing listener is removed  
warning: [Function: onWarning],  
 SIGWINCH: [Function]  
},  
_eventsCount: 4,  
_maxListeners: undefined,  
domain: null,  
_exiting: false,  
config: {  
target_defaults: {…},  
variables: {...}  
},  
cpuUsage: [Function: cpuUsage],  
resourceUsage: [Function: resourceUsage],  
memoryUsage: [Function: memoryUsage],  
kill: [Function: kill],  
exit: [Function: exit],  
openStdin: [Function],  
getuid: [Function: getuid],  
geteuid: [Function: geteuid],  
getgid: [Function: getgid],  
getegid: [Function: getegid],  
getgroups: [Function: getgroups],  
allowedNodeEnvironmentFlags: [Getter/Setter],  
assert: [Function: deprecated],  
features: {…},  
setUncaughtExceptionCaptureCallback: [Function: setUncaughtExceptionCaptureCallback],  
hasUncaughtExceptionCaptureCallback: [Function: hasUncaughtExceptionCaptureCallback],  
emitWarning: [Function: emitWarning],  
nextTick: [Function: nextTick],  
stdout: [Getter],  
stdin: [Getter],  
stderr: [Getter],  
abort: [Function: abort],  
umask: [Function: wrappedUmask],  
chdir: [Function: wrappedChdir],  
cwd: [Function: wrappedCwd],  
initgroups: [Function: initgroups],  
setgroups: [Function: setgroups],  
setegid: [Function],  
seteuid: [Function], 
setgid: [Function], 
setuid: [Function], 
env: {…}, // environment details for the node application 
title: 'node', 
argv: [  
   '/Users/srgada/.nvm/versions/node/v12.20.1/bin/node', 
    '/Users/srgada/index.js' 
 ], 
execArgv: [], 
pid: 29708, 
ppid: 19496, 
execPath: '/Users/srgada/.nvm/versions/node/v12.20.1/bin/node', 
debugPort: 9229, 
argv0: 'node', 
mainModule: Module {…} //details of the main starting file or module. This is deprecated in latest one and use require.main instead
 }

Let’s take a look at a few of the properties which are most used or required.

  • pid – gives the process id 
  • platform – is linux or darwin 
  • version – node version 
  • title – process name, by default it is node and can be changed 
  • execPath – for executable process path 
  • argv – arguments passed

Some common methods are:

  • exit - exits the process and accepts the exit code as argument. 
  • cwd – to get the current working directory and to change we can use ‘chdir’. 
  • nextTick – as the name suggests, it places the callback passed to this function in the next iteration of event loop. It is different to setTimeout with 0 ms delay.
process.nextTick(() => {  
   console.log('Got triggered in the next iteration of event loop');  
});  
setTimeout(() => {  
   console.log("Even after nextTick is executed");  
}, 0);  
console.log("First text to be printed"); 

Output: 

First text to be printed  
Got triggered in the next iteration of event loop  
Executed after some delay

EVENTS: 
To log (or) perform any cleaning before exiting the process, we can hook to ‘exit’ event which is raised when process.exit is invoked.

console.log(process.argv);
process.on('exit', () => {  
   console.log('Perform any clean up like saving or releasing any memory');  
});

Exit is fired after the event loop is terminated. As a result, we can’t perform any async work in the handler. So if you want to perform some calls like saving content to db, we can hook to ‘beforeExit’ when the process gets exited.

process.on('beforeExit', code => {
    // Can make asynchronous calls
    setTimeout(() => {
      console.log(`Process will exit with code: ${code}`)
      process.exit(code)
    }, 1000)
  });
  process.on('exit', code => {
    // Only synchronous calls
    console.log(`Process exited with code: ${code}`)
  });  
  console.log('After this, process will try to exit');

Another event ‘uncaughtException’, as the name suggests, is raised when there is an unhandled exception in the application. Whenever an unhandled exception is found, the Node.js application logs the stack raise and exit.

process.on('exit', () => {  
    console.log('Perform any clean up like saving or releasing any memory');  
});  
process.on('uncaughtException', (err) => {  
    console.error('An unhandled exception is raised. Look at stack for more details');  
    console.error(err.stack);
    process.exit(1);
});
var test = {};  
//raises an exception.  
test.unKnownObject.toString();

Output

An unhandled exception is raised. Look at stack for more details
TypeError: Cannot read property 'toString' of undefined
at Object.<anonymous> (/Users/srgada/index.js:10:20)
at Module._compile (internal/modules/cjs/loader.js:999:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
at Module.load (internal/modules/cjs/loader.js:863:32)
at Function.Module._load (internal/modules/cjs/loader.js:708:14)
at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12)
at internal/main/run_main_module.js:17:47
Perform any clean up like saving or releasing any memory

Similar to ‘uncaughtException’, a newer concept called unhandledRejection error is introduced. This is raised if a promise is rejected and there is no handler to the response.

In both the cases, it is expected that the application will crash and should not be continued.  One reason could be that the application might be in an undefined state. If you are wondering why someone would hook to this event, then it is to perform synchronous cleanup of allocated resources (e.g. file descriptors, handles, etc) before shutting the process.

Note: ‘beforeExit’ event is not fired when there is an ‘uncaughtException’ or process.exit is called explicitly.

Signal Events: Events emitted by operating system to Node.js process are referred to as signals. Most common among them are SIGTERM and SIGINT. Both are related to process termination. Signals are not available to worker threads. Let’s look into an example for SIGINT:

setInterval(() => {
    console.log('continued process');
 }, 1000);
process.on('SIGINT', signal => {
    console.log(`Process ${process.pid} has been interrupted`)
    process.exit(0)
});

In the terminal, execute node index.js. This will be a continuous process without any exit criteria because of setInterval.  

Press Ctrl+C, the result is that ‘SIGINT’ event is raised to node application and is captured in handler. Because of process.exit command in handler, the process exits.

Node.js is a single thread process. And in some cases, you may want some specific logic to be run in the child process and not in the main one, so that if any crash happens the main process is still alive.

Taking the previous example of displaying the content of the index.js file, let’s do it this time with the help of ‘child_process’ module.

var exec = require('child_process').exec;
exec('cat index.js',(err, stdout, stderr) => {
console.log(stdout);
});

Note: cat is a binary which is available on iOS. This may vary based on your operating system.
 ‘spawn’ on child process is similar to exec but it gives more granular control of how the processes are executed.
Let’s spin off a child process from the parent process and pass on the data from child process to child process.

var spawn = require('child_process').spawn;  
if (process.argv[2] === 'childProcess')  
{  
   console.log('Child process is executed’);  
} else {  
   var child = spawn(process.execPath, [__filename, 'childProcess']);  
   child.stdout.on('data', (data) => {  
      console.log('from child:', data.toString());  
   })  
}

Overview:

  •  At line 6, spawn is provided with the process to execute and second parameter is the parameter passed to it.
  • Since we want to spin off another node process, we rely on ‘process.execPath’.
  • ‘__filename’ is the name of the file i.e. index.js and second argument being ‘childProcess’.
  • When child process is spawned [like node index.js childProcess], condition at line 2 is satisfied and sends out the data to the parent process.
  • Parent process captures the stdout of child on data changed event and prints the same in stdout.

 Output: 

from child: Child process is running

As we learnt earlier, all stdout is a stream and we can pipe the child.stdout to the parent.stdout directly instead of listening to the data changed event. 

Replace line 7,8,9 with this:

child.stdout.pipe(process.stdout);

Another shorthand version is to provide the 3rd parameters to the spawn without the need to pipe as shown below: [it basically inherits allstdio from the parent and does pipe for you]

var child = spawn(process.execPath, [__filename, 'childProcess'], {  
stdio: 'inherit'
});

Note that, each child process is self-contained and data is not shared across multiple child processes or parent.

What if you want to transfer the data or control the child process like terminate if needed?

The third parameter of the spawn containing the stdio is basically an array with stdIn, stdOut and stdErr. Passing null will take the default. We can pass in the fourth item ‘pipe’ which helps in sending the data from child process.

var spawn = require('child_process').spawn; 
if (process.argv[2] === 'childProcess') 
{ 
    var net = require('net'); 
    var pipe = new net.Socket({ fd: 3 }); 
    pipe.write('Terminate me'); 
} else { 
    var child = spawn(process.execPath, [__filename, 'childProcess'], { 
       stdio: [null, null, null, 'pipe'] 
    }); 
    child.stdio[3].on('data', (data) => { 
    if (data.toString() === 'Terminate me') { 
       console.log('Terminated child process'); 
       child.kill(); 
    } 
}); 
}

From the above code snippet, we can see that child creates a socket with parent and sends the data. The parent is listening to it and does the required operation i.e. to terminate the child process.

Conclusion

Node.js is a single-threaded, non-blocking performance and works great for a single process. But what if we want to scale up and have a distributed application? Regardless of how performant the server is, single thread can only support a limited load. To overcome this, Node.js has to work with multiple processes and even on multiple machines.

Sumanth

Sumanth Reddy

Author

Full stack, UI Architect having 14+ Years of experience in web, desktop and mobile application development with strong Javascript/.Net programming skills . 


Strong experience in microsoft tech stack and Certified in OCI associate . 


Go-to-guy on integration of applications, building a mobile app by zeroing on tech stack. Majorily experienced on engineering based IIOT products and Marekting Automation products deployed on premise and on-cloud. 

Join the Discussion

Your email address will not be published. Required fields are marked *

Suggested Blogs

The 2021 Learning Path To Becoming a Full Stack Web Developer

Full stack developer roles are among the hottest careers in the tech space now. These talented folks can develop a whole product from scratch. A full stack developer is a combination of Front-end developer and Backend developer. These two in themselves are full time jobs and most people make careers out of one of them. So, we will start with Front-end roadmap and then go to Back-end roadmap. A person interested in becoming a Full-stack developer needs to have proficiency in both the front end and back-end tools, just like I started as a Front-end developer and later on become a Full stack developer by mastering JavaScript backend technologies and databases.The demand for Full Stack Web DeveloperThe demand for Full stack developers is the highest in early-stage startups, where they want to create a Minimum Viable Product at the earliest to showcase to the investors. It is also a nice skill to have in addition to frontend technologies or backend technologies alone, since an employer prefers people with both skills.There are a lot of technologies to learn to be a Full-Stack developer. We will discuss about them in the coming sections.   List of technologies to master to become a Full-Stack developer A full-stack developer is actually a combination of Frontend developer and Backend developer. We need to master both, and both have different Roadmaps. Let’s start with the basics. The frontend is the web-site which we see and it is primarily made with HTML and CSS.  JavaScript was also used earlier but nowadays, it is created with JavaScript frameworks like ReactJS, Angular or Vue. All these frameworks require one to learn the basics of HTML, CSS, & JavaScript. So, we need to learn the basics followed by at least one framework.In the backend we have a lot of technologies and databases also. So, we need to choose one backend framework from Java (Spring Framework), JavaScript (NodeJS) etc and then also learn databases. Databases are divided into two categories, which is NoSQL(MongoDB) and SQL(PostgreSQL, MySQL, Oracle) databases. So, you need to choose one of the databases.We are also required to know about DevOps, which is a practice of harmonizing development and operations whereby the entire pipeline from development, testing, deployment, continuous integration and feedback is automated. The knowledge of either AWS or Azure based cloud ecosystem is required, and also CI/CD like Jenkins and containerizing & orchestrating applications using Docker and Kubernetes.1. Frontend RoadmapLearn the BasicsPlease refer to the attached figure for Front-end roadmap, as we will be referring to this throughout this article. We have to start our journey by learning HTML, CSS and JavaScript which is the base for a web-app or website. HTML has changed a bit over the years, with the introduction of HTML 5 and semantics tags, so make sure to update yourself. JavaScript which was released in 1995, didn’t change much during the next 20 years. But once more and more developers started using it, the ECMA committee decided to add some very nice features and enhance the language, and renamed it ES6 in 2015. After that they regularly added new features to the language and have just released ES2020 in June 2020, which has many additional features. So, learn the basic JavaScript first and then upgrade to ES6 and newer versions. CSS is what makes a website or web-app beautiful, and is often considered the hardest part by a developer. Earlier, CSS was very confusing and had a steep learning curve, because of the use of floats to create a layout. Developers usually used to work with CSS frameworks like bootstrap to design a site. But things have changed a lot with the invention of CSS Grid and Flexbox. Some of the best resources to learn the basics are - html.specdeveloper.mozilla.HTMLStyle CSSdeveloper.mozilla.CSSdeveloper.mozilla.JavaScriptGetting Deeper Now, just learning JavaScript and some basic CSS will not make you a good Front-end developer as you have to take a deep dive into JavaScript. We will discuss CSS later, after learning the essentials of JavaScript.JavaScript EssentialsThere are many things associated with JavaScript which we need to learn before moving forward.The Terminal The first thing to learn is to work in a terminal, and master some of the basic commands. If you are on a Mac, it’s already based on Linux and runs most Linux commands. If you are working on Windows then you must install git bash, which will give you a Linux environment to work with. In JavaScript frameworks, we need to run a lot of commands from the terminal, like if we want to install a third-party dependency by npm.  The basics of Linux can be learnt from their official site.Version ControlNext, learning version control is very important because we should always keep our code in some remote repository like Github. The industry works on Git, which is version control software. It is completely command-based and is used heavily everywhere. Learn the basic commands which will be useful even for an individual developer. Later on, when working with teams, more advanced knowledge of the git command is required.Through the git commands, we store our code in repositories. The most popular ones are Github and Bit Bucket, so we need to learn how to store and link them.The basics of git can be learnt from this awesome tutorial.freecodecamp.orgTask Runners Task runners are applications which are used to automate tasks required in projects. These tasks include minification of JavaScript and CSS files, CSS preprocessing like from SASS to CSS, image optimization and Unit testing. The three popular task runners are npm scripts, gulp and grunt. The npm script is nothing but the package.json file which comes with React projects or is created in a Node.js project using npm init. Gulp and Grunt are much bigger applications and also have a plugin ecosystem that is suited for large JavaScript projects. The basics for these two technologies can be learnt from here. Module Loader and Bundler Both module loaders and bundlers are required for large JavaScript applications. Knowledge of both is required, if the project you are working is a big Vanilla JavaScript project. When a large JavaScript application consists of hundreds of files, the module loader takes care of the dependency and makes sure all the modules are loaded when the application is executed. Examples are RequireJS and SystemJS.Module bundlers also do the same thing, building it at the time of application build rather than at the runtime. Popular examples are Webpack and Rollup. Testing Testing nowadays is very important in any type of project. There are two types of testing; one is known as Unit testing and other as end-to-end testing. For unit testing we write test cases and the most popular tool nowadays is Jest. End-to-end testing is automated testing, which emulates the whole app. Suppose, an app has a login screen and then it shows posts. The testing tool will run the web-app to check whether all the functionalities are done correctly. The two most popular options today are Puppeteer and Cypress. The tutorials to refer for these topics are - Libraries and FrameworkThey are the most important part of the JavaScript ecosystem nowadays. It all started with the release of AngularJS in 2010. Before that period most enterprise apps were made in Java and were desktop apps. But AngularJS changed everything, because it made it easy to manage big projects with JavaScript and helped to create complex web-apps.1. React It is the most popular JavaScript library today and is used by both enterprises and startups that have a huge ecosystem. It is not a complete framework like Angular and we have to install third party dependencies for most things. But if you want to learn a framework that will get you a job, then that framework would be ReactJS, and its demand is not going away for the next 5 years. The component approach and its easy learning curve have made React more popular than other frameworks. A good starting tutorial for React isState Management In React state management can sometimes become complex, when we need to share data between components. We generally take help of external packages in it with the most popular being Redux. But we also have other state management libraries like XState and Recoil. Server-side rendering With performance becoming important nowadays, Server-Side Rendering speeds up the React projects even faster. In SSR projects, the React code is rendered on the server and the client browser directly receives the HTML, CSS, JS bundle. The only framework to do it is NextJS. Static Site Generators Lot of sites don’t need to be updated frequently and it is the place where the only Static Site Generator for ReactJS, which is GatsbyJS shines. With the help of GatsbyJS we can create extremely fast static sites and it gets into Wordpress domain a lot with it. GatsbyJS also has a huge ecosystem of plugins, which enhances its functionalities. React Testing Unit testing is a very important part of ReactJS projects, especially the ones which are very large. Unit testing ensures that we have lower bugs in Production build. The two popular libraries are – Enzyme and Jest. 2. Angular It is a complete framework and unlike React requires very few external dependencies. Everything is built within Angular and we don’t have to go outside for more features. Since it was among the earliest frameworks, older projects are in Angular and it is still widely used in enterprises. A good tutorial to learn Angular is below. 3. Vue Vue is another very popular JavaScript library, which has the best features of both ReactJS and Angular and has become very popular in recent years. It is widely used in both enterprise and startups. A good tutorial to start with Vue is below. 4. NuxtJS It is used for Server-Side Rendering in Vue projects and is similar to the NextJS framework used in ReactJS for SSR.  5. Svelte It is the newest of all frameworks/libraries and has become quite popular, but still not used much in enterprises and startups. It is different from React, Vue and Angular and converts the app at build time rather than at run time as in the other three. Good tutorials to start with Svelte are below. CSS Deep DiveA lot has changed in CSS after it included CSS Grid and Flexbox; it has become much easier for developers to work with. CSS Essentials It is now mandatory for frontend developers to learn CSS Grid and Flexbox, because through it we can develop beautiful layouts with ease. More companies are moving away from CSS Frameworks and have started working with CSS Grid and Flexbox, which are now supported by all browsers. Good tutorials to learn Flexbox and CSS Grid are below. Preprocessors CSS preprocessors are used to add special functionalities in CSS, which it lacks. An example is Sass, which adds special features like variables and nested rules in CSS and is widely used in the industry for larger projects. The other popular one is PostCSS, in which we can use custom plugin and tools in CSS. CSS Frameworks Frameworks were very popular from the early days of CSS, when it was very complicated because of floats. Bootstrap This is the most popular and oldest CSS framework; easy to learn and also has a wide variety of elements, templates and interfaces. Bulma It is another CSS framework, which is very popular and much easier to use than bootstrap. Tailwind CSS This is a fairly new CSS framework and is quite popular nowadays. It follows a different approach than the other frameworks and contains easier classes. Styled Components (React) This is a CSS in JS library and is for React only. It is used to create components out of every style and is very popular in the React world.  CI/CDThe Continuous Integration/ Continuous deployment is mainly used by DevOps. But a frontend engineer should know its basics. It is used to build, test and deploy applications automatically.Github Actions  It is a freely available CI/CD pipeline, which directly integrates to your github based project and can be used in a variety of languages. Deployment It is again a task which mainly falls into the domain of Backend engineers and DevOps, but a frontend engineer should know some basic and simple tools. Static Deployment These products are mainly used to deploy static sites, which consists of HTML, CSS and JavaScript only. Two very popular services are Amazon S3 and Surge.sh Node Application Deployment The projects containing node code cannot be deployed using static deployment. Even if the project is a simple ReactJS project, it also uses node for processing. These applications require services which run the Node code and deploy it. The three most popular services are Vercel, Firebase and Netlify. 2. Backend Roadmap (Including Storage, Services & Deployment)Understanding the BackendBackend is the part of the website that provides the functionality, allowing people to browse their favorite site, purchase a product and log into their account, for instance. All data related to a user or a product or anything else are generally stored in databases or CMS (Content Management System) and when a user visits any website, they are retrieved from there and shown. One of the responsibilities of a backend engineer involves writing APIs, which actually interact with the database and get the data. They are also involved in writing schemas of database and creating the structure of databases. Backend EssentialsFor a backend engineer, working in a Linux environment is an essential skill. A lot of the configurations are done on the terminal. So, he or she should be very good with Linux commands.Also, they should know both commands and the use of any git powered platforms like Github or bitbucket.Languages and FrameworksAll of the popular languages have some framework, which has been used for backend development. These frameworks are generally used to create API endpoints, which are used to fetch or store data in the database. For example, when we scroll articles on Facebook, these articles are fetched from a database and we use the GET method to fetch them. Similarly, when we write an article and hit submit, it uses POST method.Now, different frameworks implement this GET, POST and other APIs also referred to as RESTful APIs in their own way.Java Java is by far the oldest and the most used language for backend development. It is also used for a variety of other tasks like Android development, but it shines in the backend because of its multithreading abilities. So, enterprise grade web-apps and web-apps with a lot of traffic prefer Java, because it handles loads better. The most popular frameworks for backend development in Java are Spring Framework and Hibernate. Some good beginner's tutorials are - JavaScript It is a very popular choice for backend development, because on the frontend side JavaScript is the only choice. So, a lot of frontend engineers can take this choice to become Full-stack developers. Node.js It allows developers to use JavaScript to write server-side code, through which they can write APIs. Actually, the API part can be done by numerous frameworks of Node.js out of which Express is widely used. The other popular framework is Fastify. Some good beginner's tutorials are - Python Python is one of the most popular languages among developers and has been used in a variety of fields. The two most popular frameworks for Python are Flask and Django. Some good beginner tutorials are - C# It is a very popular programming language which was developed by Microsoft and it has the power of C++. Its popularity increased once the .NET framework was released for backend development. As Microsoft is very popular in enterprises, the .NET framework is generally preferred in enterprises. A good tutorial to learn .NET is - Go Go language which is also referred to as Golang, has gained popularity in recent years. It is used a lot in Backend programming and the two popular frameworks are Gin and Beego. DatabaseFor a Backend engineer, after making APIs with framework based on language, it's time to learn about Databases. Databases are used to store most of the things which we see in a web-app, from user login credentials to user posts and everything else. In the earlier days we only used to have one type of Database and that was Relational databases, which use tables to store data. Now we have two other categories also, one being NoSQL databases and the other In-memory databases. 1. Relational databases Relational databases allow you to create, update and delete data stored in a table format. This type of database mostly uses SQL language to access the data, hence is also known as an SQL database. MySQL It is one of the oldest databases and was released in 1995. It is an open-source database and was very popular in the 2000s with the rise of LAMP (Linux, Apache, MySQL, PHP) stack. It is still widely in use, but there are other popular Relational databases. A good tutorial to learn MySQL is - PostgreSQL PostgreSQL, which is also known as Postgres is also an old open-source Relational database, which was released in 1996. But it gained popularity recently, as it goes very well with modern stacks containing NodeJS and other backend technologies. A good tutorial to learn PostgreSQL is - Oracle is the most popular and oldest relational database. It was released in 1979 and still remains the number one preference for enterprise customers. All the big banks and other organizations, run on Oracle databases. So, the knowledge of Oracle is a must in many companies for an Engineer. A good tutorial to learn Oracle is - MS-SQL MS-SQL is also known as Microsoft SQL and is yet another commercial Relational database. It has got different editions, used by different audiences. It is also heavily used by enterprise users and powers a whole lot of big systems around the world. A good tutorial to learn MS-SQL is - 2. NoSQL databases NoSQL databases are also called non-SQL databases. The NoSQL databases mainly store data as key-value pairs, but some of them also use a SQL-like structure. These databases have become hugely popular in the 21st century, with the rise of large web-apps which have a lot of concurrent users. These databases can take huge loads, even millions of data connections, required by web-apps like Facebook, Amazon and others. Beside this, it is very easy to horizontally scale  a NoSQL database by adding more clusters, which is a problem in Relational Databases. MongoDB It is the most popular NoSQL database, used by almost every modern app. It is a free to use database, but the hosting is charged if we host on popular cloud services like MongoDB atlas. Its knowledge is a must for backend engineers, who work on the modern stack. MongoDB uses json like documents to store data. A good tutorial to learn MongoDB is - It is a proprietary database service provided by Amazon. It is quite similar to MongoDB and uses key-value pairs to store data. It is also a part of the popular AWS services. A good tutorial to learn DynamoDB is-Cassandra is an open-source and free to use NoSQL database . It takes a different approach when compared to other NoSQL databases, because we use commands like SQL, which are known as CQL (Cassandra Query Language). A good tutorial to learn Cassandra is - 3. In-memory databases The in-memory database is a database, which keeps all of the data in the RAM. This means it is the fastest among all databases.  The most popular and widely used in-memory database is Redis. Redis Redis (Remote Dictionary Server) is an in-memory database, which stores data in RAM in a json like key-value format. It keeps the data persistent by updating everything in the transaction log, because when systems are shut down their RAM is wiped clean. A good tutorial to learn Redis - StorageStoring the data is an important part of any application. Although this is mainly DevOps territory, every backend developer should know the basics for the same. We need to store the database data and also the backend code. Beside this the frontend code must also be stored somewhere. Nowadays everything is stored in the cloud, which is preferred by individuals, startups and enterprises. The two most popular cloud-based storages are – Amazon S3 Azure Blob Storage Good beginner's tutorials for both areServices and APIsThese are theoretical concepts and are implemented by various services, but a backend engineer should know them and how to use them. Restful APIs This is by far the most popular way to get data from a database. It was made more popular, with the rise of web-apps. We do GET, PUT, POST and DELETE operations to read, update, create or delete data from databases. We have earlier discussed different languages and frameworks, which have their own implementations for these operations. Microservices Architecture In microservice architecture, we divide a large and complex project into small, independent services. Each of these is responsible for a specific task and communicates with other services through simple APIs. Each service is built by a small team from the beginning, and separated by boundaries which make it easier to scale up the development effort if needed. GraphQL It is the hottest new kid in the block, which is an alternative to the Restful APIs. The problem with Restful APIs is that if you want some data stored in database, you need to get the whole data sent by the endpoint. On the other hand, with GraphQL, you get a query type language which can return only the part of the data which you require.  DevOps & DeploymentA backend engineer requires a fair bit of DevOps knowledge. So, we will next deep dive into the methodologies in DevOps. 1. Containerization & Orchestration Containers are a method of building, packaging and deploying software. They are similar to but not the same thing as virtual machines (VMs). One of the primary differences is that containers are isolated or abstracted away from the underlying operating system and infrastructure that they run on. In the simplest terms, a container includes both an application’s code and everything that code needs to run properly. Container orchestration is the automatic process of managing the work of individual containers for applications based on microservice architecture. The popular Containerization and Orchestration tools are – Kubernetes Docker Good beginner's tutorials for both are -2. DevOps DevOps is a set of practices that combine software development (Dev) and IT operations (Ops). It aims to shorten the systems development life cycle and provide continuous delivery with high software quality. The two most popular DevOps services are AWS and Azure. Both of them are cloud based and are market leaders. Both of these platforms contain a wide variety of similar services. AWS It consists of over 200 products and services for storage, database, analytics, deployment, serverless function and many more. AWS is the market leader as of now with 33% of market share. The AWS certifications are also one of the most in-demand certifications and a must for frontend engineers as well as Backend engineers. Azure Microsoft Azure is second in terms of market share of cloud-based platforms, with 18% of the market. It also consists of SaaS (Software as a Service), PaaS (Platform as a Service) and IaaS (Infrastructure as a Service) like AWS. 3. PaaS (Platform as a Service) There are several smaller players, which provide Platform as a Service and are much easier to use than services like AWS and Azure. With these services you can directly deploy your React or other web-apps, by just hosting them on GitHub and pushing the code. These services are preferred a lot by freelancers, hobbyists and small companies as they don’t require investment in learning complicated services like AWS and Azure. The three most popular PaaS services are Digital Ocean Heroku Netlify 4. Serverless Serverless computing is an execution model where the cloud provider (AWS, Azure, or Google Cloud) is responsible for executing a piece of code by dynamically allocating resources and only charging for the number of resources used to run the code. The code is typically run inside stateless containers that can be triggered by a variety of events including http requests, database events, queuing services, monitoring alerts, file uploads, scheduled events (cron jobs), etc. The code that is sent to the cloud provider for execution is usually in the form of a function. AWS Lambda It is an event-driven, serverless platform which is part of AWS. The various languages supported by AWS Lambda are Node.js, Python, Java, Go, Ruby and .NET. AWS Lambda was designed for use cases such as updates to DynamoDB tables, responding to a website click etc. After that it will “spin down” the database service, to save resources. Azure Functions They are quite similar to AWS Lambda, but are for Microsoft Azure. Azure functions have a browser-based interface to write code to respond to events generated by http requests etc. The service accepts programming languages like C#, F#, Node.js, Python, PHP and Java. Serverless Framework It is an open-source web-framework written using Node.js. The popular services like AWS Lambda, Azure functions and Google cloud functions are based on it. CI/CD A backend developer should know the popular CI/CD (Continuous Integration/Continuous deployment) tools. These tools help to automate the whole process of building, testing and deployment of applications. Github Actions It is a freely available CI/CD pipeline, which directly integrates to your GitHub based project and can be used in variety of languages. Jenkins Jenkins is the most popular CI/CD automation tool, which helps in building, testing and deployment of applications. Jenkins was written in Java and over the years has been built to support over 1400 plugins, which extend its functionalities. Circle CI Circle CI is also a CI/CD automation tool, which is cloud based and so it is different from Jenkins. It is much easier to use than Jenkins, but has a smaller community and lower user base. SecuritySecurity is an important aspect of any application. Most applications containing user personal data, like email etc, are often targeted by hackers. OWASP The Open Web Application Security Project (or OWASP), is a non-profit organization dedicated to web application security. They have free material available on their website, making it possible for anyone to improve their web application security. Protecting Services & databases against threats Hackers target databases of popular web-apps on a regular basis to get sensitive information about their customers. This data is then sold to the highest bidder on the dark-net. When such public breaches are reported, then it's a reputation loss for the enterprise also. So, a lot of emphasis should be given to Authentication, Access, Backups, and Encryption while setting up a database. The databases should also be monitored for any suspicious activities. Besides this the API routes also need to be protected, so that the hacker cannot manipulate them. Career roles Most of the companies hire Frontend developers, Backend developers and DevOps engineers separately. This is because most of the enterprise projects are huge, in which roles and responsibilities are distributed. But there is a huge demand for Full Stack developers in the startup sector in US and India. These companies need specialists who can get the product out as soon as possible with agile and small teams. Top companies hiringAlmost every company on the planet is hiring web-developers or outsourcing the development work. Since the past decade, the demand for developers has risen exponentially. The top technology companies which hire full stack developers are Facebook, Amazon, Apple, Netflix, Google, Uber, Flipkart, Microsoft and more.  The sites of each of these companies are web-apps (excluding Apple and Microsoft), with complex frontend and backend systems. The frontend generally consists of React or Angular and the backend is a combination of various technologies. The DevOps part is also quite important in these web-apps as they handle millions of concurrent connections at once.Salaries  The salary of a beginner Frontend developer in India starts from Rs. 300,000($ 3980) per year in service-based companies to Rs. 12,00,000($ 15,971) per year in the top tech companies mentioned above. The salary of a Beginner Full-Stack developer in India starts at Rs. 4,50,000 ($ 5989) per year in service companies to Rs. 12,00,000($ 15,971) per year in top tech companies. The salary for an entry level Frontend developer in USA is $ 59,213 per year and for an entry level Full stack developer is $ 61,042 per year.Below are some sources for salaries. Top regions where there is demand There are plenty of remote and freelancing opportunities in web-development across the world. The two countries with most developers and top tech companies are USA and India. Silicon Valley, which is the San Francisco Bay Area, in Northern California, USA is the hub of technology companies.  The top city in India to start a developer job is the Silicon Valley of India – Bengaluru. The number of jobs is more than all the other cities combined and it also has a very good startup ecosystem. Almost all the big technology companies mentioned earlier and top Indian service companies are located in the city. After Bengaluru, the city where the greatest number of technology jobs are based is Hyderabad, followed by Chennai and then Pune. Entry PointsThe demand for web-developers is high and anyone with a passion for creating apps can become a web-developer. An Engineering degree is not mandatory to land a job as a web developer.  The most in-demand skill today and for the next 5 years is React and its ecosystem. So, if you know HTML, CSS, JavaScript and React, it is impossible to not get a job. Career Pathway  Most people start as an intern Front-end developer or Intern Full-Stack developer and in many cases Intern Backend developer. Many companies directly hire junior Frontend/Backend/Full-stack developers.  After that, the next step is the role of Senior Frontend/Backend/Full-stack developers. Many Frontend and Backend developers become full stack developers at this level, by learning additional technologies. Senior resources in Frontend/Backend/Full-stack can then go on to assume Team Lead roles. These people manage small teams in addition to being individual contributors.  After this a professional can become a Project manager, whose main responsibility is managing the team. Another role is that of Technical Project Manager, who manages the team and also has hands-on knowledge in Technology. The last role at this level is that of a Software Architect, who handles and designs big projects and has to look at every aspect of the technology to create the enterprise app. Generally Full-stack developers are preferred in this role, as they need to know all technologies. The highest career milestone is CTO or Chief Technology Officer, who handles all the technology teams and makes all technology decisions in a Technology company. Job SpecializationThere are some Full stack development specializations which I see nowadays in the industry. Full stack developers who work with React in the Frontend and Java in the Backend are in great demand. Similarly, developers who work with Angular in the Frontend and .NET in the backend are in great demand.How KnowledgeHut can helpAll these free resources are a great place to start your Frontend or Full-Stack journey. Beside these there are many other free resources on the internet, but they may not be organized and may not have a structured approach.  This is where KnowledgeHut can make a difference and serve as a one stop shop alternative with its comprehensive Instructor-led live classes. The courses are taught by Industry experts and are perfect for aspirants who wish to become Frontend or FullStack developers.Links for some of the popular courses by KnowledgeHut are appended below-CSS3JavaScriptReactJSNodeJSDevopsConclusion This completes our article on the Full stack developer journey by combining both the Frontend and backend roadmap. There are many people who become backend developers first by working on languages like Java and then go on to learn React to become full stack developers.  Again, many developers learn front-end development first with frameworks like React, and then become full stack developers by learning Node.JS. This path is easier for developers because both React and Node.JS use the same language which is JavaScript.We hope you have found this blog useful, and can now take the right path to become a full stack developer. Good luck on your learning journey!
9573
The 2021 Learning Path To Becoming a Full Stack We...

Full stack developer roles are among the hottest c... Read More

How to Create a Collection in MongoDB?

In recent years, MongoDB has retained its top spot among NoSQL databases. The term NoSQL means non-relational. Mongo is an open-source document-oriented, NoSQL Database that allows users to query data without having to know SQL. You can read more about MongoDB here.MongoDB stores data in the form of collections. In this blog we will learn how to create a collection in MongoDB.PrerequisitesTo follow this blog, you should be having the latest version of MongoDB installed in your machine. You can download MongoDB for your operating system from this link.Let’s start by understanding the term collection.What is a collection in MongoDB?We all know that MongoDB stores data in the form of documents. All the similar documents are stored in a collection. It is same as a Table in a SQL database or any of the Relational Databases.An example of a collection is shown below:Source: mongodb websiteEach object in MongoDB is called a document. All the objects put together create a collection.Creating a Collection in MongoDBHow to create a collection in MongoDB?There are a couple of methods to create a collection in MongoDB:The createCollection() MethodCreating the Collection in MongoDB on the flyLet’s have a look at each of the methods one by one.To start with we need a mongo server in our local machine. To do that, open the terminal for Mac and Linux, and PowerShell or command prompt for Windows. Run the following command.Command: mongodThis fires up the MongoDB server.  To run the MongoDB commands, we need to start a MongoDB shell. For this, open a new window in terminal or PowerShell or command prompt and run the following. We shall refer to this window as mongo shell in the rest of this article.Command: mongoThis creates a shell where we can run MongoDB commands.Let's create a database and use it so that we can test our examples. For doing the same, run the following command in the mongo shell.Command: use myDBThis creates a new database with the name myDB and switches to that database, so that we can work with it.The createCollection() Method:Using createCollection method we can create a collection which does not have any documents or an empty collection. The syntax of createCollection() command is as follows: Syntax: db.createCollection(name, options)The createCollection method takes 2 parameters, the first parameter is the name of the collection which is a string and the other is an options object which is used to configure the collection. The options object is optional.   To create the collection without passing the options using createCollection method, run the following command in the mongo shell. Command: db.createCollection("testCollection")This creates a collection with the name testCollection inside myDB database.To see the collection, run the following command inside the mongo shell.Command: show collectionsThis should show all the collections we have inside the database.We can add additional configurations to the collection. For example, we can add validation or create a capped collection using the second parameter which is the options object.Configuration options for creating a collection in MongoDBThe basic syntax for configuring an object in createCollection method is as shown below.Syntax:db.createCollection( ,    {      capped: ,      autoIndexId: ,      size: ,      max: ,      storageEngine: ,      validator: ,      validationLevel: ,      validationAction: ,      indexOptionDefaults: ,      viewOn: ,              // Added in MongoDB 3.4      pipeline: ,          // Added in MongoDB 3.4      collation: ,         // Added in MongoDB 3.4      writeConcern:    } ) Let’s look at all the options in detail.FieldTypeDescriptioncappedboolean(Optional). To create a capped collection, specify true. If you specify true, you must also set a maximum size in the size field.autoIndexIdboolean(Optional). Specify false to disable the automatic creation of an index on the _id field.sizenumberOptional. Specify a maximum size in bytes for a capped collection. Once a capped collection reaches its maximum size, MongoDB removes the older documents to make space for the new documents. The size field is required for capped collections and ignored for other collections.maxnumberOptional. The maximum number of documents allowed in the capped collection. The size limit takes precedence over this limit. If a capped collection reaches the size limit before it reaches the maximum number of documents, MongoDB removes old documents. If you prefer to use the max limit, ensure that the size limit, which is required for a capped collection, is sufficient to contain the maximum number of documents.storageEnginedocumentOptional. Available for the WiredTiger storage engine only.Allows users to specify configuration to the storage engine on a per-collection basis when creating a collection.validatordocumentOptional. Allows users to specify validation rules or expressions for the collection.validationLevelstringOptional. Determines how strictly MongoDB applies the validation rules to existing documents during an update.validationActionstringOptional. Determines whether to create an error on invalid documents or just warn about the violations and allow invalid documents to be inserted.indexOptionDefaultsdocumentOptional. Allows users to specify a default configuration for indexes when creating a collection.viewOnstringThe name of the source collection or view from which to create the view. The name is not the full namespace of the collection or view; i.e. does not include the database name and implies the same database as the view to create. You must create views in the same database as the source collection.pipelinearrayAn array that consists of the aggregation pipeline stage(s).creates the view by applying the specified pipeline to the viewOn collection or view.collationdocumentSpecifies the default collation for the collection.Collation allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.writeConcerndocumentOptional. A document that expresses the write concern for the operation. Omit to use the default write concern.To know more about the options go to this link.Example of Create Collection in MongoDBAn example for creating a collection with the options before inserting documents is shown below. Run the below command in the mongo shell.Command: db.createCollection("anotherCollection", { capped : true, autoIndexID : true, size : 6142800, max : 10000 } )This creates a capped collection.What is a capped collection?A fixed-sized collection that automatically overwrites its oldest entries when it reaches its maximum size. The MongoDB oplog that is used in replication is a capped collection.See more about capped collection and oplog over here.Create a Collection with Document ValidationMongoDB has the capability to perform schema validation during updates and insertions. In other words, we can validate each document before updating or inserting the new documents into the collection.To specify the validation rules for a collection we need to use db.createCollection() with the validator option.MongoDB supports JSON Schema validation. To specify JSON Schema validation, use the $jsonSchema operator in your validator expression. This is the recommended way to perform validation in MongoDB.What is $jsonSchema?The $jsonSchema operator matches documents that satisfy the specified JSON Schema. It has the following syntax.Syntax: { $jsonSchema: }The example for json Schema object is given below.  Example:  {   $jsonSchema: {      required: [ "name", "year", "skills", "address" ],      properties: {         name: {            bsonType: "string",            description: "must be a string and is required"         },         address: {            bsonType: "object",            required: [ "zipcode" ],            properties: {                "street": { bsonType: "string" },                "zipcode": { bsonType: "string" }            }         }      }   } }To create a collection with validation rules, run the below command in the mongo shell.Command:db.createCollection("employees", {    validator: {       $jsonSchema: {          bsonType: "object",          required: [ "name", "year", "skills", "address" ],          properties: {             name: {                bsonType: "string",                description: "must be a string and is required"             },             year: {                bsonType: "int",                minimum: 2017,                maximum: 2021,                description: "must be an integer in [ 2017, 2021] and is required"             },             skills: {                enum: [ "JavaScript", "React", "Mongodb", null ],                description: "can only be one of the enum values and is required"             },             salary: {                bsonType: [ "double" ],                description: "must be a double if the field exists"             },             address: {                bsonType: "object",                required: [ "city" ],                properties: {                   street: {                      bsonType: "string",                      description: "must be a string if the field exists"                   },                   city: {                      bsonType: "string",                      description: "must be a string and is required"                   }                }             }          }       }    } })This creates a collection with validation.Now if you run show collections command, employees collection should show up.Now, let’s look at the second method that is “Creating the Collection in MongoDB on the fly”Creating the Collection in MongoDB on the flyOne of the best things about MongoDB is that you need not create a collection before you insert a document in it. We can insert a document in the collection and MongoDB creates a collection on the fly. Use the below syntax to create a collection on the fly.Syntax: db.collection_name.insert({key:value, key:value…})Now let’s create a collection on the fly. To achieve that, run the following command in the mongo shell.Command:db.students.insert({ name: "Sai",   age: 18,   class: 10 })This creates a collection with the name students in the database. To confirm, you can run show collections command and check. This should show all the collections which have students collection, as shown in the following image.To check whether the document is successfully inserted, run the below command in the mongoshell to check.Syntax: db.collection_name.find()Command: db.students.find()This should show all the documents inside the collection.ConclusionIn this blog you have seen how to create a collection in MongoDB using different methods, along with examples.  MongoDB is a rapidly growing technology nowadays as it is flexible, fast and for many more reasons. Many companies are using MongoDB as their go-to database of choice. Learning MongoDB is recommended by many of the web developers as it boosts the probability of getting a job as well.
3471
How to Create a Collection in MongoDB?

In recent years, MongoDB has retained its top spot... Read More

How to do MongoDB Back Up, Restoration & Migration

Popular among both enterprises and startups, MongoDB is a database that is perfectly suited for web-apps that need to scale up once the user base increases. MongoDB is different from traditional relational databases because it uses json like objects to store data, instead of tables in relational databases. In this post, we will learn to backup and restore a MongoDB database. In all software products there is an import and export feature, which in database terms, deals with human-readable format. On the other hand, the backup and restore operations use MongoDB specific data, which preserve the MongoDB attributes.  So, when migrating the database, we should prefer backup and restore over import and export. But we should also keep in mind that our source and target systems need to be are compatible, which means that both should be Windows or both should be a Linux based system like Ubuntu/Mac. Prerequisites We are using Windows 10 in this tutorial. Please make sure you have downloaded the MongoDB Community Server and installed it. It is a very easy setup and you will find lot of good articles on the internet detailing this out. Please ensure that you have added it in the Environment variable in your PC. Backup Considerations In a production environment, backups act as a snapshot of the database at a certain point. Large and complex databases do fail or can be hacked. If that happens, we can use the last backup file to restore the database to the point, before it failed. These are some of the factors which should be taken into consideration when doing a recovery.  1. Recovery Point Objective We should know the objective of the recovery point, which means how much data we are willing to lose during a backup and restoration. A continuous backup is preferred for critical data like bank information and backups should be taken several times during the day. On the other hand, if the data doesn’t change frequently, then the backup can be taken every 6 months.  2. Recovery Time ObjectiveThis tells how quickly the restoration can be done. During restoration the application will be down for some time; and this downtime should be minimized, or else customers will be inconvenienced and it may result in loss of business or loss of customer trust.  3. Database and Snapshot IsolationThis refers to the distance between the primary database server and the backup server. If they are close enough i.e., in the same building, then the recovery time reduces. However, in the event of a physical event such as a fire, there is a likelihood of it having been destroyed along with the primary database. 4. Restoration Process We should always test our backups in test servers to see if they will work, in case a restoration is required.  5. Available Storage Backup of database generally takes a lot of space and in most cases, it will never be required. So, we should try to minimize the space taken on the disk, by archiving the database into a zip file.  6. Complexity of DeploymentThe backup strategy should be easy to set and should be automated, so that we don’t have to remember to take the backup after regular intervals. Understanding the Basics The first thing that we should know is that MongoDB uses json and bson(binary json) formats for storing data. So, people coming from a JavaScript background can relate to objects for json, which have a key-value pair. Also, json is the preferred format in which we receive or send data to an API endpoint. You can check the json data of a MongoDB database in any tool or online editors. Even the famous Windows application Notepad++ has a json viewer. Here’s a snapshot of what a json document would look like:As we can see from the above example, json is very convenient to work with, especially for developers.  But it doesn’t support all the data types available in bson. So, for backup and restoring, we should use binary bson. The second thing to keep in mind is that MongoDB automatically creates databases and collection names if they don’t exist during restore operations. Third, since MongoDB is a document-based database, in many use cases we store large amounts of data in one collection, such as the whole post of an article. MongoDB is also used extensively in large databases and big data. So, reading and inserting the data can consume a lot of CPU, memory and disk space. We should always run the backups during the non-peak hours like night. As already mentioned earlier, we can use import and export functions for backup and restoration of MongoDB databases, but we should use commands like mongodump and mongorestore to backup and restore respectively. MongoDB backup We will first cover backing up the MongoDB database. For this we use the mongodump command.  First open the Windows command prompt and go to the location where MongoDB is installed. If you have chosen the default setting, while installing MongoDB though the pop-up it will be installed in a location like C:\Program Files\MongoDB\Server\4.4\bin The version number may change if you are reading this blog in the future. Also, note that it’s better to run the command prompt in the Admin mode. So, once we open the command prompt, we need to change the directory to MongoDB bin folder by giving the below command. cd C:\Program Files\MongoDB\Server\4.4\binNow, enter mongod and press Enter. It will show some json text.Now, we can backup to any location. For this post I am backing up on my Desktop in a Backup folder, which I have created through the command line.Now, we have to run mongodump command, but it should be also present in our MongoDB bin folder. If it is not present, we need to download it from and install it. After this, copy the entire exe files from the download to the MongoDB bin folder. MongoDB Backup with no option Now, run the mongodump command from the bin directory. Here, we are not giving any argument so the backup of the whole database will be taken in the same bin directory.MongoDB Backup to an output directory Now, run the mongodump command from the bin directory. Here, the argument –out specifies the directory in which the data backup will be maintained. In our case we are giving the Backup folder that we had earlier created in the  Desktop.mongodump --out C:\Users\pc\Desktop\Backup Now, go to the desktop and you can find the backup that has been created in our Backup folder.  MongoDB Backup of a specific database MongoDB also allows us to backup a specific database from a collection of databases in mongodump using the –db option. I have an ‘example’ database, so to backup only that, I will use the below command.mongodump --db example --out C:\Users\pc\Desktop\Backup As you can see in the below output only the example database was backed up. MongoDB Backup a specific collection Now, if we want to only backup a specific collection, we need to use the –collection option and give the collection name. Also, note that the database name is mandatory in this case, as MongoDB needs to know which is  the database to search for in the collection. I have a products collection within the example database, so to backup only that I will use the below command. mongodump --db example --out C:\Users\pc\Desktop\Backup –collection products As, you can see in the below output only the products collection from example database was backed up. MongoDB Backup from remote MongoDB instances We can get the backup from remote MongoDB instances also. I have a lot of MongoDB databases for my personal projects on MongoDB atlas, which is the free to use Cloud database for MongoDB. To get a backup of remote databases, we have to use the connection string with –uri parameter. I used the below command. mongodump --uri "mongodb+srv://xxxx:xxxxxxxxxxx@cluster0.suvl2.mongodb.net/xxxxxDB?retryWrites=true&w=majority" --out C:\Users\pc\Desktop\Backup You can see in the below output the backup of the remote instance. MongoDB Backup procedures We should try to make the backup procedure as automated as possible. One of the best ways is to use a cron job, so that it can run every day. As, discussed earlier it is best to run the backup at night when the database has the least load.  Setting up a cron job is easier on a Linux or a Mac because its Windows equivalent is not as good. Alternatively, you can do install MongoDB in WSL2 for Windows which supports Ubuntu.Let’s suppose that for a Linux host which has a MongoDB instance running, you want to run the backup at 04:04 am daily. For this, you should open the cron editor in the terminal, by running the below command in the terminal.sudo crontab –e Now, in the cron editor, you need to add a command like below for our case. 4 4 * * * mongodump --out /var/backups/mongobackups/`date +"%m-%d-%y"`Restoring and migrating a MongoDB database When we restore the MongoDB database from a backup, we will be able to take the exact copy of the MongoDB information, including the indexes. We restore MongoDB by using the command mongorestore, which works only with the binary backup produced by mongodump. Now, we have already taken the backup of example database earlier and it is in our Backup folder. We will use the below command to restore it. In the arguments we will specify the name of the database first with –db option. After that with –drop, we make sure that the example database is first dropped. And in the final argument, we specify the path of our backupmongorestore --db example --drop C:\Users\pc\Desktop\Backup\example Now, if we check in the terminal, we have our example database restored properly.Conclusion In this article, we have learned about MongoDB backup and restore. We have learned the different options for creating the backups, and why and when backups are required. Keep learning!
5652
How to do MongoDB Back Up, Restoration & Migration

Popular among both enterprises and startups, Mong... Read More