A Gentle Introduction to Parallelism in Node.js
Imagine you run a busy bakery with a large number of customers coming in every day. You have several ovens, each capable of baking delicious bread. However, if you only have one baker working, that baker can only use one oven at a time, leaving the other ovens unused and many customers waiting longer for their bread.
To make sure all the ovens are being used and all customers are served quickly, you hire more bakers. Each baker can now work with one oven, so all the ovens are busy baking bread simultaneously. This way, you can serve more customers faster, making your bakery efficient and popular.
Your Node.js application is like the bakery, the incoming requests from users are the customers, the CPU cores are the ovens, and the workers (bakers) are the processes handling the tasks.
By default, a Node.js application runs on a single thread, meaning it can only use one CPU core at a time, similar to having one baker using one oven. Even if your server has multiple CPU cores, the Node.js application won't take full advantage of them, leaving many "ovens" (cores) idle.
Forking Processes with the cluster Module
To make full use of all CPU cores, you can fork multiple processes using the cluster module in Node.js. This is like hiring more bakers so each core (oven) is busy. Here's how and why you do it:
- Coordinator and Workers: When you use the cluster module, you create a coordinator process (the head baker) and multiple worker processes (additional bakers). The coordinator is responsible for managing the workers, while the workers handle the actual tasks (baking bread).
- Forking Workers: The coordinator process forks worker processes. Each worker runs on a separate CPU core, so if you have a server with 6 cores, you can fork 6 worker processes. This way, you maximize the use of your server's CPU power.
- Handling Requests: The coordinator process distributes incoming requests (customers) to the worker processes. Each worker handles requests independently, just like each baker independently bakes bread in their own oven. This ensures that the load is balanced, and no single worker gets overwhelmed.
How to Implement Forking in Your Node.js Application
Continuing from our bakery analogy, where we have a coordinator (manager) overseeing bakers (worker processes) and ovens (CPU cores), let's delve into how you can set this up in your Node.js application to maximize CPU core utilization.
Step 1: Create the Coordinator Process
In our bakery, the coordinator (manager) oversees the entire operation, ensuring that each baker is assigned to an oven and that the bakery runs smoothly. Similarly, in a Node.js application, the coordinator process is the main process that manages worker processes.
The role of the Coordinator includes:
- Oversees the entire operation,
- Assigns tasks (or incoming requests) to bakers (worker processes), and
- Monitors the performance of bakers and restarts any that stop working.
1const cluster = require('cluster');
2const os = require('os');
3
4// Get the number of CPU cores available
5const numCPUs = os.cpus().length;
6
7if (cluster.isPrimary) { // Coordinator process
8 console.log(`Coordinator process is running with PID: ${process.pid}`);
9
10 // Fork worker processes
11 for (let i = 0; i < numCPUs; i++) {
12 cluster.fork();
13 }
14
15 // Listen for exiting workers and restart them
16 cluster.on('exit', (worker, code, signal) => {
17 console.log(`Worker ${worker.process.pid} died. Restarting...`);
18 cluster.fork();
19 });
20}
21
Step 2: Fork Worker Processes
Just as our bakery coordinator hires bakers and assigns them to different ovens, the coordinator process forks multiple worker processes in Node.js. Each worker is like a baker, independently handling tasks using one CPU core.
1if (cluster.isPrimary) {
2 // Coordinator process code (see above)
3} else {
4 // Worker process code
5 const http = require('http');
6
7 http.createServer((req, res) => {
8 res.writeHead(200);
9 res.end('Hello World\n');
10 }).listen(8000, () => {
11 console.log(`Worker ${process.pid} started`);
12 });
13}
14
Step 3: Workers Handle Tasks
In our bakery, each baker (worker process) receives a portion of the incoming tasks (customer orders) and processes them independently, using one CPU core each. This ensures all ovens (CPU cores) are fully utilized, and customers are served efficiently.
The role of the Worker processes:
- Each worker handles incoming HTTP requests,
- Workers process requests independently, ensuring efficient use of CPU cores, and
- If a worker fails, the coordinator restarts it to maintain smooth operations.
1const cluster = require('cluster');
2const http = require('http');
3const os = require('os');
4
5// Get the number of CPU cores available
6const numCPUs = os.cpus().length;
7
8if (cluster.isPrimary) {
9 // Coordinator process
10 console.log(`Coordinator process is running with PID: ${process.pid}`);
11
12 // Fork worker processes
13 for (let i = 0; i < numCPUs; i++) {
14 cluster.fork();
15 }
16
17 // Listen for exiting workers and restart them
18 cluster.on('exit', (worker, code, signal) => {
19 console.log(`Worker ${worker.process.pid} died. Restarting...`);
20 cluster.fork();
21 });
22} else {
23 // Worker process
24 http.createServer((req, res) => {
25 res.writeHead(200);
26 res.end('Hello World\n');
27 }).listen(8000, () => {
28 console.log(`Worker ${process.pid} started`);
29 });
30}
Explanation
The coordinator process starts first. It forks worker processes equal to the number of CPU cores available. It also listens for any worker processes that exit unexpectedly and restarts them to ensure continuous operation.
Each worker process runs an HTTP server that handles incoming requests independently. This ensures that all CPU cores are utilized, with each worker process using one core.
By following this approach, your Node.js application can efficiently handle multiple tasks simultaneously, utilizing all available CPU cores and ensuring high performance and reliability.
Summary
In this article, we explored utilizing parallelism in a Node.js application using the cluster module through a bakery analogy. Imagine your Node.js app as a bustling bakery: incoming requests are customers, CPU cores are ovens, and worker processes are bakers. Initially, one baker using one oven leaves many ovens idle and customers waiting. Hiring more bakers ensures all ovens are in use and customers are served promptly.
Similarly, the cluster module allows Node.js to fork multiple worker processes, each using a separate CPU core. The coordinator process, like a bakery manager, assigns tasks to workers and ensures smooth operations by restarting any that fail. This setup maximizes CPU usage and balances the load, ensuring high performance and reliability. By following this approach, your Node.js application efficiently handles multiple tasks simultaneously, optimizing resource utilization and overall efficiency.
Happy coding!