How to Reduce Node.js Memory Usage by 70%

If you are running a Node.js application in production, you have likely encountered the dreaded wall of text that ends with FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory.

In the early days of a project, horizontal scaling—throwing more servers or larger cloud instances at the problem—feels like an acceptable solution. However, as traffic grows, this strategy quickly becomes financially unsustainable. We recently hit this wall. Our cloud hosting costs were skyrocketing, and our API endpoints were experiencing severe latency spikes due to aggressive garbage collection.

Here is the exact blueprint of how we diagnosed the bottlenecks, plugged the memory leaks, and optimized our application for long-term stability.

1. Understanding the Enemy: The V8 Memory Model

Before you can optimize memory, you have to understand how Node.js allocates it. Node.js runs on Google’s V8 JavaScript engine, which uses a generational garbage collection system. Memory is divided into two primary segments:

New Space (Young Generation): This is where new objects are created. It is small, fast, and collected frequently by a process called the Scavenger. Most objects “die young” and are quickly cleared out here.
Old Space (Old Generation): If an object survives multiple garbage collection cycles in the New Space, it is promoted to the Old Space. This space is much larger and is managed by the Mark-Sweep and Mark-Compact algorithms. Garbage collection here is computationally expensive and can “stop the world,” pausing your application’s execution.

Our problem was that too many objects were surviving the Scavenger, flooding the Old Space, and forcing the Mark-Sweep collector to work overtime. This resulted in high memory usage and high CPU spikes.

2. Diagnosing the Bottleneck: Profiling Over Guessing

The first rule of optimization is to never guess. We needed empirical data to see exactly what was eating our RAM.

Taking a Heap Snapshot

We started by generating heap snapshots. Using the --inspect flag, we connected our local Node.js process to Chrome DevTools.

We booted our application with node --inspect index.js.
We opened Chrome and navigated to chrome://inspect.
We took a baseline snapshot of the memory.
We ran a load-testing script (using a tool like Artillery or JMeter) to simulate heavy user traffic.
We took a second snapshot and used the Comparison view in DevTools.

The comparison view was illuminating. It showed us the “Delta”—the objects that were created between snapshot A and snapshot B but were never garbage collected.

The Findings

Our profiling revealed three major culprits:

Massive strings and buffers being held in memory during file and network operations.
Unbounded in-memory caches that grew infinitely.
Memory leaks caused by improper use of closures and event listeners.

3. Strategy 1: Embracing Node.js Streams

One of the most significant architectural flaws we uncovered was how we were handling large data payloads. Our application frequently processed large CSV exports and image uploads.

Our original code looked something like this:

const fs = require('fs');

// Bad: Reading the entire file into memory at once
app.get('/download-report', async (req, res) => {
    const fileBuffer = await fs.promises.readFile('./massive-report.csv');
    res.send(fileBuffer);
});

When a user requested a 500MB report, Node.js loaded that entire 500MB file into the heap. If four users requested it simultaneously, our application immediately crashed with an Out of Memory (OOM) error.

We refactored our file handling and network requests to use Streams. Streams process data piece by piece (in chunks) rather than loading the whole payload into memory.

const fs = require('fs');

// Good: Piping the file stream directly to the response
app.get('/download-report', (req, res) => {
    const readStream = fs.createReadStream('./massive-report.csv');
    readStream.pipe(res);
});

By piping the read stream directly into the response stream, our memory usage for file downloads dropped from hundreds of megabytes to just a few kilobytes per request. This single change accounted for roughly 30% of our total memory reduction.

4. Strategy 2: Taming the ORM and Database Queries

Object-Relational Mappers (ORMs) like Prisma, Sequelize, or TypeORM are fantastic for developer productivity, but they can be disastrous for memory if left unchecked.

We found a background job that was fetching user records to send out a weekly newsletter. The query looked innocuous:

// Fetching 50,000 users into memory at once
const users = await User.findAll({ where: { subscribed: true } });

Fetching 50,000 records from the database doesn’t just hold the raw data in memory; the ORM “hydrates” those records, turning each row into a heavy JavaScript object with attached methods, getters, and setters. This resulted in massive memory bloat.

We implemented two fixes for database interactions:

Pagination and Cursors: Instead of fetching all records at once, we processed them in batches of 500 using limit and offset (or cursor-based pagination).
Raw Queries for Read-Only Data: When we didn’t need to update the records (like generating a report or sending an email), we bypassed the ORM’s hydration process entirely. By passing { raw: true } in Sequelize (or equivalent commands in other ORMs), we received lightweight, plain JavaScript objects, drastically reducing the memory footprint.

5. Strategy 3: Eliminating In-Memory Caching Leaks

Caching is essential for performance, but an unbounded cache is just a memory leak in disguise.

To speed up API responses, a previous developer had implemented a simple in-memory cache using a native JavaScript Map:

const responseCache = new Map();

app.get('/api/data', async (req, res) => {
    const key = req.url;
    if (responseCache.has(key)) {
        return res.json(responseCache.get(key));
    }
    const data = await fetchExpensiveData();
    responseCache.set(key, data); // Memory leak! No expiration.
    res.json(data);
});

Because there was no mechanism to clear old entries, this Map grew indefinitely until the server crashed.

We replaced native Maps with two safer alternatives:

LRU Cache (Least Recently Used): For small, process-level caching, we implemented the lru-cache npm package. This allows you to set a strict maximum number of items (e.g., 500). When the cache is full, the oldest items are automatically evicted.
Redis: For larger datasets and distributed caching across multiple Node.js instances, we moved the cache out of the Node process entirely and into a dedicated Redis server. This freed up the V8 heap to focus purely on application logic.

6. Strategy 4: Fixing Closure and Event Listener Leaks

JavaScript’s scoping rules make it very easy to accidentally hold onto references longer than intended.

The Event Listener Trap

Node.js relies heavily on the Event-Driven architecture. However, if you attach an event listener to a long-lived object (like a global server instance or a database connection pool) but fail to remove it, the garbage collector can never clean up the callback function or the variables it references.

We aggressively audited our use of EventEmitter. Whenever we used .on(), we ensured there was a corresponding .removeListener() or .off() when the lifecycle of that specific operation ended. We also paid close attention to the MaxListenersExceededWarning in our logs, which is Node’s built-in cry for help when a memory leak is occurring via event emitters.

Closure Scope Retention

Closures can unintentionally keep large objects alive. Consider this scenario:

function processData() {
    const massiveObject = getMassiveData();

    return function logData() {
        console.log(massiveObject.id);
    }
}

Even though logData only needs the id, the entire massiveObject is kept in memory because it is trapped in the closure’s scope. We refactored our closures to extract only the primitive values they needed, allowing the garbage collector to sweep away the large objects immediately.

7. Strategy 5: Fine-Tuning the Runtime Environment

Once the code was optimized, we looked at how Node.js was interacting with the server environment.

Adjusting `--max-old-space-size`

By default, Node.js caps the heap size at around 1.5GB on 64-bit systems. If your server has 4GB of RAM, Node.js won’t use it natively unless you tell it to. While fixing the code is always the priority, adjusting the memory limit can provide necessary breathing room for heavily concurrent applications.

We updated our start scripts (and Dockerfiles) to explicitly set the memory limit closer to our container’s actual capacity:

node --max-old-space-size=3072 index.js

Process Management with Clustering

Because Node.js is single-threaded, a single process can only utilize one CPU core. If that one process accumulates too much memory garbage, it drags down the entire application.

We utilized PM2 (and later Kubernetes native deployments) to implement Node.js Clustering. By spinning up multiple smaller worker processes (one per CPU core) rather than one monolithic process, the garbage collection load was distributed. If one worker’s memory spiked, it could be gracefully restarted without taking down the entire API.

The Results and Key Takeaways

After deploying these changes incrementally over three weeks, the results were undeniable:

Memory Usage: Dropped from an average of 1.2GB per pod to a stable 350MB. (A ~70% reduction).
Performance: The 99th percentile (p99) API latency dropped by 40% because the V8 engine was no longer spending valuable CPU cycles trying to aggressively clean up the bloated Old Space.
Infrastructure Costs: We were able to downsize our cloud instances and reduce the number of running containers, cutting our monthly hosting bill significantly.

Summary Checklist for Node.js Memory Optimization

If you are facing similar scaling issues, start here:

Do not guess; profile. Take heap snapshots and compare them to find the true source of the leak.
Stop buffering, start streaming. Never load large files or payloads entirely into memory.
Control your ORM. Use pagination, cursors, and raw queries to avoid hydrating massive data sets.
Cap your caches. Never use unbounded Maps or Arrays for caching. Use an LRU cache or an external service like Redis.
Watch your listeners. Ensure every event listener is properly removed when it is no longer needed.

Reducing Node.js memory usage isn’t about finding a single magic configuration flag. It is about understanding how the V8 engine manages data, respecting the limits of single-threaded architecture, and writing disciplined, predictable JavaScript. By applying these principles, you can build backend systems that scale gracefully without devouring your infrastructure budget.

Andriy Kravets

Andriy Kravets is writer and experience .NET developer and like .NET for regular development. He likes to build cross-platform libraries/software with .NET.