// This is the specific block of “clever” garbage that took down the
// payment gateway at 3:00 AM on a Sunday.
// Node v20.11.1 – Production Environment
var requestCache = {}; // Global scope leak
app.use((req, res, next) => {
var correlationId = req.headers[‘x-correlation-id’] || Math.random().toString();
// "Clever" optimization to avoid DB lookups
if (requestCache[correlationId]) {
res.send(requestCache[correlationId].payload);
// Missing return statement. Execution continues.
}
db.query('SELECT * FROM transactions WHERE id = ?', [req.body.id], (err, result) => {
if (err) {
// No error handling. Just logging to a global variable.
lastError = err;
}
new Promise((resolve, reject) => {
// Wrapping a callback in a promise manually for no reason
const processed = transformData(result);
// Mutating a global object without a cleanup strategy
requestCache[correlationId] = {
payload: processed,
timestamp: Date.now()
};
resolve(processed);
}).then(data => {
res.status(200).json(data);
});
// Unhandled promise rejection if transformData throws
});
});
## The Global Scope is a Graveyard
Look at the first line. `var requestCache = {};`. This is not just a variable; it is a memory leak masquerading as a feature. In a Node.js environment, the module-level scope persists for the lifetime of the process. By attaching a growing object to this scope, you have effectively created a black hole for the heap.
Every single request that hit the middleware added a key to this object. Because the "clever" developer who wrote this didn't implement a Time-To-Live (TTL) or a Least Recently Used (LRU) eviction policy, the object grew until the V8 engine hit its default heap limit.
Here is the reality: the garbage collector (GC) cannot reclaim memory that is still reachable. Since `requestCache` is in the global module scope, every entry inside it is considered "reachable." Even after the HTTP response was sent, the data persisted. This isn't just bad code; it's a fundamental misunderstanding of how the V8 heap is structured.
When we ran the post-mortem diagnostics, the heap snapshot was damning.
```text
# node --inspect-brk index.js
# In Chrome DevTools:
# Snapshot 1: 24MB (Startup)
# Snapshot 2: 840MB (After 10,000 requests)
# Snapshot 3: 1.4GB (FATAL ERROR: Reached heap limit Allocation failed - JavaScript heap out of memory)
(node:45201) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues.
<--- Last few GCs --->
[45201:0x6000000] 15201 ms: Mark-sweep 1380.5 (1420.2) -> 1380.1 (1420.2) MB, 420.1 / 0.0 ms (average mu = 0.152, current mu = 0.002) allocation failure; scavenge might not succeed
The V8 engine spent 420ms trying to perform a Mark-sweep on a heap that was 99% full of un-evictable garbage. This is why following javascript best practices isn’t optional; it’s a matter of keeping the server alive. A simple Map with a size check or, better yet, an external store like Redis, would have prevented this. But no, we had to be “clever.”
Table of Contents
Your Async Logic is a Race Condition Waiting to Happen
The middleware uses a mixture of legacy callbacks and “clever” manual Promise wrappers. Look at the db.query block. It’s a callback. Inside that callback, someone decided to instantiate a new Promise.
Listen: if you are manually wrapping existing logic in new Promise, you are likely doing it wrong. This specific implementation failed to catch errors in transformData(result). If that function throws an exception, the promise is rejected, but there is no .catch() block and no await with a try/catch.
In Node.js v20, an unhandled promise rejection will, by default, trigger a process exit if not handled correctly. But even before the crash, we saw the “Dual Response” bug. Because the developer forgot to return after res.send(requestCache[correlationId].payload), the code continued to execute. It attempted to query the database and send another response for the same request.
Error [ERR_HTTP_HEADERS_SENT]: Cannot set headers after they are sent to the client
at ServerResponse.setHeader (node:_http_outgoing:648:11)
at ServerResponse.header (express/lib/response.js:794:10)
at ServerResponse.send (express/lib/response.js:174:12)
This flooded the logs and put unnecessary load on the database. The Event Loop was spinning its wheels processing queries for requests that were already closed. This is the cost of “clever” shortcuts. You think you’re saving a few lines of code, but you’re actually destroying the predictability of the execution stack.
The Immutability Myth and the Const Lie
I see this in every code review. A developer uses const and thinks they have created an immutable data structure.
const config = { db: 'prod', timeout: 5000 };
config.timeout = 0; // Perfectly valid JS
In the production failure, a “clever” developer used const for a shared state object, then proceeded to mutate its properties across different middleware functions. They assumed that because the variable couldn’t be reassigned, the data was safe.
Here is the reality: const only prevents the binding from being changed. The underlying object in the heap is still fully mutable. During the outage, one request modified a shared configuration object to set a debug: true flag. Because that object was shared (thanks to the global scope obsession), every subsequent request started dumping full stack traces and PII into the logs.
If you want immutability, use Object.freeze(). But even then, it’s a shallow freeze. If you have nested objects, you’re still vulnerable. This is why we use functional patterns where we return new objects instead of mutating existing ones. It’s not about being “pure”; it’s about not having the production state corrupted by a stray line of code in a utility function.
Hidden Classes and Why Your Object Mutation Kills Performance
We need to talk about the V8 engine and why the requestCache object was not just a memory leak, but a performance killer. V8 uses “Hidden Classes” (also called Shapes) to optimize object property access. When you create an object, V8 assigns it a hidden class. If you add a property, V8 creates a new hidden class that transitions from the first one.
In the broken code, the requestCache object was being treated like a catch-all bucket. Properties were being added dynamically with keys that were random strings (the correlationId).
// V8 optimization nightmare
requestCache['id_1'] = { ... };
requestCache['id_2'] = { ... };
Because the keys were dynamic and the structure of the values sometimes varied (some had timestamp, some didn’t), V8’s Inline Caching (IC) failed. The engine could no longer predict the memory offset of the properties. It fell back to “dictionary mode,” which is significantly slower.
When we looked at the execution profile, the KeyedLoadIC and KeyedStoreIC stubs were taking up 15% of the CPU time. That is 15% of our compute power wasted because someone didn’t want to use a proper data structure. Using a Map is one of those “javascript best” practices that actually matters for the engine. A Map is designed for frequent additions and removals of dynamic keys. An Object is designed for a fixed shape. Use the right tool.
The Heap Snapshot Doesn’t Lie
During the incident, I took a heap dump. If you haven’t looked at a heap dump under pressure, you haven’t lived. It’s a graveyard of strings and leaked closures.
The most offensive part of the dump was the “Closure” section. Every time that middleware ran, it created a new closure for the db.query callback. That closure captured the req and res objects. Because the db.query was slow and the requestCache was holding onto references, we had thousands of IncomingMessage and ServerResponse objects sitting in memory long after the socket had closed.
(Raw Heap Snapshot Analysis)
Object Name | Count | Shallow Size | Retained Size
---------------------------------------------------------------
Closure | 45,021 | 3.6 MB | 450.2 MB
IncomingMessage (req) | 12,104 | 2.1 MB | 210.5 MB
ServerResponse (res) | 12,104 | 2.4 MB | 240.8 MB
system / Context | 45,021 | 5.2 MB | 380.1 MB
The “Retained Size” is what matters. That is the amount of memory that would be freed if that object was deleted. Those closures were holding onto the entire request context. This is why we don’t nest logic five levels deep. Every level of nesting is another opportunity to capture scope that you don’t need, preventing the GC from doing its job.
Error Handling is Not an Afterthought
The code had a lastError = err; line. What is this? A diary? A log for ghosts?
If an error occurs in a database query, the request must be terminated with a 5xx status code, or it must be retried. Simply assigning it to a global variable does nothing for the user and nothing for the system’s stability.
Furthermore, the lack of a try/catch block around the transformData call meant that any malformed data from the database would crash the entire worker process. We are running Node v20; we have async/await. There is no excuse for this callback-hell-style negligence.
Here is how a professional would have written that block, following actual javascript best practices:
app.use(async (req, res, next) => {
const correlationId = req.headers['x-correlation-id'] || crypto.randomUUID();
try {
const cached = await cacheService.get(correlationId);
if (cached) {
return res.json(cached);
}
const [rows] = await db.execute('SELECT * FROM transactions WHERE id = ?', [req.body.id]);
if (!rows.length) {
return res.status(404).json({ error: 'Not Found' });
}
const processed = await transformData(rows[0]);
await cacheService.set(correlationId, processed, { ttl: 300 });
res.json(processed);
} catch (err) {
logger.error({ err, correlationId }, 'Transaction processing failed');
next(err); // Pass to Express error handler
}
});
Notice the differences. async/await makes the execution flow linear and readable. The return statements ensure that we don’t send multiple responses. The try/catch block ensures that any error—whether from the DB, the transformation, or the cache—is caught and passed to the centralized error handler. The cache is an external service with a TTL. No memory leaks. No “clever” manual promise wrappers.
The Cost of “Clever” One-Liners
The original code used Math.random().toString() for a correlation ID. This is a collision nightmare in a high-traffic system. Math.random() is not cryptographically secure and has a limited entropy pool. We saw duplicate IDs in the logs, which meant Request B was getting the cached data from Request A.
This is a data leak. This is a security incident.
Stop trying to save three seconds by using a built-in function that isn’t fit for purpose. Use crypto.randomUUID(). It’s built into Node.js. It’s fast. It’s correct.
Also, look at the npm audit results for the environment this was running in.
# npm audit
found 12 vulnerabilities (4 moderate, 8 high) in 1402 packages
High: Prototype Pollution in 'lodash'
High: Remote Code Execution in 'qs'
Moderate: Denial of Service in 'express'
We are running outdated, vulnerable versions of core libraries because “if it ain’t broke, don’t fix it.” Well, it broke. It broke spectacularly. Part of being a Senior Architect is realizing that your code doesn’t live in a vacuum. It lives in an ecosystem of dependencies that need constant maintenance.
The Execution Context and the Event Loop
When the server started lagging, the “clever” developer’s first instinct was to add more workers. This made it worse. Each worker started consuming 1.5GB of RAM, hitting the node’s total memory limit and triggering the OOM (Out Of Memory) Killer in Linux.
The Event Loop was blocked. When the heap is full, the GC runs constantly. Since the GC is a “stop-the-world” operation in many phases, the Event Loop cannot process I/O.
We monitored the eventLoopDelay.
# Monitoring Output
Event Loop Delay: 150ms
Event Loop Delay: 450ms
Event Loop Delay: 1200ms
Event Loop Delay: 5000ms (CRASH)
A delay of 5 seconds means the server is effectively dead. It cannot accept new connections. It cannot heartbeat to the load balancer. The load balancer then marks the instance as unhealthy and kills it, but the new instance just inherits the same “clever” code and the same traffic, leading to a cascading failure.
Refactoring the Memory Management
The global requestCache has been removed. In its place, we’ve implemented a proper multi-layered caching strategy. But more importantly, we’ve addressed the way we handle object allocation.
We are now using Buffer pools for heavy data processing to keep the pressure off the V8 heap. We are using stream.Pipeline for data transformation to ensure we aren’t loading massive database result sets into memory all at once.
Look at this comparison of the “clever” way vs. the right way:
Clever Way (Broken):
const data = await db.query('SELECT * FROM large_table');
const processed = data.map(row => heavyTransform(row));
res.json(processed); // Allocates a massive string in the heap
Right Way (Stable):
const cursor = db.query('SELECT * FROM large_table').cursor();
cursor.pipe(new TransformStream({
transform(chunk) {
this.push(heavyTransform(chunk));
}
})).pipe(res); // Streams data, constant memory footprint
The “Right Way” uses the same amount of memory whether you are processing 10 rows or 1,000,000 rows. This is what it means to build scalable systems. It’s not about how fast you can write the code; it’s about how the code behaves when the load increases by 100x.
Final Directives for the Engineering Team
I am tired of cleaning up these messes. From this point forward, the following rules are non-negotiable:
- No Global State: If I see a variable defined in the module scope that isn’t a constant configuration value, the PR will be rejected. Use dependency injection or a state management service.
- Async/Await Only: No more nested callbacks. No more manual Promise wrappers. If a library doesn’t support promises, use
util.promisify. - Strict Error Handling: Every promise must have a
.catch()or be inside atry/catch. Every Express middleware must callnext(err)on failure. - No “Clever” Hacks: If you think you’ve found a “cool” way to save a few CPU cycles by bypassing a standard pattern, you are wrong. Follow javascript best practices. The V8 engine is smarter than you; write code that it can optimize.
- Mandatory Linting: We are moving to a strict ESLint configuration that forbids the use of
var, requiresawaitfor promises, and flags potential memory leaks.
The outage cost the company significant revenue and reputation. It was entirely preventable. It wasn’t caused by a surge in traffic or a hardware failure. It was caused by “clever” code written by people who didn’t respect the runtime.
Read the documentation. Understand the Event Loop. Respect the Heap. Or find another job where “clever” is an acceptable substitute for “correct.”
“`bash
npx eslint . –config .eslintrc.strict.json –fix && npx npm-check-updates -u && npm install && node –trace-warnings –heap-prof index.js
Related Articles
Explore more insights and best practices: