How to debug memory leaks in a Node.js application on Heroku

Debugging memory leaks is rarely a piece of cake, especially when they only happen in production. The best way I’ve found to debug memory leaks in a Node.js application on Heroku is to analyze heap dumps.

Obtaining such heap dumps in production can be challenging, as it might be hard to connect remotely to a production instance with the debugger.

In this article, we will go through the steps needed to obtain and analyze heap dumps from a running Heroku dyno. This method will also work on other platforms as long as it is possible to perform similar operations.

To obtain the heap dump we need to:

  • Ensure the Node.js process has a debugger listening
  • Connect Chrome dev tools to the Node.js process
  • Collect the heap dump and download it locally

Before we get to the tutorial part of the post, though, we’ll cover some fundamentals. We start with a 101 on memory leak. We’ll define the term, explaining what is a memory leak in general and also in the context of Node.js.  Then, we’ll explain why memory leaks are such a problem—and when they’re not. As it turns out, memory leaks can range in severity from “mostly harmless” to “borderline catastrophic”. It’s essential to be able to identify those different scenarios when deciding how much effort and resources to divert to detecting and fixing memory leaks.

Finally, we share some basic tips you should be aware of if you want to write apps that perform well and don’t suffer from memory leaks. Then, it’s time to roll-up your sleeves. Let’s get to it.

Node.js memory leak 101

We’ll soon get to the practical part of the post. But before that, it’s important we’re on the same page. What exactly is a Node.js memory leak? Why is this such a problem? These are the questions we’ll be answering right now.

However, it’s often cheaper to avoid a problem altogether than to fix it after it happened. That’s why we’ll also share some tips and best practices you can adopt to prevent memory leaks from occurring in your Node.js apps. Let’s get to it.

What is a memory leak in Node.js?

A memory leak is a problem that can occur in many different programming languages. So, a Node.js memory leak isn’t necessarily different than a memory leak in other languages. Wikipedia defines a memory leak as “a type of resource leak that occurs when a computer program incorrectly manages memory allocations in a way that memory which is no longer needed is not released.”

Some programming languages—especially older ones—require programmers themselves to manage memory allocation in their applications. Such a process tends to be error-prone: developers might erase objects from memory that are still needed. The opposite is also true: programmers might fail to release the memory from an obsolete object. Over time, the amount of RAM memory needed by the application grows tremendously. More modern languages adopt automatic memory management. Through the use of a GC (garbage collection) process, the programs can periodically determine which objects are no longer and use and reclaim their memory.

Node.js programs do make use of GC, but this doesn’t make them bulletproof against memory leaks. On some occasions, the GC might not be able to deallocate memory, leading to memory leaks.

Why should you care about memory leaks?

How much should you worry about memory leaks? It depends.

Memory leaks aren’t always a problem. They can even be harmless in some situations. There are two factors you should consider when deciding how much to care about memory leaks:

  1. How often they happen.
  2. The size of the leak.

Minor memory leaks that happen in apps that run only for a small amount of time might be harmless and even go unnoticed. Desktop apps, for instance, usually have a lifetime small enough that memory leaks often don’t get to create much trouble. The situation might be completely different if we’re talking about a web server that runs 24/7. Even a small memory leak that happens very frequently—let’s say, at each request—can add up to enormous amounts of memory.

When it comes to Node.js apps, a small leak that happens on an execution path that doesn’t get executed often is not a big problem. If, on the other hand, the memory leak is large and/or frequently enough, it can make your app slow. When the application can’t get more RAM, it resorts to swap memory, which is way slower. If the situation is dramatic enough, the Node process itself can crash.

Best practices for avoiding a Node.js memory leak

We’re about to share the tips you can use to debug memory leaks in your Node.js apps. However, before we do that, we’ll share a list of best practices you can adopt to reduce the likelihood of a Node.js memory leak happening in the first place.

Understand how memory works in JavaScript

The first item in our list isn’t a best practice per se, but more of a general guideline. We urge you to seek education. In this particular scenario, it’s essential you acquire a proper understanding of how memory works in Node.js—and JavaScript as a whole—to not only be able to fight memory leaks but to write code that performs better and is safer overall.

This includes understanding the regions where objects are stored in memory in JavaScript. As is the case with many different programming languages, your variables in JavaScript are stored in two different areas of the memory: the stack and the heap. Understanding the nature and purposes of each of those areas can enable you to write code that is more effective and performs better in general.

Learn about garbage collection

Like many modern languages, JavaScript uses GC (Garbage Collection) aka automatic memory management. The actual implementation of the GC relies on the JavaScript engine in use. But the general takeaway here is that GC processes—in JavaScript and otherwise—make assumptions about the objects in an application. These assumptions refer mainly to factors such as the lifetime and size of the objects. GC works with those assumptions to determine which objects are likely to be obsolete and ready for collection.

In short, you should learn about these assumptions and write code that conforms to them. Otherwise, you’ll be creating more pressure on the GC than you really need, harming the performance of your application.

Beware of globals

Global variables don’t get collected. The overuse of globals is one of the most common causes of poor performance and memory leaks. It’s important that you use global very cautiously, making sure to limit their numbers.

Use memory more effectively

Learning about stack and heap memory will allow you to use both more effectively. When possible, you should prioritize stack memory over heap memory, since stack memory usage doesn’t cause pressure on the GC. For the same reason, prefer using short-lived variables. In the same way, you should avoid creating big object trees. That’s especially true when the tree’s root is a long-lived object since that will make all of the leaves long-lived as well.

Also, avoid passing references for objects around unnecessarily, since that extends the lifetime of the object. Instead, prefer to copy the object.

Enabling the Node.js inspector

Before we can analyze anything, we need to ensure that we have a debugger listening. There are two ways to enable the inspector on a Node.js process:

Solution 1: Changing the startup command

By default, Heroku starts a Node.js application by running npm start. Usually, this calls a script defined in the package.json of the application:

Changing this script to add the --inspect (as documented here) flag will start the instances of the application with a debugger listening on a port that will be specified in the logs:

In total, this is what it will look like when you implement this solution.

What changing the startup command looks like Heroku logs

Solution 2: Changing the process state through SSH

Solution 1 is the easiest way to enable an inspector in Node.js, but there are situations in which you can’t or won’t want to enable it. For example, you might not have access to the source code of the application and therefore can’t change the startup script. Or maybe you don’t want to change the state of all your production dynos and deploy your application only for debugging.

Fortunately, there is a way to send a signal to the process to enable a debugger session.

In order to do so, you will need the Heroku CLI to connect to the dyno through an SSH connection.

For all following Heroku commands, you might need to add the --app <app_name> flag to tell the CLI which application to connect to. Also, by default, the CLI will connect to the dyno named web.1 and you might want to change that through the command line (see documentation).

First, let’s connect to the dyno (Heroku might need to restart the dyno at this point):

Then, we need to identify the PID of the Node.js process:

In our case, the process started with node bin/www has the PID 69, we will now send a signal to the process to let it know we need it to enable its debugger:

As you can see, we have sent the USR1 signal to the process to change its state (as documented on this page).

This is confirmed through the application’s logs on Heroku:

Changing the process state through signals

Attaching debugging tools to a Node.js process

In order to attach the debugging tools to our Node.js process, we need to make the websocket used by the debugger accessible on our local machine.

To do that, we first need to identify the port we need to forward. This can be found in the logs of the application:

In our case, this is the port 9229.

To forward the port locally, let’s use the Heroku CLI:

When port forwarding is established, we just need to open Chrome DevTools (going to chrome://inspect on Chrome) and after a few seconds, a target should be displayed under “Remote targets.”

If the target does not appear, make sure the port used is listed when clicking on “Configure.”

Collecting the heap dump and reading it

Now it’s time to collect and read the heap dump. First, click on the “inspect” link. This will open a new window with different tabs.

Find the “Memory” one — you should be prompted with the following window:

Click on “Take snapshot.” A new file will appear in the left hand side panel. Clicking on it will display the content of the heap:

In this view, objects are sorted by constructor. For the purpose of this walkthrough, I have introduced a memory leak in this application by creating an instance of the Access class for each request. This instance keeps a reference to the current HTTP requests and is never cleaned:

You can see for yourself that this indeed leaks in the application.

To detect constructors that have the biggest memory impact, let’s sort the items of this view by “Retained size” (You can learn more about these terms on Chrome’s website).

You can see that 24% of the process memory is held by these objects.

Now let’s look at how to identify where the leak is happening.

When expanding the list of the constructor, we can see all instances of this class. By selecting one of these instances, the list of retainers of this object is displayed:

In our case, the allAccesses set is clearly identified as the bad actor! With the location of the memory leak identified, we have everything we need to go off and fix it.

A few tips for debugging memory leaks in Node.js

Use the compare view

When suspecting a memory leak, you might want to take two separate heap dumps with a few minutes between them. Then, using the “comparison view”, you can identify which elements have been created between the snapshots.

Use constructors and classes in the code

As shown in the article, when reading the heap dump, elements are grouped by their constructor.

Using more than just classes in your code will make it more readable (and arguably more performant, but that’s probably a topic for another article). It will save you so much time when hunting for a memory leak. Do it — future you will be grateful.

Trigger a garbage collection before collecting the snapshot

At the top left hand side of this screen, there’s a little bin picture. Clicking on it will trigger a garbage collection in the application. Doing this before collecting a memory snapshot will actually remove elements that are not leaking and therefore could help save you time when browsing the heap content.

Conclusion

In this article, we’ve taken a look at how to debug memory leaks in a Node.js process running on Heroku by connecting and using a debugger. Feel free to contact me on Twitter if you have any questions or if you want to share your own tips with me!

If you’re looking for next steps or a more advanced way to debug memory leaks in Node.js in Heroku, try this: Since the Heroku CLI is written with Node.js, you could write an automated tool to perform the collection and start analyzing heap dumps.

Subscribe
Notify of
guest
5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
You May Also Like