Building a native add-on for Node.js in 2019

Okay, but first: why the hell would you build a native add-on for Node.js?

The Node.js/JavaScript ecosystem is the most popular in the world with more than 1 million packages hosted on npmjs.com. Also, the latest features of the language and the incredible work by the Node.js/V8/libuv teams ensures that Node.js runs everywhere with great performance.

However, sometimes you might need something that is not offered by the runtime or the ecosystem, no matter how strong it is. For Sqreen, this situation arose when we set out to build our In-App WAF. This is what prompted us to set out to build a native add-on for Node.js.

A bit of context

At Sqreen, we protect applications by leveraging microagents in multiple technologies (Node.js, Ruby, Python, PHP, Java, and Go). When we decided to add In-App Web Application Firewall (WAF) capabilities to our agents, we had three potential solutions for implementing it:

  • Write it separately in each language we support — this solution was ruled out pretty quickly as too costly and time-intensive
  • Use the V8 runtime that is available in all of our agents
  • Build a native library and share it between the agents

The In-App WAF takes a traditional WAF’s approach of  performing pattern-based protection at the perimeter and moves it inside a web application. This has several advantages over the traditional network-level deployment:

  • Sqreen can fingerprint the application’s stack and select only rules that are relevant, cutting down on false positives
  • The HTTP request is deserialized by the framework and then Sqreen applies the pattern to it. This gives a result way more accurate than a regular on-network WAF:
    • The pattern set is chosen according to the stack itself
    • The request is interpreted by the framework. The HTTP request is checked in its actual context

As we developed the In-App WAF, we made a series of implementation decisions and tradeoffs, and settled on building a native library, which we called libsqreen. This article will pick up on the items we discussed in the post above and take a look at how we built the native add-on for Node.js.

Back to Node.js

So here we are. We have built a native library that we plan to use in all of our agents. Thankfully, all the technologies we support provide a way to call native code.

To interface LibSqreen with Node.js, we only needed two things:

  • A static build of LibSqreen (libSqreenStatic.a)
  • Binding code to make LibSqreen available from Node.js

These bindings will give us a way to call the C functions from JavaScript. They will have to:

  • Convert the arguments from JavaScript to C
  • Call the C functions
  • Convert the results from C to JavaScript

What goes into creating the bindings

State of the art in 2019

There are two ways to interface C/C++ code with Node.js:

  • NAN (Native Abstraction for Node.js) – this is a set of tools for making the interface with V8 simpler and more stable across versions of Node.js. Our code could be broken by changes made on V8 API. It needs to be compiled for each major version of Node.js
  • N-API – this is a stable and V8-independent API for building add-ons for Node.js. You compile it once and it will run on all versions of Node.js that support N-API.

N-API is the best solution here. We would need to write the binding only once and then they would remain stable across all versions of Node.js that support N-API. But there is a catch: different N-API versions are only available on some versions of Node.js:


N-API v3N-API v4
v6.xv6.14.2*
v8.xv8.11.2v8.16.0
v9.xv9.11.0*
v10.xv10.0.0
v11.xv11.0.0v11.8.0
v12.x
v12.0.0

(source https://nodejs.org/api/n-api.html#n_api_n_api_version_matrix).

Also, it is worth to notice that N-API v3 code is compatible with N-API v4 code.

So at the end of the day, we need to write:

  • NAN bindings for Node.js 4.x, 5.x, 6.x, 7.x and 8.x
  • N-API bindings V3 that we will use when available

So, let’s build a very useful add-on for Node.js with NAN and N-API: a function that returns the length of a string!

To build this, we will need for each situation:

  • A package.json file with node-gyp in the dependencies
  • A binding.gyp file
  • A Binding file in C or C++

node-gyp is a build tool used to build Node.js add-ons. The binding.gyp file contains the instructions to build the add-on and will be consumed by node-gyp.

We will place the script 

"install": "node-gyp rebuild"

In our package.json. Meaning we can easily build the project with npm install.

In each implementation, our binding code will have four parts:

  • Imports
  • A method to obtain a String from a V8 or N-API parameter
  • A method we will expose to Node.js
  • An init method that will be called when the add-on is created to register the module

The binding for NAN/V8

The V8 API is written in C++, and so will the following code. In this section, I’ll walk through the steps needed to build our binding code in NAN.

Imports

In the case of NAN bindings, we only need to import nan.h:

Obtain a string

In our example, the parameter is a string. We need to obtain it as a C++ string from a JavaScript one. We will then call a C++ function containing our core logic (here computing the length of the string) and return the result as a number to Node.js.

V8 uses Maybe<T> to expose data. Before accessing anything, we need to check if what we’re trying to access really exists:

Then it’s time to build a buffer to fill with the string. Since there has been a breaking API change between some versions of V8, we need to have two different tracks in our code:

Eventually, we can build a string out of it and free our buffer:

Now that we have built our string, we can use it in our C++ function.

Method we will expose

Now that we have a string, the next step is pretty straightforward as most of the work is done in the previous method.

Here, we check that the function has been called with at least one argument. If so, we pass this argument to our get_string_from_item function.

Finally, we create a V8 Number with the help of Nan::New and provide it as response to the function call.

Init the module

The next step is initializing the binding module. This is pretty easy as NAN provides us with two helpers:

Basically, NAN will provide an object named target on which we attach a property with the name of our function as key and the function as value.

Notice that we finish by calling the NODE_MODULE macro to register the module. If this call is missing, Node.js will not be able to load the module.

The full code of the binding is available here.

The binding for N-API

Building our binding for N-API is a bit easier than for NAN. N-API is a C API, so we will only need to write some C code here! Here are the steps involved with creating the binding for N-API:

Imports

Here, we also define the version of N-API we will use for node-gyp to know which version of the API to provide us.

Obtain a string

In N-API, all actions return a napi_status. For each call we make, we need to check that this returned status is napi_ok.

In this code, we make a first call to napi_get_value_string_utf8 on line 12 to get the length of the string we will write. Then we allocate a buffer based on this length and call napi_get_value_string_utf8 again to fill it (on line 25).

Method we will expose

This method will return the length of a string.

This function follows the same principles as the NAN one:

  • On lines 9 to 17, we first check the arguments of the function
  • Then we call our get_utf8_string function on line 19
  • We create a number that we will return on line 24
  • And eventually we return it on line 30

Init the module

Once again, we need to create a function to expose and attach it to the module. This is the C method we want to make available from JavaScript.

And if you notice the last line, we call the macro NAPI_MODULE. It is similar to the  NODE_MODULE one.

The full code is available here.

Building for the world

Choosing build targets

Since we do not plan on releasing the sources of the module, we need to build it for multiple targets. In our case, there will be:

  • MacOS
  • Windows
  • Linux with glibc – we will use CentOS 6 here to ensure we build with an older version of glibc
  • Linux with muslc (mostly Alpine Linux)

In the example repo, we will ignore the Linux builds.

We will only focus on x86 architecture for now, but of course, we should eventually think about supporting other CPU architectures.

Remember, we also need to build for diverse versions of Node.js. One build per pre-N-API major versions and one build for N-API.

Building on the cloud

Before we could get started, we needed to choose a cloud platform for our modules. After a bit of research, we found that Azure Pipelines was an ideal platform:

  • It supports Windows and MacOS
  • It has Docker support through Ubuntu
  • It provides free builds for open source projects

In our example here, we will only build and run the tests.

The builds are available here and the build manifest is in the repository.

Eventually, these build add-ons are to be pushed into AWS S3 for distribution.

Distributing our Node.js native add-on

Now that we had chosen a cloud platform for our modules, we needed to choose a means of distributing them.

node-pre-gyp to the rescue

node-pre-gyp is a great tool for anyone who would like to distribute Node.js native add-ons:

When a package is installed, node-pre-gyp downloads the right tarball from a remote server and lets the JavaScript code know where to find it. 

It works based on a configuration place in the package.json file:

The line "remote_path": "./nodejs/libsqreen/b20190916.65/{platform}-{libc}/{arch}/" is used to tell it which tarball to download. When running, it will fill the path with the following variables:

  • platform (darwin, win32, linux,…)
  • libc (glibc, muslc, unknown)
  • arch (x86, ia32, …)

It’s your job to ensure these paths exist in the S3 bucket, otherwise the call to npm install will fail! In this case, that’s not a desired outcome, so fortunately, there is a way to ignore it.

How to fail safe

The In-App WAF feature we recently introduced in Sqreen is not directly needed for the rest of the product to work. This means that if the native add-on is not available, only a certain set of features will be unavailable to the end user, while the rest of Sqreen’s monitoring and protection capabilities will work unhindered.

So, we decided to publish another module (named sq-native) that would only download the native add-on and make it available. This module has been set as an optional dependency of our main module.

This means that if for some reason, node-pre-gyp is not able to download the native add-on, the installation will show some errors in the console but the call to npm install will not fail. As such, Sqreen will continue to work in your application, just without the In-App WAF feature. 

Conclusion: the quest is not over

This article has covered what the Sqreen native add-on looks like for Node.js. We touched on the process for building it and how we scoped out the need. This is not the end of the story however. At this moment, our native add-on is available for all the platforms we support. But eventually, someone will want to run it on ARM processors and we will have to provide a solution here.

There are several ways to tackle this. We could extend our build matrix, for one, but another solution might be smarter: WebAssembly (WASM).

I have already built an alternative to our module using WASM. This would enable us to only build our add-on once and run it on all platforms for all versions of Node.js supporting WASM.

Regardless of what the future holds, building a native binding has set the stage to enable us to support the Sqreen In-App WAF in the ways our users want to run it, no matter what they are.

Leave a Reply

avatar
  Subscribe  
Notify of
You May Also Like