Shrinking node_modules for AWS Lambda (My First “npm shock” and a Practical Fix)

I’m a newbie in Node.js and not really familiar with the whole npm ecosystem. I honestly didn’t expect that a single npm install would explode into ~10MB spread across ~30 different folders.

After a few minutes of staring at it, I realized what’s actually inside those folders: not just runtime code.

What shocked me

Inside node_modules you get everything:

  • README files
  • source files
  • tests
  • examples
  • docs

And then the next level of madness: modules inside modules. Duplicated dependencies nested multiple levels deep (e.g. inherits, core-util-is, tedious…), even if they already exist elsewhere in the project tree.

Here’s a typical example: I’m in node_modules → package bl, and there’s another node_modules folder inside it, plus test, README and other “junk”… all for a ~10KB JS file.

The obvious assumption (that turned out wrong)

Once I accepted the chaos, I assumed there must be a “build for production” command that would “compile” dependencies into something lean:

  • only dist / lib
  • no tests / docs / examples
  • no duplicated nested dependencies
  • preferably minified

Aaaaannnnnd… nope. There isn’t (at least not as a standard npm feature that magically fixes it).

Why I cared: AWS Lambda size limits

Note: This story is within AWS Lambda scripts, where there are strict size limitations (per function/package and also per region/account constraints depending on what exactly you upload: zipped package, unzipped size, layers, etc.).

It wasn’t my problem at that moment, but it easily can become one later when the system grows and each Lambda drags its own node_modules mess.

The tool I found: npm-dist

I found a script called npm-dist that can “shrink” modules by producing a dist output.

Why “somehow”? Because it can’t handle circular dependencies. And there are plenty of those in real projects. But still — it was something, and it helped reduce package size.

Basic usage

const Trim = require('npm-dist');
Trim("mssql");

But that works for a single package. I needed a loop.

Attempt 1: iterate by package-lock.json

First I used package-lock.json to iterate dependencies and shrink them:

const Trim = require('npm-dist');
var packs = require('./package-lock.json');
Object.keys(packs.dependencies).forEach(d => {
    console.log(d);
    Trim(d);
});

I hit circular dependency issues on one of the modules, so I tried a more direct approach.

Attempt 2: iterate by directories under node_modules

Instead of relying on dependency graph, I just enumerated folders in ./node_modules:

const Trim = require('npm-dist');
var fs = require('fs');
const packs = fs.readdirSync("./node_modules").filter(
    d => !d.startsWith(".") && d != 'npm-dist'
);
packs.forEach(d => {
    console.log(d);
    Trim(d);
});

I filtered out system folders (starting with a dot), excluded npm-dist itself, and then added problematic modules manually to avoid the script terminating.

Results (small project)

  • There were ~30 folders in node_modules
  • Only one module was problematic due to circular dependencies
  • Total size was ~4MB (excluding the problematic one)
  • A new dist directory was created by the script
  • Result size was ~1MB

The problematic package, as I mentioned, I processed in isolated mode, then copied the resulting dist output into the main dist folder. Some manual adjustments were required, but it was worth it.

Final thoughts

I know this is not some “CI/CD regulated” or “production-oriented” process — but it helped me build a few minimized Lambda layers that I’m using right now, and I’m happy.

Comments

Popular posts from this blog

Decimal number issue from server to client. Wow! Didn't expect this @_@