Compromising file systems with npm dependencies

JavaScript-Abhängigkeiten aus dem npm-Ökosystem können schon bei der Installation mithilfe sogenannter „install hooks“ auf dem eigenen Computer beliebigen Code ausführen. In dieser Bachelorarbeit wird die Benutzung dieses Features einer Risiko-Analyse unterzogen und Abwehrstrategien empfohlen. Dazu wurde eine Analyse-Software entwickelt, mit der etwa 250.000 existierende Pakete daraufhin untersucht wurden. Auch wenn mit dieser Analyse kein offensichtlich bösartiges Verhalten nachgewiesen wurde, konnten einige Pakete, die Informationen stahlen, an npm gemeldet werden. Als Abwehrstrategie wird abschließend empfohlen, Code-Execution zu unterbinden und den System-Zugriff für Entwicklungsprojekte stark einzuschränken.

JavaScript packages from the npm ecosystem can execute arbitrary code on developers’ machines upon installation. Our work analyzes the existing usage for its risk and proposes mitigation strategies. To do so, we built an analysis pipeline and used it to check around 250,000 packages for their usage of that ability. While this dynamic analysis did not yield any malicious results, we did find 15 packages exfiltrating sensitive information by manually examining the npm metadata. Mitigation-wise, we advocate for suppressing such code execution and limiting access to the system outside the development directory.

1 Introduction

This first chapter introduces the problems we are inspecting in this thesis and motivates our work, outlines its structure and states the research questions we set out to answer.

1.1 Motivation

Many of the applications we use in our day-to-day lives today are either in our browser (like Gmail, Twitter, Google Docs) or desktop apps built on web technologies (like Slack, Microsoft Teams, Discord, Visual Studio Code). At the very core of them: an abundance of ever-growing JavaScript code [1].

Much of that code comes from third-party dependencies called packages or modules. npm has established itself as the single most used source of JavaScript packages. It is both a website which stores metadata and code, as well as a command line program to download and install those packages on a developers’ computer.

Upon installing a package (commonly using npm install the_package_to_install), npm allows the package to execute arbitrary code with so called install hooks (also described as install scripts). This is not unique to npm, as many other package managers also allow this¹. But with the proliferation of JavaScript and the ever-increasing numbers of packages the average project depends on, the potential number of victims and attack surface are continuously growing.

To put the issue into perspective: GitHub alone, which acquired npm in 2020-03-16 [2], reports about 1.7 newly found vulnerabilities in the npm ecosystem every day² (both intentionally malicious packages and accidental security flaws).

In this thesis, we analyze the usage of install hooks across the npm ecosystem. We survey the available metadata and prototype a scalable dynamic analysis pipeline to detect malicious install hook behavior.

1.2 Structure

This first chapter focuses on an introduction to our work in the context of the risks for the npm ecosystem and stating our research questions.

In the second chapter, we give a brief introduction to the background knowledge required to understand our work. This includes descriptions of npm (both the registry and the command line interface), Docker (containers, Dockerfiles and the Docker system), Celery (the Python work queue) and strace (the Linux system call logger).

Afterwards, we present our study design in chapter 3: how we extracted the metadata from the npm servers, implemented our analysis pipeline and used it to gather behavior summaries.

Chapter 4 then presents the raw figures of our analysis, while chapter 5 discusses their implications and attempts to answer our research questions.

Chapter 6 then contextualizes our work among other related papers and endeavors in the area of securing software supply chains.

Our work is wrapped up in chapter 7, where we draw our conclusions and briefly discuss possible future work to improve on it.

1.3 Research Questions

2 Background

In this chapter, we provide some background knowledge required to understand this work. We introduce the npm ecosystem, Docker, Celery and strace.

2.1 npm

First, we outline the components and functionality of the npm ecosystem. This includes the npm registry and the CLI tool, as well as package.json files for JavaScript projects.

2.1.1 Registry

npm is the default package registry for the JavaScript ecosystem. Private registries for enterprises and alternative open registries do exist, but are insignificant in their proliferation among JavaScript developers.

A package registry is a common concept among programming languages. It is a service allowing developers to upload code (and auxiliary files) for other developers to download and reuse. These bundles of code are often referred to as packages and, in the case of npm, are just tar archives (similar to zip files) and some metadata (authors, version, license, etc.) freely downloadable via HTTP(S). Some companies working with JavaScript host their own non-public registries for internal code.

As a simple example, consider the following trivial, but still widely used, npm package: is-odd³. In essence, it provides a JavaScript function which developers can use by importing it to check if a given input is an odd number:

2.1.2 package.json

JavaScript projects commonly use a file named package.json to define its metadata and dependencies. Let us have a look at this simple example file in Listing 2.1.

This project is called some-application and depends on the packages fancy-error-messages (at version 1.3.0) and typescript (at version 3.0.0). It also defines a build task and an install hook.

To run the build task or fetch and install the dependencies, JavaScript developers commonly use the npm CLI (short for command line interface) tool, which we will introduce next.

2.1.3 CLI Tool

npm is a CLI tool used to manage (install, update and remove) dependencies in JavaScript projects. It can also run project-defined shortcuts for common tasks.

Most notably and relevant for this thesis is the execution of install hooks upon running npm install. While Listing 2.1 only uses the install hook, npm allows the usage of multiple scripts as install hooks. Currently, these are (executed in this order):

npm executes these hooks for every package in the dependency tree, including indirect⁴ ones. This gives packages on npm the possibility to download auxiliary data like platform-specific binaries depending on the specific system, but also to execute malicious scripts. Execution of such install hook scripts can be suppressed by adding a (not widely used) option to the installation command: npm install --ignore-scripts.

Although the npm CLI tool is bundled with NodeJS installations, over time multiple alternatives have emerged, with yarn⁵ and pnpm⁶ being two popular ones. While they also install dependencies and run scripts, this work will still focus on the original and most commonly used npm tool.

2.2 Docker

The term “Docker”, as it is used in this work, is an umbrella term for containerization technology, developed by the company of the same name. It can be used to create OCI images [3] via “Dockerfiles” [4] and to manage and run containers.

Containerization is, generally speaking, a method of isolating software (and everything required to run it) from the rest of the operating system (See Section 2.2.3)

2.2.1 Dockerfiles

A Dockerfile can be described as a recipe with the steps needed to build and then run software with docker.

2.2.2 Images

Images are the built blueprints for containers (which are introduced in the next section). Simplified, they are a tar-compressed directory with all the dependencies for the software packaged in them and instructions on how to run that software, as previously defined in the Dockerfile.

This reads in the content of the file Dockerfile, runs the instructions contained in it top to bottom, and stores the finished image under the name (“tag” in Docker lingo) “my-software:v1”.

2.2.3 Containers

Containers are running versions (instantiations) of images. Docker uses Linux kernel features like process, file system and network namespacing to isolate them from the operating system they are run on. For the most parts, you cannot access the outer operating system from within a container.

You could create a new container from the example image from above with this command:

Which will unpack the image’s content to a directory, and run the script specified in the ENTRYPOINT isolated to this unpacked directory. From the viewpoint of the packaged software, the container mostly behaves like an operating system on its own, just for the processes running inside it. Ignoring mounts for now, as they are not relevant for this thesis, all files written and changed in the container will be removed once the container is removed.

It should be noted, though, that, by default, containers do not offer the same level of isolation and protection as virtual machines [5], which (among other mechanisms) do not share the kernel with the host computer. For this reason, if the Linux kernel of the computer you run a docker image on has a security flaw, you might be able to use it to break the container isolation and take over the host.

Over time multiple projects, which aim to bring container security on par with virtual machines, emerged, with gVisor [6] being used in this thesis to protect the computer of the author at the cost of a small performance penalty.

2.3 Celery

Celery⁷ is a Python task queue system, meaning that it can receive a number of computational tasks and execute them on so-called worker processes.

As depicted in the following diagram, its architecture consists of a dispatcher process which sends tasks to a task queue (e.g. RabbitMQ or Redis), from which the worker processes will pick them up to execute. Results are then sent back via the task queue to the dispatcher:

Let us take a look at an adapted sample from the Celery documentation [7]. A simple task can be defined like this:

To start some worker processes which can execute this task, we run a command like this:

Having some running worker processes, we can then add this multiply task to the queue so that it will be executed:

This will add the task to the queue. A worker process will take that task, execute it, send the result back to the queue, where it will be picked up again from our calling code and placed in the result variable. This system allows us to centrally control work to be done, but have it executed on many workers on (potentially) many computers.

2.4 Strace

strace is a tool for Linux, which can be used to log each system call a process makes. System calls are used by programs to access kernel functionality, like opening and reading files, but also making network requests, allocating memory and starting other processes.

Listing 2.2: Abbreviated example strace output for the true command

execve("/usr/bin/true", ["true"], 0x7fff9cab6a30 /_ 61 vars _/) = 0
brk(NULL) = 0x557631f8b000
arch*prctl(0x3001 /\* ARCH*??? */, 0x7fff336b71d0) = -1 EINVAL (Invalid argument)
mmap(NULL, 8192, PROT*READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6d6b6e2000
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
newfstatat(3, "", {st_mode=S_IFREG|0644, st_size=105751, ...}, AT_EMPTY_PATH) = 0
exit_group(0) = ?
+++ exited with 0 +++

Listing 2.2 shows multiple possible system calls, some of them accessing the file system. A few important ones are:

Using strace like this, we can filter for only file system related system calls to track file reading and writing done by the npm install command:

3 Study Design

Our research process is split up into multiple stages, forming an analysis pipeline:

In this chapter, we state our design goals and describe our implementation of this pipeline. The code of this pipeline is Open-Source [8].

3.1 Design Goals

While npm currently hosts over two million packages⁸ with even more total versions, it was clear that we could not analyze all of them: Access to university computing resources for this bachelor thesis was limited to two different Linux servers providing a total of 28 threads as well as the authors’ laptop. This heterogeneous infrastructure guided our design decisions.

The implementation of especially the third step needs to work with freely available components, be easily self-hostable on commodity systems, which may be placed behind restrictive firewalls or networks using NAT⁹. We therefore settled on the following technologies:

3.2 Infrastructure Setup

All analysis can be replicated using a single Linux machine, but we hosted the components on multiple physical servers for increased throughput:

A central Redis server was set up as a job queue for Celery. All worker processes connect to it, retrieve their work and report their status back once finished. The worker processes are provided with a connection string in the form of redis://redis.example.cs.tu-dortmund.de:4242/0.

For the result storage, we set up an internet-facing Minio server. It was configured to be publically reachable via HTTPS, so that results could be accessed from anywhere, even the web app as described in Section 3.6. Data upload was restricted to the worker processes using access credentials.

Servers used to run the worker processes were set up to only run the analysis Docker container and docuum, a utility to delete old and unused docker containers in case the disk was getting filled up too much. Not running any other software ensures that any potential malware would not be able to access any relevant files in case it manages to escape the Docker isolation.

3.3 Gathering Metadata

npm runs a publically accessible CouchDB database which contains a complete list of all packages. As CouchDB’s API is just JSON over HTTP(S), we were able to run a database query to get a single JSON document with all packages details:

Aside from the first line with some metadata and the last one, each line in between is one JSON object representing one package¹⁰.

As this file was too large to fit into RAM, we used a python script to read the file line by line, parsing just one line (respectively one package’s JSON) at a time, inserting each version which made use of install scripts into an SQLite database table of the following (abbreviated) layout:

This process results in an easy-to-query data format with just the information relevant to our analysis.

From there on, we perform some basic statistical analysis on this database and also extract a list of packages with at least one hook script set for dynamic behavior analysis.

Column Title	Example Value
`ID`	`pkg@1.1.0`
`NAME`	`pkg`
`VERSION`	`1.1.0`
`AUTHOR_NAME`	`jsdev@example.com`
`SCRIPT_PREINSTALL`
`SCRIPT_INSTALL`	`node ./install.js`
`SCRIPT_POSTINSTALL`
… other scripts …
`DIST_TARBALL`	https://registry.npmjs.org/pkg/-/pkg-1.1.0.tgz

3.4 Statistical Metadata Analysis

To create a first understanding of our dataset, we run multiple explorative SQL queries on the SQLite database, to (among others) determine:

We used the results of those queries to manually inspect packages to understand typical hook usage and to guide the filter tuning for our dynamic analysis.

3.5 Package Behavior Analysis

For further maliciousness analysis, we observe the dynamic behavior of install hooks. We implemented a pipeline as depicted in Figure 3.1, which we will discuss in this subchapter.

To make this bachelor thesis feasible, we reduced our scope to checking if a package’s install hooks changed or touched files they should not have changed or touched.

From the SQLite database, we exported the newest version using hooks for each package as a text file, with every line following the same pattern: package_name@package_version.

A python script, the celery dispatcher, reads this file and creates analysis tasks for each package version in it. Using Celery, these tasks are queued in a Redis server for worker processes to pick up.

Each worker process pulls and executes tasks proportionally to the available CPUs of the computer it runs on.

For each analysis task, the worker starts a new Docker container with a dummy JavaScript project already set up and npm patched to wrap all script executions into the strace utility to catch every file system access invoked by package install hooks.

Inside that container, it downloads and extracts the tarball for the package version to analyze into the dummy project’s node_modules/ directory. Afterwards, the analysis task saves a list of changed files, which Docker calls diff. As no hooks were executed yet, this list of changed files just includes the freshly extracted package files. Then, all available install hooks (according to the package.json of the newly extracted package) are called via the patched npm CLI. This generates strace log files in /tmp, which contents are subsequently extracted from the container to generate a list of file paths which were read.

As strace just logs the exact system calls used, paths are often relative. To be able to filter them against a standardized list, we just assume that they are each made from the within the dependencies directory and use Python’s os.path.join() and os.path.normalize() to bring them into a canonical absolute form, like the output of Docker’s diff.

Another container file system change diff is then captured and the set difference between the current and the previous state is calculated, forming the list of file paths which were edited. These lists (edited and read file paths) are then filtered for irrelevant files (e.g. temporary and log files and read and execute access to common binaries and files in the demo package’s directory). A separate list of especially suspicious file path accesses [9] (e.g. ~/.bashrc and ~/.ssh) is generated from this filtered view. From these lists, the worker also calculates a risk score, which is described in more detail in the subsequent Section 3.7.

The results are bundled into a JSON file (see Listing 3.1) which is then uploaded to a Minio server, an S3 compatible object storage server, for later retrieval and evaluation.

Listing 3.1: Abbreviated example of the JSON result

{
  "package_name": "@someorg/somepkg",
  "package_version": "0.1.17",
  "risk_factor": 0.16154565168975454,
  "hooks": [{ "name": "install", "script": "node ./steal.js" }],
  "hook_count": 1,
  "changed_files": [
    "/etc/hosts"
  ],
  "changed_files_count": 1,
  "read_files": [
    "/home/person/.ssh/authorized_keys"
  ],
  "read_files_count": 1,
  "forbidden_files_touched": [],
  "forbidden_files_touched_count": 0,
  "parsed_strace": [
    {
        "syscall": "execve",
        "pre_flags": null,
        "path": "/usr/local/bin/node",
        "post_flags": null,
        "maybe_write": false
    }
  ],
  "raw_strace": "..."
}

3.6 Analysis Results Web App

As a proof of concept for what can be done with these results, we developed a simple web application (seen in Figure 3.2) where developers can check their JavaScript project’s dependencies against our security analysis’ results by uploading the project’s package-lock.json file, which contains all direct and transitive dependencies used.

This web app just accesses the JSON files from Minio and renders their results in HTML. Figure 3.3 shows the detail view of such an analysis.

3.7 Risk Factor Calculation

To easily quantify the security risk of the various findings, we also calculate a so-called “Risk Factor” based on the static and dynamic behavior analysis described earlier.

This factor is a floating-point number ranging from 0 to 1, with higher numbers indicating a larger risk for security issues. We define the risk factor for an npm package as

$$limit(x,d) := \text{max}\left(0, \frac{-1}{e^{x \cdot d}} + 1\right)$$ ch := Σ executed install hooks ce := Σ edited files − ignored files cr := Σ read files − ignored files cs := Σ touched sensitive files, e.g. .bashrc

The limit(x,d) function is based on the exponential function and maps an arbitrarily large input down into [0,1], while ensuring that

d serves as a “dampener” variable, which controls how fast a growing x makes limit(x,d) approach 1.

This risk factor is calculated for every analyzed package and allows us to identify the most risky packages for manual analysis without having to manually review every result for potential risky behavior.

4 Research Results

This chapter lists the raw results we gathered from analyzing the metadata from npm as well as the dynamic execution of npm hooks.

4.1 Statistical Metadata

Analyzing the npm package metadata from 2022-03-13 brought up the following results.

The database dump listed 1,903,676 packages, 248,636 of them (13.06%) with at least one published version which uses hooks. Figure 4.1 shows the distribution of install hook usage among packages which used them at least once in their version history.

In total, the 3,463,895 versions of packages tracked in our database [10] use install hooks.

Out of that, 113,291 package versions trigger other npm scripts (script starts with npm run), so are just redirects for existing tasks.

By manual inspection of script texts, we discovered 15 malicious packages, which are discussed in more detail in Chapter 5.

4.2 Package Behavior

All these 248,636 package versions with install hook scripts were scheduled for analysis. 245,772 (98.84 %) of the scheduled jobs succeeded and uploaded results. The ones who failed did so because of exceeding resource limits we set (RAM, time) or for other miscellaneous reasons.

During dynamic behavior analysis, we tracked the count of read files, edited files and (for both of those groups combined) accesses which are not to be expected during package installation, and the number of hooks executed during installation. As visible in Table 4.1, most install hooks did not access more than two files, be it reading or writing. We did not detect any access to files in our “forbidden” filter list.

Table 4.1: Dynamic analysis results for 245,772 npm packages
Statistic	Median	Mean	Maximum
Files edited	1.000	1.454	3352.000
Files read	1.000	1.454	3352.000
Forbidden files edited or read	0.000	0.000	0.000
Install hooks executed	1.000	1.006	7.000
Risk factor	0.522	0.449	0.727

5 Discussion

5.1 Analysis – Metadata

As we saw in the previous chapter, only a small number of packages actually use hooks. Inspecting the usage, we found some notable antipatterns:

Projects are using their hooks to install dependencies. For example: The package java_process has its preinstall hook set to execute npm install autodts. One possible reason for this method of dependency installation (opposed to just declaring the dependency in the package.json) might be to always install the newest available version. Needless to say that this bears the risk of breaking code once the package installed like this change their public API or behavior.

Others, like 1337-docs use their scripts to set up git hooks (git config core.hooksPath .githooks/). While this is expected from packages like husky, which are explicitly used as developer dependencies to ensure that every developer working on a package can use the same git hooks, most packages are not supposed to interact with the git repository.

A relatively small amount of package versions (195) use sudo to perform actions with elevated privileges. Among those samples, we found packages trying to install dependencies (ttb-ibeacon → sudo apt-get install --fix-missing -y bluetooth bluez libbluetooth-dev libudev-dev), globally installing npm packages (dyna-ts-react-module-boilerplate → sudo npm install -g typescript) or just invoking some install scripts (rabbit-mailer → sudo ./install). Many of those packages have since published newer versions without using sudo, bringing their code into accordance with the npm documentation’s [11] suggestion:

5.2 Malicious packages

Skimming through the list of all possible values for install hooks, we noticed another suspicious pattern: Packages, which had their preinstall hook set to node index.js. For a normal package, you would expect the hook to be set to something similar toscripts/preinstall.js or hooks/first-time-setup.sh, but not to execute the primary script of the package, index.js. index.js normally serves as the external interface for a package’s functionality, exporting its APIs for other projects to interact with via JavaScript.

use nearly the same install hooks, which uploads hostname, username and the environment variables to a remote host (Listing 5.1, Listing 5.2).

Other packages also exfiltrate this information, but use plain JavaScript to do so:

Listing 5.3: Information leakage via slightly obfuscated Javascript in the widget-framework@972.988.0 package

const http = require('https');

function main() {
  var data = process.env || {};
  if (Object.keys(data).length < 10) {
    return;
  }

  req = http.request({
    host: ['964a4e924030bf1dbadda43f51807238', 'm', 'pipedream', 'net'].join('.'),
    path: '/' + (process.env.npm_package_name || ''),
    method: 'POST'
  }).on('error', function (err) {
  });

  req.write(Buffer.from(JSON.stringify(process.env)).toString('base64'));
  req.end();
}

main();

Listing 5.4: Leakage of specific env vars in the ac-appleicons@9.886.0 package

var http = require("https");

var filter = [
  { key: "npm_config_registry", val: ["taobao", "org"].join(".") },
  {
    key: "npm_config_registry",
    val: ["registry", "npmmirror", "com"].join("."),
  },
  { key: "USERNAME", val: ["daas", "admin"].join("") },
  { key: "_", val: "/usr/bin/python" },
  {
    key: "npm_config_metrics_registry",
    val: ["mirrors", "tencent", "com"].join("."),
  },
  [
    { key: "MAIL", val: ["", "var", "mail", "app"].join("/") },
    { key: "HOME", val: ["", "home", "app"].join("/") },
    { key: "USER", val: "app" },
  ],
];

function main() {
  var data = process.env || {};
  if (
    filter.some((entry) =>
      []
        .concat(entry)
        .every((item) => data[item.key] && data[item.key].includes(item.val))
    ) ||
    Object.keys(data).length < 10
  ) {
    return;
  }

  var req = http
    .request({
      host: [
        "5b2da2f35c4c084aa770bb0ddd58b0c7", "m", ["pip", "edream"].join(""), "net",
      ].join("."),
      path: "/" + (data.npm_package_name || ""),
      method: "POST",
    })
    .on("error", function (err) {});

  req.write(Buffer.from(JSON.stringify(data)).toString("base64"));
  req.end();
}

main();

We reported all these packages to npm on 2022-07-01 and 2022-07-02. Most of them had very low download numbers, suggesting that their author(s) did not manage to spread them far. All of these packages had in common that their newest released version did not contain malicious code. We assume that this was done to evade detection by tools which only check the newest versions. As we used only the latest versions with hooks for our analysis, our pipeline did not examine these samples.

Some other packages like isq-web at version 2.0.1 do similar data extraction, but limit that to the package name, hostname and home and current directory. Their scripts start with a notice at the top of the hook (Listing 5.5). The apparent author confirmed that they used such packages for dependency confusion attacks [12] on companies with bug bounties. In essence, dependency confusion aims at misconfigured developer computers, which first try to download packages from the public npm registry, instead of the companies internal one. If a public package with the same name and version as an internal one exists, it is then downloaded instead of the internal one. Because of this relative harmlessness and security research dedication, we did not report these packages to npm.

5.3 Analysis – Package Behavior

As visible in Table 4.1, most packages which employ hooks do not access files outside the project they are added to as a dependency. Most packages which we checked manually and which did access files outside the project directory, while theoretically questionable, did so for seemingly benign reason. Many packages use hooks to compile native bindings using node-gyp, the NodeJS mechanism to link JavaScript code with native, non-interpreted code. Another common pattern was the installation of other npm dependencies in the form of running npm install some_other_package@version. A few packages did so outside of the project directory, leading to high numbers of changed files tracked.

5.4 Possible Security Improvements

Based on our findings from observing the package behavior, we can make some suggestions on how to counter these threats, sorted from easiest to implement to more involved methods.

A simple observation: Do not run npm install with root permissions. Almost no package should need to install system-wide dependencies, as node-gyp allows the compilation of binaries for npm packages, which can then just be stored and used locally. If external packages are required, it is way safer to require the developer to explicitly install them manually using sudo.

Not using hooks at all is another effective method. They are often not needed and by using the --ignore-scripts command line option, or (to not have to type it in every time) setting it as a user wide setting in the npmrc file, they can be skipped. If needed, this can be re-enabled on a per-project basis:

Disabling hooks in general risks breaking existing workflows, making manual exemptions necessary for some projects.

Users dedicated to restrict access even further could limit the reach into the system for npm and sub-processes such as hooks to the project’s directory by using Docker containers, or tools like firejail¹¹ or bubblewrap¹². Throw-away code environments like GitPod¹³ and GitHub Codespaces¹⁴ are picking up momentum for fully remote and isolated per-project coding environments.

Of course, such security measures could also be integrated into npm, yarn or pnpm themselves, by changing the default to ignore-scripts=true and by using the isolation mechanisms provided by the currently used operating system to shield the developer’s data from packages.

5.5 Known Deficits of our Method

Detecting malicious behavior in software is a continuous back-and-forth of implementing new detection techniques and finding novel evasion methods. And just like with unit tests for software, it is only possible to prove the presence of malware, but not its absence. With this in mind, we identified and accepted a few deficits in our methodology in return for a timeline appropriate of a bachelor thesis.

Malicious software may sense being in a testing environment instead of a developer’s computer and skip executing its payload. There is plenty of attack surface to achieve this, with an easy one being the /.dockerenv file present in every Docker container. Our analysis containers have a small software footprint (no common desktop software installed) and the analysis follows a very specific, observable pattern. Sensing being analyzed and deploying countermeasures to stay undetected [13] is common for malware nowadays and may have led to our analysis being wrong in some cases.

Another deficit is the lacking sophistication of the risk factor calculation. Due to resource constraints, it was often only possible to evaluate the usefulness and distribution of the risk factor after running analysis on larger sets of packages, as we could simply not store every artifact of the dynamic analysis (like full file system diffs or all straces) for larger amounts of packages. Iterating on the set of path filters (irrelevant and relevant files) takes quite some time when using significant amounts of packages to test against.

6 Related Work

The research area of securing software supply chains, including the focus on the npm ecosystem, and the approach taken to tackle the problem in this work are not unique. There are a few notable works, which inspired this thesis or just coexisted.

The infamous study “The Eval That Men Do” [14] from 2011 first analyzed potential security errors through the use of the JavaScript eval() call, which interprets and executes arbitrary strings of JavaScript from within JavaScript. They did not use npm as the source for information, but crawled webpages for content. They patched WebKit (from Apple’s Safari) to be able to observe every call to eval used. This contrasts with our approach, which observes every possible action, be it JavaScript or not, through strace. In return, we lose the specificity of being able to pinpoint the exact code which lead to which execution.

Duan et al.[15] took a broad look at not only the npm ecosystem, but also Python’s pip and Ruby’s gem, analyzed their contents and also found and reported 339 malicious packages. They contribute a pipeline for this analysis, as well as categorizations of security issues and mitigation strategies and how they are employed by each ecosystem.

Another large-scale analysis which, methodically, is quite similar to this work is the OSSF’s package analysis framework [16]. While not a paper, this working group developed software to continuously monitor package repositories for changes and, building on that, analysis software which then analyzes each package from different angles. Focus is on some “critical projects”, according to [17]. They take a very similar approach to us when it comes to detecting script behavior: They also use and parse strace, but e.g. also use wireshark to analyze networking behavior.

A more commercial take on the issue of npm dependency security as a whole, which only takes a surface look at hook security, is Socket.dev [18]. This startup’s product checks JavaScript projects for multiple issues (including name confusion, license issues and known malicious packages) and also calculates some risk scores, but can only warn of hooks being used in general (Figure 6.1).

We made use of Socket.dev in the work for this thesis, as it sometimes still offers the source code of malicious packages, which npm already removed, after learning of its maliciousness.

7 Conclusion

We set out to analyze the usage of install hook scripts among the npm package registry and gather evidence of their malicious usage. In this chapter, we will try to answer our research questions and take a critical look at shortcomings of our work towards that.

7.1 Research Questions answered

7.2 Limitations

One area lacking refinement is the filtering of irrelevant and the detection of malicious file system accesses. In this thesis, we iteratively expanded manually curated lists of paths which we deemed irrelevant from a security perspective. While it brought the signal-to-noise ratio down significantly, it also increased the risk of missing accesses which could actually be relevant.

The flip side was our, also manually curated, list of path accesses, which were critical and would greatly increase the package’s risk factor. With many lists, we split them into read and write accesses, as reading DNS config is necessary, but changing it is not to be expected from an npm package. During our analysis of all those 248,636 packages, we found none which accessed any critical path, according to our list. While we expected to find something, we found none of them appearing to be malicious in that aspect – or evading our analysis. Given our reduced scope and sophistication, we assume that did happen to some unknown extends.

7.3 Future Works

As with many time-limited software projects, there is abundant room for improvement left over from this thesis. Thus, this chapter makes a few suggestions to iterate on our design.

If available storage space allows for it, the package’s code, all changed files and strace logs could be indexed for searching, possibly by using the Open-Source code search Tool Sourcegraph. This way, it would become possible to search for newly found evil patterns in other packages, without having to re-run the whole analysis pipeline again. Also, evil actors usually don’t link their npm packages to public git repositories of the code included in the packaged tar, so we can’t even use GitHub to search for them. Even if they did do so, a match between a linked git repository and the actual code in the npm package is not enforced by npm.

Currently, the analysis pipeline only tracks a few features, all related to install hooks. There are more potential maliciousness signals that could be used to find interesting packages. Mismatching code between package tar and linked git repository, missing descriptions or same description as another, differently named project come to mind.

8 References

[1]

“HTTP archive: Page weight.” [Online]. Available: https://httparchive.org/reports/page-weight?start=2017_07_01&end=2022_03_01&view=grid. [Accessed: 28-Apr-2022]

[2]

“Npm is joining GitHub. The GitHub blog,” 16-Mar-2020. [Online]. Available: https://github.blog/2020-03-16-npm-is-joining-github/. [Accessed: 29-Apr-2022]

[3]

OCI image format specification. Open Container Initiative, 2022 [Online]. Available: https://github.com/opencontainers/image-spec/blob/02efb9a75ee11e05937b535cc5f228f9343ab2f5/image-layout.md. [Accessed: 30-Mar-2022]

[4]

Docker Dockerfile Specification. Docker, 2022 [Online]. Available: https://github.com/docker/cli/blob/68cad50f61cd5cdda9d7151c730f8dc103304b98/docs/reference/builder.md. [Accessed: 30-Mar-2022]

[5]

RandalAllison, “The ideal versus the real,” ACM Computing Surveys (CSUR), Feb. 2020, doi: 10.1145/3365199. [Online]. Available: https://dl.acm.org/doi/abs/10.1145/3365199. [Accessed: 20-Jun-2022]

[6]

E. G. Young, P. Zhu, T. Caraza-Harter, A. C. Arpaci-Dusseau, and R. H. Arpaci-Dusseau, “The true cost of containing: A gVisor case study,” in 11th USENIX workshop on hot topics in cloud computing (HotCloud 19), 2019 [Online]. Available: https://www.usenix.org/conference/hotcloud19/presentation/young

[7]

“First steps with celery — celery 5.2.6 documentation.” [Online]. Available: https://docs.celeryq.dev/en/stable/getting-started/first-steps-with-celery.html#application. [Accessed: 04-May-2022]

[8]

J. Zohren, Source code for bachelor thesis "Compromising file systems with npm dependencies". Zenodo, 2022 [Online]. Available: https://doi.org/10.5281/zenodo.6832500

[9]

“The most important linux files to protect (and how) BeyondTrust.” [Online]. Available: https://www.beyondtrust.com/blog/entry/important-linux-files-protect. [Accessed: 15-May-2022]

[10]

J. Zohren, “SQLite3 DB of npm package versions using install hooks.” Zenodo, Jul-2022 [Online]. Available: https://doi.org/10.5281/zenodo.6813093

[11]

“Scripts npm docs.” [Online]. Available: https://docs.npmjs.com/cli/v6/using-npm/scripts. [Accessed: 26-Jun-2022]

[12]

A. Birsan, “Dependency confusion: How i hacked into apple, microsoft and dozens of other companies. Medium,” 09-Feb-2021. [Online]. Available: https://medium.com/@alex.birsan/dependency-confusion-4a5d60fec610. [Accessed: 01-Jul-2022]

[13]

A. Afianian, S. Niksefat, B. Sadeghiyan, and D. Baptiste, “Malware dynamic analysis evasion techniques: A survey,” ACM Comput. Surv., vol. 52, no. 6, Nov. 2019, doi: 10.1145/3365001. [Online]. Available: https://doi.org/10.1145/3365001

[14]

G. Richards, C. Hammer, B. Burg, and J. Vitek, “The eval that men do,” in ECOOP 2011 – object-oriented programming, 2011, pp. 52–78, doi: 10.1007/978-3-642-22655-7_4.

[15]

R. Duan, O. Alrawi, R. P. Kasturi, R. Elder, B. Saltaformaggio, and W. Lee, “Towards measuring supply chain attacks on package managers for interpreted languages,” in Proceedings 2021 network and distributed system security symposium, 2021, doi: 10.14722/ndss.2021.23055 [Online]. Available: https://www.ndss-symposium.org/wp-content/uploads/ndss2021_1B-1_23055_paper.pdf. [Accessed: 07-May-2022]

[16]

“Package-analysis/case_studies.md at main · ossf/package-analysis. GitHub.” [Online]. Available: https://github.com/ossf/package-analysis. [Accessed: 07-May-2022]

[17]

WG securing critical projects. Open Source Security Foundation (OpenSSF), 2022 [Online]. Available: https://github.com/ossf/wg-securing-critical-projects. [Accessed: 27-Jun-2022]

[18]

“Introducing socket. Socket.” [Online]. Available: https://socket.dev/blog/introducing-socket. [Accessed: 26-Jun-2022]