In a recent technical interview, I faced a series of challenges to be completed within a strict 24-hour timeframe. One such task was to containerize a Node.js application, and I’d like to share my approach and reasoning with you.
Note I will be using the terms “containerize” and “dockerize” interchangeably, as well as referring to “Dockerfile” and “Containerfile” interchangeably.
Hello World
The code was initially very simple and you can check it here. I modified the code slightly by adding the Express module to make the process more interesting. The code import the Express module and create an express application. This app starts a server and listens on port 8080 for connections. The app responds with “Hello World!” for requests to the root URL (/) or route.
const express = require('express');
const app = express();
const port = 8080;
app.get('/', (_, res) => {
res.send('Hello World!');
})
app.listen(port, () => {
console.log(`Listening on port ${port}`);
})
Here’s the steps to run the application locally:
- Create an
app
directory and switch to it. - run
npm init
. - Install express as a dependency:
npm install express
. - In the
app
directory, create a file namedindex.js
and copy in the code. - Run the app:
node index.js
.
Directory Structure
The directory structure looks like this:
app
├── index.js
├── node_modules
├── package.json
└── package-lock.json
Initial Dockerfile
I started with a basic Dockerfile (blatantly copied from nodejs website). The containerfile:
- uses the node image version 20,
- copies all the files from the current directory into the image directory
/usr/src/app
, - run
npm ci
which a clean install of of the dependencies, - expose the port 8080 and run the app.
This command npm-ci
is similar to npm install
, except it’s meant to be used in automated environments such as test platforms, continuous integration, and deployment.
FROM node:20
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies and bundle app source
COPY . .
RUN npm ci
EXPOSE 8080
CMD [ "node", "index.js" ]
.dockerignore file
A .dockerignore is a configuration file that describes files and directories that you want to exclude when building a Docker image.
.git
.gitignore
README.md
node_modules
npm-debug.log
This will prevent your local modules and debug logs from being copied onto your Docker image and possibly overwriting modules installed within your image.
Seperating modules and app code
There are two main reasons to seperate modules, packages, and source code: caching efficiency and dependency management.
-
Caching efficiency When building a Docker image, each step in the Dockerfile creates a new layer. These layers are cached by Docker, and if a layer remains unchanged between builds, Docker can reuse it from the cache. By separating the installation of app dependencies (usually managed through package.json) from the actual application code, you can take advantage of Docker’s layer caching. This way, only changes in the application code will result in rebuilding the layers that follow, making the build process more efficient.
-
Dependency Management Node.js applications often have a lot of dependencies specified in the package.json file. Separating the installation of these dependencies into its own step (npm ci) ensures that dependencies are installed consistently across different builds of the container. This can prevent discrepancies between different versions of the application if the dependencies were installed during the application code copying phase.
FROM node:20
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
+ COPY packages*.json .
RUN npm ci
+ COPY server.js .
EXPOSE 8080
CMD [ "node", "server.js" ]
Run as a non-root user
By default, Docker runs commands inside the container as root which violates the Principle of Least Privilege (PoLP) when superuser permissions are not strictly required.
The user node
is provided by the image node
and it can be invoked with the flag -u node
. Alternatively the user node
can be specified in the Dockerfile.
FROM node:20
+ USER node
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
COPY packages*.json .
RUN npm ci
+ COPY --chown=node:node server.js .
EXPOSE 8080
CMD [ "node", "server.js" ]
Use smaller images
The default Node.js Docker image is based on a Debian-based Linux distribution, for example node:bookworm
.
There are two other variants: slim
and alpine
.
The slim images, such as node:bookworm-slim
provides a functional NodeJs environment and nothing more. This decreased the image size dramatically - from a close of gigabyte of container image to an image size of a few hundreds MB. Furthmore, it reduces software footprints and hence vulnerabilities.
The alpine
variant are based on the Alpine Linux distribution, and it is relatively smaller than the slim
variant. The alpine version has a very low vulnerability count compared to the slim variant. However, the alpine version is an unoffical image, and is experimental and may not be consistent.
Deterministic tag should be favoured, that is, instead of node:bookwork-slim
or node:alpine
, specify the nodejs runtime such as node:20.5.0-bookworm-slim
or node:20.4.0-alpine3.17
.
+ FROM node:20.4.0-alpine3.17
USER node
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
COPY packages*.json .
RUN npm ci
COPY --chown=node:node server.js .
EXPOSE 8080
CMD [ "node", "server.js" ]
A comparison
Minimize image size
REPOSITORY TAG IMAGE ID CREATED SIZE
node-20-alpine latest ae76c5b808a2 12 seconds ago 185MB
node-20.5.0-bookworm-slim latest 3e7be562ea13 12 days ago 253 MB
node-20 latest 3abe3becfb39 14 minutes ago 1.1GB
Tini - short for Tiny Init
Tini (short for “Tiny Init”) is a small, lightweight init system designed specifically for managing processes within Docker containers or other lightweight environments. It addresses a common problem known as the “PID 1 problem” in containerized environments. In Unix-like operating systems, the init process with PID 1 has special responsibilities, including reaping orphaned child processes, handling signals, and managing the overall lifecycle of the system. However, when running containers, using a traditional init system as PID 1 can lead to various issues due to the isolation and process management characteristics of containers.
Here’s how Tini helps in a Docker container:
-
Signal Propagation and Handling: Containers rely on signals (such as SIGTERM) to gracefully shut down and handle other lifecycle events. Traditional init systems can sometimes interfere with proper signal propagation. Tini acts as a signal proxy, ensuring that signals sent to the container are appropriately delivered to the processes within, preventing unexpected behavior during shutdown or other events.
-
Process Reaping: When a process within a container exits, it becomes a “zombie” process until its exit status is collected by the parent process (usually the init process). If the init process doesn’t reap these zombie processes, they can accumulate and negatively impact system resources. Tini performs proper process reaping, preventing the accumulation of zombie processes.
-
Graceful Shutdown: During container shutdown, Tini ensures that all processes within the container are given a chance to clean up properly. It sends the appropriate signals to processes, allowing them to gracefully terminate and release resources.
-
Resource Efficiency: Tini is designed to be minimal and lightweight, consuming very little memory and overhead. This is crucial in containerized environments where resource utilization is important.
-
Compatibility: Tini is highly compatible with various container runtimes and orchestration tools. It’s often used as a drop-in replacement for the default init process in container images.
Tini was specifically designed to mitigate the challenges that arise from using traditional init systems within containers.
FROM node:20.4.0-alpine3.17
+ RUN apk add --no-cache tini
USER node
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
COPY packages*.json .
RUN npm ci
COPY --chown=node:node server.js .
EXPOSE 8080
+ ENTRYPOINT ["/sbin/tini", "--"]
CMD [ "node", "server.js" ]
The ENTRYPOINT
specifies a command that will always be executed when the container starts.
The CMD
specifies arguments that will be fed to the ENTRYPOINT
.
Multi-stage build
+ FROM node:20.4.0-alpine3.17 AS base
RUN apk add --no-cache tini
USER node
# Create app directory
WORKDIR /usr/src/app
# Install app dependencies
COPY packages*.json .
RUN npm ci
COPY --chown=node:node server.js .
+ FROM base AS app
EXPOSE 8080
ENTRYPOINT ["/sbin/tini", "--"]
CMD [ "node", "server.js" ]
With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.
Here’s a better example to illustrate multi-stage build:
# syntax=docker/dockerfile:1
FROM golang:1.21 as build
WORKDIR /src
COPY <<EOF /src/main.go
package main
import "fmt"
func main() {
fmt.Println("hello, world")
}
EOF
RUN go build -o /bin/hello ./main.go
FROM scratch
COPY --from=build /bin/hello /bin/hello
CMD ["/bin/hello"]
The second FROM
instruction starts a new build stage with the scratch
image as its base. The COPY --from=build
line copies just the built artifact from the previous stage into this new stage.
The end result is a tiny production image with nothing but the binary inside. None of the build tools required to build the application are included in the resulting image.
BuildKit
Lastly build your image with BuildKit. BuildKit is a builder backend and is used by many projects. buildx is a Docker CLI plugin for extended build capabilities with BuildKit.
Check the documetation on how to use the BuildKit builder.
Resources
https://github.com/nodejs/docker-node/blob/main/docs/BestPractices.md https://docs.docker.com/build/building/multi-stage/