I have a monorepo managed by Yarn, I'd like to take advantage of the Docker cache layers to speed up my builds, to do so I'd like to first copy the package.json
and yarn.lock
files, run yarn install
and then copy the rest of the files.
This is my repo structure:
packages/one/package.json
packages/one/index.js
packages/two/package.json
packages/two/index.js
package.json
yarn.lock
And this is the interested part of the Dockerfile:
COPY package.json .
COPY yarn.lock .
COPY packages/**/package.json ./
RUN yarn install --pure-lockfile
COPY . .
The problem is that the 3rd COPY
command doesn't copy anything, how can I achieve the expected result?
cp packages/*/package.json ./
wouldn't yield something sensible. So I believe you should hard-code in your Dockerfile
the path of folders one
and two
...
There is a solution based on multistage-build feature:
FROM node:12.18.2-alpine3.11
WORKDIR /app
COPY ["package.json", "yarn.lock", "./"]
# Step 2: Copy whole app
COPY packages packages
# Step 3: Find and remove non-package.json files
RUN find packages \! -name "package.json" -mindepth 2 -maxdepth 2 -print | xargs rm -rf
# Step 4: Define second build stage
FROM node:12.18.2-alpine3.11
WORKDIR /app
# Step 5: Copy files from the first build stage.
COPY --from=0 /app .
RUN yarn install --frozen-lockfile
COPY . .
# To restore workspaces symlinks
RUN yarn install --frozen-lockfile
CMD yarn start
On Step 5
the layer cache will be reused even if any file in packages
directory has changed.
As mentioned in the official Dockerfile reference for COPY <src> <dest>
The COPY instruction copies new files or directories from
For your case
Each may contain wildcards and matching will be done using Go’s filepath.Match rules.
These are the rules. They contain this:
'*' matches any sequence of non-Separator characters
So try to use *
instead of **
in your pattern.
If you can't technically enumerate all the subdirectories at stake in the Dockerfile (namely, writing COPY packages/one/package.json packages/one/
for each one), but want to copy all the files in two steps and take advantage of Docker's caching feature, you can try the following workaround:
Devise a wrapper script (say, in bash) that copies the required package.json files to a separate directory (say, .deps/) built with a similar hierarchy, then call docker build …
Adapt the Dockerfile to copy (and rename) the separate directory beforehand, and then call yarn install --pure-lockfile…
All things put together, this could lead to the following files:
./build.bash:
#!/bin/bash
tag="image-name:latest"
rm -f -r .deps # optional, to be sure that there is
# no extraneous "package.json" from a previous build
find . -type d \( -path \*/.deps \) -prune -o \
-type f \( -name "package.json" \) \
-exec bash -c 'dest=".deps/$1" && \
mkdir -p -- "$(dirname "$dest")" && \
cp -av -- "$1" "$dest"' bash '{}' \;
# instead of mkdir + cp, you may also want to use
# rsync if it is available in your environment...
sudo docker build -t "$tag" .
and
./Dockerfile:
FROM …
WORKDIR /usr/src/app
# COPY package.json . # subsumed by the following command
COPY .deps .
# and not "COPY .deps .deps", to avoid doing an extra "mv"
COPY yarn.lock .
RUN yarn install --pure-lockfile
COPY . .
# Notice that "COPY . ." will also copy the ".deps" folder; this is
# maybe a minor issue, but it could be avoided by passing more explicit
# paths than just "." (or by adapting the Dockerfile and the script and
# putting them in the parent folder of the Yarn application itself...)
Using Docker's new BuildKit executor it has become possible to use a bind mount into the Docker context, from which you can then copy any files as needed.
For example, the following snippet copies all package.json files from the Docker context into the image's /app/
directory (the workdir in the below example)
Unfortunately, changing any file in the mount still results in a layer cache miss. This can be worked around using the multi-stage approach as presented by @mbelsky, but this time the explicit deletion is no longer need.
# syntax = docker/dockerfile:1.2
FROM ... AS packages
WORKDIR /app/
RUN --mount=type=bind,target=/docker-context \
cd /docker-context/; \
find . -name "package.json" -mindepth 0 -maxdepth 4 -exec cp --parents "{}" /app/ \;
FROM ...
WORKDIR /app/
COPY --from=packages /app/ .
The mindepth
/maxdepth
arguments are specified to reduce the number of directories to search, this can be adjusted/removed as desirable for your use-case.
It may be necessary to enable the BuildKit executor using environment variable DOCKER_BUILDKIT=1
, as the traditional executor silently ignores the bind mounts.
More information about BuildKit and bind bounds can be found here.
package.json
) will cause the copy step to run again so in that sense, it has no advantage over just copying the whole code and run npm install
Following @Joost suggestion, I've created a dockerfile
that utilizes the power of BuildKit to achieve the following:
Faster npm install by moving npm's cache directory to the build cache
Skipping npm install if nothing changed in package.json files since last successful build
Pseudo Code:
Get all package.json files from the build context
Compare them to the package.json files from the last successful build
If changes were found, run npm install and cache the package.json files + node_modules folder
Copy the node_modules (fresh or cached) to the desired location in the image
# syntax = docker/dockerfile:1.2
FROM node:14-alpine AS builder
# https://github.com/opencollective/opencollective/issues/1443
RUN apk add --no-cache ncurses
# must run as root
RUN npm config set unsafe-perm true
WORKDIR /app
# get a temporary copy of the package.json files from the build context
RUN --mount=id=website-packages,type=bind,target=/tmp/builder \
cd /tmp/builder/ && \
mkdir /tmp/packages && \
chown 1000:1000 /tmp/packages && \
find ./ -name "package.json" -mindepth 0 -maxdepth 6 -exec cp --parents "{}" /tmp/packages/ \;
# check if package.json files were changed since the last successful build
RUN --mount=id=website-build-cache,type=cache,target=/tmp/builder,uid=1000 \
mkdir -p /tmp/builder/packages && \
cd /tmp/builder/packages && \
(diff -qr ./ /tmp/packages/ || (touch /tmp/builder/.rebuild && echo "Found an updated package.json"));
USER node
COPY --chown=node:node . /app
# run `npm install` if package.json files were changed, or use the cached node_modules/
RUN --mount=id=website-build-cache,type=cache,target=/tmp/builder,uid=1000 \
echo "Creating NPM cache folders" && \
mkdir -p /tmp/builder/.npm && \
mkdir -p /tmp/builder/modules && \
echo "Copying latest package.json files to NPM cache folders" && \
/bin/cp -rf /tmp/packages/* /tmp/builder/modules && \
cd /tmp/builder/modules && \
echo "Using NPM cache folders" && \
npm config set cache /tmp/builder/.npm && \
if test -f /tmp/builder/.rebuild; then (echo "Installing NPM packages" && npm install --no-fund --no-audit --no-optional --loglevel verbose); fi && \
echo "copy cached NPM packages" && \
/bin/cp -rfT /tmp/builder/modules/node_modules /app/node_modules && \
rm -rf /tmp/builder/packages && \
mkdir -p /tmp/builder/packages && \
cd /app && \
echo "Caching package.json files" && \
find ./ -name "package.json" -mindepth 0 -maxdepth 6 -exec cp --parents "{}" /tmp/builder/packages/ \; && \
(rm /tmp/builder/.rebuild 2> /dev/null || true);
Note: I'm only using the node_modules
of the root folder, as in my case, all the packages from inner folders are hoisted to the root
just use .dockerignore
to filter out not needed files. refer this reference
in your cases, add this to your .dockerignore.
*.js any file to skip copy
I assume your files are located like /home/package.json
, and want to copy those files to /dest
in docker.
Dockerfile would look like this. COPY /home /dest
this will copy all files to /home directory except list in .dockerignore
.dockerignore
file.
Success story sharing
FROM ubuntu WORKDIR /app COPY */*.csproj /app/
When I ran it, here is the correct output:$ docker run --rm -ti temp ls /app foo.csproj bar.csproj