In this post I will discuss a recent structural change that I made on a project - switching to hosting all of our front-end code in a single repository - a monorepo.
I’ve been working on a suite of web applications, each with a consistent visual language, shared components and utility functions. This was originally developed as a single application with a run-time check determining which content the application would present. Out of necessity we quickly split the application into three standalone artifacts, which resulted in cloning the repository and deleting the unnecessary code. Over time we found ourselves struggling with the ability to share code between the repositories, without creating more repositories and increasing the amount of administration and co-ordination required to publish a single change. It led to many more copy-and-past changes, with the long-term risk of our “shared” libraries differing over time and become tough to re-integrate. Something had to give, so I researched ways to improve our process.
Monorepos as a Productivity Tool
The NPM ecosystem favours small dependencies as a means of promoting code sharing and reusability. Investigate the
node_modules folder of any node project and you’ll see the abundance of small libraries, composed together to make larger pieces of work. But when it comes to writing applications at the user-level, we sometimes forget this compositional nature, instead hosting all of that code in one large package.
When your libraries live in separate repositories, there’s a further administrative overhead to managing those dependencies. There are multiple pull requests necessary and it becomes more involved for developers to test the integration works correctly. Especially where libraries are shared between multiple teams, implementing changes costs a notable amount of time. In larger corporate environments I’ve seen cases where code that really should live in a library ends up living in a team’s application repository instead, as it’s just quicker that way.
The reason this situation comes about is the one-to-one relationship between a repository producing a publishable NPM artifact. This doesn’t actually have to be the case, it’s just convention. Babel, Jest, Angular and React (amongst others) have all taken the approach of storing all their code in a single repository, breaking the one-to-one relationship between a single library being housed in a single repository. It’s immediately obvious that in this situation, a lot of the organizational headaches discussed above simply disappear, improving developer productivity significantly.
Lerna is a tool used to create monorepos for npm packages. In additional to describing how multiple packages should be stored, it provides a mechanism to publish artifacts to NPM and a means of ‘linking’ packages together where local dependencies exist. Furthermore it provides an efficient means of sharing third-party dependencies.
Setting up a new Lerna repository is as simple as installing
lerna globally and then running the initialisation script:
lerna init This will create a
lerna.json file at the root of your repository.
What Does It Look Like?
In addition to a root-level
package.json, each package will also have its own
package.json descriptor. Packages are therefore free to define dependencies and scripts just like they would in a standalone repository.
lerna.json file describes where packages live within the repository. By default they’re expected to live in a directory called
For web applications, consider the difference between applications and libraries. At a high-level, applications are the entities that are built using a tool such as Webpack, Browserify or Parcel and deployed to a production environment. Libraries are smaller utilities that are composed into applications. As such a bundler that runns from an application-level package will include the code in a library-level package. Therefore libraries themselves don’t need to be built themselves using a bundler. I typically export my library-level packages as ES6 modules and let the bundler tool take care of it.
If you don’t wish to have your packages published to npm as standalone artifacts, set
private: true in that package’s
One of the features that I really enjoy about npm is the ability to define scripts in the
package.json file and run them using
npm run <script-name>. This manages the complexity of common scripts that are regularly run and simplifies CI on build servers. Lerna understands this common use case and supports it. By running
lerna run <script-name>, Lerna will run the script
<script-name> in each package that contains a script of that name.
At any point you can always
cd into a package directory and run the npm scripts defined there. But sometimes it’s more convenient to have these scripts aliased in the root directory of your monorepo. That way there’s no need to jump in and out of packages. When running a script via the
npm run command, it’s possible to supply a
--prefix argument, which instructs to run the command from a given directory.
In my situation, I have a monorepo which hosts a series of applications. Sometimes I need to run all three in parallel. Each of these are bundled with Webpack, so for some packages, I could define the following scripts in my root-level
This small change has a significant impact when it comes to developer familiarisation with your monorepo structure!
Importing From Existing Git Repositories
Lerna provides a script to import existing git repositories into its repository structure while preserving git history. This is vital for developer productivity, otherwise useful contextual information disappears into a black hole.
Unfortunately, I struggled to import my existing git repositories using
lerna import. I suspect that this was down to merge conflicts that required manual resolution. There is a documented method for fixing this, but this involves re-writing history in a way I was not entirely comfortable with. Instead, I utilised the magic that is the git subtree command to manually import into the new destination:
- Add the old repository as a remote:
git remote add <name> <path>.
- Create a subtree to the remote. Use the prefix option to specify where you would like the subtree to be written to. By default that will be a directory in
git subtree add -P packages/<package-name> <remote> <branch>.
If you ever have to merge from the old repository, you can handle that via a subtree merge:
git subtree merge -P packages/<package-name> <remote> <branch>. Note that whilst this technique preserves git history, I lost tag information. I didn’t mind so much because I can always
grep for exact commits where a tag was created.
This is where Lerna really comes into its own. Each package is free to declare its own dependencies, but for cases where several packages wish to depend on the same dependency, Lerna’s bootstrap process will “hoist” the common dependencies so that they’re stored in one location. For painless hoisting, add
lerna bootstrap --hoist as a
postinstall command in your top-level
package.json. This means every time that
npm install is run, the hoisting stage will happen immediately after, ensuring that all packages that depend on one another are correctly referenced.
There are two packages that both depend on the same dependency, i.e React v16.3. When you bootstrap, React will be installed at the root-level
node_modules folder only. Node’s module resolution will find these as expected.
node_modules \ react packages \ package-a \ package-b
Similar to above, two packages depend on the same dependency, but require it to be available from the command-line. An example is Webpack, as you may wish to run Webpack from the package’s folder. When you bootstrap, Webpack will be installed at the root-level
node_modules, although a “symlink” exists in the
.bin folder of each package’s
node_modules \ webpack packages \ package-a \ node_modules \ .bin \ webpack.cmd \ package-b \ node_modules \ .bin \ webpack.cmd
The Lerna repository contains two packages, where one depends on the other (package-b depends on package-a). When you bootstrap, a “symlink” to package-a exists in the
node_modules folder of package-b.
packages \ package-a \ package-b \ node_modules \ package-a
Note that in this case Lerna handles the versioning of the dependency to a local package for you, so as you publish new versions, it will update
package.json files of your packages automatically.
The modern web development toolchain is quite advanced - including transpilation, bundling, linting and minification. Very often your packages will wish to perform these steps in a consistent manner. To achieve that, place your configuration files, such as your
eslintrc files, at the root directory of the repository. This way they will be found through Node’s dependency resolution algorithm and applied consistently.
I use Jest as a test runner and assertions library so I’ll be referring exclusively to its operation here. In the interests of balance, other runners and frameworks are available!
Jest supports monorepos out of the box which is super-handy, but opinions differ regarding whether you should run your tests as one large run, or parallelize running Jest tests from each package. I’ve used both.
As One Runner
For the first options, as with other shared config files, declare your jest config file in the root directory of your rpeository.
Here’s an example of a bare-bones top-level Jest config file:
This file is providing a mock (specifically identity-obj-proxy) so that CSS files aren’t included. It’s instructing Jest to look for tests and source files in the packages directory. Code coverage is configured to include this directory too, and exclude
node_modules for obvious reasons. The only interesting part is the declaration of
moduleNameMapper - in my sample repository this is a library package that lives within the monorepo and is used by other packages. In order for Jest to correctly find this, I found that I had to specifically add this module name transform.
From there, it’s just a case of adding
"test": "jest" in the root-level
package.json scripts section.
As Multiple Runners
Should you wish to have Jest run standalone in each package, that’s perfectly fine too! As you would if the repository were standalone for your package, define a
jest.config.js in that package’s directory. Similar to above, that package should have
"test": "jest" in its
package.json scripts section.
On your CI server you’ll most likely wish to run
lerna run test, which will execute the
test script in each package that defines one. You can do that while developing locally too if you wish, but you may also find yourself just wanting to run the tests from a single package. It may be beneficial to therefore define a top-level
package.json script to only run the tests in a given package, using the
--prefix trick mentioned above. For a package called package-a, that could look like this:
Note however that if you wish to pass arguments, such as to instruct Jest to update snapshot files, you have to use two instances of
-- in the arguments:
Lerna has two versioning modes: fixed and independent. These versioning modes determine what occurs when you wish to publish npm packages from the Lerna repository.
Fixed mode is the default and it’s my preferred option - in this mode the
lerna.json file at the root of the repository tracks the version number for each package in the application. When running a publish build, for all packages that have been marked as updated (via the
lerna updated command) then a new version is published with the same version number.
Given a Lerna repository with three packages - if the version is currently
0.1.0 and two of the three packages have been updated, Lerna will publish the two updated packages using the version number that you specify, let’s say
Independent versioning is where the exact version number of each package can differ. Whilst this may not be immediately obvious, consider the case where a change to a library package constitutes an incremental change to one package that depends on it, but a breaking change in another. If you’re an adopter of semantic versioning then you would wish for your new version numbers to reflect this change. Managing independent versioning requires more co-ordination, so I would only recommend it if absolutely necessary. By the way, I’ve written more on Lerna Independent Versioning.
Lerna’s publish command will create a new release, utilising several lerna sub-commands. It will identify any packages that have changed, bump their version and then push those packages to NPM. Finally a commit is applied to the repository and a new tag created. The version number can be supplied to the command - either exactly or using a sem-ver keyword like
minor etc. Alternatively do not supply a version for an interactive prompt. Remember to add
--yes in CI environments to skip interactive prompts! A typical CI setup would typically look like:
npm install(with the above
postinstallstep to run
lerna bootstrap --hoist)
lerna run build
lerna run test)
lerna publish patch --yes
For further configurability you may wish to extend step four into a custom script that reads some configuration from a file, to determine the versioning scheme (cd-version or repo-version) to use, and the subsequent values of those options. This could be useful if you wish to publish a prerelease version or an exact version number from a branch, but your day-to-day incremental builds run on a patch basis. Furthermore this can be extended to a CI/CD pipeline scenario, where following your publish step you auto-deploy to environments.
I’ve enjoyed the flexibility of storing multiple packages in one repository. Especially in corporate environments where there’s a level of administration required to set up repositories, Lerna’s structure affords me the ability to create packages arbitrarily. This is great for code re-use, modularisation and provides an efficient way of building code via dependency re-use and parallel execution of npm scripts.
Taking this one step further, Yarn Workspaces takes the concept of Lerna but tightly integrates it into the package manager CLI. This makes installing dependencies even more efficient. Lerna is fully compatible with Yarn Workspaces, so it’s not necessarily a choice of one or the other.