Duplicated code from dependencies is possibly the most needlessly endemic problem confronting modern web application bundles, making them bigger and slower. And, diagnosing how the usual suspects, npm
and webpack
, conspire together to include more than you need can be incredibly difficult and painful.
In this post we will do a deep dive into various ways that dependencies can unnecessarily bloat bundles to better understand the problem space. Then, we will introduce the Inspectpack DuplicatesPlugin
-- a power tool to help you identify nuanced, actionable information about wasted bytes from dependencies in your webpack
bundles.
Dependencies, Bundles, and Duplicates
To begin to understand how duplicated dependencies hurt modern frontend bundles, we need to look at the npm
and webpack
projects, and how they work together to produce a rising share of the ecosystem's delivered web applications.
Dependencies
Modern frontend web applications are rarely written from scratch. Most JavaScript-based applications rely on the expansive world of open source libraries, in particular those published as packages to the npm
registry. With just a few additions to your package.json
file, you can instantly get any number of libraries to help your application with anything from date formatting to full-blown application frameworks like React!
Bundles
Transforming your application code and dependencies into a web application capable of running in a browser usually involves a bundling tool, the most popular of which is webpack
. Webpack ingests your code and dependencies, and packages all of the necessary files into 1+ "bundle" files that together are downloaded to end-user browsers.
Duplicates
Unfortunately, this power and flexibility comes with costs and complexities. Dependencies often have dependencies of their own, which makes analysis difficult for humans trying to optimize bundles and bundling tools trying to efficiently stitch code together.
All too often bundles suffer from one or more of the following duplication situations:
- Identical code sources from the same package: Your application bundle ends with up 2+ versions of a file included that are byte-for-byte identical. E.g., you have the code from file
lodash@4.2.3/get.js
literally included twice in your final bundle. - Similar code files from different packages: Your application bundles two files with the same package name and file path that function similarly, but are not byte-for-byte identical. E.g., your bundle has code from the files
lodash@3.1.0/get.js
andlodash@4.2.3/get.js
that is functionally the same, but not the actual identical code. - Identical code sources from different packages: This time you have different packages, but the specific file included in your bundle hasn't changed across the versions and is byte-for-byte identical. E.g., your bundle has code from the files
lodash@3.1.0/get.js
andlodash@4.2.3/get.js
that is identical (although code in other unused files in the packages may differ).
In each of these scenarios your bundle ends up with more code than necessary because the same file (identical code or not) should be collapsed to a single version. To see how and why this can happen, it helps to review how npm
and webpack
work.
Understanding npm
and webpack
Let's begin with a common frontend build situation wherein npm
(or yarn
) "get" code dependencies from the Internet and webpack
then "puts" code sources in an application bundle. (Note that npm
and webpack
have changed how they handle duplicated packages and code, so we're going to cover the mechanics of each over time.)
We will work through a simple hypothetical application with a lot of different npm
and webpack
scenarios. You can find all of the discussed examples in a GitHub repository, complete with node_modules
and application bundles placed into git source for easier online review.
npm
Dependencies Installation
The npm
tool first reads dependencies from the package.json:dependencies
(and devDependencies
for development) field. Let's consider an application like this:
// package.json { "name": "my-app", "dependencies": { "lodash": "^4.1.0", "one": "1.2.3", "two": "2.3.4" } }
with the one
and two
packages that "resolve" to match those dependencies. npm
then looks at the resolved packages and recursively resolves the dependencies of each, essentially creating a dependency tree (or even abstractly a graph if there are circular dependencies). Here, we have:
// node_modules/one/package.json { "name": "one", "dependencies": { "lodash": "^4.0.0" } } // node_modules/two/package.json { "name": "two", "dependencies": { "lodash": "^3.0.0" } }
So, what does our dependency tree look like? It's a bit complicated because the first-level dependencies in the root package like "lodash": "^4.1.0"
get resolved to a single package like lodash@4.2.3
and downloaded. Only after this are the resolved package's dependencies read and semver ranges resolved to actual packages recursively. This means that our "abstract" dependency tree actually is really a mix of concrete resolutions of files in the resolution at a given point in time.
Aside: lodash
is a real package with a get
method. We will use some fictional implementations in our examples. one
and two
are fabricated packages. We picked lodash
as a fictionalized example because it is so popular that there is a decent chance that a bundle with duplicates has some lodash
duplicates.
Let's also define some quick terms that we'll use throughout the rest of this post:
- "resolved": We start with a package name (
lodash
) that has a declared version constraint (^4.1.0
). During annpm
/yarn
install, one early step is to take that constraint and resolve it to a single version in the registry that can be downloaded (lodash@4.2.3
). - "installed": Resolved packages are downloaded, but not necessarily immediately placed in their final installation paths on disk in
node_modules
. Bothnpm
andyarn
reserve the right to move things around and flattened installed packages to single on-disk versions, etc. Only whennpm
oryarn
finish the installation command do we say a package is actually installed at a given location on disk (e.g.,node_modules/lodash
). - "depended": We use the term "depended" to describe a logical, unique path in the abstract dependency tree that causes a package to be included. In our example above, we have three packages depending on
lodash
(my-app
,one
, andtwo
).
For the present example, our abstract depended tree (with resolved versions in comments on the right) is:
# Depended # Resolved # =================== # ======== - my-app: - lodash@^4.1.0 # 4.2.3 - one@1.2.3: # 1.2.3 - lodash@^4.0.0 # 4.2.3 - two@2.3.4: # 2.3.4 - lodash@^3.0.0 # 3.1.0
So, what happens when we install this on disk via npm install
or yarn install
? Well, the answer depends on which version of npm
/ yarn
you use.
Old npm
Older versions of npm
used to install dependencies exactly as would match the abstract dependency tree, namely:
# Installed # Resolved # =================== # ======== my-app/node_modules lodash # 4.2.3 one # 1.2.3 node_modules lodash # 4.2.3 two # 2.3.4 node_modules lodash # 3.1.0
Even though lodash@4.2.3
resolves to a single package version, it is installed twice.
Flattening with modern npm
In recognition of issues with wasted disk space and bloated frontend bundles, modern versions of npm
and yarn
implement a scheme of "flattening" the installed node_modules
dependency tree. Following the Node.js require
resolution algorithm, some of these dependencies can be collapsed to one package, within an acceptable semantic version range. (The actual mechanics of flattening and Node.js require
resolution are complex and out of scope for this article -- we're just going to gloss over the subject.)
In the above example, resolved version 4.2.3
of lodash
is compatible with both ~/lodash@^4.1.0
and ~/one@1.2.3/~/lodash@^4.0.0
. Thus, by using a flattening installer, you could end up with an installed node_modules
tree like this:
# Installed # Resolved # =================== # ======== my-app/node_modules lodash # 4.2.3 (for `my-app` _and_ for `one`) one # 1.2.3 two # 2.3.4 node_modules lodash # 3.1.0
~
Note: Following a webpack convention, we use the ~
character to mean the node_modules
directory. The two can be used interchangeably.
Un-flattenable dependencies in modern npm
But, let's not get our hopes up too quickly. Even with modern npm
, identical packages may still not be able to be flattened.
If our abstract dependency tree changes to:
# Depended # Resolved # =================== # ======== - my-app: - lodash@^4.1.0 # 4.2.3 - one@1.2.3: # 1.2.3 - lodash@^3.0.0 # 3.1.0 (CHANGED!) - two@2.3.4: # 2.3.4 - lodash@^3.0.0 # 3.1.0
The root install will be ~/lodash
at 4.2.3
with two identical 3.1.0
versions like:
# Installed # Resolved # =================== # ======== my-app/node_modules lodash # 4.2.3 one # 1.2.3 node_modules lodash # 3.1.0 two # 2.3.4 node_modules lodash # 3.1.0
Together, these scenarios outline different ways that npm
can place packages on disk in node_modules
. Once there, how do depended-on code files end up in your frontend application?
The answer depends on your bundling tool. In this post, we'll focus on the widely-used webpack
project.
Webpack bundling
A Webpack build starts at an entry point, which is typically your application code. It traverses all module imports (require
or import
) recursively to ingest code from your app or node_modules
, process it, and concatenate everything into one or more bundles. (An oversimplification, but bear with us here.)
Let's say we have an application comprising of:
// my-app/index.js (entry point) const { get } = require("lodash"); // lodash@^4.1.0 const { getOne } = require("one"); const { getTwo } = require("two"); const OBJ = { one: { two: "hi" } }; console.log("Get from lodash", get("one.two", OBJ)); // => `"hi"` console.log("Get from one", getOne(OBJ)); // => `{ two: "hi" }` console.log("Get from two", getTwo(OBJ)); // => `undefined`
// node_modules/one/index.js const { get } = require("lodash"); // lodash@^4.0.0 module.exports = { getOne: (obj) => get("one", obj) };
// node_modules/two/index.js const { get } = require("lodash"); // lodash@^3.0.0 module.exports = { getTwo: (obj) => get("two", obj) };
With this setup, we will definitely end up with a bundle that includes the files:
my-app/index.js
node_modules/one/index.js
node_modules/two/index.js
But the big question is: what files from lodash
end up in our final webpack bundle?
And, of course, the answer is: it's complicated. It depends on how npm
installed the packages into node_modules
and how webpack
behaves.
Old webpack
The original version of webpack
came with the webpack.optimize.DedupePlugin
plugin. The plugin replaces subsequent identical sections of an original code source with a pointer reference. Configuring the plugin is as simple as:
// webpack.config.js module.exports = { plugins: [ new webpack.optimize.DedupePlugin() ] };
and a bundle will conveniently omit all extraneous identical code sources using any version of npm
or yarn
.
New webpack
The DedupePlugin
was removed in webpack@3
with an indication that modern npm
flattening should be sufficient to collapse duplicates in the installed node_modules
folder such that Webpack would no longer have to do anything.
... but is that really the case?
Into the weeds with various duplication scenarios
Let's look at a few scenarios that might arise in bundles using old/new npm
and old/new webpack
. (See our examples repository for the full inputs and build outputs.)
These scenarios use the application discussed above that imports get()
from lodash
, getOne()
from one
, and getTwo()
from two
. The contrived getOne()
and getTwo()
methods use another import of lodash
at differing versions. Although lodash
has a real get()
method, instead we will make up two hypothetical versions as follows:
module.exports = { // A very, very rough and naive object getter with forEach. get: (path, obj) => { let memo = obj; path.split(".").forEach((key) => { memo = memo === undefined ? memo : memo[key]; }); return memo; } };
module.exports = { // A very, very rough and naive object getter with reduce. get: (path, obj) => path.split(".").reduce((memo, key) => { return memo === undefined ? memo : memo[key]; }, obj) };
Scenario 1 - Old npm
Old npm
installs the node_modules
folder something like:
# Installed # Resolved # =================== # ======== my-app/node_modules lodash # 4.2.3 one # 1.2.3 node_modules lodash # 4.2.3 two # 2.3.4 node_modules lodash # 3.1.0
On disk, we have identical code sources at:
node_modules/lodash/index.js
(4.2.3
)node_modules/one/node_modules/lodash/index.js
(4.2.3
)
and similar code files at:
node_modules/two/node_modules/lodash/index.js
(3.1.0
)
Let's look at how old and new webpack bundle these:
Scenario 1.a - Old npm
+ old webpack
Old webpack's DedupePlugin
deduplicates the identical code sources so that only 1 code instance remains in our bundle. Looking at these lines of the bundle:
/* 3 */ /*!*****************************************!*\ !*** ./old-npm/~/one/~/lodash/index.js ***! \*****************************************/ 1,
instead of real code, there is an integer 1
which points to the full identical source at index 1
in the bundle which corresponds to node_modules/lodash/index.js
.
Assessment: Examining our potential duplication problems, our bundle stacks up as follows:
- Identical code sources from the same package: None. The plugin deduplicates.
- Similar code files from different packages: Duplicates. The deduplicated file
lodash@4.2.3/index.js
is similar in functionality to the different filelodash@3.1.0/index.js
. If only one of the two files were chosen we would save the other file's byte size. - Identical code sources from different packages: N/A. Our example doesn't have identical sources across different packages. But if it did, the plugin would deduplicate them.
Scenario 1.b - Old npm
+ new webpack
Modern webpack doesn't deduplicate, so our bundle contains the following full code sources:
node_modules/lodash/index.js
(4.2.3
)node_modules/one/node_modules/lodash/index.js
(4.2.3
)node_modules/two/node_modules/lodash/index.js
(3.1.0
)
Assessment:
- Identical code sources from the same package: Duplicates. No webpack plugin deduplication.
- Similar code files from different packages: Duplicates. Same as scenario 1.a.
- Identical code sources from different packages: N/A. Our example doesn't have identical sources across different packages. But if it did, they would still remain as duplicates because modern webpack doesn't deduplicate identical sources.
Scenario 2 - New npm
flattened
Let's upgrade to a modern npm
version or any version of yarn
. Both package managers now inspect the entire dependency tree and "flatten" dependencies to single packages higher up in node_modules
whenever they can.
As mentioned above, this translates to an installed layout of:
# Installed # Resolved # =================== # ======== my-app/node_modules lodash # 4.2.3 (for root _and_ for `one`) one # 1.2.3 two # 2.3.4 node_modules lodash # 3.1.0
collapsing the dependencies for lodash@4.2.3
to a single on-disk location.
Thus, we end up with no identical code sources and two similar code files:
node_modules/lodash/index.js
(4.2.3
)node_modules/two/node_modules/lodash/index.js
(3.1.0
)
Let's turn to webpack
and bundling this installation.
Scenario 2.a - New npm
flattened + old webpack
Old webpack's DedupePlugin
doesn't have anything to do this time, as there are no identical duplicate code sources.
Assessment: Our bundle has the following issues:
- Identical code sources from the same package: None.
npm
was able to flatten away our identical sources by collapsinglodash@4.2.3
. - Similar code files from different packages: Duplicates. The different files
lodash@4.2.3/index.js
andlodash@3.1.0/index.js
waste bytes if a single file could be used instead. - Identical code sources from different packages: N/A. Same as scenario 1.a.
Scenario 2.b - New npm
flattened + new webpack
As the old DedupePlugin
never came into play in this scenario, modern webpack
has pretty much exactly the same substantive bundle as in scenario 2.a with the same advantages and disadvantages.
Scenario 3 - New npm
unflattened
Unfortunately, modern npm
and yarn
cannot flatten all semver-compatible packages because they are ultimately bound by the rules of the Node.js require
resolution algorithm. Thus, if we have a slight change in dependencies (the one
package now depends on lodash@3.1.0.
), we end up with an installed on-disk layout of:
# Installed # Resolved # =================== # ======== my-app/node_modules lodash # 4.2.3 one # 1.2.3 node_modules lodash # 3.1.0 (Cannot be collapsed) two # 2.3.4 node_modules lodash # 3.1.0 (Cannot be collapsed)
Thus, we have identical code sources:
node_modules/one/node_modules/lodash/index.js
(3.1.0
)node_modules/two/node_modules/lodash/index.js
(3.1.0
)
and similar code files:
node_modules/lodash/index.js
(4.2.3
)
Scenario 3.a - New npm
unflattened + old webpack
Old webpack's DedupePlugin
is able to deduplicate the identical code sources across one
and two
's lodash@3.1.0
. The code from two
is collapsed to a reference integer 3
in these lines.
Assessment: Our bundle stacks up as follows:
- Identical code sources from the same package: None. The plugin deduplicates.
- Similar code files from different packages: Duplicates. The deduplicated file
lodash@3.1.0/index.js
is similar tolodash@4.2.3/index.js
, but both remain in the bundle. - Identical code sources from different packages: N/A. Same as scenario 1.a.
Scenario 3.b - New npm
unflattened + new webpack
Modern webpack doesn't deduplicate, so we end up with full code sources of:
node_modules/lodash/index.js
(4.2.3
)node_modules/one/node_modules/lodash/index.js
(3.1.0
)node_modules/two/node_modules/lodash/index.js
(3.1.0
)
in our bundle pretty much analogously to scenario 1.b. Ultimately, modern npm
that cannot flatten is equivalent to old npm
that didn't even try.
Scenario 4 - New npm
flattened with identical sources
We return to our original package version setup, but introduce a different twist: now lodash@3.1.0/index.js
and lodash@4.2.3/index.js
are identical code sources. Both use the reduce
version of our get
function.
We end up with the same installed layout as scenario 1:
# Installed # Resolved # =================== # ======== my-app/node_modules lodash # 4.2.3 (for root _and_ for `one`) one # 1.2.3 two # 2.3.4 node_modules lodash # 3.1.0
collapsing the dependencies for lodash@4.2.3
to a single on-disk install.
So now we end up with no similar code files and 2 identical code sources at:
node_modules/lodash/index.js
(4.2.3
)node_modules/two/node_modules/lodash/index.js
(3.1.0
)
Let's see how different webpack
s handle this final scenario:
Scenario 4.a - New npm
flattened with identical sources + old webpack
Old webpack's DedupePlugin
is able to deduplicate the identical code sources across the flattened lodash@4.2.3/index.js
(for root and one
) and the different package (but identical code source) of lodash@3.1.0/index.js
from two
. The identical code source from two
's dependency is collapsed to a reference integer 1
in these lines.
Assessment: Our bundle has no duplicates anywhere!
- Identical code sources from the same package: None. New
npm
/yarn
takes care of this. - Similar code files from different packages: N/A. The scenario has no similar-but-not-identical code sources.
- Identical code sources from different packages: None. The old webpack
DedupePlugin
is now able to collapse these, even across different packages.
Scenario 4.b - New npm
flattened with identical sources + new webpack
Although modern npm
/ yarn
take care of the flattened packages of lodash@4.2.3/index.js
, there is a missed opportunity for the identical code source in lodash@3.1.0/index.js
.
Assessment: Our bundle now has different types of duplicates from previous scenarios:
- Identical code sources from the same package: None.
npm
was able to flatten away our identical sources by collapsinglodash@4.2.3
. - Similar code files from different packages: N/A. The scenario has no similar-but-not-identical code sources.
- Identical code sources from different packages: Duplicates. Without old webpack deduplication, we end up with identical code across the two lodash versions in our bundle.
Finding and fixing duplicates in real applications
It takes a surprisingly large amount of background to dig into even the most common cases of duplicate dependency creep in your bundle. We have reviewed how old and modern npm
and webpack
together transform source code into a full application bundle. We've investigated just a few of the many, many scenarios that demonstrate how unnecessary dependency duplicates can produce a larger bundle than you need. And, we've identified some of the things that influence duplicates, namely:
- Where dependencies are coming from;
- How
npm
has placed the dependencies on disk; - How
webpack
has included code from dependencies into the bundle.
The tough part here is that so far, we've only analyzed a truly trivial application via human inspection of the actual application bundle. Nearly all real applications will be significantly larger and more complex, all but foreclosing such manual analysis.
How do we apply this for reals?
Real-world development and production workflows all but require some baseline of programmatic tooling support for these issues.
On the positive side, there are a multitude of different webpack analysis tools that break down the various parts of the webpack compilation process. A good starting point is SurviveJS' dedicated page for build analysis with a subsection on duplicates analysis.
Nonetheless, while many of these projects can together provide a lot of information about every part of webpack
compilation and your bundle, finding a dedicated report that is specifically actionable for reducing duplicates can remain challenging.
Introducing the Inspectpack DuplicatesPlugin
The Inspectpack project has been analyzing Webpack innards for quite some time. It's the information engine behind the popular webpack-dashboard
, which provides a captive terminal display with a NASA-like control center. Inspectpack also provides a powerful CLI tool that consumes a Webpack stats object from disk to report on size, duplicates, and package version information in a variety of formats (text, TSV, JSON).
At its core, the Inspectpack library can efficiently discover code duplicates as well as inspect installed node_modules
directories and infer how both npm
and webpack
impact a final production application bundle. We're very pleased to announce that we have taken this power and concentrated it into a new easy-to-use webpack
plugin -- the Inspectpack DuplicatesPlugin
.
Webpack integration
Following the online guide, first install the plugin and add it to your development dependencies:
$ npm install --save-dev inspectpack # OR $ yarn add --dev inspectpack
Then, integrate the plugin into the plugins
field of your webpack configuration file:
// webpack.config.js const { DuplicatesPlugin } = require("inspectpack/plugin"); module.exports = { plugins: [ new DuplicatesPlugin({ // Emit compilation warning or error? (Default: `false`) emitErrors: false, // Display full duplicates information? (Default: `false`) verbose: false }) ] };
Our options are as follows:
emitErrors
: By default (false
), theDuplicatesPlugin
emits duplicate issues tocompilation.warnings
. If this value istrue
, then the plugin will emit issues tocompilation.errors
which will fail typical modern webpack builds.verbose
: By default (false
), the report only shows package information for packages that have files that end up duplicated somewhere in a given bundle. Setting this value totrue
additionally displays information for duplicated files, including file size and type (identical (I
) or similar (S
)).
And that's pretty much it. It's worth noting that the plugin supports every version of webpack (currently 1-4) and is fully tested on each version!
Discovering and understanding duplicates
Now that we have integrated the plugin, let's try it out!
The README documentation provides a guide to understanding plugin reports, but it's probably easier to just see it in practice. Picking an exemplary situation, we return to scenario 3.b, which uses modern npm
with unflattened dependencies.
Default report
Running webpack
with the default options (DuplicatesPlugin()
) produces the following report:
Our report is a bit terse, but contains a summary of the information from our previous, manual analysis:
- For duplicate files, we have
2
similar and3
similar or identical code sources. We also get a summary of the total number of bytes at issue. For us here, that's703
for 3 code sources, which we could presumably roughly cut to a third the size if we fix the duplicates. - For packages, we have
1
unique package (lodash
) with 2 resolved versions (3.1.0
,4.2.3
), 3 installed paths on disk(~/one/~/lodash
,~/two/~/lodash
,~/lodash
), and 3 depended abstract paths (starting from root application,one
, andtwo
).
We then get a per-asset report (just bundle.js
in our case) with a per-unique-package drill-down of the form:
{PACKAGE_NAME} (Found {NUM} resolved, {NUM} installed, {NUM} depended. Latest version {VERSION}.) {INSTALLED_PACKAGE_VERSION NO 1} {INSTALLED_PACKAGE_PATH NO 1} {DEPENDENCY PATH NO 1} {DEPENDENCY PATH NO 2} ... {INSTALLED_PACKAGE_VERSION NO 1} {INSTALLED_PACKAGE_PATH NO 2} ... {INSTALLED_PACKAGE_VERSION NO 2} {INSTALLED_PACKAGE_PATH NO 3} ...
Our report contains a package summary for lodash
as it is the only duplicate-producing package in the scenario. Then we drill down into each installed version / path, and further drill down into each depended graph.
That's all of the package information we manually figured out before! But what if we want a bit more information about the duplicate code sources?
Verbose report
Enter the verbose report. Configuring the option DuplicatesPlugin({ verbose: true })
will additionally produce duplicate code source information:
Now, in addition to our dependency graphs, we get a report of each duplicated code source path, a note indicating if it is identical (I
) to some other file in the bundle or merely similar (S
), and the file byte size (e.g., 249
and 205
bytes respectively). As discussed previously, we have:
- Identical code sources from the same package: Duplicates. We have
~/one/~/lodash/index.js
and~/two/~/lodash/index.js
from separatelodash@3.1.0
installations. - Similar code files from different packages: Duplicates. Both of these
lodash@3.1.0
files are similar to~/lodash/index.js
fromlodash@4.2.3
in the bundle.
Assessing duplicates
With DuplicatesPlugin
added to our webpack configuration file, we now have a detailed assessment of how npm
dependencies are installed and what webpack
ends up placing in the bundle. More specifically, we can answer these important questions:
- What package versions were resolved?
- Where were the resolved packages installed on disk in
node_modules
? - What parent packages depended on a given package to cause it to be resolved and installed?
- Which duplicate files from a package ended up being included in the final application bundle?
Note - DedupePlugin
: The DuplicatesPlugin
does not specifically detect that old webpack's DedupePlugin
has programmatically collapsed a duplicate code source. Given that very few folks use a pre-webpack@3
, we chose to omit a special case report (although we could if there was community demand).
Fixing duplicates
So, now that we can automatically report on duplicate dependencies, how do we fix them? How do we smash our duplicates?
Staying true to our running theme, the answer is: it's complicated.
The Inspectpack documentation has an introductory guide discussing how to fix bundle duplicates.
Summarizing these for convenience, we first look to meta-level tips on prioritization and focus:
- Look first to identical code sources: When choosing what to do first, a good bet is to focus on literally identical code duplicated in your bundle and dependencies. Although not guaranteed to be collapsible (due to considerations like other depended-on code), it's a great first stop.
- Change dependencies in your root
package.json
: Anything you can change directly in your applicationspackage.json
is likely to have the least unintended consequences. - Critically examine and scrutinize your dependencies: An easy win is always to just have less code. Look at your dependency tree. Do you really need all of the bundled packages? Keep a critical eye out for packages that have lots of transitive other dependencies, prioritize based on total size brought in by a package (using tools like the awesome
webpack-bundle-analyzer
), and see if you can live without the dependency or find a smaller, equivalent replacement.
Unfortunately, many times you won't be able to harmonize and collapse your abstract dependency graph / installed node_modules
tree easily. The next step is to potentially force packages and code sources to collapse to single entities, even if the normal rules of npm
/yarn
/webpack
would prevent it.
- Set
resolve.alias
in yourwebpack
configuration: Direct webpack to use a single package when resolving dependencies that would follow normal (multi-package) resolution. - Set the
resolutions
field withyarn
: Directyarn
(not available onnpm
) to resolve packages to a single version, overriding what is normally considered an allowed resolution andnode_modules
installation.
These are essentially sledgehammer approaches for otherwise intractable duplicate situations, but because they can violate the assumptions and rules of semantic versioning, your application is potentially at risk for breaking behavior and bugs.
All in all, like most web application optimization work, reducing duplicate dependencies is more of an art than science. The above tips are just starting points for an overall effort that digs deep into why your bundle is so large and how to most effectively leverage this information to reduce its size.
Conclusion
After our (rather lengthy) deep dive, we have now uncovered more of how and why duplicates can occur in your application bundles. Although we only discussed a few hundred wasted bytes to keep our build scenarios comprehensible, it is easy to extrapolate an impact of several orders of magnitude for more complex, real-world web applications.
We hope you find the new Inspectpack DuplicatesPlugin
to be a useful, actionable tool in finding and squashing duplicate dependencies in your webpack builds. Do your application bundles a favor and give it a whirl today!