pkgndep

A new metric named 'dependency heaviness' is proposed that measures the number of additional dependency packages that a parent package brings to its child package and are unique to the dependency packages imported by all other parents.

4
mentions
1
contributor

Cite this software

What pkgndep can do for you

Analyzing Dependency Heaviness of R Packages

R-CMD-check
CRAN

When developing R packages, we should try to avoid directly setting
dependencies on "heavy packages". The "heaviness" for a package means, the
number of additional dependency packages it brings to. If your package directly depends
on a heavy package, it would bring several consequences:

  1. Users need to install a lot of additional packages when installing your
    package which brings the risk that installation of some packages
    may fail and it makes your package cannot be installed.
  2. The namespaces that are loaded into your R session after loading your package will be huge (you can see the loaded namespaces by sessionInfo()).
  3. You package will be "heavy" as well and it may take long time to load your package.

In the DESCRIPTION file of your package, there are "direct dependency
pakcages" listed in the Depends, Imports and LinkingTo fields. There are
also "indirect dependency packages" that can be found recursively for each of
the direct dependency packages. Here what we called "dependency packages" are
the union of the direct and indirect dependency packages.

There are also packages listed in Suggests and Enhances fields in
DESCRIPTION file, but they are not enforced to be installed when installing
your package. Of course, they also have "indirect dependency packages". To get
rid of the heavy packages that are not often used in your package, it is
better to move them into the Suggests/Enhances fields and to load/install
them only when they are needed.

Here the pkgndep package checks the heaviness of the dependency packages
of your package. For each package listed in the Depends, Imports,
LinkingTo and Suggests/Enhances fields in the DESCRIPTION file,
pkgndep checks how many additional packages your package requires. The
summary of the dependency is visualized by a customized heatmap.

As an example, I am developing a package called
cola which depends on a lot of other
packages
.
The dependency heatmap looks like follows:

In the heatmap, rows are the packages listed in Depends, Imports and
Suggests fields, columns are the additional dependency packages required for
each row package. The barplots on the right show the number of required
package, the number of imported functions/methods/classes (parsed from
NAMESPACE file) and the quantitative measure "heaviness" (the definition of
heaviness will be introduced later).

We can see if all the packages are put in the Depends or Imports field
(i.e. movig all suggsted packages to Imports), in total 248
packages are required, which are really a lot. Actually some of the heavy
packages such as WGCNA, clusterProfiler and ReactomePA (the last
three packages in the heatmap rows) are not very frequently used in cola,
moving them to Suggests field and using them only when they are needed
greatly helps to reduce the heaviness of cola. Now the number of required
packages are reduced to only 64.

Citation

Gu Z. et al., pkgndep: a tool for analyzing dependency heaviness of R packages. Bioinformatics 2022. https://doi.org/10.1093/bioinformatics/btac449

Gu Z, On the Dependency Heaviness of CRAN/Bioconductor Ecosystem. Journal of Systems and Software 2023. https://doi.org/10.1016/j.jss.2023.111610

Installation

The pkgndep package can be installed from CRAN by

install.packages("pkgndep")

Usage

To use this package:

library(pkgndep)
pkg = pkgndep("package-name")
dependency_heatmap(pkg)

or

pkg = pkgndep("path-of-the-package")
dependency_heatmap(pkg)

An executable example:

library(pkgndep)
pkg = pkgndep("ComplexHeatmap")
pkg
## ComplexHeatmap, version 2.9.4
## 30 additional packages are required for installing 'ComplexHeatmap'
## 117 additional packages are required if installing packages listed in all fields in DESCRIPTION
dependency_heatmap(pkg)

Heaviness database

There is an integrated dependency heaviness database for all R packages for a lot of R/Bioc versions. The database can be accessed by:

heaviness_database()

License

MIT @ Zuguang Gu

Logo of pkgndep
Keywords
Programming languages
  • R 58%
  • HTML 36%
  • JavaScript 5%
  • CSS 1%
License
</>Source code
Packages
cran.r-project.org

Participating organisations

German Cancer Research Center

Reference papers

Mentions

Contributors