Packages are collections of R functions or data that you can use to do various statistical tasks. Some packages are included by default when you install R, and others are published on the internet by their creators. There are a few different ways to explore and obtain packages.

If you want to see a list of packages that are currently installed on your computer, you can click on the packages tab in the bottom right of the screen. You can click on the name of a package to see information about what it contains.

Typical package workflow

Most packages you will use are easily obtained through Rstudio / using default commands. R automatically downloads these packages from a website called “CRAN

Installing packages

To install one of these packages, you only need to use the install.packages() function. E.g., to install the palmerpenguins package, we do:

install.packages('palmerpenguins')

This will likely just work – it may take a little while and some output will appear in the terminal.

Occasionally, one of two things might happen:

  • you may get a popup asking you to choose a repository (typically these appear as a list of locations like Melbourne, Sydney, etc.). You can just pick the closest city to where you live.
  • you may be asked if you wish to “compile the package from source” or something along those lines, by entering a number or letter in the terminal. In the first instance, it is best to select no when presented with prompts like this. If you are on a Windows computer you won’t be able to compile by source without Rtools (ref).

You only need to install a package once on your computer, you don’t need to repeat the install.packages() line regularly.

Loading and using packages

Having installed a package, we use library() to load it into memory. We need to do this each time we restart R/Rstudio. For the palmerpenguins package we installed above we would do:

library(palmerpenguins)

You may see some messages in the terminal when you do this, you should read them for the most part they are not problems. Note: be careful about quotation marks. When you used install.packages() you needed quotation marks, but when you use library() you don’t need them.

Once a package is loaded you can use the functions or data in it. The package we have loaded includes some data, so we can look at it using:

penguins
## # A tibble: 344 × 8
##    species island    bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
##    <fct>   <fct>              <dbl>         <dbl>             <int>       <int>
##  1 Adelie  Torgersen           39.1          18.7               181        3750
##  2 Adelie  Torgersen           39.5          17.4               186        3800
##  3 Adelie  Torgersen           40.3          18                 195        3250
##  4 Adelie  Torgersen           NA            NA                  NA          NA
##  5 Adelie  Torgersen           36.7          19.3               193        3450
##  6 Adelie  Torgersen           39.3          20.6               190        3650
##  7 Adelie  Torgersen           38.9          17.8               181        3625
##  8 Adelie  Torgersen           39.2          19.6               195        4675
##  9 Adelie  Torgersen           34.1          18.1               193        3475
## 10 Adelie  Torgersen           42            20.2               190        4250
## # ℹ 334 more rows
## # ℹ 2 more variables: sex <fct>, year <int>

Getting information about packages

Every package you obtain in this way must have specific documentation that you can view. Within Rstudio, your best starting point is likely the package help. You can run help(package = ...) to get a list of the help files for a particular package, like:

help(package = 'palmerpenguins')

If you want to see information about a particular function or data, you can use a question mark before its name, like:

?penguins

The help files are extremely useful, they list all the details about how to use a function, what it produces as output, and give examples of how to use it.

Every package obtained in this way also has documentation on the internet, in this case: https://cran.r-project.org/web/packages/palmerpenguins/index.html

On this site you can read the reference manual, find information about the authors, and see various other information. For some packages you will find files called vignettes, which are short examples of how to use the package (a great place to start).

Development packages – on Github

You may know that github is a website that computer programmers use to communicate their code to others. Many people who are creating new R packages publish them on Github for your use. These packages may not be quite as complete as those found on CRAN, but can sometimes be useful to access. Development versions of CRAN packages can also be found on Github. For example, you can see the palmerpenguins package at https://github.com/allisonhorst/palmerpenguins

To install a package that is only on Github, you need to first install the remotes package (following the instructions above). Then you can use the install_github() function from that package to install the package you want, like

remotes::install_github("allisonhorst/palmerpenguins")

Note: in this case, we didn’t load the remotes package using library(), instead we can access a single function from that package by using the package name followed by two colons :: This is useful when you only want to use a single function from a package.