Packages are collections of R functions or data that you can use to do various statistical tasks. Some packages are included by default when you install R, and others are published on the internet by their creators. There are a few different ways to explore and obtain packages.
If you want to see a list of packages that are currently installed on your computer, you can click on the packages tab in the bottom right of the screen. You can click on the name of a package to see information about what it contains.
Most packages you will use are easily obtained through Rstudio / using default commands. R automatically downloads these packages from a website called “CRAN”
To install one of these packages, you only need to use the
install.packages() function. E.g., to install the
palmerpenguins package, we do:
install.packages('palmerpenguins')
This will likely just work – it may take a little while and some output will appear in the terminal.
Occasionally, one of two things might happen:
You only need to install a package once on your computer, you don’t
need to repeat the install.packages() line regularly.
Having installed a package, we use library() to load it
into memory. We need to do this each time we restart R/Rstudio. For the
palmerpenguins package we installed above we would do:
library(palmerpenguins)
You may see some messages in the terminal when you do this, you
should read them for the most part they are not problems.
Note: be careful about quotation marks. When you used
install.packages() you needed quotation marks, but when you
use library() you don’t need them.
Once a package is loaded you can use the functions or data in it. The package we have loaded includes some data, so we can look at it using:
penguins
## # A tibble: 344 × 8
## species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
## <fct> <fct> <dbl> <dbl> <int> <int>
## 1 Adelie Torgersen 39.1 18.7 181 3750
## 2 Adelie Torgersen 39.5 17.4 186 3800
## 3 Adelie Torgersen 40.3 18 195 3250
## 4 Adelie Torgersen NA NA NA NA
## 5 Adelie Torgersen 36.7 19.3 193 3450
## 6 Adelie Torgersen 39.3 20.6 190 3650
## 7 Adelie Torgersen 38.9 17.8 181 3625
## 8 Adelie Torgersen 39.2 19.6 195 4675
## 9 Adelie Torgersen 34.1 18.1 193 3475
## 10 Adelie Torgersen 42 20.2 190 4250
## # ℹ 334 more rows
## # ℹ 2 more variables: sex <fct>, year <int>
Every package you obtain in this way must have specific documentation
that you can view. Within Rstudio, your best starting point is likely
the package help. You can run help(package = ...) to get a
list of the help files for a particular package, like:
help(package = 'palmerpenguins')
If you want to see information about a particular function or data, you can use a question mark before its name, like:
?penguins
The help files are extremely useful, they list all the details about how to use a function, what it produces as output, and give examples of how to use it.
Every package obtained in this way also has documentation on the internet, in this case: https://cran.r-project.org/web/packages/palmerpenguins/index.html
On this site you can read the reference manual, find information about the authors, and see various other information. For some packages you will find files called vignettes, which are short examples of how to use the package (a great place to start).
You may know that github is a website that computer programmers use
to communicate their code to others. Many people who are creating new R
packages publish them on Github for your use. These packages may not be
quite as complete as those found on CRAN, but can sometimes be useful to
access. Development versions of CRAN packages can also be found on
Github. For example, you can see the palmerpenguins package
at https://github.com/allisonhorst/palmerpenguins
To install a package that is only on Github, you need to first
install the remotes package (following the instructions
above). Then you can use the install_github() function from
that package to install the package you want, like
remotes::install_github("allisonhorst/palmerpenguins")
Note: in this case, we didn’t load the
remotes package using library(), instead we
can access a single function from that package by using the package name
followed by two colons :: This is useful when you only want
to use a single function from a package.