Lecture 1 - Get Started

  1. R
  2. RStudio
  3. R packages

flowchart LR
  A(R) --> B(RStudio)
  B --> C(R Packages)

1 R

1.1 Why learn R?

  1. Statistical Analysis
    R provides powerful tools for conducting complex statistical analyses, which is essential for many scientific and social research disciplines.
  2. Data Visualization
    R excels in creating high-quality, publishable graphics, enabling clear communication of data insights.
  3. Data Manipulation
    It includes extensive libraries for handling and transforming data, making it easier to prepare large datasets for analysis.
  4. Open Source
    R is free to use, with a vast community that contributes packages and support, reducing software costs and increasing accessibility.
  5. Career Opportunities
    Proficiency in R is highly valued in many careers such as data science, economics, actuarial science and biostatistics, enhancing job prospects. Specifically, the SoA (Society of Actuaries) requires R as the main tool in their Advanced Topics in Predictive Analytics (ATPA) assessment.
  6. Ease of Learning
    While Python is often considered beginner-friendly, R has a significant advantage for data-related tasks once you master the basics. Designed specifically for data manipulation and analysis, learning core data science skills—data manipulation, visualization, and machine learning—can be more straightforward in R.

1.2 What is R and Rstudio?

R is a language and environment for statistical computing and graphics. It is a free, open-source program for which there are abundant online resources to support its use. R can be downloaded from http://cran.r-project.org.

RStudio is an integrated development environment (IDE) for R. It includes a console, syntax-highlighting editor that supports direct code execution, as well as tools for plotting, history, debugging and workspace management. RStudio can be downloaded from https://www.rstudio.com/products/rstudio/download/.

-
Figure 1: RStudio Interface

1.3 What is CRAN?

R is downloaded from the Comprehensive R Archive Network (CRAN). You will see later that we can also download packages from CRAN.

1.4 Exercise A

Q1

Let’s print Hello World in R.

Q2

Check what you can see when you type in Sys.time() in the R console.

Q3

Which is NOT an option for a file type when you go to File \(>\) New File in RStudio?

  • A. R script
  • B. Quarto Document
  • C. Shiny Web App
  • D. R Beamer Presentation
  • E. R Markdown

2 RStudio

2.1 Script Window vs Console Window

  • The Script Window is the place to enter and run your code so that it is easily edited and saved for future use. Usually the Script Window is shown at the top left in RStudio. If this window is not shown, it will be visible when you open a previously saved R script, or when you create a new R Script.
  • To execute your code in the R script, you can move your cursor anywhere in the line of the code, and either click on Run or press Cmd/CTRL + Enter on your keyboard.
  • To execute your code in the Console Window, you can enter code directly and hit Enter. The commands that you run will be shown in the History Window on the top right of RStudio. You can save these commands for future use, yet this is not recommended for documenting your code.
  • Comments in R are preceded by the # symbol; anything following this symbol will not be executed. However, writing comments in your code is important for documentation purposes.

2.2 Saving and Opening R Script Files

  • Saving an R Script: To save your work in R, you create what is known as an R script. This is a plain text file containing the code you’ve written, which can be run in R to perform tasks like data analysis, visualization, etc. Here’s how to save an R script:
    • Click on File > New File > R Script in the RStudio menu bar.
    • Write your code in the R script window.
    • Click on File > Save As in the RStudio menu bar.
    • Choose a location on your computer to save the file, and give it a name ending in .R (e.g., myscript.R).
    • Click Save.
  • Opening an R Script: To open an existing R script for editing or execution, follow these steps:
    • Click on File > Open File in the RStudio menu bar.
    • Navigate to the location of the R script file on your computer.
    • Click Open.

2.3 Exercise B

Q1
  1. Create a new R script in RStudio and save it as myfirstscript.R.
  2. Write a code that prints Hello World in the R script and run the code.
  3. Save and Close the R script.
  4. Re-Open the R script.

3 R Packages

3.1 What is an R package?

While Base R includes numerous built-in functions for statistical computing and plotting, its capabilities can be limited. Thanks to R’s open-source nature, developers can create packages that enhance its basic functionality. You could consider a package as a collection of functions and code or sometimes data as well which is wrapped up in a nice, complete format. If you would like to develop your own packages, check out Hadley Wickham’s book from O’Reilly, “R Packages”.

3.2 What are repositories?

A repository is a central location where many developed packages are located and available for download. There are three major repositories:

3.3 CRAN task views

CRAN task views aim to provide some guidance which packages on CRAN are relevant for tasks related to a certain topic. They give a brief overview of the included packages which are intended to have a sharp focus so that it is sufficiently clear which packages should be included (or excluded). An excerpt of the current task views can be found in Figure Figure 2 and they are still being updated.

-
Figure 2: CRAN Task Views

R Documentation is a useful search engine for packages and functions from CRAN and BioConductor.

3.4 Install packages from CRAN?

We will be focusing on installing packages from CRAN in this course. If you are interested, you can Google installation instructions for packages from GitHub or BioConductor.

Use install.packages() function

You can simply install a package by using the install.packages() function in your R console. e.g., or if you would like to install multiple packages at once.

Use RStudio graphical interface

As seen in Figure Figure 3 and Figure 4 click the Packages tab in your plots, packages panel and find the Install button and click it. Once a window pops up, type in the package name you would like to install. Click Install and R will do the rest of the job.

-
Figure 3: Install packages using RStudio graphical interface
-
Figure 4: Install packages using RStudio graphical interface

3.5 Load packages

After installing a package, you MUST load it before you start to use functions in the package. You can do it using the library() function, e.g., .

3.6 Update, remove, unload packages

Checking packages:

Before considering updating or removing packages, you might want to check what packages you have already installed. You can do it by either installed.packages() or library() with nothing between the parentheses to check. Alternatively, RStudio Packages tab also presents you with a list of installed packages.

Updating packages:

Use old.packages() to obtain a list of the versions of the packages installed. To update all packages, use update.packages(). If you only want to update a specific package, just use once again install.packages('packagename'). Or, you could also update packages using the RStudio graphical interface under the Packages tab.

Uninstalling packages:

Running the remove.packages() will allow you to uninstall a package you do not need anymore. If you are using the RStudio graphical interface, the removal can be done by clicking the ‘X’ on the right end of the packages list.

Unloading packages:

Sometimes you may want to unload a package in the middle of a script, possibly due to conflicts with another package. To unload a given package you can use the detach() function, e.g., would unload the ggplot2 package (that we loaded earlier). Within the RStudio interface, under the Packages tab, you can unload a package by unchecking the box in front of the package name.

3.7 Use of Help files

To activate a function’s help file, you could simply type in the R console a question mark followed by the function name, e.g., . In the file, you will see a list of arguments of the function as well as detailed explanation related to each argument. At the bottom of the file, you can find examples of employing the function. Note that some arguments are optional and some arguments can be migrated from another function.

3.8 Exercise C

Q1
  1. Install a package called readxl and load it.
  2. Use the help file to find out what the readxl::read_excel() function does?
  3. What is the default input value for skip in readxl::read_excel()?
  4. Unload this package.
Tip

If interested, you can find more details about the ‘readxl’ package here.