Lecture 12 - Quarto

Reporting and Communicating

Reporting and communicating is the final part of the data science process. If you cannot communicate your results to other humans, it does not matter how great your analysis is.

In the realm of data science, effective communication is paramount. It bridges the gap between complex data analysis and decision-making processes, enabling stakeholders to grasp insightful conclusions and make informed choices.

Traditional tools like Microsoft Word, while ubiquitous in many professional settings, often fall short when it comes to handling the dynamic and interactive needs of data science reporting:

  • They lack the ability to seamlessly integrate statistical analysis and visualizations.
  • Their static formats can hinder the audience’s understanding and engagement, particularly when dealing with multifaceted data sets.
  • They challenge the reproducibility of reports; without the right tools, reproducing results can be labor-intensive and prone to errors, which in turn can compromise the credibility of the findings.
  • They do not allow for the display of code alongside its results, which is crucial for transparency and understanding in data science communication.

Addressing these issues, Quarto emerges as a powerful ally for users of R. Quarto is an open-source scientific and technical publishing system that enhances the creation of dynamic and reproducible reports.

  • Its compatibility with R allows analysts to embed live R code into documents, which can then be converted into various formats including HTML, PDF, and Word.
  • This integration not only ensures accuracy but also enhances the reproducibility of documents.
  • With Quarto, reports gain interactivity with elements such as expandable code outputs and interactive visualizations, which significantly boost reader engagement and comprehension.
  • Quarto’s ability to produce multiple output formats from a single source document efficiently meets diverse audience needs, simplifying the workflow for R users.

1 Quarto Basics

1.1 Get Started

Quarto is a command line interface tool, not an R package. This means that help is, by-and-large, not available through ?. Instead, as you work through this chapter, and use Quarto in the future, you should refer to the Quarto Cheatsheet or the Quarto documentation.

You need the Quarto command line interface, but you do not need to explicitly install it or load it, as RStudio automatically does both when needed. The easiest way to create a new quarto document is using the RStudio IDE, i.e., File -> New File -> Quarto Document….

Selecting “Quarto Document…” will lead to the New Quarto Document dialog window, where you can choose the type of desired output document you would like to create. The default option is HTML, which is a good choice if you want to publish your work online or in an email, or if you have not made up your mind yet about how you would like to output your final document. Changing to a different format later is typically as easy as chaining one line of text in the document, or a few clicks in the IDE.

After you make your selection and click Create, you will get a basic quarto template.

A quarto file is a plain text file that has the extension .qmd. See the following example.

It contains three important types of content:

  1. An (optional) YAML header surrounded by —s.
  2. Chunks of R code surrounded by ```.
  3. Text mixed with simple text formatting like # heading and _italics_.

It shows a .qmd document in RStudio with notebook interface where code and output are interleaved. You can run each code chunk by clicking the Run icon (it looks like a play button at the top of the chunk), or by pressing Cmd/Ctrl + Shift + Enter. RStudio executes the code and displays the results inline with the code.

To produce a complete report containing all text, code, and results, click Render or press Cmd/Ctrl + Shift + K. You can also do this programmatically with quarto::quarto_render("AQuartoExample.qmd"). This will display the report in the viewer pane and create an HTML file.

When you render the document, Quarto sends the .qmd file to knitr, https://yihui.org/knitr/, which executes all of the code chunks and creates a new markdown (.md) document which includes the code and its output. The markdown file generated by knitr is then processed by pandoc, https://pandoc.org, which is responsible for creating the finished file. This process is shown in the figure below. The advantage of this two step workflow is that you can create a very wide range of output formats.

Exercise A

Q1

Create a new Quarto document using File -> New File -> Quarto Document. Type something in the document and render the document in the format as you wish.

Q2

Work with AQuartoExample.qmd. Practice running the chunks individually. Then render the document by clicking the appropriate button and then by using the appropriate keyboard short cut. Verify that you can modify the code, re-run it, and see modified output.

1.2 Visual vs. Source Editors

If you are new to computational documents like .qmd files but have experience using tools like Google Docs or MS Word, the easiest way to get started with Quarto in RStudio is the visual editor. The Visual editor in RStudio provides a WYSIWYM interface for authoring Quarto documents. Under the hood, prose in Quarto documents (.qmd files) is written in Markdown, a lightweight set of conventions for formatting plain text files.

In the visual editor you can either use the buttons on the menu bar to insert images, tables, cross-references, etc. or you can use the catch-all Cmd + / or Ctrl + / shortcut to insert just about anything. If you are at the beginning of a line, you can also enter just / to invoke the shortcut.

You can also edit Quarto documents using the Source editor in RStudio, without the assist of the Visual editor. While the Visual editor will feel familiar to those with experience writing in tools like Google docs, the Source editor will feel familiar to those with experience writing R scripts or R Markdown documents. The Source editor can also be useful for debugging any Quarto syntax errors since it is often easier to catch these in plain text.

The guide below shows how to use Pandoc’s Markdown for authoring Quarto documents in the source editor.


Text formatting 
------------------------------------------------------------

*italic*  or _italic_
**bold**   __bold__
`code`
~~strikeout~~
superscript^2^ and subscript~2~
[underline]{.underline} [small caps]{.smallcaps}

Headings
------------------------------------------------------------

# 1st Level Header

## 2nd Level Header

### 3rd Level Header

Lists
------------------------------------------------------------

-   Bulleted list item 1

-   Item 2

    -   Item 2a

    -   Item 2b

1.  Numbered list item 1

2.  Item 2.
    The numbers are incremented automatically in the output.

Links
------------------------------------------------------------

<http://example.com>

[linked phrase](http://example.com)

The best way to learn these is simply to try them out. It will take a few days, but soon they will become second nature, and you will not need to think about them. If you forget, you can get to a handy reference sheet with Help -> Markdown Quick Reference.

Exercise B

In the previous AQuartoExample.qmd template file we worked with:

Q1

Add a new header called “Text Formatting”.

Q2

Under the “Text Formatting” header, copy and paste the following texts. Ensure that you format the texts accordingly so that the output will appear the same.


In this course, the main topics are organized as follows.

  1. R Basics
  2. Data Import
  3. Data Manipulation
  4. Exploratory Data Analysis
  5. Programming Basics
  6. Introduction to Machine Learning
  7. Quarto

Our textbooks are:


1.3 Code Chunks

To run code inside a Quarto document, you need to insert a chunk. There are three ways to do so:

  1. The keyboard shortcut Cmd + Option + I (Mac) or Ctrl + Alt + I (Windows)
  2. The “+C” icon in the editor toolbar.
  3. By manually typing the chunk delimiters ```{r} and ```.

It is highly recommended that you learn the keyboard shortcut. It will save you a lot of time in the long run!

You can continue to run the code line by line using the keyboard shortcut that by now: Cmd/Ctrl + Enter. However, chunks get a new keyboard shortcut: Cmd/Ctrl + Shift + Enter, which runs all the code in the chunk. Think of a chunk like a function. A chunk should be relatively self-contained, and focused around a single task.

The following sections describe the chunk header which consists of ```{r}, followed by an optional chunk label and various other chunk options, each on their own line, marked by #|.

Chunk Label

Chunks can be given an optional label, e.g.

This has three advantages:

  1. You can more easily navigate to specific chunks using the drop-down code navigator in the bottom-left of the script editor:

  1. Graphics produced by the chunks will have useful names that make them easier to use elsewhere.

  2. You can set up networks of cached chunks to avoid re-performing expensive computations on every run.

Your chunk labels should be short but evocative and should not contain spaces. We recommend using dashes (-) to separate words (instead of underscores, _) and avoiding other special characters in chunk labels.

You are generally free to label your chunk however you like, but there is one chunk name that imbues special behavior: setup. When you are in a notebook mode, the chunk named setup will be run automatically once, before any other code is run.

Additionally, chunk labels cannot be duplicated. Each chunk label must be unique.

Exercise C

Q1
  1. Review the chunk labels in AQuartoExample.qmd. List the corresponding label of each code chunk.
  2. Create a new section named “Code Chunks” under which add a code chunk and label it “test-glimpse”. In the chunk, write the R code to take a glimpse of the stackloss data frame.
  3. Use the Outline icon in the toolbar to navigate to the “test-glimpse” code chunk.

Chunk Options

Chunk output can be customized with options, arguments supplied to chunk header. knitr provides almost 60 options that you can use to customize your code chunks. Here we will cover the most important chunk options that you will use frequently. You can see the full list at http://yihui.name/knitr/options/.

The most important set of options controls if your code block is executed and what results are inserted in the finished report:

  • eval = FALSE prevents code from being evaluated. (And obviously if the code is not run, no results will be generated). This is useful for displaying example code, or for disabling a large block of code without commenting each line.

  • include = FALSE runs the code, but doesn’t show the code or results in the final document. Use this for setup code that you don’t want cluttering your report.

  • echo = FALSE prevents code, but not the results from appearing in the finished file. Use this when writing reports aimed at people who don’t want to see the underlying R code.

  • message = FALSE or warning = FALSE prevents messages or warnings from appearing in the finished file.

  • results = "hide" hides printed output.

  • fig.show = 'hide' hides plots.

  • error = TRUE causes the render to continue even if code returns an error. This is rarely something you’ll want to include in the final version of your report, but can be very useful if you need to debug exactly what is going on inside your .qmd. It’s also useful if you’re teaching R and want to deliberately include an error. The default, error = FALSE causes rendering to fail if there is a single error in the document.

Each of these chunk options get added to the header of the chunk, following #|, e.g., in the following chunk the result is not printed since eval is set to false.

The following table summarizes which types of output each option suppresses:

Option Run code Show code Output Plots Messages Warnings
eval = FALSE
include = FALSE
echo = FALSE
results = "hide"
fig.show = "hide"
message = FALSE
warning = FALSE

Exercise D

Q1

In AQuartoExample.qmd, to display the code that generates a plot, what steps should be taken?

Global Options

As you work more with knitr, you will discover that some of the default chunk options do not fit your needs and you want to change them.

You can do this by adding the preferred options in the document YAML, under execute. For example, if you are preparing a report for an audience who does not need to see your code but only your results and narrative, you might set echo: false at the document level. That will hide the code by default, so only showing the chunks you deliberately choose to show (with echo: true). You might consider setting message: false and warning: false, but that would make it harder to debug problems because you would not see any messages in the final document.

title: "My report"
execute:
  echo: false

Since Quarto is designed to be multi-lingual (works with R as well as other languages like Python, Julia, etc.), all of the knitr options are not available at the document execution level since some of them only work with knitr and not other engines Quarto uses for running code in other languages (e.g., Jupyter). You can, however, still set these as global options for your document under the knitr field, under opts_chunk. For example, when writing books and tutorials we set:

title: "Tutorial"
knitr:
  opts_chunk:
    comment: "#>"
    collapse: true

This uses our preferred comment formatting and ensures that the code and output are kept closely entwined.

Exercise E

Q1

Add the global options mentioned above under the knitr field in AQuartoExample.qmd. What changes do you think these options have implemented?

Inline code

There is one other way to embed R code into a Quarto document: directly into the text, with: ` r `. This can be very useful if you mention properties of your data in the text. For example, we can write something like:

There are `r nrow(mtcars)` cars. The mean miles per gallon is `r mean(mtcars$mpg)`.

When the report is rendered, the results of these computations are inserted into the text:

There are 32 cars. The mean miles per gallon is 20.090625.

When inserting numbers into text, format() is your friend. It allows you to set the number of digits so you don’t print to a ridiculous degree of precision, and a big.mark to make numbers easier to read.

format(.12358124331, digits = 2)
[1] "0.12"
format(3452345, big.mark = ",")
[1] "3,452,345"

Hence, we can write:

There are 32 cars. The mean miles per gallon is 20.1.

Exercise F

Q1

Still work within AQuartoExample.qmd,

  1. Add a new section called “Inline Code”.
  2. Fill in the following blanks using inline code:

There are ____ rows and ____ columns in the CO2 data. This data set contains information for ____ (how many) unique plants. The standard error of uptake is ____ (with precision up to 2 decimal places).

2 Quarto Advanced

2.1 Callout Blocks

Callouts are an excellent way to draw extra attention to certain concepts, or to more clearly indicate that certain content is supplemental or applicable to only some scenarios.

Callout Types

There are five different types of callouts available.

  • note
  • warning
  • important
  • tip
  • caution

The color and icon will be different depending upon the type that you select. Here are what the various types look like in HTML output:

Note

Note that there are five types of callouts, including: note, tip, warning, caution, and important.

Warning

Callouts provide a simple way to attract attention, for example, to this warning.

Important

Danger, callouts will really improve your writing.

Tip with Title

This is an example of a callout with a title.

This is an example of a “collapsed” caution callout that can be expanded by the user. You can use collapse="true" to collapse it by default or collapse="false" to make a collapsible callout that is expanded by default.

Markdown Syntax

Create callouts in markdown using the following syntax (note that the first markdown heading used within the callout is used as the callout heading):

::: {.callout-note}
Note that there are five types of callouts, including:
`note`, `warning`, `important`, `tip`, and `caution`.
:::

::: {.callout-tip}
## Tip with Title

This is an example of a callout with a title.
:::

::: {.callout-caution collapse="true"}
## Expand To Learn About Collapse

This is an example of a 'folded' caution callout that can be expanded by the user. You can use `collapse="true"` to collapse it by default or `collapse="false"` to make a collapsible callout that is expanded by default.
:::

Note that above callout titles are defined by using a heading at the top of the callout. If you prefer, you can also specify the title using the title attribute. For example:

::: {.callout-tip title="Tip with Title"}
This is a callout with a title.
:::

Exercise A

Q1
  1. Within AQuartoExample.qmd, add a section called “Callout Blocks”.
  2. Insert a folded warning callout block that can be expanded by the user. Use a title of “Expand to See Warning” and include the text “This is a foldable warning callout block.”.

2.2 Figures

The figures in a Quarto document can be embedded (e.g., a PNG or JPEG file) or generated as a result of a code chunk. Below is the syntax for inserting a figure file which is in your current working directory.

![optional caption text](figure.png){fig-alt="optional alt text"}

For example:

![Quarto Logo](imgs/quartologo.png){fig-alt="insert quarto logo"}

results in the following output:

insert quarto logo

Quarto Logo
Tip

Note that when specifying the options in {}. Do NOT use spaces before and after the =.

Alternatively, to embed an image from an external file, you can use the Insert menu in the Visual Editor in RStudio and select Figure / Image. This will pop open a menu where you can browse to the image you want to insert as well as add alternative text or caption to it and adjust its size. In the visual editor you can also simply paste an image from your clipboard into your document and RStudio will place a copy of that image in your project folder.

If you include a code chunk that generates a figure (e.g., includes a ggplot() call), the resulting figure will be automatically included in your Quarto document.

Figure Sizing

External file

By default figures are displayed using their actual size (subject to the width constraints imposed by the page they are rendered within). You can change the display size by adding the width and height attributes to the figure. For example

![Quarto Logo](imgs/quartologo.png){width=300}

Quarto Logo

Note that if only width is specified then height is calculated automatically. If you need to modify the default behavior just add an explicit height attribute.

The default units for width and height are pixels. You can also specify sizes using a percentage or a conventional measurement like inches or millimeters. For example:

![](imgs/quartologo.png){width=80%}

![](imgs/quartologo.png){width=2in}

If you have several figures that appear as a group, you can create a figure division to enclose them. For example:

::: {#fig-logos layout-ncol=2}

![Quarto](quartologo.png){#fig-Qurto}

![R Markdown](rmarkdownlogo.png){#fig-RMarkdown}

Reporting in R
:::
(a) Quarto
(b) R Markdown
Figure 1: Reporting in R
Tip

Note that the empty lines between the figures (and between the last figure and the caption) are required (it is what indicates that these images belong to their own paragraphs rather than being multiple images within the same paragraph).

Note also that we also used a layout-ncol attribute to specify a two-column layout.

Exercise B

Q1
  1. Create a new section called “Figures” in AQuartoExample.qmd.
  2. Insert the following image AUlogo.png which can be downloaded from Canvas.
    • The image has a Caption “Arcadia Logo”
    • Add fig-alt="Insert Arcadia Logo" as the alternative text for the image.
    • Set the width by 20%.
    • Your output should look like the following.

Insert Arcadia Logo

Arcadia Logo

Graph generated by R code

Getting the right size and shape for a graph created by R in Quarto can be more challenging. There are five main chunk options that control figure sizing: fig-width, fig-height, fig-asp, out-width and out-height. Image sizing is challenging because there are two sizes (the size of the figure created by R and the size at which it is inserted in the output document), and multiple ways of specifying the size (i.e. height, width, and aspect ratio: pick two of three).

We recommend three of the five options:

  • Plots tend to be more aesthetically pleasing if they have consistent width. To enforce this, set fig-width: 6 (6”) and fig-asp: 0.618 (the golden ratio) in the defaults. Then in individual chunks, only adjust fig-asp.

  • Control the output size with out-width and set it to a percentage of the body width of the output document. We suggest to out-width: "70%" and fig-align: center. That gives plots room to breathe, without taking up too much space.

  • To put multiple plots in a single row, set the layout-ncol to 2 for two plots, 3 for three plots, etc. This effectively sets out-width to “50%” for each of your plots if layout-ncol is 2, “33%” if layout-ncol is 3, etc. Depending on what you are trying to illustrate (e.g., show data or show plot variations), you might also tweak fig-width, as discussed below.

If you find that you are having to squint to read the text in your plot, you need to tweak fig-width. If fig-width is larger than the size the figure is rendered in the final doc, the text will be too small; if fig-width is smaller, the text will be too big. You will often need to do a little experimentation to figure out the right ratio between the fig-width and the eventual width in your document. To illustrate the principle, the following three plots have fig-width of 4, 6, and 8 respectively:

If you want to make sure the font size is consistent across all your figures, whenever you set out-width, you will also need to adjust fig-width to maintain the same ratio with your default out-width. For example, if your default fig-width is 6 and out-width is “70%”, when you set out-width: “50%” you will need to set fig-width to 4.3 (6 * 0.5 / 0.7).

Figure sizing and scaling is an art and science and getting things right can require an iterative trial-and-error approach. You can learn more about figure sizing in the taking control of plot scaling blog post.

Exercise C

Q1

Work with the plot generated in AQuartoExample.qmd. Experiment with the fig-width, fig-asp, out-width, and fig-align options to understand how each affects the appearance of the plot..

2.3 Tables

Similar to figures, you can include two types of tables in a Quarto document. They can be markdown tables that you create directly in your Quarto document (using the Insert Table menu) or they can be tables generated as a result of a code chunk.

Table Created by Markdown Syntax

A table can be created by the following syntax.

| Right | Left | Default | Center |
|------:|:-----|---------|:------:|
|   12  |  12  |    12   |    12  |
|  123  |  123 |   123   |   123  |
|    1  |    1 |     1   |     1  |

As a result, we will get:

Right Left Default Center
12 12 12 12
123 123 123 123
1 1 1 1

Exercise D

Q1
  1. Add a new section in AQuartoExample.qmd called “Tables”.
  2. Insert the following table using Markdown syntax to your “Tables” section.
Customer ID Customer Name Transaction Amount Payment Method
1 John 35 Cash
2 David 47 Debit
3 Amy 58 Credit

Table Created in Code Chunks

In this section we will focus on tables generated via computation.

By default, Quarto prints data frames and matrices as you would see them in the console:

head(mtcars)

If you prefer that data be displayed with additional formatting you can use the knitr::kable() function. Take a look at the output generated from the code below.

knitr::kable(head(mtcars))
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Refer to the documentation for ?knitr::kable to see the other ways in which you can customize the table. For even deeper customization, consider the gt, huxtable, reactable, kableExtra, xtable, stargazer, pander, tables, and ascii packages. Each provides a set of tools for returning formatted tables from R code.

Exercise E

Q1

Under the “Tables” section you created in Exercise D in AQuartoExample.qmd, insert a code chunk, and add a table with knitr::kable() that shows the first 5 rows of the stackloss data frame and include a caption “Table 1: Stack Loss” to the table.

2.4 YAML Header

You can control many other “whole document” settings by tweaking the parameters of the YAML header. You might wonder what YAML stands for: it is “YAML Ain’t Markup Language”, which is designed for representing hierarchical data in a way that is easy for humans to read and write. Quarto uses it to control many details of the output. Here we will discuss self-contained documents and document parameters.

Self-contained

HTML documents typically have a number of external dependencies (e.g., images, CSS style sheets, JavaScript, etc.) and, by default, Quarto places these dependencies in a _files folder in the same directory as your .qmd file. If you publish the HTML file on a hosting platform (e.g., QuartoPub, https://quartopub.com/), the dependencies in this directory are published with your document and hence are available in the published report. However, if you want to email the report to a colleague, you might prefer to have a single, self-contained, HTML document that embeds all of its dependencies. You can do this by specifying the embed-resources option:

format:
  html:
    embed-resources: true

The resulting file will be self-contained, such that it will need no external files and no internet access to be displayed properly by a browser.

Parameters

Quarto documents can include one or more parameters whose values can be set when you render the report. Parameters are useful when you want to re-render the same report with distinct values for various key inputs. For example, you might be producing sales reports per branch, exam results by student, or demographic summaries by country. To declare one or more parameters, use the params field.

This example uses a my_class parameter to determine which class of cars to display:

---
format: html
params:
  my_class: "suv"
---

```{r}
#| label: setup
#| include: false

library(tidyverse)

class <- mpg |> filter(class == params$my_class)
```

# Fuel economy for `r params$my_class`s

```{r}
#| message: false

ggplot(class, aes(x = displ, y = hwy)) + 
  geom_point() + 
  geom_smooth(se = FALSE)
```

As you can see, parameters are available within the code chunks as a read-only list named params.

You can write atomic vectors directly into the YAML header. You can also run arbitrary R expressions by prefacing the parameter value with !expr. This is a good way to specify date/time parameters.

---
format: html
params:
  start: !expr as.Date("2010-01-01")
---

```{r}
#| label: setup
#| include: false

library(tidyverse)

economics_subset <- economics |>
  filter(date >= params$start)
```

```{r}
#| message: false

economics_subset |>
  ggplot(aes(x = date, y = unemploy / pop)) +
  geom_line()
```

Exercise F

Q1
  1. In the YAML header of the AQuartoExample.qmd document, set a filtering threshold for Air.Flow at 60. This will enable you to render a report only for experiments with the flow of cooling air below 60.

  2. Render the report incorporating all modifications from previous exercises, and review the changes in the output HTML file compared to the original.

2.5 Troubleshooting and More

Troubleshooting

Troubleshooting Quarto documents can be challenging because you are no longer in an interactive R environment, and you will need to learn some new tricks. Additionally, the error could be due to issues with the Quarto document itself or due to the R code in the Quarto document.

One common error in documents with code chunks is duplicated chunk labels, which are especially pervasive if your workflow involves copying and pasting code chunks. To address this issue, all you need to do is to change one of your duplicated labels.

If the errors are due to the R code in the document, the first thing you should always try is to recreate the problem in an interactive session. Restart R, then “Run all chunks”, either from the Code menu, under Run region or with the keyboard shortcut Cmd + Options + R (Mac) or Ctrl + Alt + R (Windows). If you are lucky, that will recreate the problem, and you can figure out what is going on interactively.

If that does not help, there must be something different between your interactive environment and the Quarto environment. You are going to need to systematically explore the options. The most common difference is the working directory: the working directory of a Quarto is the directory in which it lives. Check the working directory is what you expect by including getwd() in a chunk.

Next, brainstorm all the things that might cause the bug. You will need to systematically check that they are the same in your R session and your Quarto session. The easiest way to do that is to set error: true on the chunk causing the problem, then use print() and str() to check that settings are as you expect.

More to Explore

In this chapter we introduced you to Quarto for authoring and publishing reproducible computational documents that include your code and your prose in one place. You have learned about writing Quarto documents in RStudio with the visual or the source editor, how code chunks work and how to customize options for them, and how to include figures and tables in your Quarto documents. Additionally, you have learned about adjusting YAML header options for creating self-contained or parametrized documents. We have also given you some troubleshooting tips.

While this introduction should be sufficient to get you started with Quarto, there is still a lot more to learn. Quarto is still relatively young, and is still growing rapidly. The best place to stay on top of innovations is the official Quarto website: https://quarto.org.

There is another important topic that we have not covered here: collaboration. Collaboration is a vital part of modern data science, and you can make your life much easier by using version control tools, like Git and GitHub. You are recommended to read “Happy Git with R”, a user friendly introduction to Git and GitHub from R users, by Jenny Bryan. The book is freely available online: https://happygitwithr.com.