with Quarto
May 1, 2024
Hello James,
Prof. Toad* gave us an old copy of the dataset. Could you redo the analysis on the updated data? Let’s aim to meet tomorrow for coffee to discuss the results. Are you free at 9 AM?
Best,
Steven
* Identity changed to protect the innocent.
Retraction Watch by Adam Marcus, Ivan Oransky, and Alison McCook Monitors for authors retracting their paper from a journal.
One such case of a paper being retracted due to an Excel error was the Growth in a Time of Debt by Reinhart & Rogoff.
How can we create a report that
contains code
and
updates if data changes?
Replicability is present only when the exact same experiment is performed at least twice leading to the same conclusion. This requires each experiment having the same data collection and analysis mechanisms.
Reproducibility exists if there is a specific set of computational functions/analyses (usually specified in terms of code) that exactly reproduces all of the numbers in a published paper from raw data.
There has been a notable push to move toward Reproducibility within Statistics. In particular, the Journal of American Statistical Association (JASA) recently created a formal guide for reproducibility and appointed their own Associate Editors of Reproducibility!
Elsewhere, the scientific community discussed reproducibility at length in a special edition at the Science journal.
“Let us change our traditional attitude to the constructions of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.”
— Donald Knuth in Literate Programming (1984) on pg. 1
The notion of encouraging programmers to interleaving code within narrative content that follows the natural logic and flow of human thought.
Text
Code
Output
Text
https://mine-cetinkaya-rundel.github.io/quarto-tip-a-day/ | GitHub
In the top left, click the White Plus and select “Quarto Document…”
In the new prompt, enter a title, author name, and press “Create”
Annotated sections of the “Hello Quarto” document related to document information, text formatting, and code execution
Annotated source to output of the “Hello Quarto” document
Annotated “Hello Quarto” document navigation options
Example of a Word document word-document.docx
unzip word-document.docx
Sample HTML Webpage mirror the word document
Source of the sample HTML webpage
Markdown is the
lingua franca
to creating any kind of document
Writing a post using markdown on Stanford’s Subreddit
Writing an issue using markdown on GitHub
Code
Output
Line breaks create a new paragraph.
Links can be hidden e.g. Stanford or not https://stanford.edu/.
Code
Relative Path Image
Absolute Path Image
Output
(Repeated 3 Times…)
Important
Relative paths are the best to use to share your work with others as they are operating system independent. For example, do you have a user called “jjb” on your computer with a folder “img”?
Markdown Syntax | Output |
---|---|
|
Header 1 |
|
Header 2 |
|
Header 3 |
|
Header 4 |
|
Header 5 |
|
Header 6 |
Important
Make sure a new line (space) exists between text and the first list item. For sublists or nested lists, indent four spaces to create a new level in the list.
Tip
To simplify ordered lists and allow for moving items in the list around, use 1.
for each item. If a list needs to be broken, numbering is only continued if each entry is labeled using 1.
, 2.
, 3.
, … format.
Code
| Left | Center | Right |
|-------------------------|:---------------:|--------:|
| Hey, check it out | Colons provide | 873 |
| its **Markdown** | alignment thus | 1000 |
| right in the table | *centered* text | |
Output
Left | Center | Right |
---|---|---|
Hey, check it out | Colons provide | 873 |
its Markdown | alignment thus | 1000 |
right in the table | centered text |
Tip
Visual
mode provides a Table
menu to setup quarto tables or use the table generator website.
Quarto handles literate programming by using a series of programs:
How Quarto Works (Source)
knitr
executes all code chunks and creates a new markdown (.md
) filepandoc
takes the markdown file generated and converts it to the desired format.Visual Mode represents a What You See Is What You Get (WYSIWYG) editor. This mode is similar to Word.
You can render a Quarto documents by using this shortcut in RStudio:
Cmd (⌘) + Shift (⇧) + K
Ctrl + Shift + K
Or, you can press the “Render” button in either Source
or Visual
Mode.
Rendering a Quarto Document using “Render”
Example
Insert chunk into qmd
by typing or using [⌘/Cntrl + ⌥/Alt + I]
Important
Please make sure to label your code chunks! It helps with debugging.
Option | Description | |
---|---|---|
eval |
Evaluate the code chunk. | |
echo |
Include the source code in output | |
output |
Include code output results (true , false , or asis ) |
|
warning |
Include warnings in the output. | |
error |
Include errors in the output (continues execution if error present). | |
include |
Catch all for preventing any output (code or results) from being included. |
Note
knitr
are available: http://yihui.name/knitr/optionsecho
hides code, but shows results.
eval
shows code, but does not create results.
Enclose the R expression using `r `
.
Code:
There are `r nrow(cars)` observations in our data.
Output:
There are 50 observations in our data.
Important
If using Visual
or Source
mode, be advised the R expression will only substitute the value held by the variable when the Quarto document is rendered. That is, the value contained within the expression only appears in the output file.
Code:
The _mean_ of **x** is `r x_mu` and
the _standard deviation_ is `r x_sd`.
Output:
The mean of x is 5.5 and the standard deviation is 3.02765.
The title, author, date, output format, and editor type is stored in the beginning or head of the quarto document. The data is stored according to the YAML Ain’t Markup Language (YAML)1 format.
Render as an HTML document
Render as a PDF
Render as a Word document
Render one Quarto document to many output options like HTML, Jupyter Notebook, PDF, and Word Document.
---
title: "Hello Quarto"
author: "JJB + Course"
format:
html: default
ipynb: default # new format!
pdf: default
docx: default
---
Note
Quarto supports many formats include PowerPoint (PPT), Revealjs, Beamer, Rich Text Format (RTF), and on. For details, see All Formats.
Using the Render button’s drop down menu, we can select a single output format to create.
In this example, we customize the html
and docx
format.
---
title: "Hello Quarto"
author: "JJB + Course"
format:
html:
toc: true
code-fold: true
ipynb: default
pdf: default
docx:
number-sections: true
highlight-style: github
---
Note
For individual format options, please find the format on the All Formats page of the Quarto user guide.
Literate programming has been a huge focus of the R community.
Officially, the Sweave
(.Rnw) system backed by R-core allowed for literate programming.
Championed by Fritz Leisch, who was an R Core Member that recently passed away.
However, the system required extensive use of LaTeX, which marks up text, to combine R code with prose.
Plus, there were a few useful options such as saving long running code chunk results to avoid needing to re-calculate the output that were missing.
Sample Sweave code chunk:
Looking at the weakness of the Sweave feature set, the knitr
package was created with a focus on improving options within LaTeX.
A little bit later, the rmarkdown
package arrived on the scene to lower the barrier of entry by allowing for markdown to used to interweave r
code and results.
Sample Rmarkdown code chunk:
The focus on using rmarkdown
to create reports drew widespread acclaim after its debut in 2014. (See J.J. describe rmarkdown
in 2016.)
However, the name rmarkdown
constrained the report format to just R.
As data science is a polygot field, that is you need to speak more than one language (e.g. Python, R, Julia, SQL, C++, …), the idea for a language agnostic framework was born.
Quarto
is the manifestation of being able to work with multiple languages without needing R (you could just use a Jupyter kernel).
Sample Quarto code chunk: