Dynamic Interactions for R and Python

Using Quarto and WebAssembly

James Balamuta

May 1, 2024

Intro

Housekeeping

For those new to Revealjs, change slides using…

  • Next slide: spacebar, j, or right arrow
  • Previous slide: k or left arrow

Additional options:

  • Options Menu: m
  • Slide Overview: o
  • Zoom: alt (windows) or option (mac)
  • Print slides

Slides and Code

QR Code that holds the link to the presentation's GitHub repository.

These slides were made using Quarto’s Revealjs format under the {quarto-stanford} theme.

See source of the presentation on GitHub at:

https://github.com/coatless-talks/stats352-guest-lectures-on-dynamic-interactions-wasm

Lecture Objectives

  1. Understand the concept of Web Assembly (WASM) and its role in bringing data science languages to the web.
  2. Explore the use of WASM in conjunction with Pyodide for Python and webR for R to empower dynamic and reproducible interactions within Quarto documents.
  3. Engage in hands-on exercises and demonstrations to reinforce understanding and proficiency in leveraging Pyodide and webR within Quarto documents for dynamic content creation.

Before we begin, thank you …

Agenda

  • What is WebAssembly (WASM)?
  • Why is running R and Python under WASM great?
  • How can webR and Pyodide WASM versions be used with Quarto?
  • Where can I publish my document?
  • What’s upcoming?

WebAssembly (WASM)

What is WebAssembly?

  • WebAssembly is a binary instruction format designed with safety in mind
    • Containerization/sandboxing (isolated “user-space” environments)
  • It has “near-native execution speed” in-browser or on system
  • Available in most web browsers

The logo for WebAssembly

“Hello World” with WebAssembly

  • WASM binaries are created using a human readable WebAssembly Text or .wat file for WebAssembly.
  • This file is represented in the form of symbolic expressions (c.f. Lisp family of languages like Clojure and Scheme)

main.wat

(module
  ;; Allocate a page of linear memory (64kb). Export it as "memory"
  (memory (export "memory") 1)

  ;; Write the string at the start of the linear memory.
  (data (i32.const 0) "Hello, world!") ;; write string at location 0

  ;; Export the position and length of the string.
  (global (export "length") i32 (i32.const 12))
  (global (export "position") i32 (i32.const 0)))

Convert to binary with:

wat2wasm main.wat -o main.wasm

Access with JavaScript:

const wasmInstance =
  new WebAssembly.Instance(wasmModule, {});
const { memory, length, position } = wasmInstance.exports;
const bytes = new Uint8Array(memory.buffer, position, length);
const my_string = new TextDecoder('utf8').decode(bytes);

console.log(my_string)

WAT to WASM Demo

Demonstration of compiling a WAT into a WASM binary using the wat2wasm tool.

https://webassembly.github.io/wabt/demo/wat2wasm/

Interactions

Diagram showing the workflow for how the compiled WASM file is working within the browser

But, I have C/C++ code …

No worries! We can use Emscripten:

  • an LLVM-to-JavaScript/WebAssembly compiler
  • that compiles C and C++ code into WebAssembly/WebAssembly System Interface (WASI)
  • to translate automatically

Note

There are more languages available for WASM compilation.

“Hello World” Emscripten Compilation

Let’s take for example the hello_world.c program

#include <stdio.h>

int main() {
  printf("Hello, world!\n");

  return 0;
}

We can convert it using Emscripten to a webpage with:

emcc hello_world.c -o hello_world.html

Previewing the generated webpage, e.g. hello_world.html, requires a local server to see output.

Output View of C Program

Demo webpage showing the 'Hello, world!' example written in C translated to WASM

Is it that straightforward?

There’s a bit more nuance since R and Python extensively use Fortran in:

  • Basic Linear Algebra Subprograms (BLAS)
    • Basic vector and matrix operations
    • “AXPY” operation, \(\alpha \mathbf{X} + \mathbf{Y}\)
  • Linear Algebra Package (LAPACK)
    • Solving systems, eigenvalue and singular value problems
  • And other subroutines …

This brings in the need to use llvm-flang. For more, see George Stagg’s Fortran on WebAssembly post.

Why WASM?

  • New in-browser experiences
  • Complex Web Applications
  • Universal execution

New Experience

Complex Web Applications

A screenshot showing a Docker container running underneath WASM in the web browser.

🔗 https://ktock.github.io/container2wasm-demo/

(Warning: Minimum 200MB download!)

webR and Pyodide

WebAssembly and R: webR

  • webR is a version of the R interpreter built for WebAssembly.
  • Allows for R code to be directly run in a web browser, without an R server.
  • Possible to even run R under Node.js.

The webR hexagon logo

webR REPL

🔗 https://webr.r-wasm.org/v0.3.3/

shinylive REPL

🔗 https://shinylive.io/r/editor

A screenshot of the shinylive editor in a web browser

Developer Tools

An animated gif that shows the process of opening the Web Developer Tools in Chrome

  • Let’s take a look at webR from just a JavaScript perspective, by using Web Developer Tools in Chrome.
  • You can open it by using either:
    • macOS: Command+Option+J
    • Windows: Control+Shift+J
  • In console, type:
console.log("Hi there!")

Using R in our Browser: Initialize

An animated gif that shows the initialization of webR and some calculations within Google Chrome's Developer Tools console.

  • In our web developer console, we need to first load and initialize webR.
  • In console, type:
var webRready = await import(
  'https://webr.r-wasm.org/latest/webr.mjs'
).then(
  async ({ WebR }) => {
    const webR = new WebR();
    await webR.init();
    return webR;
  }
);

Using R in our Browser: Version

An animated gif that shows obtaining the current webR version within Google Chrome's Developer Tools console.

  • Next, let’s determine what version of webR in use.
    • Under the latest tag, we’re using the development version.
    • We can change latest to a specific version, e.g. v0.3.3, to pin the evaluation.
  • In console, type:
var webRready = await import(
  'https://webr.r-wasm.org/latest/webr.mjs'
).then(
  async ({ WebR }) => {
    const webR = new WebR();
    await webR.init();
    return webR;
  }
);

webRready.version

Using R in our Browser: Evaluate

An animated gif that shows an attempt at evaluating R code using webR within Google Chrome's Developer Tools console.

  • Let’s try evaluating some R code using our webR instance.

  • In console, type:

var webRready = await import(
  'https://webr.r-wasm.org/latest/webr.mjs'
).then(
  async ({ WebR }) => {
    const webR = new WebR();
    await webR.init();
    return webR;
  }
);

webRready.version

webRready.evalR(
  'set.seed(1); rnorm(10,5,1)'
);

Using R in our Browser: Await

An animated gif that shows a refined attempt at evaluating R code using webR with `await` (asynchronous programming) within Google Chrome's Developer Tools console.

  • Evaluation involves awaiting promise resolution with await.
  • Promises are integral to asynchronous programming.
    • Offloading long-running tasks keeps the main program responsive to new events.
    • Tasks are run concurrently instead of sequentially.
  • In console, type:
var webRready = await import(
  'https://webr.r-wasm.org/latest/webr.mjs'
).then(
  async ({ WebR }) => {
    const webR = new WebR();
    await webR.init();
    return webR;
  }
);

webRready.version

let result = await webRready.evalR(
  'set.seed(1); rnorm(10,5,1)'
);

Using R in our Browser: Convert

An animated gif that shows the calculation with webR being performed and convert to a valid JavaScript object within Google Chrome's Developer Tools console.

  • With the result being a pointer, we need to convert it to a JavaScript object to see the data.
  • In console, type:
var webRready = await import(
  'https://webr.r-wasm.org/latest/webr.mjs'
).then(
  async ({ WebR }) => {
    const webR = new WebR();
    await webR.init();
    return webR;
  }
);

webRready.version

let result = await webRready.evalR(
  'set.seed(1); rnorm(10,5,1)'
);

let output = await result.toArray();
output

What are the values in R?

01:30

Open up your copy of R, what values are generated when running:

set.seed(1); 
rnorm(10,5,1)

Does it match with the webR output?

WebAssembly and Python: Pyodide

  • Pyodide is a version of the Python interpreter built for WebAssembly.
  • Features a robust seamless Javascript ⟺ Python foreign function interface.
  • Allows for Python code to be directly run in a web browser, without a Python server.

The Pyodide Project logo

Pyodide REPL

🔗 https://pyodide.org/en/stable/console.html

JupyterLab

🔗 JupyterLite’s JupyterLab Version

JupyterNotebook

🔗 JupyterLite’s JupyterNotebook Version

JupyterLite’s REPL

🔗 JupyterLite’s REPL

marimo

🔗 https://marimo.app/

Wait, what is a “server”?

A server is a type of computer that is operating 24/7 on the internet that is interacting with your own computer.

We can think of servers in two ways:

  1. Compute
  2. Web

Note

There are more types of servers available; but, our discussion rests solely on those two.

Compute Servers

An animated gif showing how users send R code to the server and the server sends back results.

Web Servers

An animated gif showing how the server sends a copy of R to the end users computer and, then, the user's computer runs the R code locally.

  • Web Servers focus on serving documents to users.

Data Science with Web Servers

An static image showing how the server sends a copy of R to the end users computer and, then, the user's computer runs the R code locally.

An static image showing how the server sends a copy of R to the end users computer and, then, the user's computer runs the R code locally.

Note

We can substitute the R logo with Python’s in these diagrams.

Trade-offs: Internet

  • Paradigm shifted from installed software requiring a single download to multiple downloads.
    • Internet bandwidth is precious (~1 TB Comcast cap, limited WiFi, slow internet).
  • Requires an internet connection at the start.
    • Need to obtain all resources over an internet connection.
  • Lack of persistency; temporal by nature.
    • Refresh page, poof work is gone!

Trade-offs: Privacy

{fig-alt=“An static image showing the source of a hidden solution on a page.” fig-align=“center”“}

Trade-offs: Complexity

  • New layer of complexity to documents.
    • Compute happens when the document opens.
    • Not during the authoring stage.
  • Uses standard code cell markdown
    • Switch {r} -> {webr-r} or {python} -> {pyodide-python}.
  • Setup using document header fields.
    • No JavaScript manipulation required.

Trade-offs: Environment

  • Universal environments
    • Everyone has the same environment.
    • Not an exact replica of original software.
    • No license fees.
    • Shinier computers perform better!

How many R packages are available?

🔗 https://repo.r-wasm.org (Warning: Minimum 75 MB)

A screenshot showing the webR project's binary WASM R package repository.

Latest R packages from GitHub

r-universe.dev offers binaries based on an R package repository’s most recent commit:

A screenshot showing the webR binary on the r-universe.dev website alongside of an application of webR to download data.

Or, use a modified GitHub Action

How many Python packages are available?

Outside of the Python packages built-in to Pyodide, the number of Python packages varies as there is no central repository.

  • If a Python package is “pure” (*py3-none-any.whl), then the package can be used as-is.
  • Otherwise, the packages must be compiled for Pyodide under specific Python and Emscripten versions.
    • e.g. *-cp310-cp310-emscripten_3_1_27_wasm32.whl

Quarto and quarto-{pyodide,webr}

Quarto

  • Next generation publishing system.
  • Unify and extends the R Markdown ecosystem.
  • Develop and Switch formats without hassle.

The Quarto hexagon logo.

Quarto Extensions

How the quarto-{webr,pyodide} extension works

An animated GIF showing how the {quarto-webr} extension works with Quarto, webR, and a static server.

Use cases

Next steps

  • The next slides focus on authoring documents with dynamic interactions.
  • We’ll go through the process for installing and using a Quarto extension.
  • Alternatively, you can use our authoring codespace. Discussed next…

Authoring Codespace

If you are comfortable with VS Code, you can jump right into an authoring Codespace by clicking on the following button:

Open in GitHub Codespaces

Note: Codespaces are available to Students and Teachers for free up to 180 core hours per month through GitHub Education. Otherwise, you will have up to 60 core hours and 15 GB free per month.

Install the {quarto-webr} Extension

  1. Open or Create an RStudio Quarto Project

  2. Navigate to the Terminal tab in lower left side of RStudio

  3. Type the install command:

    quarto add coatless/quarto-webr

    and press enter.

  4. Voila! It’s installed.

Quarto Project Structure

The project directory should contain the following structure:

.
├── _extensions
   └── coatless/quarto-webr # Added by 'quarto add'
├── _quarto.yml              # Created by 'quarto create'
└── webr-demo.qmd            # Quarto Document with webR

Important

If the _extensions directory is not found within a Quarto project, the project is not using any extensions!

Using {quarto-webr} - 4 Steps

---
title: webR in Quarto HTML Docs
format: html
engine: knitr
---

This is an R-enabled code cell 
in a Quarto HTML document.

```{r}
fit = lm(mpg ~ am, data = mtcars)

summary(fit)
```
  1. Add engine: knitr

Using {quarto-webr} - 4 Steps

---
title: webR in Quarto HTML Docs
format: html
engine: knitr
filters:
  - webr
---

This is an R-enabled code cell 
in a Quarto HTML document.

```{r}
fit = lm(mpg ~ am, data = mtcars)

summary(fit)
```
  1. Add engine: knitr
  2. Add the webr Filter

Using {quarto-webr} - 4 Steps

---
title: webR in Quarto HTML Docs
format: html
engine: knitr
filters:
  - webr
---

This is a webR-enabled code cell 
in a Quarto HTML document.

```{webr-r}
fit = lm(mpg ~ am, data = mtcars)

summary(fit)
```
  1. Add engine: knitr
  2. Add the webr Filter
  3. Use {webr-r} instead of {r}

Using {quarto-webr} - 4 Steps

---
title: webR in Quarto HTML Docs
format: html
engine: knitr
filters:
  - webr
---

This is a webR-enabled code cell 
in a Quarto HTML document.

```{webr-r}
fit = lm(mpg ~ am, data = mtcars)

summary(fit)
```
  1. Add engine: knitr
  2. Add the webr Filter
  3. Use {webr-r} instead of {r}
  4. Render the document!
  •  Mac: Cmd (⌘) + Shift (⇧) + K
  • ⊞ Win: Ctrl + Shift + K

Or, you can press the “Render” button Press the render button to generate a new document.

{quarto-webr}: In Action

An animated GIF showing a Quarto document inside of RStudio augmented by the {quarto-webr} extension having different values placed inside of its code cell.

{quarto-webr} Extension in Action

What about Python? Similar story…

First, install the {quarto-pyodide} extension using Terminal with:

quarto add coatless-quarto/pyodide

Next, register the extension in the Quarto Document with:

---
title: Pyodide in Quarto HTML Docs
format: html
filters:
  - pyodide
---

Finally, use {pyodide-python} instead of {python} when creating a code cell.

{quarto-pyodide}: In Action

An animated GIF showing a Quarto document inside of VS Code augmented by the {quarto-pyodide} extension having different Python code placed inside of its code cell generating a variety of outputs.

{quarto-pyodide} Extension in Action

Options for {quarto-webr}

There are two types of options in {quarto-webr}:

  • Cell-level: Customize how code is evaluated inside of the cell.
  • Document-level: Globally set different document properties.

Note

The cell-level options use a custom code cell parser called {quarto-codecelloptions} and, thus, are not exactly 1-to-1 with Quarto options.

Document-level options

For example, we could disable the status indicator and pre-load different R packages by specifying in the document’s YAML header the webr meta key:

---
title: webR in Quarto HTML Documents
format: html
engine: knitr
webr: 
  show-startup-message: false    # Disable displaying webR status 
  packages: ['ggplot2', 'dplyr'] # Install R packages on document open
filters:
  - webr
---

Cell-level options

Cell-level options direct the execution and output of executable code blocks. These options are specified within comments at the top of a code block by using a hashpipe, e.g. #| option: value.

```{webr-r}
#| autorun: true
#| fig-width: 5

1 + 1
plot(pressure)
```

context option

  • The context cell option handles how the code is executed and displayed to a user.
  • The default context is interactive, which gives us a runnable code cell.
  • Other options for context are:
    • context: output which only shows the output
    • context: setup which shows neither output nor code.
  • More details

Demos

Important

{quarto-pyodide} has yet to receive the options treatment! We only make available an interactive editor.

Publishing

Sharing Work

  • Once you are satisfied with the Quarto document, it’s time to publish your work!
  • The publishing step is important as the documents need to be viewed under the guise of a server.
    • Directly accessing the HTML document may prevent it from working correctly under various configuration options.
  • There are multiple options for publishing with Quarto, and we’ll present two of them.

Publish Options

To make your Quarto document accessible on GitHub Pages via Quarto, use the following command in Terminal:

quarto publish gh-pages

This option is great if you want to share your document through a GitHub Pages website.

Alternatively, you can publish your Quarto document on Quarto Pub via Quarto. Use the following command in Terminal:

quarto publish quarto-pub

This option provides you with a shareable link for easy access by others and is a good choice if you prefer a dedicated platform for your documents.

Continuously Publishing

Continous Deployment

Continuous deployment (CD) is the notion that each time a collaborator contributes code into a branch of the repository, the code is automatically built and put into a production environment.

Overview of Services

  • GitHub: Enables version control and integrates with …
  • Continuous deployment (CD) services that build and deploy code on each pushed commit

Prebuilt GitHub Actions

  • GitHub Actions have already been created for various Quarto, R language, webR, and Pyodide workflows.
  • These actions provide a way to quickly setup the necessary configuration files for the repository to be continuously deployed.
    • Actions are programmed through a combination of YAML, Shell, Node.js, or Dockerfiles.
  • Moreover, these actions usually contain the “best practices” for deployment.

Using a GitHub Action

Let’s say we want to use a GitHub action for building a webR package alongside a pkgdown website using GitHub, inside of our RStudio package project we would run:

# install.packages("usethis")
usethis::use_github_action(
  "https://raw.githubusercontent.com/r-wasm/actions/main/examples/rwasm-binary-and-pkgdown-site.yml"
)

This copies the YAML file for the action and sets it up inside of the .github/workflows folder, e.g.

.github/workflows/rwasm-binary-and-pkgdown-site.yml

What about just publishing for Quarto?

  • The prior example is great for package development.
  • But, what if we just wanted to publish our Quarto document?
  • How can we automatically render and deploy our Quarto documents to GitHub Pages?
    • We’ll need to build a custom workflow!
  • The next few slides cover the GitHub Action workflows used in the demo repo.

Stepping through the Action: Triggers

on:
  push:
    branches: [main, master]
  release:
      types: [published]
  workflow_dispatch: {}
    
name: demo-quarto-document

Action will run on either: a push, pull request, or manual trigger.

It’ll run under demo-quarto-document name.

Stepping through the Action: Config

# ... previous slide

jobs:
  demo-quarto-document:
    runs-on: ubuntu-latest
    concurrency:
      group: quarto-website-${{ github.event_name != 'pull_request' || github.run_id }}
    permissions:
      contents: read
      pages: write
      id-token: write

Next, we’ll specify what operating system will be used, restrict multiple jobs from running, and describe the permissions the workflow has.

Stepping through the Action: Setup R

# ... previous slide
steps:
  - name: "Check out repository"
    uses: actions/checkout@v4

  - name: "Setup pandoc"
    uses: r-lib/actions/setup-pandoc@v2

  - name: "Setup R"
    uses: r-lib/actions/setup-r@v2

  - name: "Setup R dependencies for Quarto's knitr engine"
    uses: r-lib/actions/setup-r-dependencies@v2
    with:
      packages:
        any::knitr
        any::rmarkdown
        any::downlit
        any::xml2

Next, we obtain a copy of the repository.

Then, we specify the different software dependencies for the project.

As we’re using the engine: knitr, this requires additional R package dependencies.

Stepping through the Action: Quarto

# ... previous slides
- name: "Set up Quarto"
  uses: quarto-dev/quarto-actions/setup@v2

- name: "Install Quarto extensions"
  shell: bash
  run: |
    quarto add --no-prompt coatless/quarto-webr

- name: "Render working directory"
  uses: quarto-dev/quarto-actions/render@v2

In the next part of the file, we focus on Quarto:

  1. Installing Quarto
  2. Installing the {quarto-webr} Quarto extension
    • This step can be ommitted; but, we want future updates!
    • We can also pin the version by using coatless/quarto-webr@v0.4.1
  3. Render any .qmd files in the current directory

Stepping through the Action: Deploy

# ... previous slides
- name: "Upload Pages artifact"
  uses: actions/upload-pages-artifact@v2
  with: 
    retention-days: 1

- name: "Deploy to GitHub Pages"
  id: deployment
  uses: actions/deploy-pages@v2

Finally, we take our rendered Quarto documents and create a zip archive that can be used on GitHub Pages.

Then, we deploy that archive onto GitHub pages.

Enable GitHub Pages

A screenshot showing the settings page of a GitHub repository to setup deployment onto GitHub Pages of the content using a GitHub Action.

View the Deployed Documents

From here, our augmented Quarto documents with interactivity should be available for everyone!

https://quarto.thecoatlessprofessor.com/quarto-webr-pyodide-demo/

You can see the deployment repository here:

https://github.com/coatless-quarto/quarto-webr-pyodide-demo

Concluding

Future Work

  • Improve the {quarto-pyodide} extension features.
    • Move toward a message posting interface.
    • Improve graphing support
  • Formalize a built-in code exercise checking feature.
  • Push toward native code APIs.
  • Explore incorporating dynamic input toggles alongside code cells.

Prototype Dynamics

An animated GIF showing a demo of a control slider alongside of an interactive code cell. As the control slider is moved, the code cell is run and the output of a normal distribution histogram is shown.

Thank you! Questions?

Thank you for the invitation to talk today!

Questions