Next Generation Data Science Education

pyOpenSci Fall Festival 2024

James Balamuta

November 1, 2024

> Who are you?_

Classic hello greeting sticker containing the text "Hello, my name is James"

Who am I?

Photo of Dr. James Balamuta flying a drone next to the Alma mater — Dr. James Balamuta (he/him)

Founder, HJJB LLC + Stealth Startups
Adjunct Lecturer, Department of Statistics @ Stanford
Former Visiting Assistant Prof. & Graduate Student Instructor, Department of Statistics @ UIUC
Builder of Quarto Extensions
- quarto-webr & quarto-pyodide
DTI @ Illinois Affiliate
GitHub | Website

Learning the Unknown

Learning Laboratory

“It shouldn’t be a static thing; it should be one where people learn what’s happening. And the only way to learn what’s happening is to change what’s happening.”

— Frank Oppenheimer in “Exploratorium” by Jon Boorstin (1974) at ~13:22

Main entrance to the Exploratorium at Pier 15 — Exploratorium at Pier 15 in San Francisco

Direct manipulation of concepts
Immediate visual feedback
Personal discovery
Enhanced retention

Active Learning

What is Active Learning?
- A pedagogical approach that engages students in the learning process through activities and/or discussion.
Why Active Learning?
- Increases student engagement
- Improves retention
- Enhances understanding
How to Implement Active Learning?
- Explorable Explanations

Explorable Explanations

“People currently think of text as information to be consumed. I want text to be used as an environment to think in.”

– Bret Victor in Explorable Explanations, 2011

🔄 Reactive Documents: Play with authors’ assumptions and see consequences
🎮 Interactive Examples: Make abstract concepts concrete through direct manipulation
🔍 Contextual Information: Verify claims and explore related ideas in real-time

Attempts, I’ve had a few …

Tweet from James Balamuta on the challenges of using learnr with Shiny Server

Me in ~2018 thinking learnr + Shiny Server + 100 students = 😱

Why? Unstable at scale and required a dedicated + licensed server.

Explorable Environments

Three key pieces of technology:

Pyodide: Python in the Browser without a server
Observable: Interactive JavaScript for Data Exploration
Quarto Live: Official Quarto Extension for Interactivity in Notebooks
- Or, use the community version quarto-pyodide.

A bonus piece of technology is Quarto Drop: In-slide IDE, press the tilda ` key to open.

🐍 Python in the Browser

01:30

Say hello to Pyodide through Quarto Live.

Live Output
Markdown Source

```{pyodide}
#| exercise: ex_basic
# Try different values in the list comprehensions
squares = [x**2 for x in range(____)]
squares
```

1: Defines a Quarto Live cell attribute specifying this as an exercise named ex_basic
2: Provides a comment guiding the user to experiment with different values
3: Creates a list comprehension that squares numbers from 0 to a value the user needs to fill in
4: Displays the resulting squares list

```{pyodide}
#| exercise: ex_basic
#| check: true
feedback = None
if len(result) > 5:
 feedback = {"correct": True,
 "message": "Great! You created a longer sequence!"}
else:
 feedback = {"correct": False,
 "message": "Try using a larger number in range()"}
feedback
```

1: Defines a Quarto Live cell attribute specifying this as part of the ex_basic exercise
2: Specifies that this cell checks or “grades” the exercise result
3: Initializes a feedback variable as being empty
4: Checks if the length of the result is greater than 5, if true set feedback to indicate correct answer
5: Sets feedback to indicate incorrect answer and provide hint
6: Displays the feedback dictionary

Explorable Mathematics

Exploring the classic equation of a line: \(y = ax + b\)

Live Output
Markdown Source

\(a =\) and \(b =\) ,

import {Tangle} from "@mbostock/tangle"

// Setup Tangle reactive inputs
viewof a = Inputs.input(1);
viewof b = Inputs.input(0);
aParam = Inputs.bind(Tangle({min: -5, max: 5, minWidth: "1em", step: 0.1}), viewof a);
bParam = Inputs.bind(Tangle({min: -10, max: 10, minWidth: "1em", step: 0.2}), viewof b);

```{md}
$a =$ `{{ojs}} aParam` and $b =$ `{{ojs}} bParam`,
```

1: Displays the current values of parameters a and b using Observable JS variables in a math context

```{ojs}
//| echo: false
import {Tangle} from "@mbostock/tangle"
// Setup Tangle reactive inputs
viewof a = Inputs.input(1);
viewof b = Inputs.input(0);
aParam = Inputs.bind(Tangle({min: -5, max: 5, minWidth: "1em", step: 0.1}), viewof a);
bParam = Inputs.bind(Tangle({min: -10, max: 10, minWidth: "1em", step: 0.2}), viewof b);
```

1: Hides the code cell from output
2: Imports the Tangle library for interactive inputs
3: Comment indicating setup of reactive inputs
4: Creates an input view for parameter ‘a’ with initial value 1
5: Creates an input view for parameter ‘b’ with initial value 0
6: Binds parameter ‘a’ to a Tangle input with range [-5,5] and step size 0.1
7: Binds parameter ‘b’ to a Tangle input with range [-10,10] and step size 0.2

```{pyodide}
#| echo: false
#| autorun: true
#| input:
#| - a
#| - b
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 100)
y = a * x + b
plt.figure(figsize=(8, 4))
plt.plot(x, y)
plt.grid(True)
plt.title(f'Linear Function: y = {a}x + {b}')
plt.show()
```

1: Hides the code cell from output
2: Automatically runs the cell when inputs change
3: Specifies input parameters
4: First input parameter ‘a’
5: Second input parameter ‘b’
6: Imports NumPy for numerical operations
7: Imports Matplotlib for plotting
8: Creates x-values array from 0 to 10 with 100 points
9: Calculates y-values using linear function ax + b
10: Creates a new figure with specified size
11: Plots x versus y values
12: Adds grid to the plot
13: Sets the plot title showing the current function
14: Displays the plot

Reactive Programming

Woah? What’s this?

Quarto Live documents make extensive use of Quarto’s built-in Observable support.
Any Observable notebook or JavaScript library can be imported in a Quarto document cell.
- In the last cell, we imported Bret Victor’s Tangle library that was ported to Observable by Mike Bostock (creator of Observable) to create a slider for the a and b parameters.
Slider input can be used to update the Python code across the entire document in real-time.

Explorables

viewof n_bins = Inputs.range([5, 1000], {
  step: 1,
  value: 20,
  label: "Number of bins:"
})

```{ojs}
//| echo: false
viewof n_bins = Inputs.range([5, 1000], {
step: 1,
value: 20,
label: "Number of bins:"
})
```

1: Hides the code cell from output
2: Creates an interactive range slider for number of bins
3: Sets the increment step size to 1
4: Sets the initial value to 20 bins
5: Adds a label to describe the input control

```{pyodide}
#| autorun: true
#| echo: false
#| input:
#| - n_bins
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Generate random data
rng = np.random.RandomState(2021)
data = rng.normal(0, 1, 1000)
plt.figure(figsize=(8, 4))
plt.hist(data, bins=n_bins)
plt.title(f'Normal Distribution with {n_bins} bins')
plt.show()
```

1: Automatically runs the cell when the input changes
2: Prevents the code cell from being shown
3: Specifies input parameters section
4: Declares n_bins as an input parameter
5: Imports NumPy for numerical operations
6: Imports Matplotlib for plotting
7: Imports Seaborn for statistical visualization
8: Comments the data generation step
9: Generates 1000 random numbers from a normal distribution
10: Creates a new figure with specified size
11: Creates a histogram with user-specified number of bins
12: Sets the plot title showing current number of bins
13: Displays the plot

Explorable Biostatistics: Contigency

Live Output
Markdown Source

viewof a11 = Inputs.input(20);
viewof a12 = Inputs.input(15);
viewof a21 = Inputs.input(10);
viewof a22 = Inputs.input(25);

// Create Tangle bindings for each cell
cell11 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a11);
cell12 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a12);
cell21 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a21);
cell22 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a22);

// Calculate row and column totals
row1_total = a11 + a12
row2_total = a21 + a22
col1_total = a11 + a21
col2_total = a12 + a22
table_total = row1_total + row2_total

Drag the numbers to adjust the cell values and see how they affect the odds ratio:

	Exposed	Not Exposed	Total
Cases
Controls
Total

Calculations:

```{ojs}
//| echo: false
// Initialize the 2x2 table cells with Tangle inputs
viewof a11 = Inputs.input(20);
viewof a12 = Inputs.input(15);
viewof a21 = Inputs.input(10);
viewof a22 = Inputs.input(25);
// Create Tangle bindings for each cell
cell11 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a11);
cell12 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a12);
cell21 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a21);
cell22 = Inputs.bind(Tangle({min: 0, max: 100, minWidth: "3em", step: 1}), viewof a22);
// Calculate row and column totals
row1_total = a11 + a12
row2_total = a21 + a22
col1_total = a11 + a21
col2_total = a12 + a22
table_total = row1_total + row2_total
```

1: Hides the code cell from output
2: Comment indicating initialization of table cell inputs
3: Creates input view for cell (1,1) with initial value 20
4: Creates input view for cell (1,2) with initial value 15
5: Creates input view for cell (2,1) with initial value 10
6: Creates input view for cell (2,2) with initial value 25
7: Comment indicating creation of Tangle bindings
8: Binds cell (1,1) to Tangle input with range [0,100]
9: Binds cell (1,2) to Tangle input with range [0,100]
10: Binds cell (2,1) to Tangle input with range [0,100]
11: Binds cell (2,2) to Tangle input with range [0,100]
12: Comment indicating calculation of totals
13: Calculates first row total
14: Calculates second row total
15: Calculates first column total
16: Calculates second column total
17: Calculates overall table total

```{pyodide}
#| autorun: true
#| echo: false
#| input:
#| - a11
#| - a12
#| - a21
#| - a22
import numpy as np
from scipy import stats
def analyze_contingency(a11, a12, a21, a22):
 # Create contingency table
 table = np.array([[a11, a12], [a21, a22]])
 # Calculate odds ratio
 odds_ratio = (a11 * a22) / (a12 * a21)
 # Calculate 95% CI for odds ratio
 log_or = np.log(odds_ratio)
 se_log_or = np.sqrt(1/a11 + 1/a12 + 1/a21 + 1/a22)
 ci_lower = np.exp(log_or - 1.96 * se_log_or)
 ci_upper = np.exp(log_or + 1.96 * se_log_or)
 # Perform chi-square test
 chi2, p_value = stats.chi2_contingency(table)[0:2]
 return {
 'odds_ratio': odds_ratio,
 'ci_lower': ci_lower,
 'ci_upper': ci_upper,
 'chi2': chi2,
 'p_value': p_value
 }
results = analyze_contingency(a11, a12, a21, a22)
print(f"Odds Ratio: {results['odds_ratio']:.2f}")
print(f"95% CI: ({results['ci_lower']:.2f}, {results['ci_upper']:.2f})")
print(f"Chi-square statistic: {results['chi2']:.2f}")
print(f"P-value: {results['p_value']:.4f}")
```

1: Automatically runs the cell when inputs change
2: Hides the code cell from output
3: Specifies input parameters section
4: First input parameter a11
5: Second input parameter a12
6: Third input parameter a21
7: Fourth input parameter a22
8: Imports NumPy for numerical operations
9: Imports scipy.stats for statistical tests
10: Defines function to analyze contingency table
11: Comments contingency table creation
12: Creates 2x2 contingency table as NumPy array
13: Comments odds ratio calculation
14: Calculates odds ratio
15: Comments confidence interval calculation
16: Calculates log odds ratio
17: Calculates standard error of log odds ratio
18: Calculates lower confidence interval bound
19: Calculates upper confidence interval bound
20: Comments chi-square test
21: Performs chi-square test and extracts statistics
22: Begins return dictionary
23: Includes odds ratio in results
24: Includes lower CI bound in results
25: Includes upper CI bound in results
26: Includes chi-square statistic in results
27: Includes p-value in results
28: Calls analysis function with current values
29: Prints formatted odds ratio
30: Prints formatted confidence interval
31: Prints formatted chi-square statistic
32: Prints formatted p-value

Explorable Physics: Pendulum motion

Live Output
Markdown Source

viewof length = Inputs.range([1, 10], {
  step: 0.1,
  value: 5,
  label: "Pendulum Length (m)"
})

viewof gravity = Inputs.range([1, 15], {
  step: 0.1,
  value: 9.8,
  label: "Gravity (m/s²)"
})

```{ojs}
//| echo: false
viewof length = Inputs.range([1, 10], {
step: 0.1,
value: 5,
label: "Pendulum Length (m)"
})
viewof gravity = Inputs.range([1, 15], {
step: 0.1,
value: 9.8,
label: "Gravity (m/s²)"
})
```

1: Creates a range slider for pendulum length between 1 and 10 meters
2: Sets length increment step size to 0.1 meters
3: Sets initial length value to 5 meters
4: Adds label specifying length units
5: Creates a range slider for gravity between 1 and 15 m/s²
6: Sets gravity increment step size to 0.1 m/s²
7: Sets initial gravity value to 9.8 m/s² (Earth’s gravity)
8: Adds label specifying gravity units

```{pyodide}
#| echo: false
#| autorun: true
#| input:
#| - length
#| - gravity
import numpy as np
import matplotlib.pyplot as plt
# Calculate period
T = 2 * np.pi * np.sqrt(length/gravity)
t = np.linspace(0, T*2, 1000)
theta = 0.5 * np.sin(np.sqrt(gravity/length) * t)
plt.figure(figsize=(10, 4))
plt.plot(t, theta)
plt.title(f'Pendulum Motion (Period: {T:.2f} s)')
plt.xlabel('Time (s)')
plt.ylabel('Angle (rad)')
plt.grid(True)
plt.show()
```

1: Hides the code cell from output
2: Automatically runs the cell when inputs change
3: Specifies input parameters section
4: Declares length as first input parameter
5: Declares gravity as second input parameter
6: Imports NumPy for mathematical operations
7: Imports Matplotlib for plotting
8: Comments period calculation
9: Calculates pendulum period using formula T = 2π√(L/g)
10: Creates time array for two periods of motion
11: Calculates angle using small-angle approximation
12: Creates new figure with specified size
13: Plots time vs angle
14: Sets title showing current period
15: Labels x-axis with units
16: Labels y-axis with units
17: Adds grid to plot
18: Displays the plot

Explorable ML: k-means

Live Output
Markdown Source

viewof n_clusters = Inputs.range([2, 8], {
  step: 1,
  value: 3,
  label: "Number of clusters:"
})

```{ojs}
//| echo: false
viewof n_clusters = Inputs.range([2, 8], {
step: 1,
value: 3,
label: "Number of clusters:"
})
```

1: Creates an interactive range slider for the number of clusters between 2 and 8
2: Sets the increment step size to 1
3: Sets the initial value to 3 clusters
4: Adds a descriptive label for the input control

```{pyodide}
#| autorun: true
#| echo: false
#| input:
#| - n_clusters
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
# Generate sample data
X, * = make*blobs(n_samples=300, centers=4, random_state=42)
# Perform k-means clustering
kmeans = KMeans(n_clusters=n_clusters, random_state=42)
labels = kmeans.fit_predict(X)
# Plot results
plt.figure(figsize=(8, 6))
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.scatter(kmeans.cluster_centers_[:, 0],
 kmeans.cluster_centers_[:, 1],
 marker='x', s=200, linewidths=3,
 color='r', label='Centroids')
plt.title(f'K-means Clustering with {n_clusters} clusters')
plt.legend()
plt.show()
```

1: Automatically runs the cell when input changes
2: Hides the code cell from output
3: Specifies input parameters section
4: Declares n_clusters as an input parameter
5: Imports make_blobs for generating synthetic data
6: Imports KMeans clustering algorithm
7: Imports Matplotlib for plotting
8: Comments data generation step
9: Creates synthetic dataset with 300 samples and 4 centers
10: Comments clustering step
11: Initializes KMeans with user-specified number of clusters
12: Fits model and predicts cluster labels
13: Comments plotting section
14: Creates new figure with specified size
15: Plots data points colored by cluster assignment
16: Begins centroid plotting - x coordinates
17: Continues centroid plotting - y coordinates
18: Sets centroid marker style and size
19: Sets centroid color and label
20: Sets plot title showing current number of clusters
21: Adds legend to plot
22: Displays the plot

Explorable Data Sets

Live Output
Markdown Source

viewof selected_species = Inputs.select(
  ["Adelie", "Gentoo", "Chinstrap", "All"],
  { value: "All", label: "Highlight Species:" }
)

```{ojs}
//| echo: false
viewof selected_species = Inputs.select(
 ["Adelie", "Gentoo", "Chinstrap", "All"],
 { value: "All", label: "Highlight Species:" }
)
```

1: Hides the code cell from output
2: Creates a dropdown select input for penguin species
3: Lists available species options including “All”
4: Sets default value to “All” and adds descriptive label

```{pyodide}
#| fig-height: 5
#| echo: false
#| autorun: true
#| input:
#| - selected_species
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Load penguins data
penguins = pd.read_csv('https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv')
penguins = penguins.dropna()
# Create plot
plt.figure(figsize=(10, 6))
# Set alpha (transparency) based on selection
penguins['alpha'] = 0.2 # default low transparency
if selected_species == "All":
 penguins['alpha'] = 0.7 # all visible
else:
 penguins.loc[penguins['species'] == selected_species, 'alpha'] = 0.9 # highlight selected
# Create scatter plot
for species in penguins['species'].unique():
 mask = penguins['species'] == species
 plt.scatter(
 penguins.loc[mask, 'bill_length_mm'],
 penguins.loc[mask, 'bill_depth_mm'],
 label=species,
 alpha=penguins.loc[mask, 'alpha'],
 s=100
 )
plt.title('Palmer Penguins: Bill Dimensions by Species')
plt.xlabel('Bill Length (mm)')
plt.ylabel('Bill Depth (mm)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
```

1: Sets figure height to 5 units
2: Hides code from output
3: Automatically runs when input changes
4: Specifies input parameters section
5: Declares selected_species as input
6: Imports pandas for data handling
7: Imports seaborn for statistical visualization
8: Imports matplotlib for plotting
9: Comments data loading section
10: Loads penguins dataset
11: Removes rows with missing values
12: Comments plot creation section
13: Creates new figure with specified size
14: Comments transparency section
15: Sets default low transparency for all points
16: Checks if all species are selected
17: Sets medium transparency for all points if “All” selected
18: Starts else block for specific species selection
19: Sets high transparency for selected species only
20: Comments scatter plot section
21: Iterates through unique species
22: Creates boolean mask for current species
23: Begins scatter plot for current species
24: Sets x-coordinates (bill length)
25: Sets y-coordinates (bill depth)
26: Labels points by species
27: Sets transparency based on selection
28: Sets point size
29: Sets plot title
30: Labels x-axis with units
31: Labels y-axis with units
32: Adds legend
33: Adds light grid
34: Displays the plot

Explorable Data Clusters

Live Output
Markdown Source

viewof feature1 = Inputs.select(
  ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g'],
  { value: 'bill_length_mm', label: 'X-axis feature:' }
)

viewof feature2 = Inputs.select(
  ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g'],
  { value: 'bill_depth_mm', label: 'Y-axis feature:' }
)

viewof n_clusters_penguins = Inputs.range([2, 5], {
  step: 1,
  value: 3,
  label: "Number of clusters:"
})

```{ojs}
//| echo: false
viewof feature1 = Inputs.select(
 ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g'],
 { value: 'bill_length_mm', label: 'X-axis feature:' }
)
viewof feature2 = Inputs.select(
 ['bill_length_mm', 'bill_depth_mm', 'flipper_length_mm', 'body_mass_g'],
 { value: 'bill_depth_mm', label: 'Y-axis feature:' }
)
viewof n_clusters_penguins = Inputs.range([2, 5], {
step: 1,
value: 3,
label: "Number of clusters:"
})
```

1: Hides the code cell from output
2: Creates first dropdown for X-axis feature selection
3: Lists available penguin measurements for X-axis
4: Sets default X-axis to bill length and adds label
5: Creates second dropdown for Y-axis feature selection
6: Lists available penguin measurements for Y-axis
7: Sets default Y-axis to bill depth and adds label
8: Creates range slider for number of clusters between 2 and 5
9: Sets increment step size to 1
10: Sets initial number of clusters to 3
11: Adds label for cluster selection

```{pyodide}
#| autorun: true
#| echo: false
#| input:
#| - feature1
#| - feature2
#| - n_clusters_penguins
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
# Load and prepare data
penguins = pd.read_csv('https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv')
penguins = penguins.dropna()
# Select and scale features
X = penguins[[feature1, feature2]]
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Perform clustering
kmeans = KMeans(n_clusters=n_clusters_penguins, random_state=42)
clusters = kmeans.fit_predict(X_scaled)
# Create plot
plt.figure(figsize=(10, 6))
scatter = plt.scatter(penguins[feature1], penguins[feature2],
 c=clusters, cmap='viridis',
 alpha=0.6, s=100)
# Add cluster centers
centers_orig = scaler.inverse_transform(kmeans.cluster_centers_)
plt.scatter(centers_orig[:, 0], centers_orig[:, 1],
 c='red', marker='x', s=200, linewidths=3,
 label='Cluster Centers')
plt.title(f'Palmer Penguins Clusters\n{feature1} vs {feature2}')
plt.xlabel(feature1.replace('_', ' ').title())
plt.ylabel(feature2.replace('_', ' ').title())
plt.legend(*scatter.legend_elements(), title="Clusters")
plt.grid(True, alpha=0.3)
plt.show()
# Print cluster sizes
cluster_sizes = pd.Series(clusters).value_counts().sort_index()
print("\nCluster sizes:")
for i, size in enumerate(cluster_sizes):
 print(f"Cluster {i}: {size} penguins")
```

1: Automatically runs when inputs change
2: Hides code from output
3: Specifies input parameters section
4: Declares first feature as input
5: Declares second feature as input
6: Declares number of clusters as input
7: Imports pandas for data handling
8: Imports matplotlib for plotting
9: Imports KMeans clustering algorithm
10: Imports StandardScaler for feature scaling
11: Comments data loading section
12: Loads penguins dataset
13: Removes rows with missing values
14: Comments feature preparation section
15: Selects user-specified features
16: Initializes scaler object
17: Scales selected features
18: Comments clustering section
19: Initializes KMeans with user-specified clusters
20: Performs clustering and gets labels
21: Comments plot creation section
22: Creates new figure with specified size
23: Creates scatter plot of selected features
24: Colors points by cluster and sets colormap
25: Sets transparency and point size
26: Comments center plotting section
27: Transforms cluster centers back to original scale
28: Plots cluster centers
29: Sets center marker style and size
30: Adds label for centers
31: Sets plot title with feature names
32: Sets x-axis label
33: Sets y-axis label
34: Adds cluster legend
35: Adds light grid
36: Displays plot
37: Comments cluster size section
38: Calculates size of each cluster
39: Prints header for cluster sizes
40: Iterates through clusters
41: Prints size of each cluster

Explorable Structured Programming

Live Output
Markdown Source

viewof operation = Inputs.select(
  ['square', 'cube', 'double', 'half'], 
  {value: 'square', label: "Operation: "}
)

viewof range_end = Inputs.range([1, 20], {
  step: 1,
  value: 5,
  label: "Range end: "
})

```{ojs}
viewof operation = Inputs.select(
 ['square', 'cube', 'double', 'half'],
 {value: 'square', label: "Operation: "}
)
viewof range_end = Inputs.range([1, 20], {
step: 1,
value: 5,
label: "Range end: "
})
```

1: Creates a dropdown select input for mathematical operations
2: Defines the list of available operations
3: Sets initial value to ‘square’ and adds a label
4: Creates a range slider for the end value between 1 and 20
5: Sets the increment step size to 1
6: Sets the initial value to 5
7: Adds a descriptive label for the range input

```{pyodide}
#| autorun: true
#| edit: false
#| echo: false
#|
#| input:
#| - operation
#| - range_end
operations = {
 'square': lambda x: x**2,
 'cube': lambda x: x**3,
 'double': lambda x: x*2,
 'half': lambda x: x/2
}
result = [operations[operation](x) for x in range(range_end)]
print(f"Python code: [operations['{operation}'](x) for x in range({range_end})]")
print(f"Result: {result}")
```

1: Automatically runs the cell when inputs change
2: Prevents editing of the code cell
3: Hides the code cell from output
4: Empty line in YAML header
5: Specifies input parameters section
6: Declares operation as first input parameter
7: Declares range_end as second input parameter
8: Creates dictionary of mathematical operations
9: Defines square operation as lambda function
10: Defines cube operation as lambda function
11: Defines double operation as lambda function
12: Defines half operation as lambda function
13: Applies selected operation to range using list comprehension
14: Prints the Python code being executed
15: Prints the resulting list after applying the operation

Explorable Unstructured Programming

02:00

Live Output
Markdown Source

```{pyodide}
# Let's classify the weather
temperature = 101
if temperature > 90:
 print('It is hot outside')
elif temperature > 70:
 print('It is warm outside')
elif weather > 50:
 print('It is cool outside')
else:
 print('Wear a parka')
```

1: Comments describing the purpose of the code
2: Initializes temperature variable with value 101
3: First condition: checks if temperature is above 90
4: Prints hot weather message if first condition is true
5: Second condition: checks if temperature is above 70 (if first condition was false)
6: Prints warm weather message if second condition is true
7: Third condition: checks if weather is above 50 (if first and second conditions were false)
8: Prints cool weather message if third condition is true
9: Else block for when all conditions are false (temperature <= 50)
10: Prints cold weather message if all conditions are false

Exercising the Mind

Live Output
Markdown Source

Filter the penguins dataset to show only Gentoo penguins with body mass greater than 5000g:

```{pyodide}
#| setup: true
#| exercise: ex_penguins
#| echo: false
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
# Load and prepare data
penguins = pd.read_csv('https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv')
penguins = penguins.dropna()
```

1: Marks this as a setup cell to run before the exercise
2: Associates this cell with the ‘ex_penguins’ exercise
3: Hides the code cell from output
4: Imports pandas for data manipulation
5: Imports numpy for numerical operations
6: Imports StandardScaler from scikit-learn
7: Comments data loading step
8: Loads penguins dataset from URL
9: Removes rows with missing values

```{pyodide}
#| exercise: ex_penguins
filtered_df = penguins[______]
filtered_df
```

1: Associates this cell with the ‘ex_penguins’ exercise
2: Creates placeholder for filtering conditions
3: Displays the filtered dataframe

```{pyodide}
#| exercise: ex_penguins
#| check: true
correct_answer = penguins[(penguins['species'] == 'Gentoo') & (penguins['body_mass_g'] > 5000)]
if len(result) == len(correct_answer) and all(result.index == correct_answer.index):
 feedback = {
 "correct": True,
 "message": "Perfect! You correctly filtered for Gentoo penguins over 5000g."
 }
else:
 feedback = {
 "correct": False,
 "message": "Not quite. Make sure you're using both conditions: species=='Gentoo' and body_mass_g>5000"
 }
feedback
```

1: Associates this cell with the ‘ex_penguins’ exercise
2: Marks this as a check cell for verification
3: Creates correct answer using both filtering conditions
4: Checks if result matches correct answer in length and indices
5: Creates feedback dictionary for correct answer
6: Sets correct status to True
7: Provides success message
8: Starts else block for incorrect answers
9: Creates feedback dictionary for incorrect answer
10: Sets correct status to False
11: Provides hint message for incorrect answer
12: Returns feedback dictionary

Explorable Design Patterns

Progressive Disclosure
- Start simple
- Add complexity gradually
- Layer concepts
Direct Manipulation
- Interactive variables
- Real-time updates
- Visual feedback
Multiple Representations
- Code
- Visualizations
- Numerical output

Best Practices

Clear Relationships
- Show how variables affect outcomes
- Highlight connections
- Demonstrate causality
Bounded Exploration
- Set meaningful limits
- Prevent invalid states
- Guide discovery
Multiple Entry Points
- Different learning styles
- Various complexity levels
- Multiple paths to understanding

Demos

We have a few more demos to show you!
- Course Webpage with Slides
- Standalone document
Source code available on GitHub.

Implementation Tips

Installation

Install Quarto: https://quarto.org

Install Quarto Live & Drop Extensions

# Install Quarto Extensions
quarto add r-wasm/quarto-live
quarto add r-wasm/quarto-drop

Quarto extensions are installed only in the current project scope. For each new project, you will need to install the extensions again.

Document Setup

The pyodide section is used to specify Python packages to load in the WebAssembly environment. This ensures that the required packages are available for Python code execution before the code cells are unlocked for students to explore.

---
format: 
    live-revealjs:
        scrollable: true
        smaller: true
        pyodide:
            packages: ['numpy', 'matplotlib']
---

Code Cells

Tangle Integration

```{ojs}
import {Tangle} from "@mbostock/tangle"
viewof var = Inputs.range([min, max], {
    step: 0.1,
    value: initial,
    label: "Variable:"
})
varTangle = Inputs.bind(Tangle({min: min, max: max, step: 0.1}), viewof var)
```

Python Integration

```{pyodide}
#| input:
#|   - var
# Your Python code using var
```

And that in-slide IDE …

The quarto-drop extension uses the drop key to specify options for the in-slide IDE for students to interact with the code cells.
While quarto-live uses the pyodide section to specify WebAssembly environment.

Document Header Setup

---
format: 
    live-revealjs:
        scrollable: true
        smaller: true
        drop: 
            engine: pyodide
            packages: ['matplotlib', 'numpy', 'pandas', 'seaborn']
        pyodide:
            packages: ['matplotlib', 'numpy', 'pandas', 'seaborn']
revealjs-plugins:
- drop
---

Concluding

Some Limitations

Technical Constraints
- Not all Python/R packages work in WebAssembly
- Browser memory limits affect large computations
- Initial load times can be significant
- File system access is restricted
Development
- Debugging is more challenging
- Error messages may be less helpful
- Package versions must be WebAssembly-compatible
Interactive Elements
- Limited widget support compared to Jupyter

Resources

Thank You!

Questions?

Keep in touch:

GitHub: @coatless
BlueSky: @coatless.bsky.social
Mastodon: @coatless@mastodon.social
Twitter/X: @axiomsofxyz
LinkedIn: jamesbalamuta