Hi, I’m Javier

Career & Skills Progression


---
config:
  look: handDrawn
---

flowchart LR
  id[****FINANCIAL MODELING****]

  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  class id bluedrawn;

---
config:
  look: handDrawn
---

flowchart LR
  id2[****DATA SCIENCE****]

  classDef peachdrawn fill:#ffddd6,stroke:#612200,color:#000000;
  class id2 peachdrawn;

---
config:
  look: handDrawn
---

flowchart LR
  A[EY] --> B[PG&E]
  B --> C[KPMG]
  C --> D{UCI's<br>MSBA<br>Program}
  D --> E[Centene]
  E --> F[Bloomreach]
  F --> G[Centene]

  classDef whitedrawn fill:#ffffff,stroke:#612200,color:#000000;
  class A,B,C,D,E,F,G whitedrawn;

Career & Skills Progression


---
config:
  look: handDrawn
---

flowchart LR
  id[**FINANCIAL MODELING**]

  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  class id bluedrawn;

---
config:
  look: handDrawn
---

flowchart LR
  id2[**DATA SCIENCE**]

  classDef peachdrawn fill:#ffddd6,stroke:#612200,color:#000000;
  class id2 peachdrawn;

---
config:
  look: handDrawn
---

flowchart LR
  A[EY] --> B[PG&E]
  B --> C[KPMG]
  C --> D{UCI's<br>MSBA<br>Program}
  D --> E[Centene]
  E --> F[Bloomreach]
  F --> G[Centene]

  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  classDef whitedrawn fill:#ffffff,stroke:#612200,color:#000000;
  class A,B,C bluedrawn;
  class D,E,F,G whitedrawn;

Career & Skills Progression


---
config:
  look: handDrawn
---

flowchart LR
  id[**FINANCIAL MODELING**]

  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  class id bluedrawn;

---
config:
  look: handDrawn
---

flowchart LR
  id2[**DATA SCIENCE**]

  classDef peachdrawn fill:#ffddd6,stroke:#612200,color:#000000;
  class id2 peachdrawn;

---
config:
  look: handDrawn
---

flowchart LR
  A[EY] --> B[PG&E]
  B --> C[KPMG]
  C --> D{UCI's<br>MSBA<br>Program}
  D --> E[Centene]
  E --> F[Bloomreach]
  F --> G[Centene]

  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  classDef peachdrawn fill:#ffddd6,stroke:#612200,color:#000000;
  classDef whitedrawn fill:#ffffff,stroke:#612200,color:#000000;
  class A,B,C bluedrawn;
  class D whitedrawn;
  class E,F,G peachdrawn;

Career & Skills Progression


---
config:
  look: handDrawn
---

flowchart LR
  id[**FINANCIAL MODELING**]

  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  class id bluedrawn;

---
config:
  look: handDrawn
---

flowchart LR
  id2[**DATA SCIENCE**]

  classDef peachdrawn fill:#ffddd6,stroke:#612200,color:#000000;
  class id2 peachdrawn;

---
config:
  look: handDrawn
---

flowchart LR
  A[EY] --> B[PG&E]
  B --> C[KPMG]
  C --> D{UCI's<br>MSBA<br>Program}
  D --> E[Centene]
  E --> F[Bloomreach]
  F --> G[Centene]
  
  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  classDef peachdrawn fill:#ffddd6,stroke:#612200,color:#000000;
  classDef whitedrawn fill:#ffffff,stroke:#612200,color:#000000;
  class A,B,C bluedrawn;
  class D whitedrawn;
  class E,F,G peachdrawn;

---
config:
  look: handDrawn
---

flowchart LR
  A[Excel, PowerPoint, Access, SQL] --> B{UCI's<br>MSBA<br>Program}
  B --> C[Automating Manual Excel Work]
  C --> D[Developing R Packages & Web Apps]
  B --> E[Code-based Analytics]
  E --> F[Machine Learning & AI]
  F --> G[R, Linux, git, Posit tools, Databricks]

  classDef bluedrawn fill:#c0e5fe,stroke:#612200,color:#000000;
  classDef peachdrawn fill:#ffddd6,stroke:#612200,color:#000000;
  classDef whitedrawn fill:#ffffff,stroke:#612200,color:#000000;
  class A,C bluedrawn;
  class D,E,F,G peachdrawn;
  class B whitedrawn;

Standing Out with Shiny


  • Shiny is a code-based web app framework for Python and R

  • Easy-ish-ly build web apps with no formal web dev experience

  • Use reactive programming that allows for dynamic, real-time updates to the app based on user input

  • Extend your Shiny apps with HTML widgets, real-time data polling, JavaScript, CSS, and more

Image: © Analythium

Shiny Demo as a Resume Accessory


  • Most corporate dashboards feel clunky…

  • They’re laggy, very boxy, e.g., Power BI, Tableau, MicroStrategy

  • Dashboards typically struggle with fast, real-time calculations, search, and data manipulation capabilities

  • For hiring teams, you can use Shiny as a resume accessory to demonstrate how easy it is to develop and deploy a web app styled with the company’s brand aesthetics

Shiny Demo as a Resume Accessory

Bloomreach’s website:

Image: © 2025 Bloomreach, Inc.

Shiny Demo as a Resume Accessory

Javier’s Shiny demo:

Pro Tip! 💡 Shiny Assistant


  • The new Shiny Assistant… 🤯… An LLM-powered tool that builds functional Python and R Shiny apps in minutes!

  • Shiny Assistant builds apps using Shinylive, a web interface made possible by WebAssembly (or “wasm”)

  • The LLM’s training data is a few months old as of this writing so it won’t know about the latest Shiny features

My Professional Data Science Workflow


My Professional Data Science Toolkit


Image: © Posit Software, PBC.

My Professional Data Science Toolkit

Image: © Posit Software, PBC.

My Professional Data Science Toolkit

Typical Python Tools for Data Science

Typical Python Tools for Data Science

In Review


  • RStudio IDE’s “always on” panes welcomed R users seeking a data-analysis-first experience

  • For Python users, RStudio felt too R-centric and other tools worked just fine including VS Code, Jupyter Notebooks, PyCharm, etc.

  • There are many programming languages that can be used for data analysis, but Python and R are the de facto standards for data science

So, why isn’t there

one tool

to rule them all!?


Introducing: Positron™

Positron, it looks familiar!

About Positron

  • What is Positron? From Posit’s getting started docs:
  • A next-generation data science IDE built by Posit PBC
  • An extensible, polyglot tool for writing code and exploring data
  • A familiar environment for reproducible authoring and publishing
  • Positron is a tailor-made IDE for data science built on top of Code OSS that can be used with any combination of programming languages

VS Code OSS w/ RStudio panes!

Prerequisites


  • Windows prereqs:

    • Ensure the latest Visual C++ Redistributable is installed

    • If you’re an R user package developer, note that Positron doesn’t currently bundle Rtools.

    • For reference, RTools contains the required compilers needed to build R packages from source code on Windows

Prerequisites


  • Python prereqs:

    • The Posit team recommends pyenv to manage Python versions, and Python versions from 3.8 to 3.12 are actively supported on Positron

    • For Linux users, install the SQLite system libraries (sqlite-devel or libsqlite3-dev) ahead of time so pyenv can build your Python version(s) of choice

    • Positron communicates with Python via the ipykernel

    • If you’re using venv or conda to manage your Python projects, you can install ipykernal manually as follows: python3 -m pip install ipykernel

Prerequisites


  • R prereqs:

    • Positron requires R 4.2 or higher - To install R, follow the instructions for your OS at https://cloud.r-project.org

    • If you’d like to have multiple R installations, rig is a great tool that works on macOS, Windows and Linux, and works well with Positron

Interpreter Selector


  • When Positron starts for the first time in a new workspace (or project directory), it will start Python and/or R depending on your workspace characteristics

  • In subsequent runs, Positron will start the same interpreter(s) that was running the last time that you used that workspace

  • You can start, stop, and switch interpreters from the interpreter selector

Key Bindings & Command Palette

  • Key bindings trigger actions by pressing a combination of keys
  • The key binding Cmd/Ctrl+Shift+P will bring up Positron’s Command Palette
  • This lets you search and execute actions without needing to remember the key binding


Global Keyboard Shortcuts


Shortcut Description
Cmd/Ctrl+Enter Run the selected code in the editor; if no code is selected, run the current statement
Cmd/Ctrl+Shift+0 Restart the interpreter currently open in the Console
Cmd/Ctrl+Shift+Enter Run the file open in the editor (using e.g. source() or %run)
F1 Show contextual help for the topic under the cursor
Cmd/Ctrl+K, Cmd/Ctrl+R Show contextual help for the topic under the cursor (alternate binding)
Cmd/Ctrl+K, F Focus the Console
Ctrl+L Clear the Console

R Keyboard Shortcuts


Shortcut Description
Cmd/Ctrl+Shift+M Insert the pipe operator (|> or %>%)
Alt+- Insert the assignment operator (<-)
Cmd/Ctrl+Shift+L Load the current R package, if any
Cmd/Ctrl+Shift+B Build and install the current R package, if any
Cmd/Ctrl+Shift+T Test the current R package, if any
Cmd/Ctrl+Shift+E Check the current R package, if any
Cmd/Ctrl+Shift+D Document the current R package, if any

RStudio Keymap


If you’re an experienced RStudio user, you can easily set the RStudio keybindings in the Positron settings:

  • Open Positron’s settings in the UI or the keystroke Cmd/Ctrl+,
  • Search for “keymap”, or navigate to Extensions > RStudio Keymap
  • Check the “Enable RStudio key mappings for Positron” checkbox

Data Explorer Overview


  • The new Data Explorer allows for interactive exploration of various types of dataframes using Python (pandas, polars) or R (data.frame, tibble, data.table, polars)

  • The Data Explorer has three primary components
    • Data grid: Spreadsheet-like display of the data with sorting
    • Summary panel: Column name, type and missing data percentage for each column
    • Filter bar: Ephemeral filters for specific columns

Data Explorer Overview


  • To use, navigate to the Variables Pane and click on the Data Explorer icon:

Data Explorer Overview

Data Explorer’s Data Grid


  • The data grid is the primary display and scales efficiently with large in-memory datasets up to millions of rows or columns

  • At the top right of each column, there is a context menu that controls sorting and filtering in the selected column

Data Explorer’s Summary Panel


  • Displays a vertical scrolling list of all columns in the data

  • It displays a sparkline histogram of that column’s data, displays the amount of missing data, and shows some summary statistics about that column

  • Double clicking on a column name will bring the column into focus in the data grid

Data Explorer’s Filter Bar


  • The filter bar has controls to add filters, show/hide existing filters, or clear filters

  • Clicking the + button quickly adds a new filter

  • The status bar at the bottom of the Data Explorer also displays the percentage and number of remaining rows relative to the original total after applying a filter

Connections Pane


- Explore database connections established with ODBC drivers or packages

- For Python users, the sqlite3 and SQLAlchemy packages are supported

- For R users, the odbc, sparklyr, bigrquery, and more packages are supported

Interactive Apps


- Instead of running apps from a Terminal, Positron lets you run supported apps by clicking the Play button in Editor Actions

- Supported apps include the following: Shiny, Dash, FastAPI, Flask, Gradio, and Streamlit

- You can also start apps in Debug mode

Learn More about Positron

Special Bonus for R Users 😘


  • Ark (“an R kernel”) was created to serve as the interface between R and the Positron IDE and is a Jupyter kernel, a Language Server Protocol (“LSP”) server, and it is a Debug Adapter Protocol (“DAP”)

  • It is compatible with all frontends implementing the Jupyter protocol and is bundled with Positron

  • Ark’s LSP was written with a Rust backend to future proof its ability to perform sophisticated static analysis of R code and its DAP allows for advanced step-debugging

So, lesson learned…
Positron = Amazing.

But next…

let’s talk about

ULTRA

FAST

ETL 💨

Arrow, DuckDB, and Polars

About these Frameworks


  • Apache Arrow: An in-memory columnar format that lets data move quickly between systems and languages (like Python and R) and speeds up analytical tasks

  • DuckDB: An in-process SQL database management system focused on analytical query processing

  • Polars: A modern DataFrame library with a multi-threaded query engine written in Rust that uses the Apache Arrow memory model under the hood

  • Apache Parquet: A popular columnar storage format that compresses data to save space and is easily read by Arrow, DuckDB, and Polars for efficient data querying

Benefits & Use Cases


  • These frameworks provide convenient methods for reading and writing columnar file formats

  • If you’re working with a collection of .parquet files, the Arrow packages for C++, Python, and R support reading entire directories of files and treating them as a single dataset

  • Allows zero-copy data sharing inside a single process without any build-time or link-time dependency requirements. This allows, for example, R users to access Python pyarrow-based projects using R’s reticulate package.

  • Apache Spark uses Arrow as a data interchange format, and both Python’s PySpark and R’s sparklyr take advantage of Arrow for significant performance gains when transferring data

Installing Arrow, DuckDB, and Polars

Python

# Arrow env vars for AWS S3 & GCP support
import os
os.environ["LIBARROW_MINIMAL"] = "FALSE"
os.environ["ARROW_S3"] = "ON"
os.environ["ARROW_GCS"] = "ON"

# Install packages
!pip install pyarrow duckdb polars

R

# Arrow env vars for AWS S3 & GCP support
Sys.setenv(LIBARROW_MINIMAL = "false")
Sys.setenv(ARROW_S3 = "ON")
Sys.setenv(ARROW_GCS = "ON")

# Enable Polars install with pre-built
# Rust library binary 
Sys.setenv(NOT_CRAN = "true")

# Install packages
install.packages(c("arrow", "duckdb"))
install.packages("polars", repos = "https://community.r-multiverse.org")
install.packages("polarssql", repos = "https://rpolars.r-universe.dev")

Back to fast… How fast?

  • My personal laptop is a MacBook Air with 24 GB RAM
  • To test Arrow’s capabilities, I read a 40 GB dataset with more than 1.1 billion rows and 24 columns
  • The .parquet dataset was partitioned by Year and Month (120 files)
  • Important to note that my laptop would not be able to load this object entirely into memory as a data.frame or tibble given my laptop’s limited RAM

Reading Remote Parquet Data

Python

import pyarrow.dataset as ds
import pyarrow.compute as pc
import os

# Set path for download
# NYC Taxi Data download (40 GB)
data_path = os.path.join("data", "nyc-taxi")

# Open connection to the remote dataset
nyc_dataset = ds.dataset(
  "s3://voltrondata-labs-datasets/nyc-taxi", 
  format = "parquet"
)

# Filter for years 2012 - 2021
filtered_table = nyc_dataset.to_table(
    filter = ds.field("year").isin(list(range(2012, 2022)))
)

# Write out the filtered data, partitioned by year and month
ds.write_dataset(
    filtered_table,
    base_dir = data_path,
    format = "parquet",
    partitioning = ["year", "month"]
)

R

library(here)
library(arrow)
library(dplyr)

# Set path for download
# NYC Taxi Data download (40 GB)
data_path <- here::here("data/nyc-taxi")

# Open connection to the remote dataset,
# filter for years 2012 - 2021, and
# write out the filtered data,
# partitioned by year and month
open_dataset("s3://voltrondata-labs-datasets/nyc-taxi") |>
  filter(year %in% 2012:2021) |> 
  write_dataset(data_path, partitioning = c("year", "month"))

Benchmarking Read Times

R

library(ggplot2)
library(bench)

# Benchmark Read Times
bnch <- bench::mark(
  min_iterations = 1000,
  arrow = arrow::open_dataset(here::here("data/nyc-taxi"))
)

autoplot(bnch)

Benchmarking Read Times

  • Once “locally” available, the 40GB, 1.1 billion row dataset (benchmarked 1,000 times) can be read on average in 17ms

Benchmarking Data Manipulation

  • Using R’s dplyr, we can perform ETL on an arrow table and collect() the results back to a df
# Open Arrow connection to dataset (40 GB)
nyc_taxi <- open_dataset(here::here("data/nyc-taxi"))

# Benchmark dplyr pipeline
bnch <- bench::mark(
  min_iterations = 100,
  arrow = nyc_taxi |> 
    filter(payment_type %in% c("Credit card", "Cash")) |> 
    group_by(payment_type) |> 
    summarise(mean_fare = mean(fare_amount, na.rm = T),
              mean_tip = mean(tip_amount, na.rm = T)) |> 
    ungroup() |> 
    dplyr::collect()
)

autoplot(bnch)

Benchmarking ETL

  • The results show the ETL pipeline on average summarized 1.1 billion rows in 9.5s, benchmarked over 100 iterations

Note for R’s Tidyverse Users

  • Many functions from the tidyverse collections of packages have 1:1 compatibility with Arrow tables
  • However, sometimes you’ll encounter a breaking point
  • Take this stringr::str_replace_na() example:

nyc_taxi |> 
  mutate(vendor_name = str_replace_na(vendor_name, "No vendor"))
#> Error: Expression str_replace_na(vendor_name, "No vendor") 
#> not supported in Arrow
  • Out of the box, Arrow does not support stringr::str_replace_na()

but wait!

a solution exists

User Defined Functions

  • Arrow allows users to create and register User Defined Functions (or “UDFs”) to the Arrow engine
  • Almost any function can be made Arrow-friendly by registering custom UDFs to the Arrow kernel
  • Let’s learn how to register stringr::str_replace_na() with the Arrow kernel

Registering UDFs

  • First, run arrow::schema() on your Arrow table to review the field name and data type pairs
arrow::schema(nyc_taxi)
#> Schema
#> vendor_name: string
#> pickup_datetime: timestamp[ms]
#> dropoff_datetime: timestamp[ms]
#> passenger_count: int64
#> trip_distance: double
#> pickup_longitude: double
#> pickup_latitude: double
#> ...
  • Since I want to mutate the vendor_name field, I know I’ll be working with an Arrow string() data type

Registering UDFs

  • Next, we’ll use arrow::register_scalar_function()
  • Name your UDF replace_arrow_nas
  • If you’re registering a tidyverse function, set auto_convert = TRUE
arrow::register_scalar_function(
  name = "replace_arrow_nas",
  # Note: The first argument must always be context
  function(context, x, replacement) {
    stringr::str_replace_na(x, replacement)
  },
  in_type = schema(
    x = string(),
    replacement = string()
  ),
  out_type = string(),
  auto_convert = TRUE
)

Trying the UDF

  • Let’s see if the registered replace_arrow_nas() Arrow UDF works…
nyc_taxi |> 
  mutate(vendor_name = replace_arrow_nas(vendor_name, "No vendor")) |> 
  distinct(vendor_name) |> 
  arrange(vendor_name) |> 
  collect()

Trying the UDF

  • Let’s see if the registered replace_arrow_nas() Arrow UDF works…
nyc_taxi |> 
  mutate(vendor_name = replace_arrow_nas(vendor_name, "No vendor")) |> 
  distinct(vendor_name) |> 
  arrange(vendor_name) |> 
  collect()

#> # A tibble: 3 × 1
#>   vendor_name
#>   <chr>      
#> 1 CMT
#> 2 No vendor
#> 3 VTS
  • Success! It works!

DuckDB



- DuckDB Labs created an in-line database management system, like a SQLite database engine, but optimized for distributed compute and optimized for larger-than-memory analysis

- The duckdb package for Python offers a state-of-the-art optimizer that pushes down filters and projections directly into Arrow scans

- As a result, only relevant columns and partitions will be read thus significantly accelerates query execution

- DuckDB comes with core and community extensions that expand the framework to, e.g., scan remote Databricks Unity Catalog tables without needing to spin up a Spark cluster

DuckDB Usage Basics

Python

import duckdb

# Connect to an in-memory DuckDB instance and scan
# the Parquet data set to make a temp View
con = duckdb.connect()

con.execute("""
    CREATE VIEW nyc_taxi AS
    SELECT * FROM read_parquet('data/nyc-taxi/**/*.parquet', hive_partitioning=true)
""")

# Run your SQL query
df = con.execute("""
    SELECT 
        payment_type,
        AVG(fare_amount) AS mean_fare,
        AVG(tip_amount)  AS mean_tip
    FROM nyc_taxi
    WHERE payment_type IN ('Credit card', 'Cash')
    GROUP BY payment_type
""").df()

print(df)

R

library(duckdb)

# Connect to an in-memory DuckDB instance and scan
# the Parquet data set to make a temp View
con <- dbConnect(duckdb())

dbExecute(con, "
  CREATE VIEW nyc_taxi AS 
  SELECT * FROM read_parquet('data/nyc-taxi/**/*.parquet', hive_partitioning = true)"
)

# Run your SQL query
df <- dbGetQuery(con, "
    SELECT 
        payment_type,
        AVG(fare_amount) AS mean_fare,
        AVG(tip_amount)  AS mean_tip
    FROM nyc_taxi
    WHERE payment_type IN ('Credit card', 'Cash')
    GROUP BY payment_type
")

print(df)

DuckDB Streaming with Python

# DuckDB via Python
# Open dataset using year,month folder partition
nyc = ds.dataset('nyc-taxi/', partitioning=["year", "month"])

# Get database connection
con = duckdb.connect()

# Run query that selects part of the data
query = con.execute("SELECT total_amount, passenger_count,year FROM nyc where total_amount > 100 and year > 2014")

# Create Record Batch Reader from Query Result.
# "fetch_record_batch()" also accepts an extra parameter related to the desired produced chunk size.
record_batch_reader = query.fetch_record_batch()

# Retrieve all batch chunks
chunk = record_batch_reader.read_next_batch()
while len(chunk) > 0:
    chunk = record_batch_reader.read_next_batch()
# We must exclude one of the columns of the NYC dataset due to an unimplemented cast in Arrow
working_columns = ["vendor_id","pickup_at","dropoff_at","passenger_count","trip_distance","pickup_longitude",
    "pickup_latitude","store_and_fwd_flag","dropoff_longitude","dropoff_latitude","payment_type",
    "fare_amount","extra","mta_tax","tip_amount","tolls_amount","total_amount","year", "month"]

# Open dataset using year,month folder partition
nyc_dataset = ds.dataset(dir, partitioning=["year", "month"])
# Generate a scanner to skip problematic column
dataset_scanner = nyc_dataset.scanner(columns=working_columns)

# Materialize dataset to an Arrow Table
nyc_table = dataset_scanner.to_table()

# Generate Dataframe from Arow Table
nyc_df = nyc_table.to_pandas()

# Apply Filter
filtered_df = nyc_df[
    (nyc_df.total_amount > 100) &
    (nyc_df.year >2014)]

# Apply Projection
res = filtered_df[["total_amount", "passenger_count","year"]]

# Transform Result back to an Arrow Table
new_table = pa.Table.from_pandas(res)

DuckDB Streaming Speed Bump

  • The Python pandas runtime was 146.91 seconds
  • Python’s duckdb runtime was 0.05 seconds
  • Data manipulation processing time was 2,900x faster with duckdb vs pandas

For R users: duckplyr

  • duckplyr, from DuckDB Labs, offers 1:1 compatibility with dplyr functions but there are some caveats:
    • Factor columns, nested lists, and nested tibbles are not yet supported
    • To group data, use dplyr’s summarize() function with the .by argument as group_by() is not be supported

Polars

Polars

  • While Arrow is primarily a memory format and data transfer standard, Polars is specifically designed for data analysis with lazy evaluation, DataFrame manipulations, and a consistent API across languages

  • DuckDB is a full database system that excels at SQL queries and can integrate with both Arrow and Polars, but Polars focuses on in-memory processing and programmatic data manipulation

Benchmarking Analysis

  • Arrow and DuckDB really stood out for fast manipulation of data using dplyr syntax
  • The SQL below shows the basic transformation done to the data using dplyr, arrow, duckdb, duckplyr, and polars
SELECT 
    payment_type,
    AVG(fare_amount) AS mean_fare,
    AVG(tip_amount)  AS mean_tip
FROM nyc_taxi
WHERE payment_type IN ('Credit card', 'Cash')
GROUP BY payment_type;

Benchmark: 1 million rows

Benchmark: 10 million rows

Benchmark: 100 million rows

Benchmark: 500 million rows

Benchmark: 1.1 billion rows

Quarto

Just to clarify…

It’s Quarto.

Q-U-A-R-T-O.

Not #4 in Spanish (or “cuatro”).

Quarto Presentations


  • These web slides were built with Quarto!

  • Quarto is an open-source scientific and technical publishing system that can be used with Python, R, Julia, and Observable JS

  • Similar to Jupyter Notebook .ipynb files (and in many ways, the successor to R Markdown .Rmd), Quarto lets you develop static and interactive reproducible, production quality content including articles, presentations, dashboards, websites, blogs, and books in HTML, PDF, MS Word, ePub, and more

  • In addition to custom styling with .css or .scss, the new _brand.yml file can be used with your Quarto and Shiny projects to provide a unifying and portable branding framework

Quarto Presentations

  • The following sections were copied almost entirely from Quarto’s Reveal.js demo documentation

  • The next few slides cover what you can do with Quarto and Reveal.js including:

    • Presenting code and LaTeX equations
    • Rendering code chunk computations in slide output
    • Fancy transitions, animations, and code windows

Pretty Code

  • Over 20 syntax highlighting themes available
  • Default theme optimized for accessibility
# Define a server for the Shiny app
function(input, output) {
  
  # Fill in the spot we created for a plot
  output$phonePlot <- renderPlot({
    # Render a barplot
  })
}

Code Animations

  • Over 20 syntax highlighting themes available
  • Default theme optimized for accessibility
# Define a server for the Shiny app
function(input, output) {
  
  # Fill in the spot we created for a plot
  output$phonePlot <- renderPlot({
    # Render a barplot
    barplot(WorldPhones[,input$region]*1000, 
            main=input$region,
            ylab="Number of Telephones",
            xlab="Year")
  })
}

Line Highlighting

  • Highlight specific lines for emphasis
  • Incrementally highlight additional lines
import numpy as np
import matplotlib.pyplot as plt

r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
fig, ax = plt.subplots(subplot_kw={'projection': 'polar'})
ax.plot(theta, r)
ax.set_rticks([0.5, 1, 1.5, 2])
ax.grid(True)
plt.show()

LaTeX Equations

  • To include a LaTeX equation in Quarto, you would use the double dollar sign delimiters ($$) for a display equation on a separate line, like this:

  • $$x = \frac{-b \pm \sqrt{(b^2 - 4ac)}}{2a}$$

\[x = \frac{-b \pm \sqrt{(b^2 - 4ac)}}{2a}\]

Counters

Great for training purposes or when you’ve got a time limit

# YOUR TURN! Complete the below 
# dplyr pipeline to group by and
# summarize the data set

library(dplyr)
data(package = "ggplot2", "diamonds")

diamonds |> 
  <your code here>
See {countdown} code
library(countdown)
countdown(
  minutes = 0, 
  seconds = 10,
  play_sound = TRUE,
  color_border = "#1A4064",
  color_text = "#76AADB",
  font_size = "3em"
)
00:10

Thank you! 🤍

questions?



Connect with me!