| Title: | A Toolkit for Research Workflows |
| Version: | 0.3.0 |
| Description: | Provides utility functions to help researchers implement best practices for their coding projects. Includes tools for reading and cleaning data files, initializing R projects with a standard folder structure, creating Quarto documents from a reproducible template, detecting the execution context across interactive, Quarto, and script-based workflows, and splitting data frames into group-level output files. |
| License: | MIT + file LICENSE |
| Depends: | R (≥ 4.2.0) |
| Encoding: | UTF-8 |
| Language: | en-US |
| RoxygenNote: | 7.3.2 |
| Suggests: | knitr, rmarkdown, spelling, testthat (≥ 3.0.0), |
| Config/testthat/edition: | 3 |
| Imports: | cli, fs, glue, janitor, purrr, readr, renv, tibble, usethis, yaml, rlang, rvest, xml2, quarto, withr |
| URL: | https://github.com/erwinlares/toolero, https://erwinlares.github.io/toolero/ |
| BugReports: | https://github.com/erwinlares/toolero/issues |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| Packaged: | 2026-04-27 15:23:42 UTC; lares |
| Author: | Erwin Lares |
| Maintainer: | Erwin Lares <erwin.lares@wisc.edu> |
| Repository: | CRAN |
| Date/Publication: | 2026-04-27 18:40:02 UTC |
Create a new Quarto document from a template
Description
Creates a new Quarto document in the specified directory, along with a sample dataset and UW-Madison branded assets. Optionally pre-populates the YAML header with user-supplied metadata.
Usage
create_qmd(
filename = NULL,
path = ".",
yaml_data = NULL,
overwrite = FALSE,
use_purl = TRUE
)
Arguments
filename |
A string or |
path |
A string. Path to the directory where the document will be
created. Defaults to |
yaml_data |
A string or |
overwrite |
A logical. Whether to overwrite existing files. Defaults
to |
use_purl |
Logical. If |
Details
create_qmd() performs the following steps:
Validates that
pathexists.Validates that
filenameis supplied.Creates a
data/folder underpathand copiessample.csvthere.Checks for
assets/styles.cssandassets/header.html- creates theassets/folder if needed and copies both from the package.Copies the template
.qmdtopath/filename.If
yaml_datais provided, reads the YAML file and substitutes values into the document header.If
use_purl = TRUE, writes a_quarto.ymlwith a post-render hook pointing toR/purl.R, and copiespurl.Rfrom the package templates intopath/R/purl.R.
Note: filename has no default value and must always be supplied
explicitly. Use tempdir() for temporary output during testing or
exploration.
Value
Invisibly returns path.
Examples
# Create a document in a temp directory
create_qmd(path = tempdir(), filename = "analysis.qmd")
# Create with a custom filename, without the purl hook
create_qmd(path = tempdir(), filename = "report.qmd",
overwrite = TRUE, use_purl = FALSE)
# Create with pre-populated YAML
yaml_file <- tempfile(fileext = ".yml")
writeLines("author:\n - name: 'Your Name'", yaml_file)
create_qmd(path = tempdir(), filename = "analysis.qmd",
yaml_data = yaml_file, overwrite = TRUE)
Detect the current execution context
Description
Identifies which of three execution environments the code is currently
running in: an interactive R session, a quarto render call, or a
plain Rscript invocation. This is useful for writing code that behaves
correctly across all three contexts, such as resolving input file paths
in a portable way.
Usage
detect_execution_context(interactive_fn = interactive)
Arguments
interactive_fn |
A function. Used to detect whether the session is
interactive. Defaults to |
Details
Detection follows a priority order:
If
interactive()isTRUE, returns"interactive".If the environment variable
QUARTO_DOCUMENT_PATHis set and non-empty, returns"quarto".Otherwise, returns
"rscript".
Value
A character string, one of "interactive", "quarto", or
"rscript".
Examples
context <- detect_execution_context()
input_file <- switch(context,
interactive = "data/sample.csv",
quarto = params$input_file,
rscript = commandArgs(trailingOnly = TRUE)[1]
)
Generate a KB-importable XML file from a Quarto document
Description
Takes a Quarto document and produces an XML file that is directly
importable into a UW-Madison Knowledge Base (KB) article. The function
re-renders the .qmd with embed-resources: true so all visual assets
are self-contained, extracts the HTML body, and wraps it in the KB XML
structure along with metadata drawn from the document's YAML header.
Usage
generate_kb_xml(html_path, qmd_path = NULL, output_dir = NULL)
Arguments
html_path |
A string. Path to the rendered HTML file. Used to infer
the output filename and, if |
qmd_path |
A string or |
output_dir |
A string or |
Details
generate_kb_xml() performs the following steps:
Validates that
html_pathexists.Infers
qmd_pathfromhtml_pathif not supplied, then validates it.Extracts
title,description, andcategoriesfrom the.qmdYAML header and maps them tokb_title,kb_summary, andkb_keywords.Re-renders the
.qmdin an isolated temporary directory withembed-resources: trueso all CSS, images, and JS are self-contained. Thedata/andassets/folders are copied alongside the.qmdto ensure the render succeeds.Extracts the
<body>from the embedded HTML.Escapes HTML entities in the body for XML compatibility, as required by the UW-Madison KB import format.
Builds the XML structure with
kb_title,kb_keywords,kb_summary, andkb_bodynodes.Writes the
.xmlfile tooutput_dir.
Temporary files are managed via withr::local_tempdir() and are
automatically cleaned up when the function exits, even on error.
When importing the resulting XML into the KB, check the Decode HTML entity in body content option.
Value
Invisibly returns the path to the written .xml file.
Examples
# Infer qmd_path automatically, write XML alongside the HTML
# generate_kb_xml(html_path = "docs/analysis.html")
# Supply qmd_path explicitly and write to a specific output directory
# generate_kb_xml(
# html_path = "docs/analysis.html",
# qmd_path = "analysis.qmd",
# output_dir = "exports"
# )
Initialize a new R project with a standard folder structure
Description
init_project() creates a new R project at the given path with an
opinionated folder structure suited for research workflows. It optionally
initializes renv for package management and git for version control.
Usage
init_project(
path,
use_renv = TRUE,
use_git = TRUE,
extra_folders = NULL,
open = FALSE,
uw_branding = FALSE
)
Arguments
path |
A character string with the path and name of the new
project (e.g., |
use_renv |
Logical. If |
use_git |
Logical. If |
extra_folders |
A character vector of additional folder names to create
inside the project. Defaults to |
open |
Logical. If |
uw_branding |
Logical. If |
Value
Called for its side effects. Does not return a value.
Examples
## Not run:
init_project(path = file.path(tempdir(), "project1"),
use_renv = FALSE, use_git = FALSE)
init_project(path = file.path(tempdir(), "project2"),
uw_branding = TRUE, use_renv = FALSE, use_git = FALSE)
init_project(path = file.path(tempdir(), "project3"),
extra_folders = c("notebooks"),
use_renv = FALSE, use_git = FALSE)
## End(Not run)
Read and clean a CSV file
Description
read_clean_csv() reads a CSV file and cleans the column names in one step.
It leverages readr::read_csv() for reading and janitor::clean_names() for
making column names tidyverse-friendly (lowercase, no spaces, no special
characters). By default, column type messages are suppressed. Set
verbose = TRUE to display them.
Usage
read_clean_csv(file_path, verbose = FALSE)
Arguments
file_path |
A character string with the path to the CSV file. |
verbose |
Logical. If |
Value
A tibble with clean column names.
Examples
# Read and clean a CSV file silently
sample_path <- system.file("templates", "sample.csv", package = "toolero")
data <- read_clean_csv(sample_path)
# Show column type messages
data <- read_clean_csv(sample_path, verbose = TRUE)
Split a data frame by a grouping column and write each group to a CSV file
Description
Splits a data frame by a single grouping column and writes each group to a separate CSV file. Optionally writes a manifest file listing the output files, their group values, and row counts.
Usage
write_by_group(data, group_col, output_dir = NULL, manifest = FALSE)
Arguments
data |
A data frame or tibble to split and save. |
group_col |
A string. The name of the column to group by. |
output_dir |
A string or |
manifest |
A logical. Whether to write a |
Details
Output filenames are derived from the group values of group_col.
Values are sanitized before use as filenames: converted to lowercase,
spaces and special characters replaced with -, consecutive dashes
collapsed, and leading/trailing dashes stripped.
If manifest = TRUE, a manifest.csv is written to output_dir
containing three columns: group_value, n_rows, and file_path.
Note: output_dir has no default value. Always supply an explicit path
to avoid writing files to unexpected locations. Use tempdir() for
temporary output during testing or exploration.
Value
Invisibly returns output_dir.
Examples
# Split a small data frame by group and write to a temp directory
data <- data.frame(
species = c("Adelie", "Adelie", "Gentoo"),
mass = c(3750, 3800, 5000)
)
write_by_group(data, group_col = "species", output_dir = tempdir())
# Same but also write a manifest
write_by_group(data, group_col = "species",
output_dir = tempdir(), manifest = TRUE)