---
title: "Get Started"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Get Started}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

# Introduction to tidyllm

**tidyllm** is an R package providing a unified interface for interacting with various large language model APIs. This vignette guides you through the basic setup and usage of **tidyllm**.

## Installation

To install **tidyllm** from CRAN:

```{r, eval=FALSE}
install.packages("tidyllm")
```

Or install the development version from GitHub:

```{r, eval=FALSE}
devtools::install_github("edubruell/tidyllm")
```

### Setting up API Keys

Set API keys as environment variables. The easiest way is to add them to your `.Renviron` file (run `usethis::edit_r_environ()`) and restart R:

```
ANTHROPIC_API_KEY="your-key-here"
OPENAI_API_KEY="your-key-here"
```

Or set them temporarily in your session:

```{r, eval=FALSE}
Sys.setenv(OPENAI_API_KEY = "your-key-here")
```

| Provider | Environment Variable | Where to get a key |
|----------|---------------------|--------------------|
| **Claude** (Anthropic) | `ANTHROPIC_API_KEY` | [Anthropic Console](https://console.anthropic.com/settings/keys) |
| **OpenAI** | `OPENAI_API_KEY` | [OpenAI API Keys](https://platform.openai.com/account/api-keys) |
| **Google Gemini** | `GOOGLE_API_KEY` | [Google AI Studio](https://aistudio.google.com/app/apikey) |
| **Mistral** | `MISTRAL_API_KEY` | [Mistral Console](https://console.mistral.ai/api-keys/) |
| **Groq** | `GROQ_API_KEY` | [Groq Console](https://console.groq.com/playground) |
| **Perplexity** | `PERPLEXITY_API_KEY` | [Perplexity API Settings](https://www.perplexity.ai/settings/api) |
| **DeepSeek** | `DEEPSEEK_API_KEY` | [DeepSeek Platform](https://platform.deepseek.com/api_keys) |
| **Voyage AI** | `VOYAGE_API_KEY` | [Voyage AI Dashboard](https://dashboard.voyageai.com/api-keys) |
| **OpenRouter** | `OPENROUTER_API_KEY` | [OpenRouter Dashboard](https://openrouter.ai/keys) |
| **Azure OpenAI** | `AZURE_OPENAI_API_KEY` | [Azure Portal](https://portal.azure.com/) |

### Running Local Models

**tidyllm** supports local inference via [Ollama](https://ollama.com/) and [llama.cpp](https://github.com/ggml-org/llama.cpp). Both run entirely on your machine; no API key required.

For Ollama, install from [ollama.com](https://ollama.com/) and pull a model:

```{r, eval=FALSE}
ollama_download_model("qwen3.5:4b")
```

For llama.cpp, see the Local Models article on the tidyllm website for a step-by-step setup guide.

## Basic Usage

**tidyllm** is built around a **message-centric** interface. Interactions work through a message history created by `llm_message()` and passed to verbs like `chat()`:

```{r convo1, eval=FALSE}
library(tidyllm)

conversation <- llm_message("What is the capital of France?") |>
  chat(claude())

conversation
```
```{r convo1_out, eval=TRUE, echo=FALSE, message=FALSE}
library(tidyllm)

conversation <- llm_message("What is the capital of France?") |>
  llm_message("The capital of France is Paris.", .role = "assistant")

conversation
```

```{r convo2, eval=FALSE}
# Continue the conversation with a different provider
conversation <- conversation |>
  llm_message("What's a famous landmark in this city?") |>
  chat(openai())

get_reply(conversation)
```
```{r convo2_out, eval=TRUE, echo=FALSE}
c("A famous landmark in Paris is the Eiffel Tower.")
```

All API interactions follow the **verb + provider** pattern:

- **Verbs** define the action: `chat()`, `embed()`, `send_batch()`, `check_batch()`, `fetch_batch()`, `list_batches()`, `list_models()`, `deep_research()`, `check_job()`, `fetch_job()`, `upload_file()`, `list_files()`, `file_info()`, `delete_file()`
- **Providers** specify the API: `openai()`, `claude()`, `gemini()`, `ollama()`, `mistral()`, `groq()`, `perplexity()`, `deepseek()`, `voyage()`, `openrouter()`, `llamacpp()`, `azure_openai()`

Provider-specific functions like `openai_chat()` or `claude_chat()` also work directly and expose the full range of parameters for each API.

### Sending Images and Media

Attach images, PDFs, audio, or video to a message using the `.media` argument of `llm_message()`. Pass a single media object or a list of them:

```{r, echo=FALSE, out.width="70%", fig.alt="A photograph showing lake Garda and the scenery near Torbole, Italy."}
knitr::include_graphics("picture.jpeg")
```

```{r images, eval=FALSE}
# Single image
image_description <- llm_message("Describe this picture. Can you guess where it was taken?",
                                  .media = img("picture.jpeg")) |>
  chat(openai(.model = "gpt-5.4"))

get_reply(image_description)
```
```{r images_out, eval=TRUE, echo=FALSE}
c("The picture shows a beautiful landscape with a lake, mountains, and a town nestled below. The area appears lush and green, with agricultural fields visible. This scenery is reminiscent of northern Italy, particularly around Lake Garda.")
```

Send multiple images in one message by passing a list. The list can also mix different media types; for example, an image alongside a PDF or an audio file alongside a chart. Gemini handles mixed-type lists most flexibly:

```{r, eval=FALSE}
llm_message("Compare these two charts.",
            .media = list(img("chart_a.png"), img("chart_b.png"))) |>
  chat(claude())

# Mixed types in one message
llm_message("Does this figure match the results described in the paper?",
            .media = list(img("figure.png"), pdf_file("paper.pdf", pages = 1:5))) |>
  chat(gemini())
```

For audio and video, use `audio_file()` and `video_file()`. Gemini supports both natively; llama.cpp supports audio with compatible models (Ultravox, Qwen2.5-Omni, Qwen3-Omni); OpenRouter routes audio and video to underlying models that accept them:

```{r, eval=FALSE}
# Audio transcription or analysis
llm_message("Transcribe and summarise this recording.",
            .media = audio_file("interview.mp3")) |>
  chat(gemini())

# Video understanding
llm_message("Describe what happens in this clip.",
            .media = video_file("demo.mp4")) |>
  chat(gemini())
```

### Adding PDFs to Messages

Pass a PDF to `pdf_file()` and attach it with `.media`. By default the file is sent as binary to providers that support native PDF rendering (Claude, Gemini), and the text is automatically extracted as a fallback for others:

```{r pdf, eval=FALSE}
llm_message("Summarize the key points from this document.",
            .media = pdf_file("die_verwandlung.pdf")) |>
  chat(claude())
```
```{r pdf_out, eval=TRUE, echo=FALSE}
llm_message("Summarize the key points from this document.",
            .media = pdf_file("die_verwandlung.pdf")) |>
  llm_message("The story centres on Gregor Samsa, who wakes up transformed into a giant insect. Unable to work, he becomes isolated while his family struggles. Eventually Gregor dies, and his relieved family looks ahead to a better future.",
              .role = "assistant")
```

Restrict to specific pages with the `pages` argument:

```{r, eval=FALSE}
pdf_file("report.pdf", pages = 1:5)
```

For large or frequently reused documents, upload them once to the provider's Files API and reference the handle in any message:

```{r, eval=FALSE}
report <- upload_file(claude(), .path = "annual_report.pdf")

llm_message("What are the key risks mentioned?", .files = report) |>
  chat(claude())

# Manage uploaded files
list_files(claude())
file_info(claude(), report)
delete_file(claude(), report)
```

File management verbs are currently only supported by Claude, Gemini, and OpenAI. Files are uploaded to and stored on the specific provider's servers; a file uploaded to Claude cannot be used with Gemini or OpenAI. Supported file types vary by provider: Gemini accepts a wide range including video, audio, images, PDFs, and various document formats. OpenAI accepts 80+ formats including Office documents (PPTX, DOCX, XLSX), source code, ZIP archives, and more. Claude focuses on PDFs and images. Always check the provider's current documentation for the full list.

### Sending R Outputs to Models

Use `.f` to capture console output from a function and `.capture_plot` to include the last plot:

```{r routputs_base, eval=TRUE, echo=TRUE, message=FALSE, fig.alt="Scatter plot of car weight (x-axis) versus miles per gallon (y-axis) from the mtcars dataset, with a fitted linear regression line showing a negative relationship between weight and fuel efficiency."}
library(tidyverse)

ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  geom_smooth(method = "lm", formula = "y ~ x") +
  labs(x = "Weight", y = "Miles per gallon")
```

```{r routputs_base_out, eval=FALSE}
llm_message("Analyze this plot and data summary:",
            .capture_plot = TRUE,
            .f = ~{summary(mtcars)}) |>
  chat(claude())
```
```{r routputs_fake, eval=TRUE, echo=FALSE}
llm_message("Analyze this plot and data summary:",
            .imagefile = "file1568f6c1b4565.png",
            .f = ~{summary(mtcars)}) |>
  llm_message("The scatter plot shows a clear negative correlation between weight and fuel efficiency. Heavier cars consistently achieve lower mpg, with the linear trend confirming this relationship. Variability around the line suggests other factors, engine size and transmission, also play a role.", .role = "assistant")
```

## Getting Replies from the API

`get_reply()` retrieves assistant text from a message history. Use an index to access a specific reply, or omit it to get the last:

```{r eval=FALSE}
conversation <- llm_message("Imagine a German address.") |>
  chat(groq()) |>
  llm_message("Imagine another address.") |>
  chat(claude())

conversation
```
```{r eval=TRUE, echo=FALSE}
conversation <- llm_message("Imagine a German address.") |>
  llm_message("Herr Müller\nMusterstraße 12\n53111 Bonn", .role = "assistant") |>
  llm_message("Imagine another address.") |>
  llm_message("Frau Schmidt\nFichtenweg 78\n42103 Wuppertal", .role = "assistant")
conversation
```

```{r standard_last_reply, eval=TRUE}
conversation |> get_reply(1)   # first reply
conversation |> get_reply()    # last reply (default)
```

Convert the message history (without attachments) to a tibble with `as_tibble()`:

```{r as_tibble, eval=TRUE}
conversation |> as_tibble()
```

`get_metadata()` returns token counts, model names, and API-specific metadata for all assistant replies:

```{r get_metadata, eval=FALSE}
conversation |> get_metadata()
```
```{r get_metadata_out, eval=TRUE, echo=FALSE}
tibble::tibble(
  model             = c("groq-model", "claude-sonnet-4-6-20251001"),
  timestamp         = as.POSIXct(c("2025-11-08 14:25:43", "2025-11-08 14:26:02")),
  prompt_tokens     = c(20L, 80L),
  completion_tokens = c(45L, 40L),
  total_tokens      = c(65L, 120L),
  api_specific      = list(list(), list())
) |> print(width = 60)
```

The `api_specific` list column holds provider-only metadata such as citations (Perplexity), thinking traces (Claude, Gemini, DeepSeek), or grounding information (Gemini).

### Listing Available Models

Use `list_models()` to query which models a provider offers:

```{r list_models, eval=FALSE}
list_models(openai())
list_models(openrouter())   # 300+ models across providers
list_models(ollama())       # models installed locally
```

## Working with Structured Outputs

**tidyllm** supports JSON schema enforcement so models return structured, machine-readable data. Use `tidyllm_schema()` with field helpers to define your schema:

- `field_chr()`: text
- `field_dbl()`: numeric
- `field_lgl()`: boolean
- `field_fct(.levels = ...)`: enumeration
- `field_object(...)`: nested object (supports `.vector = TRUE` for arrays)

```{r schema, eval=FALSE}
person_schema <- tidyllm_schema(
  first_name = field_chr("A male first name"),
  last_name  = field_chr("A common last name"),
  occupation = field_chr("A quirky occupation"),
  address    = field_object(
    "The person's home address",
    street  = field_chr("Street name"),
    number  = field_dbl("House number"),
    city    = field_chr("A large city"),
    zip     = field_dbl("Postal code"),
    country = field_fct("Country", .levels = c("Germany", "France"))
  )
)

profile <- llm_message("Imagine a person profile matching the schema.") |>
  chat(openai(), .json_schema = person_schema)

profile
```
```{r schema_out, eval=TRUE, echo=FALSE}
profile <- llm_message("Imagine a person profile matching the schema.") |>
  llm_message("{\"first_name\":\"Julien\",\"last_name\":\"Martin\",\"occupation\":\"Gondola Repair Specialist\",\"address\":{\"street\":\"Rue de Rivoli\",\"number\":112,\"city\":\"Paris\",\"zip\":75001,\"country\":\"France\"}}",
              .role = "assistant")
profile@message_history[[3]]$json <- TRUE
profile
```

Extract the parsed result as an R list with `get_reply_data()`:

```{r get_reply_data}
profile |> get_reply_data() |> str()
```

Set `.vector = TRUE` on `field_object()` to get outputs that unpack directly into a data frame:

```{r fobj, eval=FALSE}
llm_message("Imagine five people.") |>
  chat(openai(),
       .json_schema = tidyllm_schema(
         person = field_object(
           first_name = field_chr(),
           last_name  = field_chr(),
           occupation = field_chr(),
           .vector    = TRUE
         )
       )) |>
  get_reply_data() |>
  bind_rows()
```
```{r fobjout, echo=FALSE, eval=TRUE}
data.frame(
  first_name = c("Alice", "Robert", "Maria", "Liam", "Sophia"),
  last_name  = c("Johnson", "Anderson", "Gonzalez", "O'Connor", "Lee"),
  occupation = c("Software Developer", "Graphic Designer", "Data Scientist",
                 "Mechanical Engineer", "Content Writer")
)
```

If the **ellmer** package is installed, you can use ellmer type objects (`ellmer::type_string()`, `ellmer::type_object()`, etc.) directly as field definitions in `tidyllm_schema()` or as the `.json_schema` argument.

## API Parameters

Common arguments like `.model`, `.temperature`, and `.json_schema` can be set directly in verbs like `chat()`. Provider-specific arguments go in the provider function:

```{r temperature, eval=FALSE}
# Common args in the verb
llm_message("Write a haiku about tidyllm.") |>
  chat(ollama(), .temperature = 0)

# Provider-specific args in the provider
llm_message("Hello") |>
  chat(ollama(.ollama_server = "http://my-server:11434"), .temperature = 0)
```

When an argument appears in both `chat()` and the provider function, `chat()` takes precedence. If a common argument is not supported by the chosen provider, `chat()` raises an error rather than silently ignoring it.

## Tool Use

Define R functions that the model can call during a conversation. Wrap them with `tidyllm_tool()`:

```{r, eval=FALSE}
get_current_time <- function(tz, format = "%Y-%m-%d %H:%M:%S") {
  format(Sys.time(), tz = tz, format = format, usetz = TRUE)
}

time_tool <- tidyllm_tool(
  .f          = get_current_time,
  .description = "Returns the current time in a specified timezone.",
  tz     = field_chr("The timezone identifier, e.g. 'Europe/Berlin'."),
  format = field_chr("strftime format string. Default: '%Y-%m-%d %H:%M:%S'.")
)

llm_message("What time is it in Stuttgart right now?") |>
  chat(openai(), .tools = time_tool)
```
```{r, eval=TRUE, echo=FALSE}
llm_message("What time is it in Stuttgart right now?") |>
  llm_message("The current time in Stuttgart (Europe/Berlin) is 2025-03-03 09:51:22 CET.",
              .role = "assistant")
```

When a tool is provided, the model can request its execution, tidyllm runs it in your R session, and the result is fed back automatically. Multi-turn tool loops (where the model makes several tool calls) are handled transparently.

If you use **ellmer**, convert existing ellmer tool definitions with `ellmer_tool()`:

```{r, eval=FALSE}
library(ellmer)
btw_tool <- ellmer_tool(btw::btw_tool_files_list_files)
llm_message("List files in the R/ folder.") |>
  chat(claude(), .tools = btw_tool)
```

For a detailed guide, see the Tool Use article on the tidyllm website.

## Embeddings

Generate vector representations of text with `embed()`. The output is a tibble with one row per input:

```{r embed, eval=FALSE}
c("What is the meaning of life?",
  "How much wood would a woodchuck chuck?",
  "How does the brain work?") |>
  embed(ollama())
```
```{r embed_output, eval=TRUE, echo=FALSE}
tibble::tibble(
  input      = c("What is the meaning of life?",
                 "How much wood would a woodchuck chuck?",
                 "How does the brain work?"),
  embeddings = purrr::map(1:3, ~runif(min = -1, max = 1, n = 384))
)
```

Voyage AI supports **multimodal embeddings**, meaning text and images share the same vector space:

```{r, eval=FALSE}
list("tidyllm logo", img("docs/logo.png")) |>
  embed(voyage())
```

`voyage_rerank()` re-orders documents by relevance to a query:

```{r, eval=FALSE}
voyage_rerank(
  "best R package for LLMs",
  c("tidyllm", "ellmer", "httr2", "rvest")
)
```

## Batch Requests

Batch APIs (Claude, OpenAI, Mistral, Groq, Gemini) process large numbers of requests overnight at roughly half the cost of standard calls. Use `send_batch()` to submit, `check_batch()` to poll status, and `fetch_batch()` to retrieve results:

```{r, eval=FALSE}
# Submit a batch and save the job handle to disk
glue::glue("Write a poem about {x}", x = c("cats", "dogs", "hamsters")) |>
  purrr::map(llm_message) |>
  send_batch(claude()) |>
  saveRDS("claude_batch.rds")
```

```{r, eval=FALSE}
# Check status
readRDS("claude_batch.rds") |>
  check_batch(claude())
```
```{r, echo=FALSE}
tibble::tribble(
  ~batch_id,                          ~status, ~created_at,                      ~expires_at,                      ~req_succeeded, ~req_errored, ~req_expired, ~req_canceled,
  "msgbatch_02A1B2C3D4E5F6G7H8I9J",   "ended", as.POSIXct("2025-11-01 10:30:00"), as.POSIXct("2025-11-02 10:30:00"), 3L,             0L,           0L,           0L
)
```

```{r, eval=FALSE}
# Fetch completed results
conversations <- readRDS("claude_batch.rds") |>
  fetch_batch(claude())

poems <- purrr::map_chr(conversations, get_reply)
```

`check_job()` and `fetch_job()` are type-dispatching aliases; they work on both batch objects and Perplexity research jobs (see `deep_research()` below).

## Deep Research

For long-horizon research tasks, `deep_research()` runs an extended web search and synthesis. Perplexity's `sonar-deep-research` model is currently supported:

```{r, eval=FALSE}
# Blocking: waits for completion, returns an LLMMessage
result <- llm_message("Compare Rust and Go for systems programming in 2025.") |>
  deep_research(perplexity())

get_reply(result)
get_metadata(result)$api_specific[[1]]$citations
```

```{r, eval=FALSE}
# Background mode: returns a job handle immediately
job <- llm_message("Summarise recent EU AI Act developments.") |>
  deep_research(perplexity(), .background = TRUE)

check_job(job)           # poll status
result <- fetch_job(job) # retrieve when complete
```

## Streaming

Stream reply tokens to the console as they are generated with `.stream = TRUE`:

```{r, eval=FALSE}
llm_message("Write a short poem about R.") |>
  chat(claude(), .stream = TRUE)
```

Streaming is useful for monitoring long responses interactively. For production data-analysis workflows the non-streaming mode is preferred because it provides complete metadata and is more reliable.

## Choosing the Right Provider

| Provider | Strengths and tidyllm-specific features |
|----------|-----------------------------------------|
| `openai()` | Top benchmark performance across coding, math, and reasoning; `o3`/`o4` reasoning models; built-in web search via `openai_websearch()`; Files API |
| `claude()` | Best-in-class coding (SWE-bench leader); `.thinking = TRUE` for extended reasoning; Files API for document workflows; batch |
| `gemini()` | 1M-token context window; native image, audio, and video via `.media`; search grounding; `.thinking_budget` for reasoning; Files API; batch |
| `mistral()` | EU-hosted, GDPR-friendly; Magistral reasoning models; Voxtral audio models (`.media = audio_file()`); embeddings; batch |
| `groq()` | Fastest available inference (300-1200 tokens/s); `groq_transcribe()` for Whisper audio; `.json_schema`; batch |
| `perplexity()` | Real-time web search with citations in metadata; `deep_research()` for long-horizon research; search domain filter |
| `deepseek()` | Top math and reasoning benchmarks at very low cost; `.thinking = TRUE` auto-switches to `deepseek-reasoner` |
| `voyage()` | State-of-the-art retrieval embeddings; `voyage_rerank()`; multimodal embeddings (text + images); `.output_dimension` |
| `openrouter()` | 500+ models via one API key; audio and video support on Gemini, Qwen 3.6, and other multimodal routes; automatic fallback routing |
| `ollama()` | Local models with full data privacy; no API costs; `ollama_download_model()` for model management |
| `llamacpp()` | Often the most performant local inference stack; audio input with Ultravox, Qwen2.5-Omni, Qwen3-Omni, and Gemma 4 models; BNF grammar constraints; logprobs; `llamacpp_rerank()` |
| `azure_openai()` | Enterprise Azure deployments of OpenAI models; batch support |

For getting started with local models (Ollama and llama.cpp), see the Local Models article on the tidyllm website.

## Setting a Default Provider

Avoid specifying a provider on every call by setting options:

```{r, eval=FALSE}
options(tidyllm_chat_default   = openai(.model = "gpt-5.4"))
options(tidyllm_embed_default  = ollama())
options(tidyllm_sbatch_default = claude(.temperature = 0))
options(tidyllm_cbatch_default = claude())
options(tidyllm_fbatch_default = claude())
options(tidyllm_lbatch_default = claude())

# Now the provider argument can be omitted
llm_message("Hello!") |> chat()

c("text one", "text two") |> embed()
```