---
title: "Introduction to openaq"
subtitle: "Get started with the openaq package"
author: "Russ Biggs"
date: "2025-01-17"
description: >
  "Get started with the openaq package"
output:
  rmarkdown::html_vignette:
    df_print: kable
vignette: >
  %\VignetteIndexEntry{Introduction to openaq}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


``` r
library(openaq)
```

This guide provides an overview of the key features of the openaq package. For
detailed information on the functions provided in the package see the reference
section.

For more general documentation on the OpenAQ platform and API see the main
OpenAQ documentation site at [docs.openaq.org](https://docs.openaq.org).

## Key concepts

### API Key

An API key is required for using the OpenAQ API. Register for an account at
[https://explore.openaq.org/register](https://explore.openaq.org/register) to
get an API key.

<aside style="padding: 16px; margin:16px; background-color: #fbdcda; border-inline-start-color: #dd443c; border-inline-start-style: solid; border-inline-start-width: 4px;">
<span style="color:#8a0f3a; margin:0; padding: 6px;">Important</span>
Treat your API key as you would a password. Do not share your API key with
others. Do not commit your API key to version control to avoid compromising the
API key. If your API key is compromised, you can request a new one in your
OpenAQ Explorer account.
</aside>

By default the OpenAQ R client looks for an API key in the `OPENAQ_API_KEY`
system environment variable. The package also provides a helper function called
`set_api_key()` to set this value.


``` r
set_api_key("my-super-secret-openaq-key-1234")
```

Alternatively, the API key can be set on individual resource function calls e.g.


``` r
list_locations(api_key = "my-super-secret-openaq-key-1234")
```

Setting the API key at an individual function level will always take precedent
over an API key set at the environment variable level


``` r
set_api_key("my-super-secret-openaq-key-1234")
list_locations(api_key = "this-is-my-alternate-api-key")
```

### Rate limits

The OpenAQ API limits the number of requests a single API key can make in a set
time to ensure fair access for all users and prevent overuse.

The API provides custom rate limit headers to indicate the number of requests
used, the number remaining, the rate limit allowance, and the number of seconds
remaining in the current period until reset. These headers are preserved by
default in the openaq package as object attributes on the output data frame:

* `x_ratelimit_used`
* `x_ratelimit_remaining`
* `x_ratelimit_limit`
* `x_ratelimit_reset`


``` r
locations <- list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166
)
headers <- attr(locations, "headers")
print(headers[["x_ratelimit_remaining"]])
```

```
## [1] 50
```

Read more about the headers and rate limits in the OpenAQ API documentation under
[Rate Limits](https://docs.openaq.org/using-the-api/rate-limits)

The openaq package provides optional functionality to automatically throttle
requests when the rate limit has been reached.

#### Automatic rate limit handling

The openaq package provides optional functionality to automatically throttle
requests when the rate limit has been reached. This feature uses httr2's
built-in retry mechanism to intelligently handle rate limit errors.

Using the openaq package you can enable automatic rate limiting in two ways:

**Option 1: Enable globally for your session**

``` r
# Enable automatic rate limiting for all subsequent requests
enable_rate_limit()

# Now all API calls will automatically handle rate limits
locations <- list_locations(limit = 1000, parameters_id = 2)
```

```
## Setting `max_tries = 2`.
```

``` r
nrow(locations)
```

```
## [1] 1000
```

**Option 2: Enable per request**

``` r
# Enable rate limiting for a single function call
locations <- list_locations(
  limit = 1000,
  parameters_id = 2,
  rate_limit = TRUE
)
```

```
## Setting `max_tries = 2`.
```

``` r
head(locations)
```

```
##   id                name is_mobile is_monitor     timezone countries_id
## 1  3          NMA - Nima     FALSE       TRUE Africa/Accra          152
## 2  4          NMT - Nima     FALSE       TRUE Africa/Accra          152
## 3  5     JTA - Jamestown     FALSE       TRUE Africa/Accra          152
## 4  6   ADT - Asylum Down     FALSE       TRUE Africa/Accra          152
## 5  7 ADEPA - Asylum Down     FALSE       TRUE Africa/Accra          152
## 6  8   ADA - Asylum Down     FALSE       TRUE Africa/Accra          152
##   country_name country_iso latitude  longitude datetime_first datetime_last
## 1        Ghana          GH 5.583890 -0.1996800             NA            NA
## 2        Ghana          GH 5.581650 -0.1989800             NA            NA
## 3        Ghana          GH 5.540114 -0.2103972             NA            NA
## 4        Ghana          GH 5.570722 -0.2120555             NA            NA
## 5        Ghana          GH 5.567833 -0.2040278             NA            NA
## 6        Ghana          GH 5.566722 -0.2077778             NA            NA
##                          owner_name providers_id
## 1 Unknown Governmental Organization          209
## 2 Unknown Governmental Organization          209
## 3 Unknown Governmental Organization          209
## 4 Unknown Governmental Organization          209
## 5 Unknown Governmental Organization          209
## 6 Unknown Governmental Organization          209
##                        provider_name
## 1 Dr. Raphael E. Arku and Colleagues
## 2 Dr. Raphael E. Arku and Colleagues
## 3 Dr. Raphael E. Arku and Colleagues
## 4 Dr. Raphael E. Arku and Colleagues
## 5 Dr. Raphael E. Arku and Colleagues
## 6 Dr. Raphael E. Arku and Colleagues
```

This is particularly useful when making many sequential requests or when working
with large datasets where you might exceed the rate limit. The automatic retry
mechanism will pause execution until the rate limit resets, then continue
automatically without raising an error.

### Pagination

The OpenAQ API uses pagination provide access to large amounts of data in
"pages". The number of results is controlled by the `limit` parameter which
defaults 100 and can be configured up to 1000 results. If your query results in
more than the page limit you can page through the results using the `page`
parameter. For a `limit = 1000` `page=1` will contain results 1-1000, `page=2`
will contain results 1001-2000 an so on. The `page` and `limit` are available on
any resource that returns more than on results, i.e. "list" functions such as
`list_locations()`, `list_licenses()` or `list_sensor_measurements()`

Examples:


``` r
locs <- list_locations(
  limit = 1000,
  page = 1
)
```

```
## Setting `max_tries = 2`.
```

``` r
head(locs)
```

```
##   id                name is_mobile is_monitor     timezone countries_id
## 1  3          NMA - Nima     FALSE       TRUE Africa/Accra          152
## 2  4          NMT - Nima     FALSE       TRUE Africa/Accra          152
## 3  5     JTA - Jamestown     FALSE       TRUE Africa/Accra          152
## 4  6   ADT - Asylum Down     FALSE       TRUE Africa/Accra          152
## 5  7 ADEPA - Asylum Down     FALSE       TRUE Africa/Accra          152
## 6  8   ADA - Asylum Down     FALSE       TRUE Africa/Accra          152
##   country_name country_iso latitude  longitude datetime_first datetime_last
## 1        Ghana          GH 5.583890 -0.1996800             NA            NA
## 2        Ghana          GH 5.581650 -0.1989800             NA            NA
## 3        Ghana          GH 5.540114 -0.2103972             NA            NA
## 4        Ghana          GH 5.570722 -0.2120555             NA            NA
## 5        Ghana          GH 5.567833 -0.2040278             NA            NA
## 6        Ghana          GH 5.566722 -0.2077778             NA            NA
##                          owner_name providers_id
## 1 Unknown Governmental Organization          209
## 2 Unknown Governmental Organization          209
## 3 Unknown Governmental Organization          209
## 4 Unknown Governmental Organization          209
## 5 Unknown Governmental Organization          209
## 6 Unknown Governmental Organization          209
##                        provider_name
## 1 Dr. Raphael E. Arku and Colleagues
## 2 Dr. Raphael E. Arku and Colleagues
## 3 Dr. Raphael E. Arku and Colleagues
## 4 Dr. Raphael E. Arku and Colleagues
## 5 Dr. Raphael E. Arku and Colleagues
## 6 Dr. Raphael E. Arku and Colleagues
```


``` r
locs <- list_locations(
  limit = 1000,
  page = 2
)
```

```
## Setting `max_tries = 2`.
```

``` r
head(locs)
```

```
##     id            name is_mobile is_monitor            timezone countries_id
## 1 1119         HANOVER     FALSE       TRUE    America/New_York          155
## 2 1120  HAMPTON - NASA     FALSE       TRUE    America/New_York          155
## 3 1121     Jerome Mack     FALSE       TRUE America/Los_Angeles          155
## 4 1122     Jersey City     FALSE       TRUE    America/New_York          155
## 5 1123 46th and Farnam     FALSE       TRUE     America/Chicago          155
## 6 1124        Joe Neal     FALSE       TRUE America/Los_Angeles          155
##    country_name country_iso latitude  longitude      datetime_first
## 1 United States          US 37.60613  -77.21880 2016-03-06 20:00:00
## 2 United States          US 37.10373  -76.38702 2016-03-10 08:00:00
## 3 United States          US 36.14187 -115.07874 2016-03-06 20:00:00
## 4 United States          US 40.73169  -74.06657 2016-03-06 20:00:00
## 5 United States          US 41.25732  -95.98383 2016-03-06 20:00:00
## 6 United States          US 36.27059 -115.23828 2016-03-06 20:00:00
##         datetime_last                        owner_name providers_id
## 1 2026-03-09 20:00:00 Unknown Governmental Organization          119
## 2 2026-03-09 20:00:00 Unknown Governmental Organization          119
## 3 2026-03-09 20:00:00 Unknown Governmental Organization          119
## 4 2026-03-09 20:00:00 Unknown Governmental Organization          119
## 5 2018-04-25 05:00:00 Unknown Governmental Organization          119
## 6 2026-03-09 20:00:00 Unknown Governmental Organization          119
##   provider_name
## 1        AirNow
## 2        AirNow
## 3        AirNow
## 4        AirNow
## 5        AirNow
## 6        AirNow
```

## Features

### Queryable resources

The OpenAQ API follows a resource-oriented design, allowing developers to
retrieve air quality data through standardized HTTP requests to specific
endpoints representing data resources like measurements, locations, and
parameters. The OpenAQ R package provides functions that correspond to these API
resources, simplifying the process of querying and retrieving data resources.

#### Countries


``` r
get_country()
```

```
## Error in get_country(): argument "countries_id" is missing, with no default
```


``` r
list_countries()
```

#### Instruments


``` r
get_instrument()
```

```
## Error in get_instrument(): argument "instruments_id" is missing, with no default
```


``` r
list_instruments()
```


``` r
list_manufacturer_instruments()
```

#### Latest


``` r
list_location_latest()
```


``` r
list_parameter_latest()
```

#### Licenses


``` r
list_licenses()
```


``` r
get_license()
```

#### Locations


``` r
list_locations()
```


``` r
get_location()
```

#### Manufacturers


``` r
list_manufacturers()
```


``` r
get_manufacturer()
```

#### Measurements


``` r
list_sensor_measurements()
```

#### Owners


``` r
list_owners()
```


``` r
get_owner()
```

#### Parameters


``` r
list_parameters()
```


``` r
get_parameter()
```

#### Providers

``` r
list_providers()
```


``` r
get_provider()
```

#### Sensors


``` r
get_sensor()
```


``` r
get_location_sensors()
```



### Data frames

All resource functions return a typed data frame by default. If you prefer to
work with JSON parsed as a standard list you can toggle off data frame parsing
with the `as_data_frame` function parameter.


``` r
list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166,
  as_data_frame = FALSE
)

#> list()
#> attr(,"meta")
#> attr(,"meta")$name
#> [1] "openaq-api"
#>
#> attr(,"meta")$website
#> [1] "/"
#>
#> attr(,"meta")$page
#> [1] 1
#> ...
```

`as.data.frame` methods are provided for all resource classes as well.

JSON results are parsed with the `httr2::resp_body_json()` function under-the-hood.

### Automatic rate limiting

All resource function provide an option to enable automatic rate limiting to
ensure you do not exceed account rate limits. You can or course implement your
own rate limiting yourself, but the built-in functionality is provided as an
easy to use option.


``` r
list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166,
  rate_limit = TRUE
)
```

This functionality uses the OpenAQ API's
[rate limit headers](https://docs.openaq.org/using-the-api/rate-limits#rate-limit-headers)
and the `httr2::req_retry()` function under-the-hood.

### Debugging

Every resource function provides an optional parameter named `DRY_RUN` that
prevents a full HTTP request to the API and instead prints out a summary of how
the request would have been made.



``` r
list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166,
  dry_run = TRUE
)
```

This can be helpful when debugging to identify issues and compare the raw query
URL and headers.

This functionality uses the `httr2::req_dry_run()` function under-the-hood.
