---
title: "Importing a database"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Importing a database from Nominatim}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup, include=FALSE}
library(photon)
```

As specified in the [introduction vignette](photon.html), you can download pre-built search indices for selected country extracts. If you require more freedom in providing the geocoding data, you can choose to import from an existing Nominatim database or from a JSON dump. This vignette guides you through the setup and import of an external database.

# Importing from Nominatim

Technically, Nominatim databases can only be reliably set up on Linux systems. Here, we use the `mediagis/nominatim` docker image to set up Nominatim irrespective of the operating system. You can use the helper functions `cmd_options()` and `run()` to run a Nominatim docker. It is important to expose the port 5432 on the host machine, otherwise photon is not able to connect to the database.

```{r, eval=FALSE}
opts <- cmd_options(
  e = "PBF_URL=https://download.geofabrik.de/australia-oceania/samoa-latest.osm.pbf",
  e = "NOMINATIM_PASSWORD=mypassword",
  e = "FREEZE=true",
  p = "8080:8080",
  p = "5432:5432",
  name = "nominatim",
  "mediagis/nominatim:4.4",
  use_double_hyphens = TRUE
)

# Note: on Windows, make sure you have Docker Desktop running!
nominatim <- process$new("docker", c("run", opts))

# Wait until Nominatim is ready
ready <- FALSE
while (!ready) {
  Sys.sleep(5)
  logs <- run("docker", c("logs", "nominatim"))
  ready <- any(grepl("ready to accept requests", logs))
}

run(
  "docker",
  c(
    "exec", "--user", "postgres", "nominatim", "psql", "-d", "nominatim", "-c",
    "ALTER USER nominatim WITH ENCRYPTED PASSWORD 'mypassword'"
  )
)
```

To verify that the database can be connected to, you can connect to it from R.

```{r, eval=FALSE}
library(RPostgres)
db <- dbConnect(Postgres(), password = "MNdtC2*pP#aMbe", user = "nominatim")
dbGetInfo(db)
#> $dbname
#> [1] "nominatim"
#> 
#> $host
#> [1] "localhost"
#> 
#> $port
#> [1] "5432"
#> 
#> $username
#> [1] "nominatim"
#> 
#> $protocol.version
#> [1] 3
#> 
#> $server.version
#> [1] 140013
#> 
#> $db.version
#> [1] 140013
#> 
#> $pid
#> [1] 604

dbDisconnect(db)
```

If the database can be connected to, you can start a new photon instance and import the database using `$import()`. The database import creates the folder `photon_data` inside the given photon directory.

```{r, eval=FALSE}
dir <- file.path(tempdir(), "photon")
photon <- new_photon(dir, overwrite = TRUE)
#> ℹ java version "22" 2024-03-19
#> ℹ Java(TM) SE Runtime Environment (build 22+36-2370)
#> ℹ Java HotSpot(TM) 64-Bit Server VM (build 22+36-2370, mixed mode, sharing)
#> ✔ Successfully downloaded photon 1.0.0. [8.2s]        
#> ℹ No search index downloaded! Download one or import from a Nominatim database.
#> • Version: 1.0.0

photon$import(host = "localhost", password = "MNdtC2*pP#aMbe")
```

After the import has finished, you can start the photon instance.

```{r, eval=FALSE}
photon$start()
#> 2024-10-24 23:26:46,360 [main] WARN  org.elasticsearch.node.Node - version [5.6.16-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production
#> ✔ Photon is now running. [11.1s]
```

```{r, eval=FALSE}
geocode("Apia", limit = 3)
#> Simple feature collection with 3 features and 13 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -171.7631 ymin: -13.83613 xmax: -171.7512 ymax: -13.82611
#> Geodetic CRS:  WGS 84
#> # A tibble: 3 × 14
#>     idx osm_type     osm_id country osm_key city        street     countrycode osm_value name  state type  extent
#>   <int> <chr>         <int> <chr>   <chr>   <chr>       <chr>      <chr>       <chr>     <chr> <chr> <chr> <list>
#> 1     1 W        1322127938 Samoa   place   NA          NA         WS          city      Apia  Tuam… city  <dbl> 
#> 2     1 W         723300892 Samoa   landuse Matautu Tai NA         WS          harbour   Apia… Tuam… other <dbl> 
#> 3     1 W         666117780 Samoa   tourism Levili      Levili St… WS          attracti… Apia… Tuam… house <dbl> 
#> # ℹ 1 more variable: geometry <POINT [°]>
```


# Import from a JSON dump

Since photon 0.7.0, databases can be dumped to and imported from JSON files (so called Nominatim Dump Files, see the [docs](https://github.com/komoot/photon/blob/master/docs/json-dump-format-0.1.0.md)). While pre-built databases are not available for every region through `$download_data()`, JSON dumps are. You can choose to download JSON dumps instead of pre-built databases by setting `json = TRUE`.

```{r, eval=FALSE}
photon$remove_data()
photon$download_data("Andorra", json = TRUE)
```

Using this data, you can then simply import the dump using the `$import()` method with `json = TRUE`.

```{r, eval=FALSE}
photon$import(json = TRUE)
```
