---
title: "gwas2crispr: From GWAS to CRISPR-ready files (hg38)"
pagetitle: "gwas2crispr: From GWAS to CRISPR-ready files (hg38)"
output: rmarkdown::html_vignette
vignette: >
    %\VignetteIndexEntry{gwas2crispr: From GWAS to CRISPR-ready files (hg38)}
    %\VignetteEngine{knitr::rmarkdown}
    %\VignetteEncoding{UTF-8}
---

```{r, include=FALSE}
# Keep vignette fast and CRAN-safe: disable evaluation by default
knitr::opts_chunk$set(
  collapse = TRUE, comment = "#>",
  eval = FALSE, message = FALSE, warning = FALSE
)
```

## Overview

`gwas2crispr` retrieves significant genome-wide association study (GWAS) SNPs for an Experimental Factor Ontology (EFO) trait, aggregates variant/gene/study metadata, and **optionally** exports CSV, BED, and FASTA files for downstream functional genomics and CRISPR guide design. The package targets **GRCh38/hg38**.

**Key design for CRAN compliance:** functions do **not** write by default. File writing happens **only if** you set `out_prefix`. In examples/tests/vignettes, write to `tempdir()`.

> Runtime prerequisites: the GWAS Catalog client **gwasrapidd** is required for data retrieval; **Biostrings** + **BSgenome.Hsapiens.UCSC.hg38** are required only if you want FASTA output.

### Core functions

* `fetch_gwas(efo_id, p_cut)` — fetch significant associations via `gwasrapidd` with a REST fallback.
* `run_gwas2crispr(efo_id, p_cut, flank_bp, out_prefix = NULL, verbose = FALSE)` — end-to-end pipeline that returns objects; writes CSV/BED/FASTA **only** if `out_prefix` is provided.

> This vignette does not run network calls or write files (global `eval = FALSE`) to keep CRAN checks deterministic.

## Installation

```{r}
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install(c("Biostrings", "BSgenome.Hsapiens.UCSC.hg38"))

install.packages("gwasrapidd")  # required for GWAS retrieval

if (!requireNamespace("devtools", quietly = TRUE))
  install.packages("devtools")
devtools::install_github("leopard0ly/gwas2crispr")
```

## Quick examples (primary + CRAN-safe)

### A) Primary workflow — write outputs to the current working directory

```{r}
library(gwas2crispr)

# Lung disease (EFO_0000707), GRCh38/hg38
run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = "lung"   # produces: lung_snps_full.csv / lung_snps_hg38.bed / lung_snps_flank300.fa
)
```

### B) CRAN-safe — write into a temporary directory

```{r}
library(gwas2crispr)

tmp <- tempdir()  # CRAN-safe target
res <- run_gwas2crispr(
  efo_id     = "EFO_0000707",
  p_cut      = 1e-6,
  flank_bp   = 300,
  out_prefix = file.path(tmp, "lung"),  # writes here, not to user's home
  verbose    = FALSE
)

# Files written (list components or vector of paths, depending on return structure):
res$csv
res$bed
res$fasta  # present only if BSgenome/Biostrings are installed
```

## CLI usage (optional)

```bash
Rscript "$(Rscript -e \"cat(system.file('scripts','gwas2crispr.R', package='gwas2crispr'))\")" \
  -e EFO_0000707 -p 1e-6 -f 300 -o "$(Rscript -e \"cat(tempdir())\")/lung"
```

> The `-o` path in CLI should point to a temporary or user-chosen directory. Avoid writing to the package root when reproducing examples under CRAN-like conditions.

## Session info

```{r, eval=TRUE}
sessionInfo()
```