The `fuzzyfaers` package

Functions to extract drug records from the FAERS (FDA Adverse Event Reporting System) data in a postgreSQL database for analysis in R.

As drug names (whether they be generic/clinical or branded) are provided to the FDA and do not undergo standardisation, this package provides an automated way to extract records of a drug, its synonyms and their inevitable misspellings (plus superfluous text). Fuzzy string matching is required to do this, thus, fuzzyfaers.

Background and motivation

Signal detection of adverse drug events often requires analysis of tabulated count data as seen in the below table.

	`Event(s) X`	`Event(s) Y`
`Drug(s) A`	a	b
`Drug(s) B`	c	d

To get to this data, or a flat dataset with a single record for each adverse event is a time consuming process. Not only that but finding synonyms to drug names is a painfully manual process, compounded by alternative spellings and superfluous text included in the drug text fields.

This package aims to

provide data in as ready-to-go format as possible with instructions to house it in a manageable database environment accessible by R, and
automate the drug synonym (and misspelling) searching in the database to extract single record adverse event data.frames in R.

Prerequisites

All required software are free and available on Linux, Mac OS and Windows. Testing has been undertaken on Ubuntu LTS 18.04 and Windows 10 systems. You will likely need ~10Gb of disk space for the database/data installation and enough memory to deal with data.frames in R potentially with millions of records (I have found 32Gb of memory sufficient for me).

Installation of PostgreSQL server and pgAdmin which are normally both included in the installation.
The curated tables (demo, drug, indi, outc, reac, rpsr, ther) from FAERS as csvs available with the link found here. These csvs are ready-to-go data compiled from the raw and seperated quarterly extracts available here from the quarters 2013Q1 to 2020Q3 inclusive. I will endevour to update these files to include new quarterly data. For reproducibility, I also include a HTML rendered rmarkdown document detailing the pre-processing of the FAERS ascii text quarterly extracts to single files for each FAERS tables.
OHDSI (Observational Health Data Sciences and Informatics) common data model (CDM) vocabulary files. These files allow us to automatically check for drug synonyms through their excellent vocabulary mappings. The vocabulary files can be downloaded at Athena (sign up required, free).
An installation of R.

Example usage

library(devtools) # see https://www.r-project.org/nosvn/pandoc/devtools.html
devtools::install_github('tystan/fuzzyfaers')
library(fuzzyfaers)
### see help file to run example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

The `fuzzyfaers` package

Background and motivation

Prerequisites

Example usage

`fuzzyfaers` process to extact drug specific FAERS data

Files

README.md

Latest commit

History

README.md

File metadata and controls

The fuzzyfaers package

Background and motivation

Prerequisites

Example usage

fuzzyfaers process to extact drug specific FAERS data

The `fuzzyfaers` package

`fuzzyfaers` process to extact drug specific FAERS data